Data Engineering Tools and Technologies

πŸš€ My personal Hall of Fame of tools and technologies πŸ› οΈ (Python ecosystem edition)

These tools and technologies have served me well over the years, so I thought I’d share them:

πŸ”΅ Click, Loguru: Creating command line interfaces
πŸ”΅ dotenv, pyyaml: Managing configuration
πŸ”΅ pytest: For making sure your code does actually what you want
πŸ”΅ Pandas: Extracting data from CSV/Excel/Parquet
πŸ”΅ MinIO, Parquet, SQLite, PostgreSQL, Snowflake: Storing structured and unstructured data
πŸ”΅ dbt: SQL management
πŸ”΅ Superset: Reporting and dynamic dashboarding
πŸ”΅ FastAPI: Building robust APIs
πŸ”΅ Streamlit: Rapid prototyping
πŸ”΅ Flask, Jinja2, Bootstrap CSS, JQuery, Gunicorn, Nginx: A simple yet powerful webstack
πŸ”΅ scikit-learn, lightgbm, pytorch: Batch ML for predictive analytics
πŸ”΅ River: Online ML for real-time insights
πŸ”΅ Docker: Seamlessly packaging applications for deployment
πŸ”΅ Jenkins: Orchestrating builds and automating workflows effortlessly
πŸ”΅ Git: Keeping code versioned
πŸ”΅ cron: Triggering tasks even when I’m sleeping πŸ’€
πŸ”΅ Ubuntu: A rock-solid OS to build servers that run for years
πŸ”΅ DigitalOcean: Fast VMs in the cloud

If you’d like to hear a more detailed break-down of the why’s and how’s, let me know. πŸ“© hashtag#dataengineering hashtag#tools hashtag#productivity

Matt von Rohr
Matt von Rohr

#ai #datascience #machinelearning #dataengineering #dataintegration

Articles: 36

Leave a Reply

Your email address will not be published. Required fields are marked *

×