The Data Engineering Cookbook
-
Updated
Jul 21, 2025 - Python
The Data Engineering Cookbook
One framework to develop, deploy and operate data workflows with Python and SQL.
A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.
😎 A curated list of awesome DataOps tools
datacamp Data Engineer with Python course. 73 hours/ 19 Courses /2 Skill Assessments
Data Engineering Project with Hadoop HDFS and Kafka
Crawls sites, to find new content and scrap it
Code, Examples, Templates and Scripts for DataWorksSummit 2017 Sydney Talk
A data engineering platform for maintaining a data ecosystem to support self-driving cars research.
Wraps the DB by opening a REST API for storing and retrieving documents info & recommendations
A comprehensive data pipeline leveraging Airflow, DBT, Google Cloud Platform (GCP), and Docker to extract, transform, and load data seamlessly from a staging layer to a data warehouse and data mart.
End-to-end data engineering processes for the NIGERIA Health Facility Registry (HFR). The project leveraged Selenium, Pandas, PySpark, PostgreSQL and Airflow
Repositório dedicado ao desafio do hackathon de engenharia de dados A3 Data Challenge Woman
Docker powered starter for geospatial analysis of lightning atmospheric data.
The Real-time Ecommerce Data Collection and Processing project empowers businesses with real-time insights by efficiently extracting, processing, and storing ecommerce data from multiple sources. Combining Golang and Python, this cutting-edge solution streamlines data handling from diverse ecommerce websites.
This project designs and implements an ETL pipeline using Apache Airflow (Docker Compose) to ingest, process, and store retail data. AWS S3 acts as the data lake, AWS Redshift as the data warehouse, and Looker Studio for visualization. [Data Engineer]
IGTI MBA Engenharida de dados - Bootcamp Engenheiro de Dados Cloud - Desafio final
UserInsight-Streaming-Data-Pipeline is a real-time pipeline that ingests API data into Kafka, processes it with Spark, stores it in S3, and uses AWS Lambda to load it into Redshift. The data is then used to create a dashboard in Looker. [Data Engineer]
Add a description, image, and links to the data-engineer topic page so that developers can more easily learn about it.
To associate your repository with the data-engineer topic, visit your repo's landing page and select "manage topics."