data-engineer

A comprehensive data pipeline leveraging Airflow, DBT, Google Cloud Platform (GCP), and Docker to extract, transform, and load data seamlessly from a staging layer to a data warehouse and data mart.

bigquery airflow gcp dbt data-engineer

Updated Dec 1, 2024
Python

DeleLinus / HFR-Data-Warehousing

Star

End-to-end data engineering processes for the NIGERIA Health Facility Registry (HFR). The project leveraged Selenium, Pandas, PySpark, PostgreSQL and Airflow

Updated Sep 1, 2022
Python

rafaelladuarte / thedatagirl_hackathon

Star

Repositório dedicado ao desafio do hackathon de engenharia de dados A3 Data Challenge Woman

python hackathon google-cloud data-engineer

Updated Sep 30, 2021
Python

BayoAdejare / lightning-containers

Sponsor

Star

Docker powered starter for geospatial analysis of lightning atmospheric data.

docker jupyter sqlite csv-files machine-learning-algorithms databases pandas python3 data-warehouse orchestrator data-engineer spatialite clustering-analysis data-engineering-pipeline noaa-weather streamlit-dashboard

Updated May 16, 2025
Python

mohidex / data-pipeline-on-gcp

Star

The Real-time Ecommerce Data Collection and Processing project empowers businesses with real-time insights by efficiently extracting, processing, and storing ecommerce data from multiple sources. Combining Golang and Python, this cutting-edge solution streamlines data handling from diverse ecommerce websites.

python go golang data-science firebase google database storage dependency-injection gcp google-cloud pubsub web-scraping beautifulsoup datastore data-pipeline data-engineer solid-principles firestore

Updated Jan 18, 2025
Python

zekeriyyaa / Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services

Star

python aws lambda aws-lambda dynamodb s3 s3-bucket stream-processing datapipeline iam-policy data-engineer iam-role iam-credentials aws-event

Updated May 28, 2022
Python

mikecerton / The-Retail-ELT-Pipeline-End-To-End

Star

This project designs and implements an ETL pipeline using Apache Airflow (Docker Compose) to ingest, process, and store retail data. AWS S3 acts as the data lake, AWS Redshift as the data warehouse, and Looker Studio for visualization. [Data Engineer]

aws-s3 data-engineer aws-redshift apache-airflow etl-pipeline looker-studio

Updated Jul 7, 2025
Python

andersonesanto / igti-edc-desafio-final

Star

IGTI MBA Engenharida de dados - Bootcamp Engenheiro de Dados Cloud - Desafio final

kubernetes airflow spark pyspark data-engineering data-engineer igti airflow-dags engenharia-de-dados

Updated Apr 1, 2022
Python

mikecerton / UserInsight-Streaming-Data-Pipeline

Star

UserInsight-Streaming-Data-Pipeline is a real-time pipeline that ingests API data into Kafka, processes it with Spark, stores it in S3, and uses AWS Lambda to load it into Redshift. The data is then used to create a dashboard in Looker. [Data Engineer]

aws apache-spark docker-compose apache-kafka data-pipeline data-engineer

Updated Jul 7, 2025
Python

Improve this page

Add a description, image, and links to the data-engineer topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-engineer topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-engineer

Here are 80 public repositories matching this topic...

andkret / Cookbook

vmware / versatile-data-kit

digitalghost-dev / premier-league

kelvins / awesome-dataops

ortizfram / datacamp-Data-Engineer-with-Python-course

AhmetFurkanDEMIR / Data-Engineering-Project-with-HDFS-and-Kafka

Keep-Current / web-miner

tspannhw / dws2017sydney

adilkhash / apache-airflow-intro

ZhiruiFeng / CarsMemory

Keep-Current / Data-Engineering

lixx21 / airflow-dbt-gcp

DeleLinus / HFR-Data-Warehousing

rafaelladuarte / thedatagirl_hackathon

BayoAdejare / lightning-containers

mohidex / data-pipeline-on-gcp

zekeriyyaa / Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services

mikecerton / The-Retail-ELT-Pipeline-End-To-End

andersonesanto / igti-edc-desafio-final

mikecerton / UserInsight-Streaming-Data-Pipeline

Improve this page

Add this topic to your repo