Skip to content

Latest commit

 

History

History
125 lines (77 loc) · 3.33 KB

README.md

File metadata and controls

125 lines (77 loc) · 3.33 KB

Data Engineering Zoomcamp

Syllabus

  • Course overview
  • Introduction to GCP
  • Docker and docker-compose
  • Running Postgres locally with Docker
  • Setting up infrastructure on GCP with Terraform
  • Preparing the environment for the course
  • Homework

More details

  • Data Lake
  • Workflow orchestration
  • Workflow orchestration with Mage
  • Homework

More details

  • Reading from apis
  • Building scalable pipelines
  • Normalising data
  • Incremental loading
  • Homework

More details

  • Data Warehouse
  • BigQuery
  • Partitioning and clustering
  • BigQuery best practices
  • Internals of BigQuery
  • BigQuery Machine Learning

More details

  • Basics of analytics engineering
  • dbt (data build tool)
  • BigQuery and dbt
  • Postgres and dbt
  • dbt models
  • Testing and documenting
  • Deployment to the cloud and locally
  • Visualizing the data with google data studio and metabase

More details

  • Batch processing
  • What is Spark
  • Spark Dataframes
  • Spark SQL
  • Internals: GroupBy and joins

More details

  • Introduction to Kafka
  • Schemas (avro)
  • Kafka Streams
  • Kafka Connect and KSQL

More details

More details

Putting everything we learned to practice

  • Week 1 and 2: working on your project
  • Week 3: reviewing your peers

More details

Course UI

Alternatively, you can access this course using the provided UI app, the app provides a user-friendly interface for navigating through the course material.

dezoomcamp-ui

Star History

Star History Chart