├── LICENSE
├── README.md
├── docs
└── Understanding Data Engineering.pptx
└── img
├── cloud-computing-1.png
├── cloud-computing.png
├── cloud-data-engineering.png
└── github-notes.png
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2024 Qasim Hassan
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # CLOUD DATA ENGINEERING
2 |
3 | 
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 | ---
14 |
15 | Welcome to the **Cloud Data Engineering** course! This comprehensive 6–8 month journey is designed to equip you with the necessary skills to become a proficient Data Engineer, focusing on cloud-based technologies, data acquisition, modeling, warehousing, and orchestration.
16 |
17 | Our curriculum is divided into **5 modules** that include hands-on projects, assignments, and real-world case studies to ensure a practical understanding of the technologies covered.
18 |
19 | ---
20 |
21 | This repository includes the **Roadmap for Data Engineering**. Since Data Engineering is a broad field, we'll try to cover the following tools:
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 | ---
36 |
37 | ## 📑 Table of Contents
38 |
39 | 1. [Course Overview](#course-overview)
40 | 2. [Understanding Data Engineering](#understanding-data-engineering)
41 | 3. [Module 1: Data Acquisition](#module-1-data-acquisition)
42 | 4. [Module 2: Data Modeling](#module-2-data-modeling)
43 | 5. [Module 3: Cloud Data Warehousing](#module-3-cloud-data-warehousing)
44 | 6. [Module 4: Data Orchestration & Streaming](#module-4-data-orchestration--streaming)
45 | 7. [Module 5: Architecting AWS Data Engineering Projects](#module-5-architecting-aws-data-engineering-projects)
46 | 8. [Why These Technologies?](#why-these-technologies)
47 | 9. [Final Notes](#final-notes)
48 |
49 | ---
50 |
51 | ## 🚀 Course Overview
52 |
53 | This course is meticulously crafted to cover all facets of **Cloud Data Engineering**.
54 | You'll learn everything from the basics of data acquisition and transformation to advanced cloud-based data warehousing, orchestration, and streaming techniques.
55 |
56 | The course is structured to build your skills progressively, ensuring you are job-ready to tackle complex data engineering challenges by the end.
57 |
58 |
59 | ## 📖 Understanding Data Engineering
60 |
61 | Before diving deep, one should know:
62 | - What is Data Engineering?
63 | - What is the scope of Data Engineering in **2025 and beyond**?
64 | - What tools are required for a modern Data Engineer?
65 |
66 | 📂 [Understanding Data Engineering (PPT)](./docs/Understanding%20Data%20Engineering.pptx)
67 |
68 |
69 | ## 📦 Module 1: Data Acquisition   
70 |
71 | ### Overview
72 | The focus of this module is on acquiring, manipulating, and processing data from various sources.
73 | You’ll set up your data engineering environment, explore Python, manage projects with Git, and gain hands-on experience with **web scraping** using BeautifulSoup and Selenium.
74 |
75 | ➡️ Includes **projects** like:
76 |
77 | - ETL with Python
78 | - Netflix Data Analysis
79 | - GitHub History (Scala)
80 | - Security Log Analysis, etc.
81 |
82 | ## 🗄️ Module 2: Data Modeling 
83 |
84 | Dive into **database design, SQL querying, optimization, and ETL pipelines**.
85 |
86 | 📌 Covers:
87 | - SQL Server setup
88 | - Joins, aggregations, window functions
89 | - Stored procedures, triggers, optimization
90 |
91 | ➡️ Includes **projects** like:
92 |
93 | - ETL pipeline with Python + Pandas + SQL
94 |
95 |
96 | ## ❄️ Module 3: Cloud Data Warehousing 
97 |
98 | Master **Snowflake Cloud Data Warehousing** through hands-on badges, Udemy masterclass, and real-time projects.
99 |
100 | 📌 Includes **official Snowflake badges**:
101 | - Data Warehousing Workshop
102 | - Collaboration & Marketplace
103 | - Data Application Builders
104 | - Data Lake Workshop
105 | - Data Engineering Workshop
106 |
107 | ➡️ Includes **projects** like:
108 |
109 | - Snowflake Real Time Data Warehouse For Beginners
110 | - Batch pipeline using AWS S3, lambda, Eventbridge and Snowflake for currency Exhancge rates
111 | - Real-time Snowflake Data Warehouse, Change Data Capture with AWS
112 |
113 |
114 | ## ⏳ Module 4: Data Orchestration & Streaming  
115 |
116 | - **Apache Airflow** for orchestration of ETL pipelines
117 | - **Apache Kafka** for real-time data streaming and decoupling producers/consumers
118 |
119 | ➡️ Includes **projects** like:
120 |
121 | - Twitter Data Pipeline, Stock Market Analysis, Airflow on AWS EC2
122 |
123 |
124 | ## ☁️ Module 5: Architecting AWS Data Engineering Projects 
125 |
126 | Dive deep into AWS ecosystem for data engineering:
127 |
128 | 📌 Covers:
129 | - S3, Redshift, Glue, Athena, Lambda, Kinesis, RDS, EMR
130 |
131 | ➡️ Projects:
132 | - Batch Data Pipeline (S3 + Lambda + CloudWatch)
133 | - ETL pipeline with Glue & Athena
134 | - Real-time streaming with Kinesis
135 | - End-to-End AWS Data Engineering
136 |
137 | ---
138 |
139 | ## ❓ Why These Technologies?
140 |
141 | The chosen technologies (Python, SQL, Snowflake, Airflow, Kafka, AWS) are **the most in-demand** in industry, ensuring you are **job-ready** by the end of this course.
142 |
143 | Each module builds on the previous one, reinforcing both **theory + practical projects**.
144 |
145 | ## 📝 Final Notes
146 |
147 | Throughout this course, you will engage in **hands-on projects, assignments, and case studies** that simulate real-world data engineering challenges.
148 |
149 | ⚡ Get ready to embark on this exciting journey of becoming a **proficient Cloud Data Engineer!** 🚀
150 |
151 | ---
152 |
--------------------------------------------------------------------------------
/docs/Understanding Data Engineering.pptx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aiwithqasim/cloud-data-engineering/792f81476994e16fec9e0b29f3c0dbbe18b924ec/docs/Understanding Data Engineering.pptx
--------------------------------------------------------------------------------
/img/cloud-computing-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aiwithqasim/cloud-data-engineering/792f81476994e16fec9e0b29f3c0dbbe18b924ec/img/cloud-computing-1.png
--------------------------------------------------------------------------------
/img/cloud-computing.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aiwithqasim/cloud-data-engineering/792f81476994e16fec9e0b29f3c0dbbe18b924ec/img/cloud-computing.png
--------------------------------------------------------------------------------
/img/cloud-data-engineering.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aiwithqasim/cloud-data-engineering/792f81476994e16fec9e0b29f3c0dbbe18b924ec/img/cloud-data-engineering.png
--------------------------------------------------------------------------------
/img/github-notes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aiwithqasim/cloud-data-engineering/792f81476994e16fec9e0b29f3c0dbbe18b924ec/img/github-notes.png
--------------------------------------------------------------------------------