├── .vscode └── settings.json ├── README.md ├── image └── sql_crash_course.jpg └── sql_scripts ├── Problems ├── Problem_01_Solutions.sql └── Problem_02_Solutions.sql └── Theory ├── 04_SQL_Basic_Commands.sql ├── 05_Sorting_and_Limiting.sql ├── 06_Aggregate_Functions.sql ├── 07_Join.sql ├── 08_Union.sql ├── 09_CASE.sql ├── 10_Functions.sql ├── 11_Subqueries.sql ├── 12_CTEs.sql ├── 13_Recursion.sql ├── 14_Views.sql ├── 15_CRUD_Operations.sql ├── 16_Database_modifications.sql ├── 17_Indexing_and_Optimization.sql ├── 18_Modular_Code.sql ├── 19_execution_order.sql └── 20_query_optimization.sql /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "taipyStudio.gUI.elementsFilePaths": [] 3 | } 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SQL Crash Course – From Zero to Hero! 🚀 2 | 3 | ![SQL Crash Course](image/sql_crash_course.jpg) 4 | 5 | Hi! You must be here because of our's ([Cornellius from Non-Brand Data](https://www.nb-data.com/) and [Josep from databites.tech](https://www.databites.tech/)) newest SQL Crash Course. 6 | 7 | Whether you're just starting out or looking to sharpen your SQL skills, this series will take you from zero to hero with a structured, easy-to-follow approach. Each post will break down essential SQL concepts, helping you build confidence in SQL following the same style as my previous courses. 8 | 9 | ## What’s in the Course? 10 | 11 | We will structure the series into seven key topics to guide you step by step: 12 | 13 | 1. **Introduction** – What SQL is and why it matters 14 | 2. **SQL Fundamentals** – Basic commands, filtering, and aggregation 15 | 3. **Intermediate SQL** – Joins, unions, and functions 16 | 4. **Advanced SQL** – Subqueries, CTEs, recursion, and views 17 | 5. **Database Operations** – CRUD, schema changes, and optimization 18 | 6. **Crafting Good SQL Queries** – Best practices for writing efficient queries 19 | 7. **Real-world Problems** – Applying SQL to practical challenges 20 | 21 | The repository will be updated over time with all the articles and code used in the series, so stay tuned! 22 | 23 | ## Table of Contents 24 | | Article Title | Article Link | Code Script | SQL Playground | 25 | |---------------|--------------|-------------|----------------| 26 | | Launching the SQL Crash Course – From Zero to Hero! 🚀 | [Read here](https://www.databites.tech/p/launching-the-sql-crash-course-from) | | | 27 | | #1 What is SQL? | [Read here](https://www.nb-data.com/p/2-what-is-sql) | | | 28 | | #2 Why learn SQL? | [Read here](https://www.databites.tech/p/2-why-learn-sql) | | | 29 | | #3 Relational Data & Models | [Read here](https://www.databites.tech/p/3-relational-data-and-models) | | | 30 | | #4 SQL Basic Commands | [Read here](https://www.nb-data.com/p/4-sql-basic-commands) | [View script](sql_scripts/Theory/04_SQL_Basic_Commands.sql) | [Run playground](https://www.db-fiddle.com/f/tLA6Ca3iAcABo7Bkgm87nE/1) | 31 | |#5 Sorting & Limiting | [Read here](https://www.databites.tech/p/5-sorting-and-limiting) | [View script](sql_scripts/Theory/05_Sorting_and_Limiting.sql)| [Run playground](https://www.db-fiddle.com/f/gsJfafADqkwjrHBLernRZP/0) | 32 | |#6 Aggregate Functions | [Read here](https://www.nb-data.com/p/6-aggregate-functions) | [View script](sql_scripts/Theory/06_Aggregate_Functions.sql) | [Run playground](https://www.db-fiddle.com/f/w3trdsFQ23og1tYerokVMm/0) | 33 | |#7 JOINs (Left, Right, Inner & Full) | [Read here](https://www.databites.tech/p/7-joins-left-right-inner-and-full) | [View script](sql_scripts/Theory/07_Join.sql) | [Run playground](https://www.db-fiddle.com/f/8rkgKHYSFEmmhjdR9P4iii/3) | 34 | |#8 UNION & UNION ALL | [Read here](https://www.nb-data.com/p/8-union-and-union-all) | [View script](sql_scripts/Theory/08_Union.sql) | [Run playground](https://www.db-fiddle.com/f/pTrpDgGYGejUmAMXTtaQNm/2) | 35 | |#9 Case Expressions | [Read here](https://www.databites.tech/p/9-case-expressions) | [View script](sql_scripts/Theory/09_CASE.sql) | [Run playground](https://www.db-fiddle.com/f/sTGiHfN435PP2xCSWGd5q7/0) | 36 | |#10 Functions | [Read here](https://www.nb-data.com/p/10-functions-string-date-numeric) | [View script](sql_scripts/Theory/10_Functions.sql) | [Run playground](https://www.db-fiddle.com/f/oEaniab9PUi1eCJUqk7xjL/0) | 37 | |#11 Subqueries | [Read here](https://www.nb-data.com/p/11-subqueries) | [View script](sql_scripts/Theory/11_Subqueries.sql) | [Run playground](https://www.db-fiddle.com/f/orVEkoyFQhhppyGtSrZzYA/0) | 38 | |#12 Common Table Expressions (CTEs) | [Read here](https://www.databites.tech/p/12-common-table-expressions-ctes) | [View script](sql_scripts/Theory/12_CTEs.sql) | [Run playground](https://www.db-fiddle.com/f/tvtNVuMuynXcBM3ADX5ymp/0) | 39 | |#13 Recursion | [Read here](https://www.nb-data.com/p/13-recursion) | [View script](sql_scripts/Theory/13_Recursion.sql) | [Run playground](https://www.db-fiddle.com/f/wkeRysp7ZDzzrA9yPLYuJm/0) | 40 | |#14 Views | [Read here](https://www.databites.tech/p/14-views) | [View script](sql_scripts/Theory/14_Views.sql) | [Run playground](https://www.db-fiddle.com/f/vht5JRmZEFXFguackAoMJW/0) | 41 | |#15 CRUD Operations | [Read here](https://www.nb-data.com/p/15-crud-operations) | [View script](sql_scripts/Theory/15_CRUD_Operations.sql) | [Run playground](https://www.db-fiddle.com/f/wKgWsY7o6ua4u7EedRrfco/1) | 42 | |#16 Dabase Modifications | [Read here](https://www.databites.tech/p/16-database-modifications) | [View script](sql_scripts/Theory/16_Database_modifications.sql) | [Run playground](https://www.db-fiddle.com/f/fLFqwktLC3vQadZKnqZf8o/0) | 43 | |#17 Indexing and Optimization | [Read here](https://www.nb-data.com/p/17-indexing-and-optimization) | [View script](sql_scripts/Theory/17_Indexing_and_Optimization.sql) | [Run playground](https://www.db-fiddle.com/f/pvMeVUQx5MNp2333YyJjPu/0) | 44 | |#18 Modular Code | [Read here](https://www.databites.tech/p/18-generating-modular-code) | [View script](sql_scripts/Theory/18_Modular_Code.sql) | [Run playground](https://www.db-fiddle.com/f/ezePzh6hZtarATN33nuHRV/1) | 45 | |#19 SQL Execution Order | [Read here](https://www.databites.tech/p/19-sql-execution-order) | [View script](sql_scripts/Theory/19_execution_order.sql) | [Run playground](https://www.db-fiddle.com/f/vgfg4bJUzYuxPNvNQHThqB/1) | 46 | |#20 Query Optimization | [Read here](https://www.nb-data.com/p/20-query-optimization)|[View script](sql_scripts/Theory/20_query_optimization.sql) | [Run playground](https://www.db-fiddle.com/f/qXW9oG2DpFE8aFkzV8zUMV/1) | -------------------------------------------------------------------------------- /image/sql_crash_course.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CornelliusYW/SQL-Crash-Course/062c82bcee9779bcd2876d9d684a1d6bcc67f6d0/image/sql_crash_course.jpg -------------------------------------------------------------------------------- /sql_scripts/Problems/Problem_01_Solutions.sql: -------------------------------------------------------------------------------- 1 | -- ****************TASKS*********************************************** 2 | -- Task 1: List all newsletter issues with their titles and publish dates. 3 | SELECT title, published_date 4 | FROM issues; 5 | 6 | -- Task 2: Which 2 issues got the most views? 7 | SELECT title, views 8 | FROM issues 9 | ORDER BY views DESC 10 | LIMIT 2; 11 | 12 | -- Task 3: What is the total number of likes across all issues? 13 | SELECT SUM(likes) AS total_likes 14 | FROM issues; 15 | 16 | -- Task 4: Show each subscriber’s name and which newsletter(s) they’re subscribed to. 17 | SELECT 18 | m.name AS member_name, 19 | n.name AS newsletter_name 20 | FROM members m 21 | JOIN subscriptions s ON m.member_id = s.member_id 22 | JOIN newsletters n ON s.newsletter_id = n.newsletter_id; 23 | 24 | -- Task 5: Return a list of all distinct countries of your members and all newsletter names. 25 | SELECT country AS value FROM members 26 | UNION 27 | SELECT name FROM newsletters; 28 | 29 | -- Task 6: Classify each member as 'Supporter' or 'Follower'. 30 | SELECT name, 31 | CASE 32 | WHEN plan = 'paid' THEN 'Supporter' 33 | ELSE 'Follower' 34 | END AS status 35 | FROM members; 36 | 37 | -- Task 7: Return each member’s name in UPPERCASE and the length of their email. 38 | SELECT UPPER(name) AS uppercase_name, 39 | LENGTH(email) AS email_length 40 | FROM members; 41 | 42 | -- Task 8: How many members are subscribed to each newsletter? 43 | SELECT n.name AS newsletter_name, 44 | COUNT(DISTINCT s.member_id) AS total_subscribers 45 | FROM newsletters n 46 | JOIN subscriptions s ON n.newsletter_id = s.newsletter_id 47 | GROUP BY n.name; 48 | 49 | -- Task 9: What is the average monthly revenue per newsletter from subscriptions? 50 | SELECT n.name AS newsletter_name, 51 | ROUND(AVG(s.monthly_cost), 2) AS avg_monthly_revenue 52 | FROM newsletters n 53 | JOIN subscriptions s ON n.newsletter_id = s.newsletter_id 54 | GROUP BY n.name; 55 | 56 | -- Task 10: What’s the total monthly revenue per country? 57 | SELECT n.name AS newsletter_name, 58 | ROUND(AVG(s.monthly_cost), 2) AS avg_monthly_revenue 59 | FROM newsletters n 60 | JOIN subscriptions s ON n.newsletter_id = s.newsletter_id 61 | GROUP BY n.name; 62 | 63 | -- Task 10: Categorize engagement levels for each issue. 64 | SELECT title, likes, 65 | CASE 66 | WHEN likes > 150 THEN 'Hot' 67 | WHEN likes BETWEEN 100 AND 150 THEN 'Warm' 68 | ELSE 'Cold' 69 | END AS engagement_level 70 | FROM issues; -------------------------------------------------------------------------------- /sql_scripts/Problems/Problem_02_Solutions.sql: -------------------------------------------------------------------------------- 1 | -- ****************TASKS*********************************************** 2 | -- 1. List newsletters with above-average monthly revenue 3 | SELECT n.newsletter_id, n.name, SUM(s.monthly_cost) AS revenue 4 | FROM newsletters n 5 | JOIN subscriptions s ON n.newsletter_id = s.newsletter_id 6 | GROUP BY n.newsletter_id, n.name 7 | HAVING SUM(s.monthly_cost) > ( 8 | SELECT AVG(monthly_cost_sum) 9 | FROM ( 10 | SELECT SUM(monthly_cost) AS monthly_cost_sum 11 | FROM subscriptions 12 | GROUP BY newsletter_id 13 | ) sub 14 | ); 15 | 16 | -- 2. Find issues with more likes than their newsletter's average 17 | SELECT 18 | issue_id, 19 | title, 20 | likes, 21 | (SELECT AVG(likes) 22 | FROM issues i2 23 | WHERE i2.newsletter_id = i1.newsletter_id) AS avg_newsletter_likes 24 | FROM issues i1 25 | WHERE likes > ( 26 | SELECT AVG(likes) 27 | FROM issues i3 28 | WHERE i3.newsletter_id = i1.newsletter_id 29 | ); 30 | 31 | -- 3. Rank members by subscription spending per country 32 | WITH MemberSpending AS ( 33 | SELECT 34 | m.member_id, 35 | m.name, 36 | m.country, 37 | SUM(s.monthly_cost) AS total_spending 38 | FROM members m 39 | JOIN subscriptions s ON m.member_id = s.member_id 40 | GROUP BY m.member_id, m.name, m.country 41 | ) 42 | SELECT 43 | *, 44 | RANK() OVER (PARTITION BY country ORDER BY total_spending DESC) AS spending_rank 45 | FROM MemberSpending; 46 | 47 | -- 4. Calculate cumulative views per newsletter over time 48 | WITH NewsletterViews AS ( 49 | SELECT 50 | newsletter_id, 51 | published_date, 52 | views, 53 | SUM(views) OVER ( 54 | PARTITION BY newsletter_id 55 | ORDER BY published_date 56 | ) AS cumulative_views 57 | FROM issues 58 | ) 59 | SELECT * FROM NewsletterViews; 60 | 61 | -- 5. Generate a date series for 2025 and count new subscriptions 62 | WITH RECURSIVE DateSeries AS ( 63 | SELECT CAST('2025-01-01' AS DATE) AS series_date 64 | UNION ALL 65 | SELECT series_date + INTERVAL 1 DAY 66 | FROM DateSeries 67 | WHERE series_date < '2025-12-31' 68 | ) 69 | SELECT 70 | ds.series_date, 71 | COUNT(s.joined_date) AS new_subscriptions 72 | FROM DateSeries AS ds 73 | LEFT JOIN subscriptions AS s 74 | ON ds.series_date = s.joined_date 75 | GROUP BY ds.series_date 76 | ORDER BY ds.series_date; 77 | 78 | 79 | -- 6. Track subscription growth milestones per newsletter 80 | WITH RECURSIVE GrowthMilestones AS ( 81 | -- Anchor: first subscription date per newsletter 82 | SELECT 83 | newsletter_id, 84 | MIN(joined_date) AS milestone_date, 85 | 1 AS milestone_month 86 | FROM subscriptions 87 | GROUP BY newsletter_id 88 | 89 | UNION ALL 90 | 91 | -- Recursive: add one month until month 3 92 | SELECT 93 | gm.newsletter_id, 94 | gm.milestone_date + INTERVAL 1 MONTH AS milestone_date, 95 | gm.milestone_month + 1 96 | FROM GrowthMilestones AS gm 97 | WHERE gm.milestone_month < 3 98 | ) 99 | SELECT 100 | gm.newsletter_id, 101 | n.name, 102 | gm.milestone_date, 103 | gm.milestone_month, 104 | COUNT(s.subscription_id) AS cumulative_subscriptions 105 | FROM GrowthMilestones AS gm 106 | JOIN newsletters AS n 107 | ON gm.newsletter_id = n.newsletter_id 108 | LEFT JOIN subscriptions AS s 109 | ON s.newsletter_id = gm.newsletter_id 110 | AND s.joined_date <= gm.milestone_date 111 | GROUP BY 112 | gm.newsletter_id, 113 | n.name, 114 | gm.milestone_date, 115 | gm.milestone_month 116 | ORDER BY 117 | gm.newsletter_id, 118 | gm.milestone_month; 119 | 120 | 121 | -- 7. Create a view for high-engagement issues 122 | CREATE VIEW hot_issues AS 123 | SELECT issue_id, title, newsletter_id, likes 124 | FROM issues 125 | WHERE likes > 150; 126 | 127 | -- Query the view: 128 | SELECT n.name, COUNT(*) AS hot_issue_count 129 | FROM hot_issues h 130 | JOIN newsletters n ON h.newsletter_id = n.newsletter_id 131 | GROUP BY n.name; 132 | 133 | -- 8. Modular revenue analysis by country 134 | WITH CountryRevenue AS ( 135 | SELECT 136 | m.country, 137 | SUM(s.monthly_cost) AS revenue 138 | FROM members m 139 | JOIN subscriptions s ON m.member_id = s.member_id 140 | GROUP BY m.country 141 | ), 142 | GlobalRevenue AS ( 143 | SELECT SUM(revenue) AS total_revenue 144 | FROM CountryRevenue 145 | ) 146 | SELECT 147 | cr.country, 148 | cr.revenue, 149 | ROUND((cr.revenue / gr.total_revenue) * 100, 2) AS revenue_pct 150 | FROM CountryRevenue cr, GlobalRevenue gr; 151 | 152 | -- 9. Debug query with incorrect aggregation 153 | -- Fixed query: 154 | -- Corrected query: 155 | SELECT country, plan, COUNT(*) AS member_count 156 | FROM members 157 | GROUP BY country, plan 158 | HAVING COUNT(*) > 5; -- HAVING applies AFTER aggregation 159 | 160 | -- 10. Optimize a slow-running issue analysis 161 | -- Optimized version: 162 | -- Optimized (CTE + JOIN): 163 | WITH NewsletterAvg AS ( 164 | SELECT 165 | newsletter_id, 166 | AVG(views) AS avg_views 167 | FROM issues 168 | GROUP BY newsletter_id 169 | ) 170 | SELECT 171 | i.issue_id, 172 | i.title, 173 | i.views, 174 | na.avg_views 175 | FROM issues i 176 | JOIN NewsletterAvg na ON i.newsletter_id = na.newsletter_id 177 | WHERE i.views > na.avg_views; 178 | -------------------------------------------------------------------------------- /sql_scripts/Theory/04_SQL_Basic_Commands.sql: -------------------------------------------------------------------------------- 1 | -- Creating table NEWSLETTERS 2 | CREATE TABLE newsletters ( 3 | id VARCHAR(10) PRIMARY KEY, 4 | name VARCHAR(100) 5 | ); 6 | 7 | -- Creating table POSTS 8 | CREATE TABLE posts ( 9 | id VARCHAR(10) PRIMARY KEY, 10 | newsletter_id VARCHAR(10), 11 | name VARCHAR(255), 12 | published_at DATE, 13 | FOREIGN KEY (newsletter_id) REFERENCES newsletters(id) 14 | ); 15 | 16 | -- Insert sample data into NEWSLETTERS 17 | INSERT INTO newsletters (id, name) VALUES 18 | ('1112A', 'DataBites'), 19 | ('1111B', 'Non-Brand Data'); 20 | 21 | -- Insert sample data into POSTS 22 | INSERT INTO posts (id, newsletter_id, name, published_at) VALUES 23 | ('1112A001', '1112A', 'SQL basics', '2024-01-10'), 24 | ('1112A002', '1112A', 'Understanding Time Series', '2024-02-15'), 25 | ('1111B001', '1111B', 'RAG model basics', '2024-03-05'), 26 | ('1111B002', '1111B', 'Crafting modular SQL queries', '2024-04-20'); 27 | 28 | -- Simple SELECT example 29 | SELECT name FROM newsletters; 30 | 31 | -- Single Filter with Comparison Operator 32 | SELECT name, published_at 33 | FROM posts 34 | WHERE published_at < '2024-03-01'; 35 | 36 | -- Multiple Filter Condition 37 | SELECT name 38 | FROM posts 39 | WHERE newsletter_id = '1112A' AND published_at >= '2024-01-01'; 40 | 41 | -- Text Filter Condition 42 | SELECT name 43 | FROM posts 44 | WHERE name LIKE '%SQL%'; 45 | 46 | -- Date Filter Condition 47 | SELECT id, name 48 | FROM posts 49 | WHERE published_at >= '2024-02-01'; 50 | 51 | -- Grouped Condition 52 | SELECT name 53 | FROM posts 54 | WHERE (newsletter_id = '1112A' OR newsletter_id = '1111B') 55 | AND published_at <= '2024-03-31'; 56 | -------------------------------------------------------------------------------- /sql_scripts/Theory/05_Sorting_and_Limiting.sql: -------------------------------------------------------------------------------- 1 | -- Creating table NEWSLETTERS 2 | CREATE TABLE newsletters ( 3 | id VARCHAR(10) PRIMARY KEY, 4 | name VARCHAR(100) 5 | ); 6 | 7 | -- Creating table POSTS 8 | CREATE TABLE posts ( 9 | id VARCHAR(10) PRIMARY KEY, 10 | newsletter_id VARCHAR(10), 11 | name VARCHAR(255), 12 | published_at DATE, 13 | FOREIGN KEY (newsletter_id) REFERENCES newsletters(id) 14 | ); 15 | 16 | -- Insert sample data into NEWSLETTERS 17 | INSERT INTO newsletters (id, name) VALUES 18 | ('1112A', 'DataBites'), 19 | ('1111B', 'Non-Brand Data'); 20 | 21 | -- Insert sample data into POSTS 22 | INSERT INTO posts (id, newsletter_id, name, published_at) VALUES 23 | ('1112A001', '1112A', 'SQL basics', '2024-01-10'), 24 | ('1112A002', '1112A', 'Understanding Time Series', '2024-02-15'), 25 | ('1111B001', '1111B', 'RAG model basics', '2024-03-05'), 26 | ('1111B002', '1111B', 'Crafting modular SQL queries', '2024-04-20'); 27 | 28 | -- Get all post names from the posts table: 29 | SELECT name 30 | FROM posts; 31 | 32 | -- ORDER BY 33 | -- Get all post names sorted by publish date (oldest to newest): 34 | SELECT name, published_at 35 | FROM posts 36 | ORDER BY published_at ASC; 37 | 38 | -- Get post names sorted by newsletter and then by publish date (newest first): 39 | SELECT name, newsletter_id, published_at 40 | FROM posts 41 | ORDER BY newsletter_id ASC, published_at DESC; 42 | 43 | -- LIMIT 44 | -- Get the 2 most recent posts: 45 | SELECT name, published_at 46 | FROM posts 47 | ORDER BY published_at DESC 48 | LIMIT 2; 49 | 50 | -- Get just 1 post from the 'Non-Brand Data' newsletter: 51 | SELECT name 52 | FROM posts 53 | WHERE newsletter_id = '1111B' 54 | LIMIT 1; -------------------------------------------------------------------------------- /sql_scripts/Theory/06_Aggregate_Functions.sql: -------------------------------------------------------------------------------- 1 | -- Create the NEWSLETTERS table 2 | CREATE TABLE newsletters ( 3 | id VARCHAR(10) PRIMARY KEY, 4 | name VARCHAR(255) NOT NULL 5 | ); 6 | 7 | -- Create the POSTS table 8 | CREATE TABLE posts ( 9 | id VARCHAR(10) PRIMARY KEY, 10 | newsletter_id VARCHAR(10), 11 | name VARCHAR(255) NOT NULL, 12 | published_at DATE, 13 | FOREIGN KEY (newsletter_id) REFERENCES newsletters(id) 14 | ); 15 | 16 | -- Create the INTERACTIONS table with an extra numeric column 'points' 17 | CREATE TABLE interactions ( 18 | id VARCHAR(10) PRIMARY KEY, 19 | post_id VARCHAR(10), 20 | datetime DATETIME, 21 | user VARCHAR(50), 22 | type_of_interaction VARCHAR(50), 23 | points INT, -- numeric column added for demonstration of SUM and AVG 24 | FOREIGN KEY (post_id) REFERENCES posts(id) 25 | ); 26 | 27 | 28 | -- Insert sample data into NEWSLETTERS 29 | INSERT INTO newsletters (id, name) VALUES 30 | ('1112A', 'DataBites'), 31 | ('1111B', 'Non-Brand Data'); 32 | 33 | 34 | -- Insert sample data into POSTS 35 | INSERT INTO posts (id, newsletter_id, name, published_at) VALUES 36 | ('1112A001', '1112A', 'SQL basics', '2024-01-10'), 37 | ('1112A002', '1112A', 'Understanding Time Series', '2024-02-15'), 38 | ('1111B001', '1111B', 'RAG model basics', '2024-03-05'), 39 | ('1111B002', '1111B', 'Crafting modular SQL queries', '2024-04-20'); 40 | 41 | 42 | -- Insert sample data into INTERACTIONS 43 | INSERT INTO interactions (id, post_id, datetime, user, type_of_interaction, points) VALUES 44 | ('INT9256', '1111B002', '2024-04-18 11:48:00', 'user3', 'like', 5), 45 | ('INT7503', '1111B002', '2024-01-04 07:30:00', 'user1', 'share', 8), 46 | ('INT7170', '1111B002', '2024-03-12 04:23:00', 'user2', 'like', 3), 47 | ('INT2624', '1112A001', '2024-02-03 00:47:00', 'user4', 'comment', 4), 48 | ('INT6104', '1111B001', '2024-01-06 20:50:00', 'user1', 'click', 2); 49 | 50 | 51 | -- 1. SUM: Add up numeric values from the 'points' column in INTERACTIONS 52 | SELECT SUM(points) AS total_points 53 | FROM interactions; 54 | -- This query sums up the 'points' values for all interactions. 55 | 56 | -- 2. AVG: Calculate the average of the numeric values in the 'points' column 57 | SELECT AVG(points) AS average_points 58 | FROM interactions; 59 | -- This query returns the average points per interaction. 60 | 61 | -- 3. COUNT: Count rows (all interactions) or count non-NULL values in a column 62 | -- a) Count all rows in the INTERACTIONS table: 63 | SELECT COUNT(*) AS total_interactions 64 | FROM interactions; 65 | -- b) Count only the non-NULL entries in the 'points' column: 66 | SELECT COUNT(points) AS count_points 67 | FROM interactions; 68 | 69 | -- 4. MIN / MAX: Find the smallest (MIN) and largest (MAX) published dates in POSTS 70 | SELECT MIN(published_at) AS earliest_post, 71 | MAX(published_at) AS latest_post 72 | FROM posts; 73 | -- This query retrieves the earliest and latest publication dates of posts. 74 | 75 | -- 5. GROUP BY: Segment data into groups. 76 | -- For example, count the number of interactions for each type_of_interaction: 77 | SELECT type_of_interaction, COUNT(*) AS interactions_count 78 | FROM interactions 79 | GROUP BY type_of_interaction; 80 | -- This groups the interactions table by type (like, share, etc.) and counts each group. 81 | 82 | -- 6. HAVING: Filter groups after aggregation. 83 | -- For example, only show interaction types that occur more than once: 84 | SELECT type_of_interaction, COUNT(*) AS interactions_count 85 | FROM interactions 86 | GROUP BY type_of_interaction 87 | HAVING COUNT(*) > 1; 88 | -- This filters the groups, returning only those with more than one interaction. -------------------------------------------------------------------------------- /sql_scripts/Theory/07_Join.sql: -------------------------------------------------------------------------------- 1 | -- Creating table NEWSLETTERS 2 | CREATE TABLE newsletters ( 3 | id VARCHAR(10) PRIMARY KEY, 4 | name VARCHAR(100) 5 | ); 6 | 7 | -- Creating table POSTS 8 | CREATE TABLE posts ( 9 | id VARCHAR(10) PRIMARY KEY, 10 | newsletter_id VARCHAR(10), 11 | name VARCHAR(255), 12 | published_at DATE, 13 | FOREIGN KEY (newsletter_id) REFERENCES newsletters(id) 14 | ); 15 | 16 | -- Insert sample data into NEWSLETTERS 17 | INSERT INTO newsletters (id, name) VALUES 18 | ('1112A', 'DataBites'), 19 | ('1111B', 'Non-Brand Data'); 20 | 21 | -- Insert sample data into POSTS 22 | INSERT INTO posts (id, newsletter_id, name, published_at) VALUES 23 | ('1112A001', '1112A', 'SQL basics', '2024-01-10'), 24 | ('1112A002', '1112A', 'Understanding Time Series', '2024-02-15'), 25 | ('1111B001', '1111B', 'RAG model basics', '2024-03-05'), 26 | ('1111B002', '1111B', 'Crafting modular SQL queries', '2024-04-20'); 27 | 28 | -- Creating table INTERACTIONS 29 | CREATE TABLE interactions ( 30 | id VARCHAR(10) PRIMARY KEY, 31 | post_id VARCHAR(10), 32 | datetime DATETIME, 33 | user VARCHAR(50), 34 | type_of_interaction VARCHAR(50), 35 | FOREIGN KEY (post_id) REFERENCES posts(id) 36 | ); 37 | 38 | INSERT INTO interactions (id, post_id, datetime, user, type_of_interaction) VALUES 39 | ('INT9256', '1111B002', '2024-04-18 11:48:00', 'user3', 'like'), 40 | ('INT7503', '1111B002', '2024-01-04 07:30:00', 'user1', 'share'), 41 | ('INT7170', '1111B002', '2024-03-12 04:23:00', 'user2', 'like'), 42 | ('INT2624', '1112A001', '2024-02-03 00:47:00', 'user4', 'comment'), 43 | ('INT6104', '1111B001', '2024-01-06 20:50:00', 'user1', 'click'), 44 | ('INT5555', '1111B002', '2024-04-07 06:42:00', 'user3', 'comment'), 45 | ('INT7674', '1112A002', '2024-01-13 06:59:00', 'user5', 'share'), 46 | ('INT4502', '1111B002', '2024-04-28 21:33:00', 'user2', 'like'), 47 | ('INT9635', '1111B001', '2024-02-27 14:00:00', 'user1', 'click'); 48 | 49 | 50 | -- LEFT JOIN 51 | -- Get all posts and the name of their corresponding newsletter (if any): 52 | SELECT 53 | posts.name AS post_title, 54 | newsletters.name AS newsletter_name 55 | FROM posts 56 | LEFT JOIN newsletters 57 | ON posts.newsletter_id = newsletters.id; 58 | 59 | 60 | -- RIGHT JOIN 61 | -- Get all newsletters and any associated posts (if available): 62 | SELECT 63 | newsletters.name AS newsletter_name, 64 | posts.name AS post_title 65 | FROM posts 66 | RIGHT JOIN newsletters 67 | ON posts.newsletter_id = newsletters.id; 68 | 69 | -- LEFT JOIN (no match) 70 | 71 | -- Get all posts that are not linked to any newsletter: 72 | SELECT 73 | posts.name AS post_title 74 | FROM posts 75 | LEFT JOIN newsletters 76 | ON posts.newsletter_id = newsletters.id 77 | WHERE newsletters.id IS NULL; 78 | 79 | -- RIGHT JOIN (no match) 80 | -- Get all newsletters that don’t have any posts: 81 | SELECT 82 | newsletters.name AS newsletter_name 83 | FROM posts 84 | RIGHT JOIN newsletters 85 | ON posts.newsletter_id = newsletters.id 86 | WHERE posts.id IS NULL; 87 | 88 | -- FULL OUTER JOIN (simulated using UNION of LEFT and RIGHT JOINs) 89 | -- Show all posts and newsletters, including any that don’t match: 90 | SELECT 91 | posts.name AS post_title, 92 | newsletters.name AS newsletter_name 93 | FROM posts 94 | LEFT JOIN newsletters 95 | ON posts.newsletter_id = newsletters.id 96 | 97 | UNION 98 | 99 | SELECT 100 | posts.name AS post_title, 101 | newsletters.name AS newsletter_name 102 | FROM posts 103 | RIGHT JOIN newsletters 104 | ON posts.newsletter_id = newsletters.id; 105 | 106 | -- FULL OUTER JOIN (no match only) simulated in MySQL 107 | -- Get only the non-matching records between posts and newsletters: 108 | SELECT 109 | posts.name AS post_title, 110 | newsletters.name AS newsletter_name 111 | FROM posts 112 | LEFT JOIN newsletters 113 | ON posts.newsletter_id = newsletters.id 114 | WHERE newsletters.id IS NULL 115 | 116 | UNION 117 | 118 | SELECT 119 | posts.name AS post_title, 120 | newsletters.name AS newsletter_name 121 | FROM posts 122 | RIGHT JOIN newsletters 123 | ON posts.newsletter_id = newsletters.id 124 | WHERE posts.id IS NULL; 125 | 126 | 127 | -- INNER JOIN 128 | -- Get only posts that belong to a valid newsletter: 129 | SELECT 130 | posts.name AS post_title, 131 | newsletters.name AS newsletter_name 132 | FROM posts 133 | INNER JOIN newsletters 134 | ON posts.newsletter_id = newsletters.id; 135 | 136 | -- Joining newsletters, posts, and interactions 137 | -- Show the newsletter name, post title, and interaction type: 138 | SELECT 139 | newsletters.name AS newsletter_name, 140 | posts.name AS post_title, 141 | interactions.type_of_interaction 142 | FROM newsletters 143 | JOIN posts 144 | ON newsletters.id = posts.newsletter_id 145 | JOIN interactions 146 | ON posts.id = interactions.post_id; 147 | -------------------------------------------------------------------------------- /sql_scripts/Theory/08_Union.sql: -------------------------------------------------------------------------------- 1 | -- 1. CREATE TABLES 2 | CREATE TABLE newsletters ( 3 | id VARCHAR(10) PRIMARY KEY, 4 | name VARCHAR(100) 5 | ); 6 | 7 | CREATE TABLE posts ( 8 | id VARCHAR(10) PRIMARY KEY, 9 | newsletter_id VARCHAR(10), 10 | name VARCHAR(255), 11 | published_at DATE, 12 | FOREIGN KEY (newsletter_id) REFERENCES newsletters(id) 13 | ); 14 | 15 | CREATE TABLE interactions ( 16 | id VARCHAR(10) PRIMARY KEY, 17 | post_id VARCHAR(10), 18 | datetime DATETIME, 19 | user VARCHAR(50), 20 | type_of_interaction VARCHAR(50), 21 | points INT, -- numeric column added for demonstration of SUM and AVG 22 | FOREIGN KEY (post_id) REFERENCES posts(id) 23 | ); 24 | 25 | -- 2. INSERT DATA 26 | -- Newsletters 27 | INSERT INTO newsletters (id, name) VALUES 28 | ('1112A', 'DataBites'), 29 | ('1111B', 'Non-Brand Data'); 30 | 31 | -- Posts (including a duplicate name "DataBites") 32 | INSERT INTO posts (id, newsletter_id, name, published_at) VALUES 33 | ('1112A001', '1112A', 'SQL basics', '2024-01-10'), 34 | ('1112A002', '1112A', 'Understanding Time Series', '2024-02-15'), 35 | ('1111B001', '1111B', 'RAG model basics', '2024-03-05'), 36 | ('1111B002', '1111B', 'Crafting modular SQL queries', '2024-04-20'), 37 | ('1112A003', '1112A', 'DataBites', '2024-05-01'); -- Duplicate name from newsletters 38 | 39 | -- Insert sample data into INTERACTIONS 40 | INSERT INTO interactions (id, post_id, datetime, user, type_of_interaction, points) VALUES 41 | ('INT9256', '1111B002', '2024-04-18 11:48:00', 'user3', 'like', 5), 42 | ('INT7503', '1111B002', '2024-01-04 07:30:00', 'user1', 'share', 8), 43 | ('INT7170', '1111B002', '2024-03-12 04:23:00', 'user2', 'like', 3), 44 | ('INT2624', '1112A001', '2024-02-03 00:47:00', 'user4', 'comment', 4), 45 | ('INT6104', '1111B001', '2024-01-06 20:50:00', 'user1', 'click', 2); 46 | 47 | -- 3. UNION QUERY (Removes Duplicate "DataBites") 48 | SELECT name FROM newsletters 49 | UNION 50 | SELECT name FROM posts; 51 | 52 | -- 4. UNION ALL QUERY (Keeps All Rows) 53 | SELECT name FROM newsletters 54 | UNION ALL 55 | SELECT name FROM posts; 56 | 57 | -- 5. STACK 3 TABLE WITH MIXED DATA TYPE 58 | -- Columns: id (VARCHAR), name (VARCHAR), points (INT) 59 | SELECT id, name, NULL AS points 60 | FROM newsletters 61 | 62 | UNION ALL 63 | 64 | SELECT id, name, NULL AS points 65 | FROM posts 66 | 67 | UNION ALL 68 | 69 | SELECT id, type_of_interaction AS name, points 70 | FROM interactions; -------------------------------------------------------------------------------- /sql_scripts/Theory/09_CASE.sql: -------------------------------------------------------------------------------- 1 | -- EXAMPLE 1 2 | -- We want to count how many interactions each post has received and classify them into categories: 🟥 No Interactions, 🟨 Low, 🟧 Medium, 🟩 High. 3 | SELECT 4 | P.name AS post_name, 5 | COUNT(I.id) AS num_interactions, 6 | CASE 7 | WHEN COUNT(I.id) = 0 THEN '🟥 No Interactions' 8 | WHEN COUNT(I.id) BETWEEN 1 AND 3 THEN '🟨 Low' 9 | WHEN COUNT(I.id) BETWEEN 4 AND 6 THEN '🟧 Medium' 10 | ELSE '🟩 High' 11 | END AS interaction_level 12 | FROM posts P 13 | LEFT JOIN interactions I 14 | ON P.id = I.post_id 15 | GROUP BY P.name; 16 | 17 | -- EXAMPLE 2 18 | -- We’ll classify each interaction as Like, Comment, or Other using a CASE expression. 19 | SELECT 20 | I.id, 21 | I.type_of_interaction, 22 | CASE 23 | WHEN I.type_of_interaction = 'like' THEN 'Like' 24 | WHEN I.type_of_interaction = 'comment' THEN 'Comment' 25 | ELSE 'Other' 26 | END AS interaction_category 27 | FROM interactions I; 28 | 29 | -- EXAMPLE 3 30 | -- We want to list all posts, and prioritize those with no interactions at the top of the results. 31 | SELECT 32 | P.name AS post_name, 33 | COUNT(I.id) AS num_interactions 34 | FROM posts P 35 | LEFT JOIN interactions I 36 | ON P.id = I.post_id 37 | GROUP BY P.name 38 | ORDER BY 39 | CASE 40 | WHEN COUNT(I.id) = 0 THEN 0 41 | ELSE 1 42 | END, 43 | P.name; 44 | -------------------------------------------------------------------------------- /sql_scripts/Theory/10_Functions.sql: -------------------------------------------------------------------------------- 1 | -- Example combining String, Date, and Numeric functions 2 | SELECT 3 | -- String Functions 4 | CONCAT(n.name, ' - ', p.name) AS combined_title, -- Merge newsletter & post names 5 | SUBSTRING(p.name, 1, 10) AS title_snippet, -- Extract first 10 characters 6 | UPPER(n.name) AS uppercase_newsletter, -- Convert newsletter name to uppercase 7 | TRIM(i.type_of_interaction) AS clean_interaction, -- Remove whitespace from interaction types 8 | REPLACE(p.name, 'DataBites', 'DB') AS renamed_post,-- Replace substring in post names 9 | 10 | -- Date Functions 11 | CURRENT_DATE AS today, -- Get current date 12 | EXTRACT(YEAR FROM p.published_at) AS publish_year, -- Extract year from publication date 13 | DATE_ADD(CURRENT_DATE, INTERVAL -7 DAY) AS week_ago_start, -- Calculate date 7 days ago 14 | DATEDIFF(CURRENT_DATE, p.published_at) AS days_since_publish, -- Days between today and publish date 15 | 16 | -- Numeric Functions 17 | SUM(i.points) AS total_points, -- Sum interaction points per post 18 | ROUND(AVG(i.points), 1) AS avg_points, -- Average points (rounded to 1 decimal) 19 | CEIL(SUM(i.points)) AS rounded_up_points, -- Round total points up to nearest integer 20 | FLOOR(SUM(i.points)) AS rounded_down_points, -- Round total points down to nearest integer 21 | ABS(i.points - 5) AS deviation_from_five, -- Absolute difference from 5 points 22 | MOD(i.points, 2) AS even_odd_check -- Check if points are even (0) or odd (1) 23 | 24 | FROM posts p 25 | JOIN newsletters n ON p.newsletter_id = n.id 26 | LEFT JOIN interactions i ON p.id = i.post_id 27 | GROUP BY 28 | p.id, n.name, p.name, i.type_of_interaction, p.published_at, i.points; -------------------------------------------------------------------------------- /sql_scripts/Theory/11_Subqueries.sql: -------------------------------------------------------------------------------- 1 | -- Preserve Original Table Structure 2 | SELECT 3 | n.id, 4 | n.name, 5 | (SELECT COUNT(*) 6 | FROM posts p 7 | WHERE p.newsletter_id = n.id) AS total_posts 8 | FROM newsletters n; 9 | 10 | -- Aggregate Comparison 11 | SELECT 12 | p.name, 13 | SUM(i.points) AS total_points 14 | FROM posts p 15 | JOIN interactions i ON p.id = i.post_id 16 | GROUP BY p.id 17 | HAVING SUM(i.points) > ( 18 | SELECT AVG(total) 19 | FROM ( 20 | SELECT SUM(points) AS total 21 | FROM interactions 22 | GROUP BY post_id 23 | ) agg 24 | ); 25 | 26 | -- Multi-Layer Calculations with GROUP BY 27 | SELECT 28 | p.name AS post_name, 29 | p.newsletter_id, 30 | SUM(i.points) AS post_points, 31 | newsletter_avg.avg_points 32 | FROM posts p 33 | JOIN interactions i ON p.id = i.post_id 34 | JOIN ( 35 | SELECT 36 | newsletter_id, 37 | AVG(points) AS avg_points 38 | FROM posts 39 | JOIN interactions ON posts.id = interactions.post_id 40 | GROUP BY newsletter_id 41 | ) newsletter_avg ON p.newsletter_id = newsletter_avg.newsletter_id 42 | GROUP BY p.name, p.newsletter_id, newsletter_avg.avg_points; 43 | 44 | 45 | 46 | -- Existence Check 47 | SELECT * 48 | FROM newsletters n 49 | WHERE EXISTS ( 50 | SELECT 1 51 | FROM posts p 52 | WHERE p.newsletter_id = n.id 53 | AND p.name = 'DataBites' 54 | ); -------------------------------------------------------------------------------- /sql_scripts/Theory/12_CTEs.sql: -------------------------------------------------------------------------------- 1 | -- #1. Modular Metric Calculation 2 | -- WITH SUBQUERIES 3 | SELECT 4 | n.name, 5 | SUM(i.points) AS total_points 6 | FROM newsletters n 7 | JOIN posts p ON n.id = p.newsletter_id 8 | JOIN interactions i ON p.id = i.post_id 9 | GROUP BY n.name 10 | ORDER BY total_points DESC; 11 | 12 | -- WITH CTEs 13 | WITH newsletter_points AS ( 14 | SELECT 15 | n.id AS newsletter_id, 16 | SUM(i.points) AS total_points 17 | FROM newsletters n 18 | JOIN posts p ON n.id = p.newsletter_id 19 | JOIN interactions i ON p.id = i.post_id 20 | GROUP BY n.id 21 | ) 22 | SELECT n.name, np.total_points 23 | FROM newsletter_points np 24 | JOIN newsletters n ON n.id = np.newsletter_id 25 | ORDER BY np.total_points DESC; 26 | 27 | -- #2. Reusable Intermediate Filters 28 | -- Real-world scenario 29 | WITH high_engagement_newsletters AS ( 30 | SELECT 31 | n.id AS newsletter_id 32 | FROM newsletters n 33 | JOIN posts p ON n.id = p.newsletter_id 34 | JOIN interactions i ON p.id = i.post_id 35 | GROUP BY n.id 36 | HAVING SUM(i.points) > 10 37 | ), 38 | top_posts AS ( 39 | SELECT 40 | p.id, 41 | p.name, 42 | SUM(i.points) AS total_post_points 43 | FROM posts p 44 | JOIN interactions i ON p.id = i.post_id 45 | GROUP BY p.id, p.name 46 | ) 47 | SELECT tp.name, tp.total_post_points 48 | FROM top_posts tp 49 | JOIN posts p ON tp.id = p.id 50 | WHERE p.newsletter_id IN ( 51 | SELECT newsletter_id FROM high_engagement_newsletters 52 | ); 53 | 54 | -- #3. Step-by-Step Aggregation 55 | WITH post_points AS ( 56 | SELECT 57 | p.id AS post_id, 58 | p.newsletter_id, 59 | SUM(i.points) AS total_points 60 | FROM posts p 61 | JOIN interactions i ON p.id = i.post_id 62 | GROUP BY p.id, p.newsletter_id 63 | ), 64 | normalized_post_points AS ( 65 | SELECT 66 | pp.post_id, 67 | pp.newsletter_id, 68 | pp.total_points / COUNT(i.id) AS normalized_score 69 | FROM post_points pp 70 | JOIN interactions i ON pp.post_id = i.post_id 71 | GROUP BY pp.post_id, pp.newsletter_id, pp.total_points 72 | ), 73 | newsletter_avg_score AS ( 74 | SELECT 75 | n.id AS newsletter_id, 76 | AVG(npp.normalized_score) AS avg_normalized_score 77 | FROM newsletters n 78 | JOIN normalized_post_points npp ON n.id = npp.newsletter_id 79 | GROUP BY n.id 80 | ) 81 | SELECT n.name, nas.avg_normalized_score 82 | FROM newsletter_avg_score nas 83 | JOIN newsletters n ON nas.newsletter_id = n.id; -------------------------------------------------------------------------------- /sql_scripts/Theory/13_Recursion.sql: -------------------------------------------------------------------------------- 1 | WITH RECURSIVE InteractionDates AS ( 2 | SELECT MIN(DATE(datetime)) AS interaction_date 3 | FROM interactions 4 | 5 | UNION ALL 6 | 7 | SELECT DATE_ADD(interaction_date, INTERVAL 1 DAY) 8 | FROM InteractionDates 9 | WHERE interaction_date < (SELECT MAX(DATE(datetime)) FROM interactions) 10 | ) 11 | SELECT interaction_date 12 | FROM InteractionDates; 13 | 14 | WITH RECURSIVE PointsRanking AS ( 15 | SELECT 16 | id, 17 | post_id, 18 | datetime, 19 | points, 20 | points AS running_total, 21 | ROW_NUMBER() OVER (PARTITION BY post_id ORDER BY datetime) AS rn 22 | FROM interactions 23 | 24 | UNION ALL 25 | 26 | SELECT 27 | i.id, 28 | i.post_id, 29 | i.datetime, 30 | i.points, 31 | pr.running_total + i.points AS running_total, 32 | pr.rn + 1 33 | FROM interactions i 34 | JOIN PointsRanking pr 35 | ON i.post_id = pr.post_id 36 | AND i.datetime > pr.datetime 37 | WHERE NOT EXISTS ( 38 | SELECT 1 FROM interactions i2 39 | WHERE i2.post_id = pr.post_id 40 | AND i2.datetime > pr.datetime 41 | AND i2.datetime < i.datetime 42 | ) 43 | ) 44 | SELECT post_id, datetime, points, running_total 45 | FROM PointsRanking 46 | ORDER BY post_id, datetime; 47 | 48 | WITH RECURSIVE post_sequence AS ( 49 | -- Anchor: Earliest post in the newsletter 50 | SELECT 51 | id, 52 | name, 53 | published_at, 54 | 1 AS post_order, 55 | CAST(NULL AS SIGNED) AS days_since_previous 56 | FROM posts 57 | WHERE newsletter_id = '1112A' 58 | AND published_at = ( 59 | SELECT MIN(published_at) 60 | FROM posts 61 | WHERE newsletter_id = '1112A' 62 | ) 63 | 64 | UNION ALL 65 | 66 | -- Recursive step: find next-later post 67 | SELECT 68 | p.id, 69 | p.name, 70 | p.published_at, 71 | ps.post_order + 1, 72 | DATEDIFF(p.published_at, ps.published_at) AS days_since_previous 73 | FROM posts p 74 | JOIN post_sequence ps ON p.newsletter_id = '1112A' 75 | WHERE p.published_at > ps.published_at 76 | AND NOT EXISTS ( 77 | SELECT 1 78 | FROM posts p2 79 | WHERE p2.newsletter_id = '1112A' 80 | AND p2.published_at > ps.published_at 81 | AND p2.published_at < p.published_at 82 | ) 83 | ) 84 | SELECT * 85 | FROM post_sequence 86 | ORDER BY post_order; 87 | -------------------------------------------------------------------------------- /sql_scripts/Theory/14_Views.sql: -------------------------------------------------------------------------------- 1 | -- #1. Simplifying Repetitive Logic 2 | -- Without a view 3 | SELECT * 4 | FROM posts 5 | WHERE newsletter_id = '1112A'; 6 | 7 | -- With a view 8 | CREATE VIEW databites_posts AS 9 | SELECT * 10 | FROM posts 11 | WHERE newsletter_id = '1112A'; 12 | 13 | -- Reuse easily: 14 | SELECT * FROM databites_posts 15 | WHERE published_at > '2024-02-01'; 16 | 17 | -- #2. Reusable Metrics 18 | CREATE VIEW post_performance AS 19 | SELECT 20 | p.id AS post_id, 21 | p.name, 22 | n.name AS newsletter_name, 23 | SUM(i.points) AS total_points, 24 | COUNT(i.id) AS interaction_count 25 | FROM posts p 26 | JOIN newsletters n ON p.newsletter_id = n.id 27 | LEFT JOIN interactions i ON p.id = i.post_id 28 | GROUP BY p.id, p.name, n.name; 29 | 30 | -- Use it like this: 31 | SELECT * 32 | FROM post_performance 33 | WHERE total_points > 5; 34 | 35 | -- #3. Hiding Sensitive Data 36 | CREATE VIEW public_post_insights AS 37 | SELECT 38 | p.id AS post_id, 39 | p.name AS post_title, 40 | COUNT(i.id) AS total_interactions, 41 | SUM(i.points) AS engagement_score 42 | FROM posts p 43 | LEFT JOIN interactions i ON p.id = i.post_id 44 | GROUP BY p.id, p.name; 45 | 46 | -- Simple and safe: 47 | SELECT * FROM public_post_insights; -------------------------------------------------------------------------------- /sql_scripts/Theory/15_CRUD_Operations.sql: -------------------------------------------------------------------------------- 1 | -- ====================================== 2 | -- 1. INSERT EXAMPLES 3 | -- ====================================== 4 | 5 | -- Single Row Insert 6 | INSERT INTO newsletters (id, name) 7 | VALUES ('1113C', 'Analytics Weekly'); 8 | 9 | INSERT INTO posts (id, newsletter_id, name, published_at) 10 | VALUES ('1113C001', '1113C', 'Introduction to Python', '2024-06-01'); 11 | 12 | -- Bulk Insert 13 | INSERT INTO interactions (id, post_id, datetime, user, type_of_interaction, points) 14 | VALUES 15 | ('INT9991', '1112A003', '2024-05-02 09:00:00', 'user5', 'like', 5), 16 | ('INT9992', '1112A003', '2024-05-02 10:30:00', 'user6', 'share', 8); 17 | 18 | -- Insert from Another Table 19 | CREATE TABLE interactions_archive AS 20 | SELECT * FROM interactions 21 | WHERE datetime < '2024-01-01'; 22 | 23 | 24 | -- ====================================== 25 | -- 2. UPDATE EXAMPLES 26 | -- ====================================== 27 | 28 | -- Update with Subquery 29 | UPDATE interactions 30 | SET points = points + 2 31 | WHERE post_id IN ( 32 | SELECT id FROM posts 33 | WHERE newsletter_id = '1112A' -- "DataBites" newsletter 34 | ); 35 | 36 | -- ====================================== 37 | -- 3. DELETE EXAMPLES (SAFE VERSION) 38 | -- ====================================== 39 | 40 | -- Example: Delete post '1112A003' and its interactions 41 | -- Step 1: Delete child interactions first 42 | DELETE FROM interactions 43 | WHERE post_id = '1112A003'; 44 | 45 | -- Step 2: Delete the post 46 | DELETE FROM posts 47 | WHERE id = '1112A003'; 48 | 49 | -- ====================================== 50 | -- 🔑 BEST PRACTICES EXAMPLES 51 | -- ====================================== 52 | 53 | -- 1. Always Use WHERE in UPDATE/DELETE 54 | -- ✅ Safe deletion of old interactions 55 | DELETE FROM interactions 56 | WHERE datetime < '2024-01-01'; 57 | 58 | -- 2. Test with SELECT First 59 | -- Preview posts to delete (with no interactions) 60 | SELECT * FROM posts 61 | WHERE id NOT IN (SELECT post_id FROM interactions); 62 | 63 | 64 | -- 3. Soft Delete Example (avoids foreign key issues) 65 | ALTER TABLE posts ADD COLUMN is_active BOOLEAN DEFAULT TRUE; 66 | UPDATE posts SET is_active = FALSE WHERE id = '1112A003'; -- Mark as inactive -------------------------------------------------------------------------------- /sql_scripts/Theory/16_Database_modifications.sql: -------------------------------------------------------------------------------- 1 | -- #1. Simplifying Repetitive Logic 2 | -- Without a view 3 | SELECT * 4 | FROM posts 5 | WHERE newsletter_id = '1112A'; 6 | 7 | -- With a view 8 | CREATE VIEW databites_posts AS 9 | SELECT * 10 | FROM posts 11 | WHERE newsletter_id = '1112A'; 12 | 13 | -- Reuse easily: 14 | SELECT * FROM databites_posts 15 | WHERE published_at > '2024-02-01'; 16 | 17 | -- #2. Reusable Metrics 18 | CREATE VIEW post_performance AS 19 | SELECT 20 | p.id AS post_id, 21 | p.name, 22 | n.name AS newsletter_name, 23 | SUM(i.points) AS total_points, 24 | COUNT(i.id) AS interaction_count 25 | FROM posts p 26 | JOIN newsletters n ON p.newsletter_id = n.id 27 | LEFT JOIN interactions i ON p.id = i.post_id 28 | GROUP BY p.id, p.name, n.name; 29 | 30 | -- Use it like this: 31 | SELECT * 32 | FROM post_performance 33 | WHERE total_points > 5; 34 | 35 | -- #3. Hiding Sensitive Data 36 | CREATE VIEW public_post_insights AS 37 | SELECT 38 | p.id AS post_id, 39 | p.name AS post_title, 40 | COUNT(i.id) AS total_interactions, 41 | SUM(i.points) AS engagement_score 42 | FROM posts p 43 | LEFT JOIN interactions i ON p.id = i.post_id 44 | GROUP BY p.id, p.name; 45 | 46 | -- Simple and safe: 47 | SELECT * FROM public_post_insights; -------------------------------------------------------------------------------- /sql_scripts/Theory/17_Indexing_and_Optimization.sql: -------------------------------------------------------------------------------- 1 | -- Create indexes for MySQL 2 | CREATE INDEX idx_posts_name ON posts(name); 3 | CREATE INDEX idx_posts_newsletter_published ON posts(newsletter_id, published_at); 4 | CREATE INDEX idx_interactions_post ON interactions(post_id); 5 | CREATE INDEX idx_interactions_covered ON interactions(post_id, points, datetime); -------------------------------------------------------------------------------- /sql_scripts/Theory/18_Modular_Code.sql: -------------------------------------------------------------------------------- 1 | WITH 2 | -- MODULE 1 3 | 4 | post_points AS ( 5 | SELECT 6 | p.id AS post_id, 7 | p.name AS post_name, 8 | p.newsletter_id, 9 | SUM(i.points) AS total_points 10 | FROM posts p 11 | JOIN interactions i ON p.id = i.post_id 12 | GROUP BY p.id, p.name, p.newsletter_id 13 | ), 14 | 15 | -- MODULE 2 16 | ranked_posts AS 17 | ( 18 | SELECT 19 | pp.*, 20 | RANK() OVER (PARTITION BY newsletter_id ORDER BY total_points DESC) AS rank_within_newsletter 21 | FROM post_points pp 22 | ), 23 | 24 | -- MODULE 3 25 | newsletter_avg_points AS ( 26 | SELECT 27 | newsletter_id, 28 | AVG(total_points) AS avg_post_score 29 | FROM post_points 30 | GROUP BY newsletter_id 31 | ) 32 | 33 | -- FINAL SELECTION 34 | SELECT 35 | n.name AS newsletter_name, 36 | rp.post_name, 37 | rp.total_points, 38 | rp.rank_within_newsletter, 39 | nap.avg_post_score 40 | FROM ranked_posts rp 41 | JOIN newsletters n ON n.id = rp.newsletter_id 42 | JOIN newsletter_avg_points nap ON nap.newsletter_id = rp.newsletter_id 43 | WHERE rp.rank_within_newsletter <= 3 44 | ORDER BY n.name, rp.rank_within_newsletter; -------------------------------------------------------------------------------- /sql_scripts/Theory/19_execution_order.sql: -------------------------------------------------------------------------------- 1 | -- OUR EXAMPLE TO UNDERSTAND THE SQL EXECUTION ORDER 2 | SELECT 3 | n.name AS newsletter_name, 4 | p.name AS post_name, 5 | SUM(i.points) AS total_points 6 | FROM posts p 7 | JOIN newsletters n 8 | ON p.newsletter_id = n.id 9 | JOIN interactions i 10 | ON p.id = i.post_id 11 | WHERE i.points IS NOT NULL 12 | GROUP BY n.name, p.name 13 | HAVING SUM(i.points) >= 2 14 | ORDER BY total_points DESC 15 | LIMIT 10; -------------------------------------------------------------------------------- /sql_scripts/Theory/20_query_optimization.sql: -------------------------------------------------------------------------------- 1 | -- 1️⃣ Use EXISTS Instead of JOIN + DISTINCT 2 | SELECT * 3 | FROM newsletters n 4 | WHERE EXISTS ( 5 | SELECT 1 6 | FROM posts p 7 | WHERE p.newsletter_id = n.id 8 | AND p.name = 'DataBites' 9 | ); 10 | 11 | 12 | -- 2️⃣ Select Only What You Need 13 | SELECT id, name, published_at 14 | FROM posts 15 | WHERE published_at >= '2024-01-01'; 16 | 17 | -- 3️⃣ Index Strategically 18 | CREATE INDEX idx_posts_date ON posts(published_at); 19 | 20 | -- 4️⃣ Break Queries into CTEs 21 | WITH post_totals AS ( 22 | SELECT post_id, SUM(points) AS total 23 | FROM interactions 24 | GROUP BY post_id 25 | ) 26 | SELECT p.id 27 | FROM posts p 28 | JOIN post_totals pt ON p.id = pt.post_id 29 | WHERE pt.total > (SELECT AVG(total) FROM post_totals); 30 | 31 | -- 5️⃣ Avoid SELECT * in Subqueries 32 | SELECT n.name, 33 | (SELECT COUNT(id) FROM posts p WHERE p.newsletter_id = n.id) AS post_count 34 | FROM newsletters n; 35 | 36 | -- 6️⃣ Use JOIN Instead of Subqueries for Filters 37 | SELECT DISTINCT n.id 38 | FROM newsletters n 39 | JOIN posts p ON n.id = p.newsletter_id 40 | WHERE p.name LIKE '%SQL%'; 41 | 42 | --------------------------------------------------------------------------------