├── placeholder.sql
├── .gitattributes
├── LICENSE
└── README.md


/placeholder.sql:
--------------------------------------------------------------------------------
1 | SELECT *


--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | *.sql linguist-detectable=true
2 | *.sql linguist-language=sql
3 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 Ben
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # SQL tips and tricks
  2 | 
  3 | [![Stand With Ukraine](https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/badges/StandWithUkraine.svg)](https://stand-with-ukraine.pp.ua)
  4 | 
  5 | [![Ceasefire Now](https://badge.techforpalestine.org/default)](https://techforpalestine.org/learn-more)
  6 | 
  7 | A (somewhat opinionated) list of SQL tips and tricks that I've picked up over the years.
  8 | 
  9 | There's so much you can do with SQL but I've focused on what I find most useful in my day-to-day work as a data analyst and what 
 10 | I wish I had known when I first started writing SQL.
 11 | 
 12 | Please note that some of these tips might not be relevant for all RDBMSs.
 13 | 
 14 | ## Table of contents
 15 | 
 16 | ### Formatting/readability
 17 | 
 18 | - [Use a leading comma to separate fields](#use-a-leading-comma-to-separate-fields)
 19 | -  [Use a dummy value in the `WHERE` clause](#use-a-dummy-value-in-the-where-clause)
 20 | - [Indent your code](#indent-your-code)
 21 | - [Consider CTEs when writing complex queries](#consider-ctes-when-writing-complex-queries)
 22 | - [Comment your code](#comment-your-code)
 23 | - [Simplify joins with USING](#simplify-joins-with-using)
 24 | 
 25 | ### Data wrangling
 26 | - [Anti-joins will return rows from one table that have no match in another table](#anti-joins-will-return-rows-from-one-table-that-have-no-match-in-another-table)
 27 | - [Use `QUALIFY` to filter window functions](#use-qualify-to-filter-window-functions)
 28 | - [You can (but shouldn't always) `GROUP BY` column position](#you-can-but-shouldnt-always-group-by-column-position)
 29 | - [Create a grand total with `GROUP BY ROLLUP`](#create-a-grand-total-with-group-by-rollup)
 30 | - [Use `EXCEPT` to find the difference between two tables](#use-except-to-find-the-difference-between-two-tables)
 31 | 
 32 | ### Performance
 33 | 
 34 | - [`NOT EXISTS` is faster than `NOT IN` if your column allows `NULL`](#not-exists-is-faster-than-not-in-if-your-column-allows-null)
 35 | - [Implicit casting will slow down (or break) ](#implicit-casting-will-slow-down-or-break-your-query)
 36 | 
 37 | ### Common mistakes
 38 | 
 39 | - [Be aware of how `NOT IN` behaves with `NULL` values](#be-aware-of-how-not-in-behaves-with-null-values)
 40 | - [Avoid ambiguity when naming calculated fields](#avoid-ambiguity-when-naming-calculated-fields)
 41 | - [Always specify which column belongs to which table](#always-specify-which-column-belongs-to-which-table)
 42 | 
 43 | ### Miscellaneous
 44 | 
 45 | - [Understand the order of execution](#understand-the-order-of-execution)
 46 | - [Read the documentation (in full)](#read-the-documentation-in-full)
 47 | - [Use descriptive names for your saved queries](#use-descriptive-names-for-your-saved-queries)
 48 | 
 49 | 
 50 | ## Formatting/readability
 51 | ### Use a leading comma to separate fields
 52 | 
 53 | Use a leading comma to separate fields in the `SELECT` clause rather than a trailing comma.
 54 | 
 55 | - Clearly defines that this is a new column vs code that's wrapped to multiple lines.
 56 | 
 57 | - Visual cue to easily identify if the comma is missing or not. Varying line lengths makes it harder to determine.
 58 |  
 59 | ```SQL
 60 | SELECT
 61 | employee_id
 62 | , employee_name
 63 | , job
 64 | , salary
 65 | FROM employees
 66 | ;
 67 | ```
 68 | 
 69 | - Also use a leading `AND` in the `WHERE` clause, for the same reasons (following tip demonstrates this).
 70 | 
 71 | -----
 72 | 
 73 | ### **Use a dummy value in the WHERE clause**
 74 | Use a dummy value in the `WHERE` clause so you can easily comment out conditions when testing or tweaking a query.
 75 | 
 76 | ```SQL
 77 | /*
 78 | If I want to comment out the job
 79 | condition the following query
 80 | will break:
 81 | */
 82 | SELECT *
 83 | FROM employees
 84 | WHERE
 85 | --job IN ('Clerk', 'Manager')
 86 | AND dept_no != 5
 87 | ;
 88 | 
 89 | /*
 90 | With a dummy value there's no issue.
 91 | I can comment out all the conditions
 92 | and 1=1 will ensure the query still runs:
 93 | */
 94 | SELECT *
 95 | FROM employees
 96 | WHERE 1=1
 97 | -- AND job IN ('Clerk', 'Manager')
 98 | AND dept_no != 5
 99 | ;
100 | ```
101 | 
102 | -----
103 | 
104 | ### Indent your code
105 | Indent your code to make it more readable to colleagues and your future self.
106 | 
107 | Opinions will vary on what this looks like, so be sure to follow your company/team's guidelines or, if that doesn't exist, go with whatever works for you.
108 | 
109 | You can also use an online formatter like [poorsql](https://poorsql.com/) or a linter like [sqlfluff](https://github.com/sqlfluff/sqlfluff).
110 | 
111 | ``` SQL
112 | SELECT
113 | -- Bad:
114 | vc.video_id
115 | , CASE WHEN meta.GENRE IN ('Drama', 'Comedy') THEN 'Entertainment' ELSE meta.GENRE END as content_type
116 | FROM video_content AS vc
117 | INNER JOIN metadata ON vc.video_id = metadata.video_id
118 | ;
119 | 
120 | -- Good:
121 | SELECT 
122 | vc.video_id
123 | , CASE 
124 | 	WHEN meta.GENRE IN ('Drama', 'Comedy') THEN 'Entertainment' 
125 | 	ELSE meta.GENRE 
126 | END AS content_type
127 | FROM video_content
128 | INNER JOIN metadata 
129 | 	ON video_content.video_id = metadata.video_id
130 | ;
131 | ```
132 | -----
133 | 
134 | ### Consider CTEs when writing complex queries
135 | For longer than I'd care to admit I would nest inline views, which would lead to
136 | queries that were hard to understand, particularly if revisited after a few weeks.
137 | 
138 | If you find yourself nesting inline views more than 2 or 3 levels deep, 
139 | consider using common table expressions, which keep your code more organised and readable and supports reusability and debugging.
140 | 
141 | ```SQL
142 | -- Using inline views:
143 | SELECT 
144 | vhs.movie
145 | , vhs.vhs_revenue
146 | , cs.cinema_revenue
147 | FROM 
148 |     (
149 |     SELECT
150 |     movie_id
151 |     , SUM(ticket_sales) AS cinema_revenue
152 |     FROM tickets
153 |     GROUP BY movie_id
154 |     ) AS cs
155 |     INNER JOIN 
156 |         (
157 |         SELECT 
158 |         movie
159 |         , movie_id
160 |         , SUM(revenue) AS vhs_revenue
161 |         FROM blockbuster
162 |         GROUP BY movie, movie_id
163 |         ) AS vhs
164 |         ON cs.movie_id = vhs.movie_id
165 | ;
166 | 
167 | -- Using CTEs:
168 | WITH cinema_sales AS 
169 |     (
170 |         SELECT 
171 |         movie_id
172 |         , SUM(ticket_sales) AS cinema_revenue
173 |         FROM tickets
174 |         GROUP BY movie_id
175 |     ),
176 |     vhs_sales AS
177 |     (
178 |         SELECT 
179 |         movie
180 |         , movie_id
181 |         , SUM(revenue) AS vhs_revenue
182 |         FROM blockbuster
183 |         GROUP BY movie, movie_id
184 |     )
185 | SELECT 
186 | vhs.movie
187 | , vhs.vhs_revenue
188 | , cs.cinema_revenue
189 | FROM cinema_sales AS cs
190 |     INNER JOIN vhs_sales AS vhs
191 |     ON cs.movie_id = vhs.movie_id
192 | ;
193 | ```
194 | 
195 | -----
196 | ### Comment your code
197 | While in the moment you know why you did something, if you revisit
198 | the code weeks, months or years later you might not remember.
199 | 
200 | In general you should strive to write comments that explain why you did something, not how.
201 | 
202 | Your colleagues and future self will thank you!
203 | 
204 | ```SQL
205 | SELECT 
206 | video_content.*
207 | FROM video_content
208 |     LEFT JOIN archive
209 |     ON video_content.video_id = archive.video_id
210 | WHERE 1=1
211 | -- Need to filter out as new CMS cannot process archive video formats:
212 | AND archive.video_id IS NULL
213 | ;
214 | ```
215 | 
216 | -----
217 | ### Simplify joins with `USING`
218 | 
219 | If you're joining using a column with the same name in two tables you can use `USING` to
220 | simplify your join:
221 | 
222 | ```SQL
223 | -- USING:
224 | SELECT * 
225 | FROM album 
226 | 	INNER JOIN artist 
227 | 	USING (artistid)
228 | 
229 | -- Traditional ON clause:
230 | SELECT * 
231 | FROM album 
232 | 	INNER JOIN artist 
233 | 	ON album.artistid = artist.ArtistId 
234 | ```
235 | 
236 | The other benefit of `USING` is that the column in common between the two tables is deduplicated, with only one column returned in the result set.
237 | 
238 | This means that there is no ambiguity, unlike the following query which would throw a `ambiguous column name` error as the database would not be sure
239 | which column to which you are referring if you are using the `ON` clause:
240 | 
241 | ```SQL
242 | SELECT ArtistId -- Which table column?
243 | FROM album
244 | 	INNER JOIN artist 
245 | 	ON album.artistid = artist.ArtistId
246 | ```
247 | 
248 | ## Data wrangling
249 | 
250 | ### Anti-joins will return rows from one table that have no match in another table
251 | 
252 | Use anti-joins when you want to return rows from one table that don't have a match in another table.
253 | 
254 | For example, you only want video IDs of content that hasn't been archived.
255 | 
256 | There are multiple ways to do an anti-join:
257 | 
258 | ```SQL 
259 | -- Using a LEFT JOIN:
260 | SELECT 
261 | vc.video_id
262 | FROM video_content AS vc
263 |     LEFT JOIN archive
264 |     ON vc.video_id = archive.video_id
265 | WHERE 1=1
266 | AND archive.video_id IS NULL -- Any rows with no match will have a NULL value.
267 | ;
268 | 
269 | -- Using NOT IN/subquery:
270 | SELECT 
271 | video_id
272 | FROM video_content
273 | WHERE 1=1
274 | AND video_id NOT IN (SELECT video_id FROM archive) -- Be mindful of NULL values.
275 | 
276 | -- Using NOT EXISTS/correlated subquery:
277 | SELECT 
278 | video_id
279 | FROM video_content AS vc
280 | WHERE 1=1
281 | AND NOT EXISTS (
282 |         SELECT 1
283 |         FROM archive AS a
284 |         WHERE a.video_id = vc.video_id
285 |         )
286 | 
287 | ```
288 | 
289 | Note that I advise against using `NOT IN` - see [this tip](#be-aware-of-how-not-in-behaves-with-null-values).
290 | 
291 | -----
292 | ### Use `QUALIFY` to filter window functions
293 | 
294 | `QUALIFY` lets you filter the results of a query based on a window function, meaning you don't need
295 | to use an inline view to filter your result set and thus reducing the number of lines of code.
296 | 
297 | For example, if I want to return the top 10 markets per product I can use
298 | `QUALIFY` rather than an inline view:
299 | 
300 | ```SQL
301 | -- Using QUALIFY:
302 | SELECT 
303 | product
304 | , market
305 | , SUM(revenue) AS market_revenue 
306 | FROM sales
307 | GROUP BY product, market
308 | QUALIFY DENSE_RANK() OVER (PARTITION BY product ORDER BY SUM(revenue) DESC)  <= 10
309 | ORDER BY product, market_revenue
310 | ;
311 | 
312 | -- Without QUALIFY:
313 | SELECT 
314 | product
315 | , market
316 | , market_revenue 
317 | FROM
318 | (
319 | SELECT 
320 | product
321 | , market
322 | , SUM(revenue) AS market_revenue
323 | , DENSE_RANK() OVER (PARTITION BY product ORDER BY SUM(revenue) DESC) AS market_rank
324 | FROM sales
325 | GROUP BY product, market
326 | )
327 | WHERE market_rank  <= 10
328 | ORDER BY product, market_revenue
329 | ;
330 | ```
331 | 
332 | Unfortunately it looks like `QUALIFY` is only available in the big data warehouses (Snowflake, Amazon Redshift, Google BigQuery) but I had to include this because it's so useful.
333 | 
334 | -----
335 | ### You can (but shouldn't always) `GROUP BY` column position
336 | 
337 | Instead of using the column name, you can `GROUP BY` or `ORDER BY` using the
338 | column position.
339 | 
340 | - This can be useful for ad-hoc/one-off queries, but for production code
341 | you should always refer to a column by its name.
342 | 
343 | ```SQL
344 | SELECT 
345 | dept_no
346 | , SUM(salary) AS dept_salary
347 | FROM employees
348 | GROUP BY 1 -- dept_no is the first column in the SELECT clause.
349 | ORDER BY 2 DESC
350 | ;
351 | ```
352 | 
353 | -----
354 | ### Create a grand total with `GROUP BY ROLLUP`
355 | Creating a grand total (and/or sub-total) row is possible thanks to `GROUP BY ROLLUP`.
356 | 
357 | For example, if you've aggregated a company's employees salary per department you 
358 | can use `GROUP BY ROLLUP` to create a grand total that applies your aggregate functions as if 
359 | the specified grouping hadn't been applied (thus creating a grand total row).
360 | 
361 | The [Transact-SQL documentation](https://learn.microsoft.com/en-us/sql/t-sql/queries/select-group-by-transact-sql?view=sql-server-ver17) explains `GROUP BY ROLLUP` well:
362 | 
363 | _"Creates a group for each combination of column expressions. In addition, it 'rolls up' the results into subtotals and grand totals. To do this, it moves from right to left decreasing the number of column expressions over which it creates groups and the aggregation(s)."_
364 | 
365 | You may want to apply `COALESCE`, as below, to ensure the total row is labelled as such.
366 | 
367 | ```SQL
368 | SELECT 
369 | COALESCE(dept_no, 'Total') AS department_number
370 | , SUM(salary) AS dept_salary
371 | FROM employees
372 | GROUP BY ROLLUP(dept_no)
373 | ORDER BY dept_salary -- Be sure to order by this column to ensure the Total appears last/at the bottom of the result set.
374 | ;
375 | ```
376 | 
377 | -----
378 | ### Use `EXCEPT` to find the difference between two tables
379 | 
380 | `EXCEPT` returns rows from the first query's result set that don't appear in the second query's result set.
381 | 
382 | ```SQL
383 | /*
384 | Miles Davis will be returned from
385 | this query
386 | */
387 | SELECT artist_name
388 | FROM artist
389 | WHERE artist_name = 'Miles Davis'
390 | EXCEPT 
391 | SELECT artist_name
392 | FROM artist
393 | WHERE artist_name = 'Nirvana'
394 | ;
395 | 
396 | /*
397 | Nothing will be returned from this
398 | query as 'Miles Davis' appears in
399 | both queries' result sets.
400 | */
401 | SELECT artist_name
402 | FROM artist
403 | WHERE artist_name = 'Miles Davis'
404 | EXCEPT 
405 | SELECT artist_name
406 | FROM artist
407 | WHERE artist_name = 'Miles Davis'
408 | ;
409 | ```
410 | 
411 | You can also utilise `EXCEPT` with `UNION ALL` to verify whether two tables have the same data.
412 | 
413 | If no rows are returned the tables are identical - otherwise, what's returned are the rows causing the difference:
414 | 
415 | ```SQL
416 | /* 
417 | The first query will return rows from
418 | employees that aren't present in
419 | department.
420 | 
421 | The second query will return rows from
422 | department that aren't present in employees.
423 | 
424 | The UNION ALL will ensure that the
425 | final result set returned combines
426 | all of these rows so you know
427 | which rows are causing the difference.
428 | */
429 | (
430 | SELECT 
431 | id
432 | , employee_name
433 | FROM employees
434 | EXCEPT 
435 | SELECT 
436 | id
437 | , employee_name
438 | FROM department
439 | )
440 | UNION ALL 
441 | (
442 | SELECT 
443 | id
444 | , employee_name
445 | FROM department
446 | EXCEPT
447 | SELECT 
448 | id
449 | , employee_name
450 | FROM employees
451 | )
452 | ;
453 | 
454 | ```
455 | 
456 | ## Performance
457 | 
458 | ### `NOT EXISTS` is faster than `NOT IN` if your column allows `NULL`
459 | 
460 | `NOT IN` is usually slower than using `NOT EXISTS`, if the values/column you're comparing against allows `NULL`.
461 | 
462 | I've experienced this when using Snowflake and the PostgreSQL Wiki explicitly [calls this out](https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_NOT_IN):
463 | 
464 | *"...NOT IN (SELECT ...) does not optimize very well."*
465 | 
466 | Aside from being slow, using `NOT IN` will not work as intended if there is a `NULL` in the values being compared against - see [tip 11](#be-aware-of-how-not-in-behaves-with-null-values).
467 | 
468 | Why include this tip if `NOT IN` doesn't work with `NULL` values anyway?
469 | 
470 | Well just because a column allows `NULL` values does not mean there **are** any `NULL` values present and if you're working with a table that you cannot alter you'll want to use `NOT EXISTS` to speed up your query.
471 | 
472 | -----
473 | 
474 | ### Implicit casting will slow down (or break) your query
475 | 
476 | If you specify a value with a different data type than a column's, your database may automatically (implicitly) convert the value.
477 | 
478 | For example, let's say the `video_id` column has a string data type and you specify an integer in the `WHERE` clause:
479 | 
480 | ```SQL
481 | SELECT video_name
482 | FROM video_content 
483 |  -- Behind the scenes the database will implicitly attempt to convert the video_id column to an integer:
484 | WHERE video_id = 200050
485 | ```
486 | 
487 | There's a couple of problems with relying on implicit casting:
488 | 
489 | 1) An error may be thrown if the implicit conversion isn't possible - for example, if one of the video IDs has a string value of _'abc2000'_
490 | 
491 | 2) \*Your query may be slower, due to the additional work of converting each value to the specified data type.
492 | 
493 | Instead, use the same data type as the column you're operating on (`WHERE video_ID = '200050'`) or, to avoid errors, use a function like [`TRY_TO_NUMBER`](https://docs.snowflake.com/en/sql-reference/functions/try_to_decimal) that 
494 | will attempt the conversion but handle any errors:
495 | 
496 | ```SQL
497 | SELECT video_name
498 | FROM video_content 
499 |  -- This won't result in an error:
500 | WHERE TRY_TO_NUMBER(video_id) = 200050
501 | ```
502 | 
503 | \* Note that this depends on the size of the dataset being operated on. 
504 | 
505 | ## Common mistakes
506 | 
507 | ### Be aware of how `NOT IN` behaves with `NULL` values
508 | 
509 | `NOT IN` doesn't work if `NULL` is present in the values being checked against. As `NULL` represents Unknown the SQL engine can't verify that the value being checked is not present in the list.
510 | - Instead use `NOT EXISTS`.
511 | 
512 | ``` SQL
513 | INSERT INTO departments (id)
514 | VALUES (1), (2), (NULL);
515 | 
516 | -- Doesn't work due to NULL:
517 | SELECT * 
518 | FROM employees 
519 | WHERE department_id NOT IN (SELECT DISTINCT id from departments)
520 | ;
521 | 
522 | -- Solution.
523 | SELECT * 
524 | FROM employees e
525 | WHERE NOT EXISTS (
526 |     SELECT 1 
527 |     FROM departments d 
528 |     WHERE d.id = e.department_id
529 | )
530 | ;
531 | ```
532 | 
533 | -----
534 | ### Avoid ambiguity when naming calculated fields
535 | 
536 | When creating a calculated field, naming it the same as an existing column can lead to unexpected behaviour.
537 | 
538 | Note [Snowflake's documentation](https://docs.snowflake.com/en/sql-reference/constructs/group-by) on the topic:
539 | 
540 | *"If a GROUP BY clause contains a name that matches both a column name and an alias, then the GROUP BY clause uses the column name."*
541 | 
542 | For example you might expect the following to return 2 rows but what's actually returned is 3 rows:
543 | 
544 | ```SQL
545 | CREATE TABLE products (
546 |     product VARCHAR(50) NOT NULL,
547 |     revenue INT NOT NULL
548 | )
549 | ;
550 | 
551 | INSERT INTO products (product, revenue)
552 | VALUES 
553 |     ('Shark', 100),
554 |     ('Robot', 150),
555 |     ('Racecar', 90);
556 | 
557 | SELECT 
558 | LEFT(product, 1) AS product -- Returns the first letter of the product value.
559 | , MAX(revenue) as max_revenue
560 | FROM products
561 | GROUP BY product
562 | ;
563 | ```
564 | 
565 | |PRODUCT|MAX_REVENUE|
566 | |-------|------------|
567 | |S|100|
568 | |R|150|
569 | |R|90|
570 | 
571 | What's happened is that the `LEFT` function has been applied after the product column has been 
572 | grouped and aggregation applied.
573 | 
574 | The solution is to use a unique alias or be more explicit in the `GROUP BY` clause: 
575 | 
576 | ```SQL
577 | -- Solution option 1:
578 | SELECT 
579 | LEFT(product, 1) AS product_letter
580 | , MAX(revenue) AS max_revenue
581 | FROM products
582 | GROUP BY product_letter
583 | ;
584 | 
585 | -- Solution option 2:
586 | SELECT 
587 | LEFT(product, 1) AS product,
588 | , MAX(revenue) AS max_revenue
589 | FROM products
590 | GROUP BY LEFT(product, 1)
591 | ;
592 | ```
593 | 
594 | Result:
595 | 
596 | |PRODUCT_LETTER|MAX_REVENUE|
597 | |--------------|-----------|
598 | |S|100|
599 | |R|150|
600 | 
601 | 
602 | Assigning an alias to a calculated field can also be problematic when it comes to window functions.
603 | 
604 | In this example the `CASE` statement is being applied AFTER the window function has executed:
605 | 
606 | ```SQL
607 | /*
608 | The window function will rank the 'Robot' product as 1 when it should be 3.
609 | */
610 | SELECT 
611 | product
612 | , CASE product WHEN 'Robot' THEN 0 ELSE revenue END AS revenue
613 | , RANK() OVER (ORDER BY revenue DESC)
614 | FROM products
615 | ;
616 | ```
617 | 
618 | Our earlier solutions apply:
619 | 
620 | ```SQL
621 | /*
622 | Solution option 1 (note this might not work in all RDBMS, in which case use the other solution):
623 | */
624 | SELECT 
625 | product
626 | , CASE product WHEN 'Robot' THEN 0 ELSE revenue END AS updated_revenue
627 | , RANK() OVER (ORDER BY updated_revenue DESC)
628 | FROM products
629 | ;
630 | 
631 | -- Solution option 2:
632 | SELECT 
633 | product
634 | , CASE product WHEN 'Robot' THEN 0 ELSE revenue END AS revenue
635 | , RANK() OVER (ORDER BY CASE product WHEN 'Robot' THEN 0 ELSE revenue END DESC)
636 | FROM products
637 | ;
638 | ```
639 | 
640 | My advice - use a unique alias when possible to avoid confusion.
641 | 
642 | -----
643 | ### Always specify which column belongs to which table
644 | 
645 | When you have complex queries with multiple joins, it pays to be able to 
646 | trace back an issue with a value to its source. 
647 | 
648 | Additionally, your RDBMS might raise an error if two tables share the same
649 | column name and you don't specify which column you are using.
650 | 
651 | ```SQL
652 | SELECT 
653 | vc.video_id
654 | , vc.series_name
655 | , metadata.season
656 | , metadata.episode_number
657 | FROM video_content AS vc 
658 |     INNER JOIN video_metadata AS metadata
659 |     ON vc.video_id = metadata.video_id
660 | ;
661 | ```
662 | 
663 | ## Miscellaneous
664 | 
665 | ### Understand the order of execution
666 | If I had to give one piece of advice to someone learning SQL, it'd be to understand the order of 
667 | execution (of clauses). It will completely change how you write queries. This [blog post](https://blog.jooq.org/a-beginners-guide-to-the-true-order-of-sql-operations/) is a fantastic resource for learning.
668 | 
669 | -----
670 | ### Read the documentation (in full)
671 | Using Snowflake I once needed to return the latest date from a list of columns 
672 | and so I decided to use `GREATEST()`.
673 | 
674 | What I didn't realise was that if one of the
675 | arguments is `NULL` then the function returns `NULL`. 
676 | 
677 | If I'd read the documentation in full I'd have known! In many cases it can take just a minute or less to scan
678 | the documentation and it will save you the headache of having to work
679 | out why something isn't working the way you expected:
680 | 
681 | ```SQL
682 | /*
683 | If I'd read the documentation
684 | further I'd also have realised
685 | that my solution to the NULL
686 | problem with GREATEST()... 
687 | */
688 | 
689 | SELECT COALESCE(GREATEST(signup_date, consumption_date), signup_date, consumption_date);
690 | 
691 | /*
692 | ... could have been solved with the
693 | following function:
694 | */
695 | SELECT GREATEST_IGNORE_NULLS(signup_date, consumption_date);
696 | ```
697 | 
698 | -----
699 | ### Use descriptive names for your saved queries
700 | 
701 | There's almost nothing worse than not being able to find a query you need to re-run/refer back to.
702 | 
703 | Use a descriptive name when saving your queries so you can easily find what you're looking for.
704 | 
705 | I usually will write the subject of the query, the month the query was ran and the name of the requester (if they exist).
706 | For example: `Lapsed users analysis - 2023-09-01 - Olivia Roberts`
707 | 


--------------------------------------------------------------------------------