├── .gitignore ├── README.md ├── data.sql └── solutions.sql /.gitignore: -------------------------------------------------------------------------------- 1 | # Logs 2 | logs 3 | *.log 4 | npm-debug.log* 5 | 6 | # Runtime data 7 | pids 8 | *.pid 9 | *.seed 10 | 11 | # Directory for instrumented libs generated by jscoverage/JSCover 12 | lib-cov 13 | 14 | # Coverage directory used by tools like istanbul 15 | coverage 16 | 17 | # nyc test coverage 18 | .nyc_output 19 | 20 | # Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files) 21 | .grunt 22 | 23 | # node-waf configuration 24 | .lock-wscript 25 | 26 | # Compiled binary addons (http://nodejs.org/api/addons.html) 27 | build/Release 28 | 29 | # Dependency directories 30 | node_modules 31 | jspm_packages 32 | 33 | # Optional npm cache directory 34 | .npm 35 | 36 | # Optional REPL history 37 | .node_repl_history 38 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | **Author**: [@tbtommyb](https://github.com/tbtommyb) 2 | 3 | **Maintainer**: TBC 4 | 5 | # PostgreSQL Workshop 6 | 7 | This workshop is designed to build your confidence in querying data using SQL. 8 | 9 | 14 | ## Contents 15 | * [Set up](#set-up) 16 | * [Loading the file](#loading-the-file) 17 | * [psql](#psql) 18 | * [Syntax hints](#syntax-hints) 19 | * [Schema diagrams](#schema-diagrams) 20 | + [Authors](#authors) 21 | + [Books](#books) 22 | + [Publishers](#publishers) 23 | + [Book Authors](#book-authors) 24 | * [The challenges](#the-challenges) 25 | + [Introductory](#introductory) 26 | + [Intermediate](#intermediate) 27 | + [Hard](#hard) 28 | 29 | ## Set up 30 | 31 | We will be working with the dataset in the `data.sql` file. This file contains a set of SQL commands that will create a set of tables and fill them with data. We will connect to the PostgreSQL server running locally on our individual computers and tell it to run the file. 32 | 33 | Make sure that you have correctly installed PostgreSQL according to [these instructions](https://github.com/macintoshhelper/learn-sql/blob/master/postgresql/setup.md). Check that you can connect to your locally-running database by running `psql` from the command line. 34 | 35 | If you run into problems, on a Mac using Homebrew, run `brew services restart postgresql` and try to connect again. On Ubuntu, run `sudo service postgresql restart` and try to connect again. 36 | 37 | ## Loading the file 38 | 39 | Please download the file and navigate in the Terminal to its location. 40 | 41 | Now run the command `psql --file=data.sql` in a new Terminal window/tab. 42 | 43 | If it doesn't work try `psql -f data.sql`. If it still doesn't work then holler. 44 | 45 | This will connect to your PostgreSQL server and run all of the SQL in `data.sql`, setting up our database for us. 46 | 47 | ## psql 48 | 49 | We mentioned above that PostgreSQL using a server-client model. Currently we're running both on the same machine but if we wanted to we could have the server running on a different computer and connect to it via the client. We will cover this soon. 50 | 51 | Now that we're set up, we can connect to our newly-created database by running `psql` (if it gives an 'access denied' error, try `psql -U [your-user-name]`). 52 | 53 | Slightly confusingly, `psql` has its own set of commands that are entirely different from SQL. You can identify them because they start with a backwards slash (\) and **don't** end in a semicolon. 54 | 55 | Once you are in `psql` try some of the following commands: 56 | 57 | `\d` - list all tables (know as 'relations' in psql) 58 | 59 | `\d [table name]` - give information on a given table 60 | 61 | `\l` - list all databases 62 | 63 | ## Syntax hints 64 | 65 | Before we jump into the challenges, here are a few syntax points to be aware of: 66 | 67 | Don't forget to use semicolons at the end of SQL commands. If you hit enter and you just get empty lines this is probably what you're missing. 68 | 69 | In PostgreSQL words in double quotes mean identifiers like the names of tables and columns. Single quotes are used for values. If you do something like: 70 | 71 | `SELECT * FROM authors WHERE first_name = "Sharon"` 72 | 73 | You'll get an error because PostgreSQL thinks "Sharon" is the name of a column. Use single quotes so that PostgreSQL knows it's a value. You can optionally put double quotes around `authors` and `first_name`, but single quotes won't work (because they are the names of identifiers within our database). 74 | 75 | Note that a single equals is used for equality testing, not assignment. 76 | 77 | SQL keywords like `SELECT`, `WHERE` etc can be in upper or lower case. The convention is upper case to distinguish them from identifiers and values but PostgreSQL will understand either way. 78 | 79 | SQL is pretty flexible with whitespace so you can spread your statements out on to as many lines as you want. Keeping things aligned can help make big statements easier to read. Just remember to end with a semicolon! 80 | 81 | ## Schema diagrams 82 | 83 | Here are the schema diagrams to help: 84 | 85 | ### Authors 86 | Column | Type | Modifiers 87 | --- | --- | --- 88 | id | integer | not null default 89 | first_name | character varying(100) | not null 90 | surname | character varying(100) | not null 91 | location | character varying(100) | 92 | 93 | ### Books 94 | 95 | Column | Type | Modifiers 96 | --- | --- | --- 97 | id | integer | not null default 98 | name | character varying(100) | not null 99 | release_date | date | not null 100 | publisher_id | integer | foreign key (publishers.id) 101 | 102 | ### Publishers 103 | 104 | Column | Type | Modifiers 105 | --- | --- | --- 106 | id | integer | not null default 107 | name | character varying(100) | not null 108 | 109 | ### Book Authors 110 | 111 | Column | Type | Modifiers 112 | --- | --- | --- 113 | book_id | integer | foreign key (books.id) 114 | author_id | integer | foreign key (authors.id) 115 | 116 | 117 | ## The challenges 118 | 119 | Please don't feel that you have to get through all of them or be able to answer them all right away! The idea is to introduce you to the kind of queries we do regularly with SQL. 120 | 121 | ### Introductory 122 | 123 | These challenges cover the basics of SQL: selects, joins and conditions. 124 | 125 | #### 1. Find the first name and surname of every author 126 | 127 | ##### Expected result 128 | 129 | first_name | surname 130 | --- | --- 131 | Sharon | Smith 132 | Ted | Burns 133 | Stephen | Wistle 134 | Amanda | Bertwistle 135 | David | Grewal 136 | John | White 137 | Paul | Hallam-Wistle 138 | Paul | Jones 139 | 140 | #### 2. Sort everyone by surname and find the first three 141 | 142 | ##### Expected result 143 | 144 | id | first_name | surname | location 145 | --- | --- | --- | --- 146 | 4 | Amanda | Bertwistle | Nazareth 147 | 2 | Ted | Burns | London 148 | 5 | David | Grewal | 149 | 150 | #### 3. Find everyone who has a location specified 151 | 152 | ##### Expected result 153 | 154 | id | first_name | surname | location 155 | --- | --- | --- | --- 156 | 1 | Sharon | Smith | Nazareth 157 | 2 | Ted | Burns | London 158 | 4 | Amanda | Bertwistle | Nazareth 159 | 6 | John | White | London 160 | 7 | Paul | Hallam-Wistle | London 161 | 8 | Paul | Jones | Nazareth 162 | 163 | #### 4. Find everyone who is not in Nazareth (including nulls) 164 | 165 | ##### Expected result 166 | 167 | id | first_name | surname | location 168 | --- | --- | --- | --- 169 | 2 | Ted | Burns | London 170 | 3 | Stephen | Wistle | 171 | 5 | David | Grewal | 172 | 6 | John | White | London 173 | 7 | Paul | Hallam-Wistle | London 174 | 175 | #### 5. Find everyone with 'Wistle' in their surname (bonus points for case insensitivity) 176 | 177 | ##### Expected result 178 | 179 | id | first_name | surname | location 180 | --- | --- | --- | --- 181 | 3 | Stephen | Wistle | 182 | 4 | Amanda | Bertwistle | Nazareth 183 | 7 | Paul | Hallam-Wistle | London 184 | 185 | #### 6. Find the name of the publisher who released 'Python Made Easy' 186 | 187 | ##### Expected result 188 | 189 | 'No Starch Press' 190 | 191 | #### 7. Find all the books published by 'No Starch Press' 192 | 193 | ##### Expected result 194 | 195 | name | name 196 | --- | --- 197 | No Starch Press | Python Made Easy 198 | No Starch Press | JavaScript: The Really Good Parts 199 | 200 | #### 8. Show a list of every book and their authors, ordered by book name 201 | Note: Only one author per row, so the book's name may need to be repeated. 202 | 203 | ##### Expected result 204 | 205 | name | first_name | surname 206 | --- | --- | --- 207 | C++ | Sharon | Smith 208 | C++ | John | White 209 | C++ | Paul | Jones 210 | C++ | David | Grewal 211 | CSS: Cansei | Amanda | Bertwistle 212 | CSS: Cansei | Paul | Hallam-Wistle 213 | CoffeeScript in Java | Stephen | Wistle 214 | CoffeeScript in Java | Paul | Hallam-Wistle 215 | Elm Street | David | Grewal 216 | Elm Street | John | White 217 | Elm Street | Sharon | Smith 218 | Java in Japanese | Ted | Burns 219 | Java in Japanese | Amanda | Bertwistle 220 | Java in Japanese | Stephen | Wistle 221 | Java in Japanese | Paul | Jones 222 | Java in Japanese | David | Grewal 223 | JavaScript: The Really Good Parts | Stephen | Wistle 224 | JavaScript: The Really Good Parts | David | Grewal 225 | Python Made Easy | David | Grewal 226 | Python Made Easy | Sharon | Smith 227 | Python Made Easy | Amanda | Bertwistle 228 | Ruby Gems | Paul | Hallam-Wistle 229 | Ruby Gems | Ted | Burns 230 | SQL: Part 2 | Sharon | Smith 231 | Swift in 10 Days | Stephen | Wistle 232 | Swift in 10 Days | David | Grewal 233 | 234 | #### 9. Find all the books that Ted Burns authored 235 | 236 | ##### Expected result 237 | 238 | 'Java in Japanese' and 'Ruby Gems' 239 | 240 | ### Intermediate 241 | 242 | These slightly trickier challenges will require you to use [aggregate functions](https://www.postgresql.org/docs/9.6/static/tutorial-agg.html) and/or [subqueries](https://www.tutorialspoint.com/postgresql/postgresql_sub_queries.htm). 243 | 244 | #### 10. Find everyone who wrote at least three books 245 | 246 | ##### Expected result 247 | 248 | first_name | surname 249 | --- | --- 250 | Paul | Hallam-Wistle 251 | David | Grewal 252 | Sharon | Smith 253 | Amanda | Bertwistle 254 | Stephen | Wistle 255 | 256 | #### 11. Order the publishers by the number of books they have published. 257 | 258 | ##### Expected result 259 | 260 | name | count 261 | --- | --- 262 | McGraw-Hill | 4 263 | The Big Publishing House | 3 264 | No Starch Press | 2 265 | Mega Corp Ltd | 1 266 | 267 | #### 12. Find all books released after 1st Jan 1996, ordered by the number of people who wrote them. 268 | 269 | ##### Expected result 270 | 271 | name | count 272 | --- | --- 273 | Java in Japanese | 5 274 | C++ | 4 275 | Elm Street | 3 276 | Swift in 10 Days | 2 277 | CoffeeScript in Java | 2 278 | Ruby Gems | 2 279 | 280 | #### 13. What's the highest number of authors per book? The lowest? 281 | 282 | ##### Expected results 283 | 284 | Highest: 'Java in Japanese' (5 authors) 285 | 286 | Lowest: 'SQL: Part 2' (1 author) 287 | 288 | #### 14. Who wrote the most books? How many did they write? 289 | 290 | ##### Expected result 291 | 292 | David Grewal, 6 293 | 294 | ### Hard 295 | 296 | Doing these is not required! Only look at these if you have time at the end. 297 | 298 | * I forgot to make a primary key for books_authors table. Alter the table to create a new column to contain a primary key made up of 'book_id' and 'author_id'. 299 | 300 | * What's the average number of authors per book? (Answer: 2.6) 301 | 302 | * Show every author who has only written for one publisher. (Answer: Ted Burns) 303 | 304 | * Which location has the higher figure for books per author? (Answer: Nazareth) 305 | 306 | * Let's say you are the first developer at a new start up called 'Amazonia'. Your boss asks you to modify the database so that customers can add books to their shopping carts. What tables and associations would you need? 307 | -------------------------------------------------------------------------------- /data.sql: -------------------------------------------------------------------------------- 1 | BEGIN; 2 | 3 | DROP TABLE IF EXISTS authors, books, publishers, book_authors CASCADE; 4 | 5 | CREATE TABLE authors ( 6 | id serial PRIMARY KEY, 7 | first_name varchar(100) NOT NULL, 8 | surname varchar(100) NOT NULL, 9 | location varchar(100) 10 | ); 11 | 12 | CREATE TABLE publishers ( 13 | id serial PRIMARY KEY, 14 | name varchar(100) NOT NULL 15 | ); 16 | 17 | CREATE TABLE books ( 18 | id serial PRIMARY KEY, 19 | name varchar(100) NOT NULL, 20 | release_date date NOT NULL, 21 | publisher_id integer REFERENCES publishers(id) ON UPDATE CASCADE 22 | ); 23 | 24 | CREATE TABLE book_authors ( 25 | book_id integer REFERENCES books(id) ON UPDATE CASCADE, 26 | author_id integer REFERENCES authors(id) ON UPDATE CASCADE 27 | ); 28 | 29 | INSERT INTO authors(first_name, surname, location) VALUES 30 | ('Sharon', 'Smith', 'Nazareth'), 31 | ('Ted', 'Burns', 'London'), 32 | ('Stephen', 'Wistle', NULL), 33 | ('Amanda', 'Bertwistle', 'Nazareth'), 34 | ('David', 'Grewal', NULL), 35 | ('John', 'White', 'London'), 36 | ('Paul', 'Hallam-Wistle', 'London'), 37 | ('Paul', 'Jones', 'Nazareth') 38 | RETURNING ID; 39 | 40 | INSERT INTO publishers(name) VALUES 41 | ('The Big Publishing House'), 42 | ('McGraw-Hill'), 43 | ('No Starch Press'), 44 | ('Mega Corp Ltd') 45 | RETURNING ID; 46 | 47 | INSERT INTO books(name, release_date, publisher_id) VALUES 48 | ('Python Made Easy', '26-Jan-94', 3), 49 | ('SQL: Part 2', '01-Jun-79', 1), 50 | ('JavaScript: The Really Good Parts', '18-Sep-95', 3), 51 | ('Java in Japanese', '23-Jan-96', 2), 52 | ('Elm Street', '01-Apr-12', 4), 53 | ('CSS: Cansei', '10-Oct-94', 1), 54 | ('Ruby Gems', '25-Dec-96', 2), 55 | ('C++', '06-Jul-17', 1), 56 | ('CoffeeScript in Java', '24-Dec-09', 2), 57 | ('Swift in 10 Days', '02-Jun-14', 2) 58 | RETURNING ID; 59 | 60 | /* We are hard-coding ID values because we know the tables will be empty and can start from 1. 61 | Don't do this! In Real Life you would write a script to build up the relations instead. */ 62 | INSERT INTO book_authors(book_id, author_id) VALUES 63 | (9, 7), 64 | (4, 4), 65 | (10, 3), 66 | (7, 2), 67 | (1, 1), 68 | (4, 8), 69 | (8, 8), 70 | (6, 4), 71 | (5, 6), 72 | (1, 5), 73 | (2, 1), 74 | (6, 7), 75 | (3, 5), 76 | (4, 3), 77 | (10, 5), 78 | (8, 6), 79 | (8, 1), 80 | (4, 5), 81 | (3, 3), 82 | (5, 1), 83 | (4, 2), 84 | (7, 7), 85 | (1, 4), 86 | (8, 5), 87 | (9, 3), 88 | (5, 5); 89 | 90 | COMMIT; 91 | -------------------------------------------------------------------------------- /solutions.sql: -------------------------------------------------------------------------------- 1 | /* 1. Find the first name and surname of every author. */ 2 | 3 | SELECT first_name, surname FROM authors; 4 | 5 | /* 2. Sort everyone by surname and find the first three */ 6 | 7 | SELECT * FROM authors ORDER BY surname LIMIT 3; 8 | 9 | /* 3. Find everyone who has a location specified */ 10 | 11 | SELECT * FROM authors WHERE location IS NOT NULL; 12 | 13 | /* 4. Find everyone who is not in Nazareth (including nulls) */ 14 | 15 | SELECT * FROM authors 16 | WHERE location <> 'Nazareth' 17 | OR location IS NULL; 18 | 19 | /* 5. Find everyone with 'Wistle' in their surname. */ 20 | 21 | SELECT * FROM authors WHERE surname LIKE '%Wistle%'; 22 | 23 | /* Case insensitive with Postgres-specific syntax */ 24 | SELECT * FROM authors WHERE surname ILIKE '%Wistle%'; 25 | /* Case insensitive with standard syntax */ 26 | SELECT * FROM authors WHERE LOWER(surname) LIKE LOWER('%Wistle%'); 27 | 28 | /* 6. Find the name of the publisher who released 'Python Made Easy'. */ 29 | 30 | SELECT publishers.name FROM publishers 31 | INNER JOIN books 32 | ON books.publisher_id = publishers.id 33 | WHERE books.name = 'Python Made Easy'; 34 | 35 | /* 7. Find all the books published by 'No Starch Press'. */ 36 | 37 | SELECT publishers.name, books.name FROM publishers 38 | INNER JOIN books 39 | ON books.publisher_id = publishers.id 40 | WHERE publishers.name = 'No Starch Press'; 41 | 42 | /* 8. Show a list of every book and their authors (one row per author). */ 43 | 44 | SELECT books.name, authors.first_name, authors.surname FROM books 45 | INNER JOIN book_authors 46 | ON book_authors.book_id = books.id 47 | INNER JOIN authors 48 | ON book_authors.author_id = authors.id 49 | ORDER BY books.name; 50 | 51 | /* 9. Find all the books authored by Ted Burns. */ 52 | 53 | SELECT books.name FROM books 54 | INNER JOIN book_authors 55 | ON book_authors.book_id = books.id 56 | INNER JOIN authors 57 | ON book_authors.author_id = authors.id 58 | WHERE authors.first_name = 'Ted' 59 | AND authors.surname = 'Burns'; 60 | 61 | /* 10. Find everyone who wrote at least three books */ 62 | 63 | SELECT authors.first_name, authors.surname FROM authors 64 | INNER JOIN book_authors 65 | ON book_authors.author_id = authors.id 66 | GROUP BY authors.id 67 | HAVING COUNT(book_authors.author_id) > 2; 68 | 69 | /* 11. Order the publishers by the number of books they have published. */ 70 | 71 | SELECT publishers.name, COUNT(books.publisher_id) FROM publishers 72 | INNER JOIN books 73 | ON books.publisher_id = publishers.id 74 | GROUP BY publishers.name 75 | ORDER BY COUNT(books.publisher_id) DESC; 76 | 77 | /* 12. Find all languages released after 1st Jan 1996, 78 | ordered by the number of people who wrote them. */ 79 | 80 | SELECT books.name, COUNT(book_authors.book_id) FROM books 81 | INNER JOIN book_authors 82 | ON books.id = book_authors.book_id 83 | WHERE books.release_date > '01-Jan-96' 84 | GROUP BY books.name 85 | ORDER BY COUNT(book_authors.book_id) DESC; 86 | 87 | /* 13.1 What's the highest number of authors per book? */ 88 | 89 | SELECT books.name, COUNT(book_authors.book_id) FROM books 90 | INNER JOIN book_authors 91 | ON books.id = book_authors.book_id 92 | GROUP BY books.name 93 | ORDER BY COUNT(book_authors.book_id) DESC LIMIT 1; 94 | 95 | /* 13.2 What's the lowest number of authors per book? */ 96 | 97 | SELECT books.name, COUNT(book_authors.book_id) FROM books 98 | INNER JOIN book_authors 99 | ON books.id = book_authors.book_id 100 | GROUP BY books.name 101 | ORDER BY COUNT(book_authors.book_id) LIMIT 1; 102 | 103 | /* 14. Who wrote the most books? How many did they write? */ 104 | 105 | SELECT authors.first_name, authors.surname, COUNT(book_authors.author_id) AS total FROM authors 106 | INNER JOIN book_authors 107 | ON authors.id = book_authors.author_id 108 | GROUP BY authors.first_name, authors.surname 109 | ORDER BY total DESC LIMIT 1; 110 | 111 | -- I forgot to make a primary key for books_authors table. Alter the table to 112 | -- create a new column to contain a primary key made up of 'book_id' and 'author_id'. 113 | 114 | ALTER TABLE book_authors ADD book_author_id VARCHAR(20); 115 | UPDATE book_authors SET book_author_id = (Cast(book_id AS VARCHAR(20)) || '-' || Cast(author_id AS VARCHAR(20))); 116 | ALTER TABLE book_authors ADD PRIMARY KEY (book_author_id); 117 | 118 | 119 | -- What's the average number of authors per book? 120 | 121 | SELECT AVG(count) 122 | FROM (SELECT book_authors.book_id, COUNT(book_authors.author_id) 123 | FROM book_authors 124 | GROUP BY book_authors.book_id) AS avg; 125 | 126 | -- Show every author who has only written for one publisher. 127 | 128 | SELECT authors.first_name, authors.surname, authors.id 129 | FROM authors 130 | INNER JOIN book_authors 131 | ON authors.id = book_authors.author_id 132 | INNER JOIN books 133 | ON books.id = book_authors.book_id 134 | INNER JOIN publishers 135 | ON publishers.id = books.publisher_id 136 | GROUP BY authors.first_name, authors.surname, authors.id 137 | HAVING COUNT(DISTINCT books.publisher_id) = 1; 138 | 139 | 140 | -- Which location has the higher figure for books per author? 141 | 142 | SELECT location, AVG(books) FROM (SELECT authors.location, authors.first_name, authors.surname, COUNT(book_authors.book_id) AS books 143 | FROM authors 144 | INNER JOIN book_authors 145 | ON authors.id = book_authors.author_id 146 | GROUP BY authors.location, authors.first_name, authors.surname 147 | ORDER BY authors.location) AS average 148 | GROUP BY location 149 | ORDER BY avg DESC; 150 | 151 | 152 | -- Let's say you are the first developer at a new start up called 'Amazonia'. 153 | -- Your boss asks you to modify the database so that customers can add books to their shopping carts. 154 | -- What tables and associations would you need? 155 | --------------------------------------------------------------------------------