├── pcs.png ├── screenshot.png ├── README.md └── SQLslides.ipynb /pcs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/caocscar/intro-to-SQL-1/master/pcs.png -------------------------------------------------------------------------------- /screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/caocscar/intro-to-SQL-1/master/screenshot.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # intro-to-SQL 2 | Introduction to SQL Workshop 3 | 4 | The best way to view the slidedeck is to use the link: http://nbviewer.jupyter.org/format/slides/github/kaitcorn/intro-to-SQL/blob/master/SQLslides.ipynb#/ 5 | -------------------------------------------------------------------------------- /SQLslides.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": { 7 | "slideshow": { 8 | "slide_type": "skip" 9 | } 10 | }, 11 | "outputs": [ 12 | { 13 | "name": "stdout", 14 | "output_type": "stream", 15 | "text": [ 16 | "Populating the interactive namespace from numpy and matplotlib\n" 17 | ] 18 | } 19 | ], 20 | "source": [ 21 | "%pylab inline" 22 | ] 23 | }, 24 | { 25 | "cell_type": "markdown", 26 | "metadata": { 27 | "slideshow": { 28 | "slide_type": "slide" 29 | } 30 | }, 31 | "source": [ 32 | "
\n", 33 | "# Introduction to SQL\n", 34 | "\n", 35 | "By Kaitlin Cornwell, Maggie Orton, and Alex Cao \n", 36 | "\n", 37 | "\n", 38 | "October 6, 2017 \n", 39 | "\n", 40 | "CSCAR at The University of Michigan" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "metadata": { 46 | "slideshow": { 47 | "slide_type": "fragment" 48 | } 49 | }, 50 | "source": [ 51 | "Please fill out the workshop sign-in here\n" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": { 57 | "slideshow": { 58 | "slide_type": "fragment" 59 | } 60 | }, 61 | "source": [ 62 | "We'll practice SQL using the W3Schools online database here\n", 63 | "\n", 64 | "Hint: It may be easier to open multiple tabs. One tab can be used to complete the exercises while the other can be used to view the data." 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": { 70 | "slideshow": { 71 | "slide_type": "fragment" 72 | } 73 | }, 74 | "source": [ 75 | "Structured Query Language (\"SQL\") allows you to extract or change specific information in a relational database (i.e. a series of tables). " 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": { 81 | "slideshow": { 82 | "slide_type": "fragment" 83 | } 84 | }, 85 | "source": [ 86 | " MySQL, SQLite, PostgreSQL, SQL Server, etc. are all database management systems that rely on SQL. Each has its own special variety of SQL, but the general format of the queries is the same. " 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": { 92 | "slideshow": { 93 | "slide_type": "slide" 94 | } 95 | }, 96 | "source": [ 97 | "# SQL Queries\n", 98 | "\n", 99 | "## Format\n", 100 | "Series of commands followed by argument(s)\n", 101 | "\n", 102 | "When dealing with multiple tables, identify a column name with tablename.columnname\n", 103 | "\n", 104 | "End query with semicolon for multiple consecutive queries\n", 105 | "\n", 106 | "## Style + Readability\n", 107 | "Table names and column names can be case sensitive. Depends on the database.\n", 108 | "\n", 109 | "SQL keywords are not case sensitive\n", 110 | "\n", 111 | "Standard format: Commands in all-caps\n", 112 | "\n", 113 | "Each command set on new line\n", 114 | "\n", 115 | "## Comments\n", 116 | "Single-line comments:\n", 117 | "\n", 118 | "-- begin line with two hyphens\n", 119 | "\n", 120 | "Multi-line comments:\n", 121 | "\n", 122 | "/\\* enclose comment\n", 123 | "in asterisk|slashes \\*/" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": { 129 | "slideshow": { 130 | "slide_type": "slide" 131 | } 132 | }, 133 | "source": [ 134 | "# SELECT, FROM, AS\n", 135 | "The SELECT command specifies the desired columns \n", 136 | "\n", 137 | "The FROM command specifies the table from which those columns should be selected\n", 138 | "\n", 139 | "The AS command temporarily renames a column or table with the specified name (an \"alias\")\n", 140 | "\n", 141 | "An '\\*' character selects all columns in a table" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": { 147 | "slideshow": { 148 | "slide_type": "fragment" 149 | } 150 | }, 151 | "source": [ 152 | "## Example\n", 153 | "To retrieve all columns from this table, you would use the command: \n", 154 | "\n", 155 | " SELECT * \n", 156 | " FROM Employees" 157 | ] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "metadata": { 162 | "slideshow": { 163 | "slide_type": "fragment" 164 | } 165 | }, 166 | "source": [ 167 | "To retrieve only the name-related columns along with the corresponding birthdays from this table, you would use the command:\n", 168 | "\n", 169 | " SELECT LastName, FirstName, BirthDate \n", 170 | " FROM Employees" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": { 176 | "slideshow": { 177 | "slide_type": "fragment" 178 | } 179 | }, 180 | "source": [ 181 | "To do the above while renaming the columns as \"last name\" and \"first\":\n", 182 | "\n", 183 | " SELECT LastName AS 'Last Name', FirstName AS First, BirthDate AS [Birth Date] \n", 184 | " FROM Employees" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "metadata": { 190 | "slideshow": { 191 | "slide_type": "subslide" 192 | } 193 | }, 194 | "source": [ 195 | "## Practice 1\n", 196 | "In the Products table, retreive all columns except Unit. Rename the ProductName and Price, but call them \"Item Name\" and \"Dollars\". \n", 197 | "\n", 198 | "Include a comment in your solution." 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "metadata": { 204 | "slideshow": { 205 | "slide_type": "notes" 206 | } 207 | }, 208 | "source": [ 209 | "### Answers\n", 210 | "SELECT ProductID, ProductName AS 'Item Name', SupplierID, CategoryID, Price AS Dollars\n", 211 | "FROM Products\n", 212 | "\n", 213 | "or\n", 214 | "\n", 215 | "SELECT ProductID, ProductName AS [Item Name], SupplierID, CategoryID, Price AS Dollars\n", 216 | "FROM Products" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": { 222 | "collapsed": true, 223 | "slideshow": { 224 | "slide_type": "slide" 225 | } 226 | }, 227 | "source": [ 228 | "# COUNT\n", 229 | "COUNT returns the number of non-null results in the specified column or the number of non-null rows. " 230 | ] 231 | }, 232 | { 233 | "cell_type": "markdown", 234 | "metadata": { 235 | "slideshow": { 236 | "slide_type": "fragment" 237 | } 238 | }, 239 | "source": [ 240 | "## Example\n", 241 | "To count the number of non-null rows in the Employees table:\n", 242 | "\n", 243 | " SELECT COUNT(*) \n", 244 | " FROM Employees" 245 | ] 246 | }, 247 | { 248 | "cell_type": "markdown", 249 | "metadata": { 250 | "slideshow": { 251 | "slide_type": "subslide" 252 | } 253 | }, 254 | "source": [ 255 | "## Practice 2\n", 256 | "Count the number of non-null values in the OrderDetails table." 257 | ] 258 | }, 259 | { 260 | "cell_type": "markdown", 261 | "metadata": { 262 | "slideshow": { 263 | "slide_type": "notes" 264 | } 265 | }, 266 | "source": [ 267 | "### Answers\n", 268 | "SELECT COUNT(*) FROM OrderDetails" 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "metadata": { 274 | "collapsed": true, 275 | "slideshow": { 276 | "slide_type": "slide" 277 | } 278 | }, 279 | "source": [ 280 | "# WHERE, AND, OR\n", 281 | "The WHERE command retrieves rows that satisfy a given condition\n", 282 | "\n", 283 | "Simple conditions:\n", 284 | "- = Equals \n", 285 | "- != or <> Does not equal\n", 286 | "- \\> or < Is greater/less than\n", 287 | "- \\>= or <= Is greater/less than or equal to\n", 288 | "\n", 289 | "Additional conditions covered in following slides:\n", 290 | "- a AND b\n", 291 | "- a OR b\n", 292 | "- BETWEEN a AND b\n", 293 | "- IN ('a','b','c')\n", 294 | "- LIKE 'a'\n", 295 | "\n", 296 | "BETWEEN, IN, and LIKE can all be modified using the \"NOT\" keyword to retrieve rows that are not a match\n", 297 | " " 298 | ] 299 | }, 300 | { 301 | "cell_type": "markdown", 302 | "metadata": { 303 | "slideshow": { 304 | "slide_type": "fragment" 305 | } 306 | }, 307 | "source": [ 308 | "## Example\n", 309 | "Using the 'Employees' table again, suppose you want to retrieve all employee information for those born before 1960:\n", 310 | " \n", 311 | " SELECT * \n", 312 | " FROM Employees \n", 313 | " WHERE BirthDate < '1960-01-01'\n", 314 | " \n", 315 | "Note: BirthDate is of type string instead of date. This command works because the date is specified in the correct order: 'YYYY-MM-DD'." 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": { 321 | "slideshow": { 322 | "slide_type": "fragment" 323 | } 324 | }, 325 | "source": [ 326 | "Filtering out everyone whose last name is King:\n", 327 | " \n", 328 | " SELECT * \n", 329 | " FROM Employees \n", 330 | " WHERE LastName != 'King'" 331 | ] 332 | }, 333 | { 334 | "cell_type": "markdown", 335 | "metadata": { 336 | "slideshow": { 337 | "slide_type": "fragment" 338 | } 339 | }, 340 | "source": [ 341 | "Applying both filters: \n", 342 | "\n", 343 | " SELECT * \n", 344 | " FROM Employees \n", 345 | " WHERE Lastname != 'King' AND BirthDate < '1960-01-01'" 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": { 351 | "slideshow": { 352 | "slide_type": "fragment" 353 | } 354 | }, 355 | "source": [ 356 | "Applying either filter:\n", 357 | "\n", 358 | " SELECT * \n", 359 | " FROM Employees \n", 360 | " WHERE Lastname != 'King' OR BirthDate < '1960-01-01'" 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": { 366 | "slideshow": { 367 | "slide_type": "subslide" 368 | } 369 | }, 370 | "source": [ 371 | "## Practice 3\n", 372 | "\n", 373 | "Retrieve a list from the Orders table of orders starting in 1997 and with shipper IDs of at least 2, or orders that were shipped before 1997 and were sent from employee 3." 374 | ] 375 | }, 376 | { 377 | "cell_type": "markdown", 378 | "metadata": { 379 | "slideshow": { 380 | "slide_type": "notes" 381 | } 382 | }, 383 | "source": [ 384 | "### Answers\n", 385 | "SELECT * FROM Orders \n", 386 | "WHERE (OrderDate >= '1997-01-01' AND ShipperID > 1) OR (OrderDate <= '1997-01-01' AND EmployeeID == 3)" 387 | ] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": { 392 | "collapsed": true, 393 | "slideshow": { 394 | "slide_type": "slide" 395 | } 396 | }, 397 | "source": [ 398 | "# BETWEEN, IN, and LIKE (wildcards)\n", 399 | "\n", 400 | "## Between\n", 401 | "\n", 402 | "The BETWEEN command gives a range of values (inclusive)\n", 403 | "\n", 404 | "BETWEEN can be modified using the \"NOT\" keyword to retrieve rows that are not a match" 405 | ] 406 | }, 407 | { 408 | "cell_type": "markdown", 409 | "metadata": { 410 | "slideshow": { 411 | "slide_type": "fragment" 412 | } 413 | }, 414 | "source": [ 415 | "## Example\n", 416 | "To retrieve orders placed in January of 1997:\n", 417 | " \n", 418 | " SELECT * \n", 419 | " FROM Orders \n", 420 | " WHERE OrderDate \n", 421 | " BETWEEN '1997-01-01' AND '1997-01-31'" 422 | ] 423 | }, 424 | { 425 | "cell_type": "markdown", 426 | "metadata": { 427 | "collapsed": true, 428 | "slideshow": { 429 | "slide_type": "fragment" 430 | } 431 | }, 432 | "source": [ 433 | "## IN\n", 434 | "The IN command gives multiple possible matching values" 435 | ] 436 | }, 437 | { 438 | "cell_type": "markdown", 439 | "metadata": { 440 | "slideshow": { 441 | "slide_type": "fragment" 442 | } 443 | }, 444 | "source": [ 445 | "## Example\n", 446 | "\n", 447 | "Retrieving Employee IDs for employees not named Nancy or Andrew:\n", 448 | "\n", 449 | " SELECT EmployeeID FROM Employees WHERE FirstName NOT IN ('Nancy', 'Andrew') " 450 | ] 451 | }, 452 | { 453 | "cell_type": "markdown", 454 | "metadata": { 455 | "slideshow": { 456 | "slide_type": "fragment" 457 | } 458 | }, 459 | "source": [ 460 | "## LIKE, wildcards\n", 461 | "The LIKE command specifies a pattern to match, such as 'Nancy' or '1999'. \n", 462 | "\n", 463 | "### Wildcards\n", 464 | "\n", 465 | "Wildcards are characters that stand in for a range of possible values. \n", 466 | "\n", 467 | "Wildcards:\n", 468 | " \n", 469 | " % A string of 0+ characters\n", 470 | " \n", 471 | " _ A single character\n", 472 | " \n", 473 | " [...] A single character from the range or list in the brackets\n", 474 | " \n", 475 | " [^...], [!...] A single character not from the range or list in the brackets\n", 476 | "\n", 477 | "Without any wildcards, LIKE will match only values equal to the exact pattern.\n", 478 | "\n", 479 | "Note: The negative wildcard statement is not supported in the version of SQL currently being used. Instead, you can practice the example here." 480 | ] 481 | }, 482 | { 483 | "cell_type": "markdown", 484 | "metadata": { 485 | "slideshow": { 486 | "slide_type": "fragment" 487 | } 488 | }, 489 | "source": [ 490 | "## Example\n", 491 | "To match all customers with names starting with M, we use the query:\n", 492 | "\n", 493 | " SELECT * FROM Customers \n", 494 | " WHERE CustomerName Like 'M%'" 495 | ] 496 | }, 497 | { 498 | "cell_type": "markdown", 499 | "metadata": { 500 | "slideshow": { 501 | "slide_type": "fragment" 502 | } 503 | }, 504 | "source": [ 505 | "To match all customers with names starting with letters after D, we can use the query:\n", 506 | "\n", 507 | " SELECT * FROM Customers \n", 508 | " WHERE CustomerName Like '[!a-d]%'" 509 | ] 510 | }, 511 | { 512 | "cell_type": "markdown", 513 | "metadata": { 514 | "slideshow": { 515 | "slide_type": "subslide" 516 | } 517 | }, 518 | "source": [ 519 | "## Practice 4\n", 520 | "1\\. From the Suppliers table, retreive the SupplierName and ContactName for suppliers with IDs between 3 and 13 and who are located in Japan or Germany.\n", 521 | "\n", 522 | "2\\. From the Customers table, retrieve all information about customers from two- or three-letter named countries\n", 523 | "\n", 524 | "3\\. From the Customers table, retrieve the customer names from people whose contacts have a last name starting with S" 525 | ] 526 | }, 527 | { 528 | "cell_type": "markdown", 529 | "metadata": { 530 | "slideshow": { 531 | "slide_type": "notes" 532 | } 533 | }, 534 | "source": [ 535 | "### Answers\n", 536 | "1. SELECT SupplierName, ContactName FROM Suppliers WHERE SupplierID BETWEEN 3 AND 13 AND Country IN ('Japan','Germany') (5)\n", 537 | "3. SELECT * FROM Customers WHERE Country LIKE '___' OR Country LIKE '__' \\(13)\n", 538 | "4. SELECT CustomerName FROM Customers WHERE ContactName LIKE '% S%' (7)" 539 | ] 540 | }, 541 | { 542 | "cell_type": "markdown", 543 | "metadata": { 544 | "slideshow": { 545 | "slide_type": "slide" 546 | } 547 | }, 548 | "source": [ 549 | "# DISTINCT\n", 550 | "The DISTINCT command retrieves only distinct combinations of the specified columns.\n", 551 | "\n", 552 | "It is also commonly used in combination with COUNT to return the number of distinct combinations." 553 | ] 554 | }, 555 | { 556 | "cell_type": "markdown", 557 | "metadata": { 558 | "slideshow": { 559 | "slide_type": "fragment" 560 | } 561 | }, 562 | "source": [ 563 | "## Example\n", 564 | "To retrieve all past combinations of employee IDs and shipper IDs:\n", 565 | " \n", 566 | " SELECT DISTINCT EmployeeID, ShipperID FROM Orders" 567 | ] 568 | }, 569 | { 570 | "cell_type": "markdown", 571 | "metadata": { 572 | "slideshow": { 573 | "slide_type": "fragment" 574 | } 575 | }, 576 | "source": [ 577 | "The version of SQL that is currently being used does not support the following syntax, but other versions may:\n", 578 | "\n", 579 | "To count the distinct combinations of employee IDs and shipper IDs:\n", 580 | "\n", 581 | " SELECT COUNT(DISTINCT EmployeeID, ShipperID) FROM Orders" 582 | ] 583 | }, 584 | { 585 | "cell_type": "markdown", 586 | "metadata": { 587 | "slideshow": { 588 | "slide_type": "subslide" 589 | } 590 | }, 591 | "source": [ 592 | "## Practice 5\n", 593 | "Retrieve all distinct combinations of ProductID and Quantity from the OrderDetails table." 594 | ] 595 | }, 596 | { 597 | "cell_type": "markdown", 598 | "metadata": { 599 | "slideshow": { 600 | "slide_type": "notes" 601 | } 602 | }, 603 | "source": [ 604 | "### Answers\n", 605 | "SELECT DISTINCT ProductID, Quantity FROM OrderDetails" 606 | ] 607 | }, 608 | { 609 | "cell_type": "markdown", 610 | "metadata": { 611 | "collapsed": true, 612 | "slideshow": { 613 | "slide_type": "slide" 614 | } 615 | }, 616 | "source": [ 617 | "# ORDER BY, TOP, LIMIT\n", 618 | "\n", 619 | "## Order By\n", 620 | "\n", 621 | "ORDER BY sorts the results according to the specified columns. Default is ascending, but you can use ASC/DESC to specify the ordering." 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": { 627 | "slideshow": { 628 | "slide_type": "fragment" 629 | } 630 | }, 631 | "source": [ 632 | "## Example\n", 633 | "To retrieve products ordered by price:\n", 634 | "\n", 635 | " SELECT ProductName, Price FROM Products ORDER BY Price DESC" 636 | ] 637 | }, 638 | { 639 | "cell_type": "markdown", 640 | "metadata": { 641 | "slideshow": { 642 | "slide_type": "subslide" 643 | } 644 | }, 645 | "source": [ 646 | "## TOP, LIMIT\n", 647 | "TOP and LIMIT (syntax depends on type of database) retrieve only the first x results or percent of results; good for checking results before requesting a very large query or in combination with the ORDER BY command.\n", 648 | "\n", 649 | "LIMIT is only supported in MySQL and Oracle databases.\n", 650 | "\n", 651 | "TOP is not supported in the program currently used." 652 | ] 653 | }, 654 | { 655 | "cell_type": "markdown", 656 | "metadata": { 657 | "slideshow": { 658 | "slide_type": "fragment" 659 | } 660 | }, 661 | "source": [ 662 | "### Example\n", 663 | "\n", 664 | "To retrieve the ten most expensive products:\n", 665 | " \n", 666 | " SELECT TOP 10 ProductName, Price FROM Products ORDER BY Price DESC\n", 667 | " \n", 668 | " SELECT ProductName, Price FROM Products ORDER BY Price DESC LIMIT 10" 669 | ] 670 | }, 671 | { 672 | "cell_type": "markdown", 673 | "metadata": { 674 | "collapsed": true, 675 | "slideshow": { 676 | "slide_type": "fragment" 677 | } 678 | }, 679 | "source": [ 680 | "To retrieve the top ten percent of products by price:\n", 681 | "\n", 682 | " SELECT TOP 10 PERCENT ProductName, Price FROM Products ORDER BY Price DESC" 683 | ] 684 | }, 685 | { 686 | "cell_type": "markdown", 687 | "metadata": { 688 | "slideshow": { 689 | "slide_type": "fragment" 690 | } 691 | }, 692 | "source": [ 693 | "To retreive rows not at the beginning or the end (e.g. to retreive the middle 20 records starting at record 11):\n", 694 | "\n", 695 | " SELECT * FROM [OrderDetails] LIMIT 10, 20" 696 | ] 697 | }, 698 | { 699 | "cell_type": "markdown", 700 | "metadata": { 701 | "slideshow": { 702 | "slide_type": "subslide" 703 | } 704 | }, 705 | "source": [ 706 | "## Practice 6\n", 707 | "From the Customers table, order the customers by city (ascending) and then by country (descending). Return rows 15 - 30." 708 | ] 709 | }, 710 | { 711 | "cell_type": "markdown", 712 | "metadata": { 713 | "slideshow": { 714 | "slide_type": "notes" 715 | } 716 | }, 717 | "source": [ 718 | "### Answers\n", 719 | "SELECT * FROM Customers ORDER BY City ASC, Country DESC LIMIT 14,16" 720 | ] 721 | }, 722 | { 723 | "cell_type": "markdown", 724 | "metadata": { 725 | "slideshow": { 726 | "slide_type": "slide" 727 | } 728 | }, 729 | "source": [ 730 | "# JOIN\n", 731 | "JOIN connects two tables where the specified columns match\n", 732 | "\n", 733 | "\n", 734 | "INNER JOIN - all rows where specified columns match (default)\n", 735 | "\n", 736 | "LEFT JOIN - all rows from left table and matching rows in right table\n", 737 | "\n", 738 | "RIGHT JOIN - all rows from right table and matching rows in left table\n", 739 | "\n", 740 | "FULL OUTER JOIN - all rows from left and right table\n", 741 | "\n", 742 | "\n", 743 | "Image excerpt from Pandas Cheat Sheet" 744 | ] 745 | }, 746 | { 747 | "cell_type": "markdown", 748 | "metadata": { 749 | "slideshow": { 750 | "slide_type": "fragment" 751 | } 752 | }, 753 | "source": [ 754 | "## Example\n", 755 | "To retrieve all past combinations of customer names and employee IDs:\n", 756 | "\n", 757 | " SELECT CustomerName, EmployeeID, Orders.OrderID FROM Orders\n", 758 | " JOIN Customers\n", 759 | " ON Customers.CustomerID=Orders.CustomerID" 760 | ] 761 | }, 762 | { 763 | "cell_type": "markdown", 764 | "metadata": { 765 | "slideshow": { 766 | "slide_type": "fragment" 767 | } 768 | }, 769 | "source": [ 770 | "To retrieve all past combinations of customer names and employee last names:\n", 771 | "\n", 772 | " SELECT CustomerName, LastName, Orders.OrderID FROM Orders\n", 773 | " JOIN Employees\n", 774 | " ON Employees.EmployeeID=Orders.EmployeeID\n", 775 | " JOIN Customers\n", 776 | " ON Customers.CustomerID=Orders.CustomerID" 777 | ] 778 | }, 779 | { 780 | "cell_type": "markdown", 781 | "metadata": { 782 | "slideshow": { 783 | "slide_type": "subslide" 784 | } 785 | }, 786 | "source": [ 787 | "### Primary keys\n", 788 | "A primary key is field that is a unique identifier to each record in a table. Each table can have only one primary key, but multiple columns can define a primary key (called a composite primary key).\n", 789 | "\n", 790 | "### Foreign Keys\n", 791 | "A foreign key is a field in a table that matches the primary key in a different table.\n", 792 | "The table with the foreign key is called the child table, while the table containing the primary key is called the parent table.\n", 793 | "\n", 794 | "Note: Foreign and primary keys are not required." 795 | ] 796 | }, 797 | { 798 | "cell_type": "markdown", 799 | "metadata": { 800 | "slideshow": { 801 | "slide_type": "fragment" 802 | } 803 | }, 804 | "source": [ 805 | "The database we are using does not support primary or foreign keys. You can see examples using primary keys and learn how to set them here, and you can see examples using foreign keys here." 806 | ] 807 | }, 808 | { 809 | "cell_type": "markdown", 810 | "metadata": { 811 | "slideshow": { 812 | "slide_type": "fragment" 813 | } 814 | }, 815 | "source": [ 816 | "### Example\n", 817 | "In the Customers table, CustomerID could be set as a primary key.\n", 818 | "In the Orders table, OrderID could be set as a primary key while CustomerID would be a foreign key." 819 | ] 820 | }, 821 | { 822 | "cell_type": "markdown", 823 | "metadata": { 824 | "slideshow": { 825 | "slide_type": "subslide" 826 | } 827 | }, 828 | "source": [ 829 | "## Practice 7\n", 830 | "Return the Products table with a column for the Supplier Name (found in the Suppliers table) included." 831 | ] 832 | }, 833 | { 834 | "cell_type": "markdown", 835 | "metadata": { 836 | "slideshow": { 837 | "slide_type": "notes" 838 | } 839 | }, 840 | "source": [ 841 | "### Answers\n", 842 | "SELECT ProductID, ProductName, Products.SupplierID,\tCategoryID, Unit, Price, SupplierName\n", 843 | "FROM Products\n", 844 | "JOIN Suppliers\n", 845 | "ON Products.SupplierID = Suppliers.SupplierID" 846 | ] 847 | }, 848 | { 849 | "cell_type": "markdown", 850 | "metadata": { 851 | "slideshow": { 852 | "slide_type": "slide" 853 | } 854 | }, 855 | "source": [ 856 | "# GROUP BY, HAVING\n", 857 | "\n", 858 | "## Group By\n", 859 | "GROUP BY groups the rows according to their values in the selected column and then uses an \"aggregate function\" to create a new column with information about each group (of rows). \n", 860 | "\n", 861 | "Example aggregate functions:\n", 862 | "- COUNT()\n", 863 | "- SUM()\n", 864 | "- MAX()\n", 865 | "- MIN()\n", 866 | "- AVG()" 867 | ] 868 | }, 869 | { 870 | "cell_type": "markdown", 871 | "metadata": { 872 | "slideshow": { 873 | "slide_type": "fragment" 874 | } 875 | }, 876 | "source": [ 877 | "## Example\n", 878 | "To retrieve the total number of each product ordered:\n", 879 | " \n", 880 | " SELECT ProductID, SUM(Quantity) \n", 881 | " FROM OrderDetails\n", 882 | " GROUP BY ProductID" 883 | ] 884 | }, 885 | { 886 | "cell_type": "markdown", 887 | "metadata": { 888 | "slideshow": { 889 | "slide_type": "subslide" 890 | } 891 | }, 892 | "source": [ 893 | "## HAVING\n", 894 | "The HAVING command acts like a WHERE command for GROUP BY" 895 | ] 896 | }, 897 | { 898 | "cell_type": "markdown", 899 | "metadata": { 900 | "slideshow": { 901 | "slide_type": "fragment" 902 | } 903 | }, 904 | "source": [ 905 | "## Example\n", 906 | "To retrieve employees with more than fifteen orders in the Orders table:\n", 907 | " \n", 908 | " SELECT EmployeeID, COUNT(*) \n", 909 | " FROM ORDERS\n", 910 | " GROUP BY EmployeeID\n", 911 | " HAVING COUNT(*) > 15" 912 | ] 913 | }, 914 | { 915 | "cell_type": "markdown", 916 | "metadata": { 917 | "slideshow": { 918 | "slide_type": "subslide" 919 | } 920 | }, 921 | "source": [ 922 | "## Practice 8\n", 923 | "Using the OrderDetails table, retrieve product IDs for products with at least 100 total units ordered." 924 | ] 925 | }, 926 | { 927 | "cell_type": "markdown", 928 | "metadata": { 929 | "slideshow": { 930 | "slide_type": "notes" 931 | } 932 | }, 933 | "source": [ 934 | "### Answers\n", 935 | "SELECT ProductID, SUM(Quantity) \n", 936 | "FROM OrderDetails \n", 937 | "GROUP BY ProductID \n", 938 | "HAVING SUM(QUANTITY) >= 100" 939 | ] 940 | }, 941 | { 942 | "cell_type": "markdown", 943 | "metadata": { 944 | "slideshow": { 945 | "slide_type": "slide" 946 | } 947 | }, 948 | "source": [ 949 | "# UNION\n", 950 | "UNION combines the results of 2+ SELECT queries.\n", 951 | "\n", 952 | "Requirements: same number, type, and order of columns\n", 953 | "\n", 954 | "Default is only distinct results; use UNION ALL for all results" 955 | ] 956 | }, 957 | { 958 | "cell_type": "markdown", 959 | "metadata": { 960 | "slideshow": { 961 | "slide_type": "fragment" 962 | } 963 | }, 964 | "source": [ 965 | "## Example\n", 966 | "To retrieve a combined list of all cities which have suppliers AND customers:\n", 967 | "\n", 968 | " SELECT City from Customers\n", 969 | " UNION\n", 970 | " SELECT City from Suppliers\n", 971 | " ORDER BY City" 972 | ] 973 | }, 974 | { 975 | "cell_type": "markdown", 976 | "metadata": { 977 | "slideshow": { 978 | "slide_type": "fragment" 979 | } 980 | }, 981 | "source": [ 982 | "To retrieve a combined list of all cities which have suppliers OR customers: \n", 983 | "\n", 984 | " SELECT City from Customers \n", 985 | " UNION ALL \n", 986 | " SELECT City from Suppliers \n", 987 | " ORDER BY City " 988 | ] 989 | }, 990 | { 991 | "cell_type": "markdown", 992 | "metadata": { 993 | "slideshow": { 994 | "slide_type": "subslide" 995 | } 996 | }, 997 | "source": [ 998 | "## Practice 9\n", 999 | "Using the Suppliers and Customers tables, retrieve a list of countries and citites containing both suppliers and customers." 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "markdown", 1004 | "metadata": { 1005 | "slideshow": { 1006 | "slide_type": "notes" 1007 | } 1008 | }, 1009 | "source": [ 1010 | "### Answers\n", 1011 | "SELECT Country, City from Customers \n", 1012 | "UNION \n", 1013 | "SELECT Country, City from Suppliers \n", 1014 | "ORDER BY City " 1015 | ] 1016 | }, 1017 | { 1018 | "cell_type": "markdown", 1019 | "metadata": { 1020 | "slideshow": { 1021 | "slide_type": "slide" 1022 | } 1023 | }, 1024 | "source": [ 1025 | "# Combining commands\n", 1026 | "\n", 1027 | "You can combine as many commands as you like to make your queries as specific as you require. Pay careful attention to the order of the commands." 1028 | ] 1029 | }, 1030 | { 1031 | "cell_type": "markdown", 1032 | "metadata": { 1033 | "slideshow": { 1034 | "slide_type": "fragment" 1035 | } 1036 | }, 1037 | "source": [ 1038 | "## Command order of operations\n", 1039 | "SELECT (TOP) \n", 1040 | "FROM \n", 1041 | "WHERE (BETWEEN | LIKE | IN) \n", 1042 | "GROUP BY \n", 1043 | "HAVING \n", 1044 | "ORDER BY (ASC | DESC) \n", 1045 | "LIMIT \n", 1046 | "UNION (ALL) " 1047 | ] 1048 | }, 1049 | { 1050 | "cell_type": "markdown", 1051 | "metadata": { 1052 | "slideshow": { 1053 | "slide_type": "subslide" 1054 | } 1055 | }, 1056 | "source": [ 1057 | "## Practice 10\n", 1058 | "Retrieve a table with two columns: the customer's name, and the total number of orders made by that customer. " 1059 | ] 1060 | }, 1061 | { 1062 | "cell_type": "markdown", 1063 | "metadata": { 1064 | "slideshow": { 1065 | "slide_type": "notes" 1066 | } 1067 | }, 1068 | "source": [ 1069 | "### Answers\n", 1070 | "SELECT CustomerName, COUNT(Orders.CustomerID)\n", 1071 | "FROM Customers\n", 1072 | "JOIN Orders\n", 1073 | "ON Customers.CustomerID = Orders.CustomerID\n", 1074 | "GROUP BY Customers.CustomerID" 1075 | ] 1076 | }, 1077 | { 1078 | "cell_type": "markdown", 1079 | "metadata": { 1080 | "slideshow": { 1081 | "slide_type": "slide" 1082 | } 1083 | }, 1084 | "source": [ 1085 | "# Quiz\n", 1086 | "Test your SQL knowledge here:\n", 1087 | "https://goo.gl/forms/jgPcRXjX5QzDsQqE3" 1088 | ] 1089 | }, 1090 | { 1091 | "cell_type": "markdown", 1092 | "metadata": { 1093 | "slideshow": { 1094 | "slide_type": "slide" 1095 | } 1096 | }, 1097 | "source": [ 1098 | "# Spatial Data in Databases\n", 1099 | "\n", 1100 | "If you are working with spatial data (GPS or any geometry that has spatial points), you SHOULD be using a spatial database. \n", 1101 | "\n", 1102 | "Most modern databases have a spatial extension plugin. This will make your life easier and allow you to do more complicated queries." 1103 | ] 1104 | }, 1105 | { 1106 | "cell_type": "markdown", 1107 | "metadata": { 1108 | "slideshow": { 1109 | "slide_type": "slide" 1110 | } 1111 | }, 1112 | "source": [ 1113 | "# Food For Thought\n", 1114 | "\"Drawing\"" 1115 | ] 1116 | }, 1117 | { 1118 | "cell_type": "markdown", 1119 | "metadata": { 1120 | "slideshow": { 1121 | "slide_type": "slide" 1122 | } 1123 | }, 1124 | "source": [ 1125 | "# More practice\n", 1126 | "1. Return a Table with the following 3 columns: CustomerID, CustomerName, Number of Orders (using the Customers and Orders tables)\n", 1127 | "\n", 1128 | " Have the table sorted by Number of Orders with the most orders at the top. Only include repeat customers\n", 1129 | "\n", 1130 | "2. Return a Table with the following 3 columns: ProductID, ProductName, Qty Sold (using the Products and OrderDetails tables)\n", 1131 | "\n", 1132 | " Return only the top 10 products in descending order based on quantity sold.\n", 1133 | "\n", 1134 | "3. Find out who your top 5 customers are based on how much money they've spent." 1135 | ] 1136 | }, 1137 | { 1138 | "cell_type": "markdown", 1139 | "metadata": { 1140 | "slideshow": { 1141 | "slide_type": "notes" 1142 | } 1143 | }, 1144 | "source": [ 1145 | "# Answers\n", 1146 | "1. \n", 1147 | "SELECT Customers.CustomerID, CustomerName, COUNT(OrderID)\n", 1148 | "FROM Customers\n", 1149 | "JOIN Orders\n", 1150 | "ON Customers.CustomerID = Orders.CustomerID\n", 1151 | "GROUP BY Customers.CustomerID\n", 1152 | "HAVING COUNT(OrderID) > 1\n", 1153 | "ORDER BY COUNT(OrderID) DESC\n", 1154 | "\n", 1155 | "2. \n", 1156 | "SELECT Products.ProductID, ProductName, SUM(Quantity)\n", 1157 | "FROM Products\n", 1158 | "JOIN OrderDetails\n", 1159 | "ON Products.ProductID = OrderDetails.ProductID\n", 1160 | "GROUP BY Products.ProductID\n", 1161 | "ORDER BY SUM(Quantity) DESC\n", 1162 | "LIMIT 10\n", 1163 | "\n", 1164 | "3. \n", 1165 | "SELECT CustomerName\n", 1166 | "FROM Customers\n", 1167 | "JOIN Orders\n", 1168 | "ON Customers.CustomerID = Orders.CustomerID\n", 1169 | "JOIN OrderDetails\n", 1170 | "ON Orders.OrderID = OrderDetails.OrderId\n", 1171 | "JOIN Products\n", 1172 | "ON OrderDetails.ProductID = Products.ProductID\n", 1173 | "GROUP BY Customers.CustomerID\n", 1174 | "ORDER BY SUM(Quantity*Price) DESC\n", 1175 | "LIMIT 5" 1176 | ] 1177 | }, 1178 | { 1179 | "cell_type": "markdown", 1180 | "metadata": { 1181 | "slideshow": { 1182 | "slide_type": "notes" 1183 | } 1184 | }, 1185 | "source": [ 1186 | "# Creating a table\n", 1187 | "You can create a new table in the database and initialize it with empty columns." 1188 | ] 1189 | }, 1190 | { 1191 | "cell_type": "markdown", 1192 | "metadata": { 1193 | "slideshow": { 1194 | "slide_type": "notes" 1195 | } 1196 | }, 1197 | "source": [ 1198 | "## Example\n", 1199 | " CREATE TABLE table_name (\n", 1200 | " column1 datatype,\n", 1201 | " column2 datatype,\n", 1202 | " column3 datatype,\n", 1203 | " ....\n", 1204 | " );" 1205 | ] 1206 | }, 1207 | { 1208 | "cell_type": "markdown", 1209 | "metadata": { 1210 | "slideshow": { 1211 | "slide_type": "notes" 1212 | } 1213 | }, 1214 | "source": [ 1215 | "## Practice" 1216 | ] 1217 | }, 1218 | { 1219 | "cell_type": "markdown", 1220 | "metadata": { 1221 | "slideshow": { 1222 | "slide_type": "notes" 1223 | } 1224 | }, 1225 | "source": [ 1226 | "### Answers" 1227 | ] 1228 | }, 1229 | { 1230 | "cell_type": "markdown", 1231 | "metadata": { 1232 | "slideshow": { 1233 | "slide_type": "notes" 1234 | } 1235 | }, 1236 | "source": [ 1237 | "# Primary keys\n", 1238 | "A primary key is an identifier unique to that particular record. Each table can have only one primary key." 1239 | ] 1240 | }, 1241 | { 1242 | "cell_type": "markdown", 1243 | "metadata": { 1244 | "slideshow": { 1245 | "slide_type": "notes" 1246 | } 1247 | }, 1248 | "source": [ 1249 | "## Example\n", 1250 | "While creating a table:\n", 1251 | "\n", 1252 | " CREATE TABLE Persons (\n", 1253 | " ID int NOT NULL,\n", 1254 | " LastName varchar(255) NOT NULL,\n", 1255 | " FirstName varchar(255),\n", 1256 | " Age int,\n", 1257 | " PRIMARY KEY (ID)\n", 1258 | " )\n", 1259 | "\n", 1260 | "Adding to existing table:\n", 1261 | "\n", 1262 | " ALTER TABLE Persons\n", 1263 | " ADD PRIMARY KEY (ID);\n", 1264 | "\n", 1265 | "The format varies slightly here depending on the type of server. \n" 1266 | ] 1267 | }, 1268 | { 1269 | "cell_type": "markdown", 1270 | "metadata": { 1271 | "slideshow": { 1272 | "slide_type": "notes" 1273 | } 1274 | }, 1275 | "source": [ 1276 | "## Practice" 1277 | ] 1278 | }, 1279 | { 1280 | "cell_type": "markdown", 1281 | "metadata": { 1282 | "slideshow": { 1283 | "slide_type": "notes" 1284 | } 1285 | }, 1286 | "source": [ 1287 | "### Answers" 1288 | ] 1289 | } 1290 | ], 1291 | "metadata": { 1292 | "anaconda-cloud": {}, 1293 | "celltoolbar": "Slideshow", 1294 | "kernelspec": { 1295 | "display_name": "Python 3", 1296 | "language": "python", 1297 | "name": "python3" 1298 | }, 1299 | "language_info": { 1300 | "codemirror_mode": { 1301 | "name": "ipython", 1302 | "version": 3 1303 | }, 1304 | "file_extension": ".py", 1305 | "mimetype": "text/x-python", 1306 | "name": "python", 1307 | "nbconvert_exporter": "python", 1308 | "pygments_lexer": "ipython3", 1309 | "version": "3.6.1" 1310 | } 1311 | }, 1312 | "nbformat": 4, 1313 | "nbformat_minor": 1 1314 | } 1315 | --------------------------------------------------------------------------------