├── styles.css ├── db7-cover.jpg ├── Ch14_Indexing ├── 14.3a.png ├── 14.3b.png ├── 14.3c.png ├── 14.13_answer.png ├── 14.4a_insert8.jpg ├── 14.4a_insert9.jpg ├── 14.4b_insert8.jpg ├── 14.4b_insert9.png ├── 14.4c_insert8.png ├── 14.4c_insert9.png ├── 14.6_answer.jpg ├── 14.4a_delete19.png ├── 14.4a_delete23.png ├── 14.4a_insert10.jpg ├── 14.4b_delete19.png ├── 14.4b_delete23.png ├── 14.4b_insert10.png ├── 14.4c_delete19.png ├── 14.4c_delete23.png ├── 14.4c_insert10.png ├── 14.13_bitmap_index.png ├── 14.13_bitmap_intersection.png ├── 14.13_bitmap_index_on_deptname.png ├── 14.5.md ├── 14.16.md ├── 14.17.md ├── 14.23.md ├── 14.19.md ├── 14.2.md ├── 14.6.md ├── 14.10.md ├── 14.14.md ├── 14.11.md ├── 14.15.md ├── 14.12.md ├── 14.7.md ├── 14.24.md ├── 14.1.md ├── 14.26.md ├── 14.21.md ├── 14.3.md └── 14.25.md ├── Ch05_Advanced_SQL ├── Figure_5.21.png ├── Figure_5.22.png ├── 5.14.md ├── 5.11.md ├── 5.20.md ├── 5.9.md ├── 5.17.md ├── 5.10.md ├── 5.19.md ├── 5.8.md ├── 5.22.md ├── 5.13.md ├── 5.7.md ├── 5.4.md ├── 5.2.md ├── 5.1.md ├── 5.23.md ├── 5.15.md └── 5.16.md ├── Ch08_Complex_Data_Types ├── 8.1d.jpg ├── Fig8.7.png ├── Fig8.8.png ├── Fig8.9.png ├── Solution_8.7.png ├── 8.14.md ├── 8.7.md ├── 8.15.md ├── 8.10.md ├── 8.11.md └── 8.13.md ├── Ch11_Data_Analytics ├── SolutionOf11.2.png ├── SolutionOf11.5.jpg ├── 11.11.md ├── 11.12.md ├── 11.2.md ├── 11.4.md ├── 11.10.md ├── 11.7.md ├── 11.8.md └── 11.5.md ├── Ch15_Query_Processing ├── figure15.14.png ├── takes_schema.png ├── student_schema.png ├── algo_of_ex_15.13.png ├── figure_for_15.17.png ├── indexed_nl_join_next.png ├── indexed_nl_join_open.png ├── indexed_nl_join_close.png ├── semijoin_using_sorting.jpg ├── sub_operators_of_hash_join.png ├── 15.9.md ├── 15.11.md ├── 15.5.md ├── 15.2.md ├── 15.19.md ├── 15.14.md ├── 15.8.md ├── 15.15.md └── 15.7.md ├── Ch12_Physical_Storage_Systems ├── Fig12.4.png ├── 12.9.md ├── 12.13.md ├── 12.4.md ├── 12.8.md ├── 12.7.md ├── 12.5.md ├── 12.2.md ├── 12.11.md └── 12.10.md ├── Ch09_Application_Development ├── 9.13_final.png ├── 9.13_compile.png ├── 9.13_inital.png ├── 9.14_request1.png ├── 9.14_request2.png ├── 9.14_request3.png ├── 9.14_request4.png ├── 9.14_request5.png ├── 9.14_submit_5.png ├── thousand_stars.png ├── 9.14_inital_screen.png ├── 9.14_start_server.png ├── 9.14_install_fastapi.png ├── 9.14_redis_ping_pong.png ├── running_your_server.png ├── 9.14_installing_the_redispy_module.png ├── 9.5.md ├── 9.15.md ├── 9.23.md ├── 9.26.md ├── 9.6.md ├── 9.1.md ├── 9.3.md ├── 9.10.md ├── 9.19.md ├── 9.2.md ├── 9.18.md ├── 9.21.md ├── 9.8.md ├── 9.17.md └── 9.7.md ├── Ch07_Relational_Database_Design ├── Figure7.17.png ├── Figure7.18.png ├── 7.28.md ├── 7.38.md ├── 7.21.md ├── 7.20.md ├── 7.22.md ├── 7.37.md ├── 7.24.md ├── 7.7.md ├── 7.26.md ├── 7.27.md ├── 7.17.md ├── 7.44.md ├── 7.5.md ├── 7.43.md ├── 7.1.md ├── 7.41.md ├── 7.15.md ├── 7.35.md ├── 7.2.md ├── 7.29.md ├── 7.9.md ├── 7.10.md ├── 7.4.md ├── 7.3.md ├── 7.14.md ├── 7.23.md ├── 7.31.md └── 7.39.md ├── Ch13_Data_Storage_Structures ├── Figure_13_101.png ├── Figure_13_102.png ├── Figure_13_103.png ├── Index_metadata.png ├── 13.10.md ├── 13.12.md ├── 13.5.md ├── 13.2.md ├── 13.13.md ├── 13.6.md ├── 13.8.md └── 13.1.md ├── Ch06_Database_Design_Using_the_ER_Model ├── Figure_6.101.jpg ├── Figure_6.102.png ├── Figure_6.103.png ├── Figure_6.104.png ├── Figure_6.105.png ├── Figure_6.106.png ├── Figure_6.29.png ├── Figure_6.30.png ├── solution_for_6.15.png ├── solution_for_6.16.png ├── solution_for_6.21.jpg ├── solution_for_6.22.png ├── solution_for_6.23.png ├── solution_for_6.24.png ├── solution_for_6.29.png ├── solution_for_6.21_b.jpg ├── pic_in_exercise_6.12.png ├── generated_relations_for_6.13b.png ├── 6.16.md ├── 6.19.md ├── 6.28.md ├── 6.27.md ├── 6.9.md ├── 6.26.md ├── 6.14.md ├── 6.7.md ├── 6.17.md ├── 6.3.md ├── 6.24.md ├── 6.12.md ├── 6.25.md ├── 6.10.md ├── 6.4.md ├── 6.8.md ├── 6.21.md ├── 6.15.md └── 6.5.md ├── .gitignore ├── Ch02_Introduction_to_the_Relational_Model ├── schema_diagram_2_13.png ├── 2.16.md ├── 2.1.md ├── 2.10.md ├── 2.13.md ├── 2.4.md ├── 2.7.md ├── 2.11.md ├── 2.3.md ├── 2.6.md ├── 2.17.md ├── 2.2.md ├── 2.15.md ├── 2.5.md └── 2.12.md ├── index.qmd ├── Ch03_Introduction_to_SQL ├── 3.19.md ├── 3.25.md ├── 3.24.md ├── 3.6.md ├── 3.34.md ├── 3.26.md ├── 3.27.md ├── 3.22.md ├── 3.29.md ├── 3.28.md ├── 3.33.md ├── 3.35.md ├── 3.31.md ├── 3.7.md ├── 3.23.md ├── 3.17.md ├── 3.32.md ├── 3.3.md ├── 3.14.md └── 3.18.md ├── Ch04_Intermediate_SQL ├── 4.21.md ├── 4.20.md ├── 4.16.md ├── 4.24.md ├── 4.23.md ├── 4.25.md ├── 4.11.md ├── 4.19.md ├── 4.26.md ├── 4.12.md ├── 4.1.md ├── 4.17.md ├── 4.14.md ├── 4.10.md ├── 4.22.md ├── 4.18.md ├── 4.13.md ├── 4.4.md ├── 4.9.md └── 4.7.md ├── Ch01_Introduction ├── 1.11.md ├── 1.1.md ├── 1.15.md ├── 1.8.md ├── 1.6.md ├── 1.10.md ├── 1.5.md ├── 1.14.md ├── 1.9.md ├── 1.13.md ├── 1.3.md ├── 1.7.md └── 1.12.md ├── Ch10_Big_Data ├── 10.12.md ├── 10.1.md ├── 10.3.md ├── 10.15.md ├── 10.16.md ├── 10.5.md ├── 10.11.md ├── 10.14.md ├── 10.9.md ├── 10.6.md └── 10.8.md ├── _quarto.yml ├── LICENSE └── README.md /styles.css: -------------------------------------------------------------------------------- 1 | /* css styles */ 2 | -------------------------------------------------------------------------------- /db7-cover.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/db7-cover.jpg -------------------------------------------------------------------------------- /Ch14_Indexing/14.3a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.3a.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.3b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.3b.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.3c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.3c.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.13_answer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.13_answer.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4a_insert8.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4a_insert8.jpg -------------------------------------------------------------------------------- /Ch14_Indexing/14.4a_insert9.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4a_insert9.jpg -------------------------------------------------------------------------------- /Ch14_Indexing/14.4b_insert8.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4b_insert8.jpg -------------------------------------------------------------------------------- /Ch14_Indexing/14.4b_insert9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4b_insert9.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4c_insert8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4c_insert8.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4c_insert9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4c_insert9.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.6_answer.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.6_answer.jpg -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/Figure_5.21.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch05_Advanced_SQL/Figure_5.21.png -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/Figure_5.22.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch05_Advanced_SQL/Figure_5.22.png -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.1d.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch08_Complex_Data_Types/8.1d.jpg -------------------------------------------------------------------------------- /Ch14_Indexing/14.4a_delete19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4a_delete19.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4a_delete23.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4a_delete23.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4a_insert10.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4a_insert10.jpg -------------------------------------------------------------------------------- /Ch14_Indexing/14.4b_delete19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4b_delete19.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4b_delete23.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4b_delete23.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4b_insert10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4b_insert10.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4c_delete19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4c_delete19.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4c_delete23.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4c_delete23.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.4c_insert10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.4c_insert10.png -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/Fig8.7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch08_Complex_Data_Types/Fig8.7.png -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/Fig8.8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch08_Complex_Data_Types/Fig8.8.png -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/Fig8.9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch08_Complex_Data_Types/Fig8.9.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.13_bitmap_index.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.13_bitmap_index.png -------------------------------------------------------------------------------- /Ch11_Data_Analytics/SolutionOf11.2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch11_Data_Analytics/SolutionOf11.2.png -------------------------------------------------------------------------------- /Ch11_Data_Analytics/SolutionOf11.5.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch11_Data_Analytics/SolutionOf11.5.jpg -------------------------------------------------------------------------------- /Ch15_Query_Processing/figure15.14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/figure15.14.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/takes_schema.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/takes_schema.png -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/Solution_8.7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch08_Complex_Data_Types/Solution_8.7.png -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/Fig12.4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch12_Physical_Storage_Systems/Fig12.4.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/student_schema.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/student_schema.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.13_final.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.13_final.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.13_bitmap_intersection.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.13_bitmap_intersection.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/algo_of_ex_15.13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/algo_of_ex_15.13.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/figure_for_15.17.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/figure_for_15.17.png -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/Figure7.17.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch07_Relational_Database_Design/Figure7.17.png -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/Figure7.18.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch07_Relational_Database_Design/Figure7.18.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.13_compile.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.13_compile.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.13_inital.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.13_inital.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_request1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_request1.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_request2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_request2.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_request3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_request3.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_request4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_request4.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_request5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_request5.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_submit_5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_submit_5.png -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/Figure_13_101.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch13_Data_Storage_Structures/Figure_13_101.png -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/Figure_13_102.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch13_Data_Storage_Structures/Figure_13_102.png -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/Figure_13_103.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch13_Data_Storage_Structures/Figure_13_103.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/indexed_nl_join_next.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/indexed_nl_join_next.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/indexed_nl_join_open.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/indexed_nl_join_open.png -------------------------------------------------------------------------------- /Ch09_Application_Development/thousand_stars.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/thousand_stars.png -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/Index_metadata.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch13_Data_Storage_Structures/Index_metadata.png -------------------------------------------------------------------------------- /Ch14_Indexing/14.13_bitmap_index_on_deptname.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch14_Indexing/14.13_bitmap_index_on_deptname.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/indexed_nl_join_close.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/indexed_nl_join_close.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/semijoin_using_sorting.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/semijoin_using_sorting.jpg -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_inital_screen.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_inital_screen.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_start_server.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_start_server.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_install_fastapi.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_install_fastapi.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_redis_ping_pong.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_redis_ping_pong.png -------------------------------------------------------------------------------- /Ch09_Application_Development/running_your_server.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/running_your_server.png -------------------------------------------------------------------------------- /Ch15_Query_Processing/sub_operators_of_hash_join.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch15_Query_Processing/sub_operators_of_hash_join.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.101.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.101.jpg -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.102.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.102.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.103.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.103.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.104.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.104.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.105.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.105.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.106.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.106.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.29.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.29.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/Figure_6.30.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/Figure_6.30.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.15.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.16.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.21.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.21.jpg -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.22.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.22.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.23.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.23.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.24.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.24.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.29.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.29.png -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /.quarto/ 2 | 3 | # ignore the files generated by quarto 4 | _site/ 5 | 6 | #ignore some private files 7 | PRIVATE_README.md 8 | 9 | .firebaserc 10 | firebase.json 11 | .firebase/ -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/solution_for_6.21_b.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/solution_for_6.21_b.jpg -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/schema_diagram_2_13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch02_Introduction_to_the_Relational_Model/schema_diagram_2_13.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/pic_in_exercise_6.12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/pic_in_exercise_6.12.png -------------------------------------------------------------------------------- /Ch09_Application_Development/9.14_installing_the_redispy_module.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch09_Application_Development/9.14_installing_the_redispy_module.png -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/generated_relations_for_6.13b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/noahabe/database_system_concepts_answers/HEAD/Ch06_Database_Design_Using_the_ER_Model/generated_relations_for_6.13b.png -------------------------------------------------------------------------------- /index.qmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Database System Concepts Answers" 3 | --- 4 | 5 | Solutions to the book "Database System Concepts" 6 | 7 | ![DSC](db7-cover.jpg) 8 | 9 | [Here](https://www.db-book.com/) 10 | is a link to the book. 11 | 12 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.28.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 28 3 | title: '7.28' 4 | --- 5 | > Using the functional dependencies of Exercise 7.6, compute $B^+$. 6 | 7 | -------------------------------- 8 | 9 | Use the algorithm given on Figure 7.8. 10 | 11 | $B^+ = \{ B, D\} $ -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.38.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 38 3 | title: '7.38' 4 | --- 5 | > In designing a relational database, why might we choose a non-BCNF design? 6 | 7 | -------------------------------- 8 | 9 | If the design is not dependency-preserving we might choose a non-BCNF design. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '5.14' 4 | --- 5 | > Repeat Exercise 5.13 using ODBC, defining `void printTable(char* r)` as a function 6 | > instead of a method. 7 | 8 | -------------------------------- 9 | 10 | ```C 11 | void printTable(char* c) { 12 | // TODO #3 13 | } 14 | ``` -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.16.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 16 3 | title: '2.16' 4 | --- 5 | > List two reasons why null values might be introduced into a database. 6 | 7 | -------------------------------- 8 | 9 | 1. When a value of an attribute is unknown. 10 | 2. When a value of an attribute does not exist. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '5.11' 4 | --- 5 | > Show how to express `GROUP BY CUBE(a,b,c,d)` using `ROLLUP`; your answer should 6 | > have only one `GROUP BY` clause. 7 | 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | GROUP BY ROLLUP(a), ROLLUP(b), ROLLUP(c), ROLLUP(d) 13 | ``` -------------------------------------------------------------------------------- /Ch09_Application_Development/9.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '9.5' 4 | --- 5 | > Why is it important to open JDBC connections using the try-with-resources 6 | > (`try (...) {...}`) syntax? 7 | 8 | -------------------------------- 9 | 10 | This ensures connections are closed properly, and you will not run out of 11 | database connections. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '3.19' 4 | --- 5 | > List two reasons why null values might be introduced into the database. 6 | 7 | -------------------------------- 8 | 9 | [same question as that of 2.16] 10 | 1. When a value of an attribute is unknown. 11 | 2. When a value of an attribute does not exist 12 | 13 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.21.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 21 3 | title: '7.21' 4 | --- 5 | > Give a lossless decomposition into BCNF of schema $R$ of Exercise 7.1. 6 | 7 | -------------------------------- 8 | 9 | Use the algorithm given on Figure 7.11 (BCNF decomposition algorithm). 10 | 11 | One possible decomposition: $\{ (A, B, C, E), (B, D) \}$ -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.21.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 21 3 | title: '4.21' 4 | --- 5 | > For the view of Exercise 4.18, explain why the database system would not allow 6 | > a tuple to be inserted into the database through this view. 7 | 8 | -------------------------------- 9 | 10 | I think there is an error with the question. 11 | 12 | There is no view in Exercise 4.18. -------------------------------------------------------------------------------- /Ch01_Introduction/1.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '1.11' 4 | --- 5 | > Assume that two students are trying to register for a course in which there is only 6 | > one open seat. What component of a database system prevents both students 7 | > from being given that last seat? 8 | 9 | The transaction manager. More specifically the concurrency-control manager. 10 | 11 | -------------------------------------------------------------------------------- /Ch01_Introduction/1.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '1.1' 4 | --- 5 | > This chapter has described several major advantages of a database system. What are two disadvantages? 6 | 7 | Two disadvantages associated with database systems are listed below. 8 | * Setup of the database system requires more knowledge, money, skills, and time. 9 | * The complexity of the database may result in poor performance. -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '11.11' 4 | --- 5 | > The organization of parts, chapters, sections, and subsections in a book is 6 | > related to clustering. Explain why, and to what form of clustering. 7 | 8 | -------------------------------- 9 | 10 | Since this kind of clustering attemps to cluster related items together, it is 11 | **hierarchical clustering**. -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '2.1' 4 | --- 5 | > Consider the employee database of Figure 2.17. What are the appropriate 6 | > primary-keys? 7 | 8 | The appropriate primary keys are shown below: 9 | 10 | employee(person_name, street, city)
11 | works(person_name, company_name, salary)
12 | company(company_name, city) 13 | 14 | -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.16.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 16 3 | title: '6.16' 4 | --- 5 | > Extend the E-R diagram of Exercise 6.3 to track the same information 6 | > for all teams in a league. 7 | 8 | -------------------------------- 9 | 10 | 11 | 12 | The above design assumes that the game is soccer. That explains the 13 | mapping cardinalties given in the picture. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.20.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 20 3 | title: '5.20' 4 | --- 5 | > The execution of a trigger can cause another action to be triggered. Most 6 | > database systems place a limit on how deep the nesting can be. Explain why 7 | > they might place such a limit. 8 | 9 | -------------------------------- 10 | 11 | This is to protect the user of the database system against accidental infinite 12 | chain of triggering. -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '6.19' 4 | --- 5 | > We can convert any weak entity set to a strong entity set by simply adding 6 | > appropriate attributes. Why, then do we have weak entity sets? 7 | 8 | -------------------------------- 9 | 10 | We have weak entity sets, because we want to make the dependence of the weak 11 | entity set on its identifying entity set **explicit**. 12 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '14.5' 4 | --- 5 | > Consider the modified redistribution scheme for B+-trees described on page 651. 6 | > What is the expected height of the tree as a function of n? 7 | 8 | -------------------------------- 9 | 10 | If there are $K$ search-key values and $m - 1$ sibilings are involved in the redistribution, 11 | the expected height of the tree is: $\log_{\lfloor (m-1)n/m \rfloor} (K)$ -------------------------------------------------------------------------------- /Ch01_Introduction/1.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '1.15' 4 | --- 5 | > Describe at least three tables that might be used to store information in a social- 6 | > networking system such as Facebook. 7 | 8 | 1. Users table - that contains id, full name, phone number, email, date of birth, profile pic 9 | 2. Chats table - that contains the chat 10 | 3. Friends table - that contains basically two columns of user ids (foreign keys from Users table) -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.20.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 20 3 | title: '4.20' 4 | --- 5 | > Show how to define a view _tot_credits(year,num_credits)_, giving the total number 6 | > of credits taken in each year. 7 | 8 | -------------------------------- 9 | 10 | ```sql 11 | CREATE VIEW tot_credits(year,num_credits) AS ( 12 | SELECT year, SUM(credits) 13 | FROM takes NATURAL JOIN course 14 | GROUP BY year 15 | ORDER BY year ASC 16 | ) 17 | ``` -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.28.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 28 3 | title: '6.28' 4 | --- 5 | > Explain the distinction between total and partial constraints. 6 | 7 | -------------------------------- 8 | 9 | * **Total specialization constraint** - Each higher-level entity must belong to a 10 | lower-level entity set. 11 | * **Partial specialization constraint** - Some higher-level entities may not belong 12 | to any lower-level entity set. 13 | -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '2.10' 4 | --- 5 | > Describe the differences in meaning between the terms _relation_ and _relation schema_. 6 | 7 | -------------------------------- 8 | 9 | _relation_ is a set of tuples. 10 | 11 |
12 | 13 | _relation schema_ is used to refer to the the structure of a relation. A _relation schema_ 14 | consists of a list of attributes and their corresponding domains. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.20.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 20 3 | title: '7.20' 4 | --- 5 | > Give an example of a relation schema $R$ and a set of dependencies such that 6 | > $R$ is in BCNF but is not in 4NF. 7 | 8 | -------------------------------- 9 | 10 | There are, of course, an infinite number of such examples. We show the simplest one here. 11 | 12 | Let $R$ be the schema $(A,B,C)$ with the only nontrivial dependency being $A \twoheadrightarrow B$ -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.22.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 22 3 | title: '7.22' 4 | --- 5 | > Give a lossless, dependency-preserving decomposition into 3NF of schema $R$ of Exercise 7.1. 6 | 7 | -------------------------------- 8 | 9 | Use the algorithm given in Figure 7.12. 10 | 11 | Also don't forget that $F = F_c$ (see exercise 7.7) 12 | 13 | $$ 14 | R_1 = \{ A, B, C\} \\ 15 | R_2 = \{ C, D, E\} \\ 16 | R_3 = \{ B, D\} \\ 17 | R_4 = \{ E, A\} \\ 18 | $$ -------------------------------------------------------------------------------- /Ch09_Application_Development/9.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '9.15' 4 | --- 5 | > Write a servlet that authenticates a user (based on user names and 6 | > passwords stored in a database relation) and sets a session variable called 7 | > _userid_ after authentication. 8 | 9 | -------------------------------- 10 | 11 | // TODO. 12 | 13 | Servlet version: 14 | 15 | ```java 16 | 17 | ``` 18 | 19 | Flask Version: 20 | 21 | ```python 22 | 23 | ``` -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.16.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 16 3 | title: '4.16' 4 | --- 5 | > Write an SQL query using the university schema to find the ID of each student 6 | > who has never taken a course at the university. Do this using no subqueries and 7 | > no set operations (use an outer join). 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT s.ID 13 | FROM student s LEFT OUTER JOIN takes t 14 | ON s.ID = t.ID 15 | WHERE t.ID IS NULL; 16 | ``` -------------------------------------------------------------------------------- /Ch01_Introduction/1.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '1.8' 4 | --- 5 | > Explain the concept of physical data independence and its importance in 6 | > database systems. 7 | 8 | There are 3 levels of data abstraction in a database: 9 | Physical Level, Logical Level and View Level. **Physical data independence** 10 | is the abstraction provided by the Logical Level to hide the complex 11 | data-structures that are used at the Physical Level to retrieve data efficiently. -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '12.9' 4 | --- 5 | > How does the remapping of bad sectors by disk controllers affect 6 | > data-retrieval rates? 7 | 8 | -------------------------------- 9 | 10 | It will decrease the data-retrieval rates. This is because of the following 11 | overhead. Given the logical address of the bad sector, the disk controller 12 | needs to perform a lookup for the mapped physical address of the good sector. -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '2.13' 4 | --- 5 | > Construct a schema diagram for the bank database of Figure 2.18. 6 | 7 | -------------------------------- 8 | 9 |

10 | schema diagram for bank database of Figure 2.18 11 |

12 | 13 | the above picture was created using [figma](https://www.figma.com) and [Arctype](https://www.youtube.com/watch?v=bND5cWmk_nk) -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.25.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 25 3 | title: '3.25' 4 | --- 5 | > Using the university schema, write an SQL query to find the names of 6 | > those departments whose budget is higher than that of Philosophy. 7 | > List them in alphabetic order. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT dept_name 13 | FROM department 14 | WHERE budget > (SELECT budget FROM department WHERE dept_name = 'Philosophy') 15 | ORDER BY dept_name ASC; 16 | ``` -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '5.9' 4 | --- 5 | > Given a relation _nyse(year,month,day,shares_traded,dollar_volume)_ with trading data 6 | > from the New York Stock Exchange, list each trading day in order of number of shares 7 | > traded, and show each day's rank. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT year,month,day,shares_traded, 13 | RANK() OVER (ORDER BY (shares_traded) DESC) AS mostshares 14 | FROM nyse 15 | ``` 16 | -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '13.10' 4 | --- 5 | > Explain why the allocation of records to blocks affects database-system 6 | > performance significantly. 7 | 8 | -------------------------------- 9 | 10 | Since _allocation of records to blocks_ is such a basic operation for a database-system 11 | to perform and is required to be performed many times during a realistic use case, it 12 | affects the database-system performance significantly. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '5.17' 4 | --- 5 | > Consider the relational schema from Exercise 5.16. Write a JDBC function 6 | > using nonrecursive SQL to find the total cost of part "P-100", including the 7 | > costs of all its subparts. Be sure to take into account the fact that a part may 8 | > have multiple occurrences of a subpart. You may use recursion in Java if you wish. 9 | 10 | -------------------------------- 11 | 12 | ```java 13 | // TODO #4 14 | ``` -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '12.13' 4 | --- 5 | > Suppose you have data that should not be lost on disk failure, 6 | > and the application is write-intensive. How would you store the data? 7 | 8 | -------------------------------- 9 | 10 | I would use **RAID level 1** (Mirroring disks). This is because RAID level 1 offers the best 11 | write performance, and data would not be lost on disk failure since we have 12 | a mirror disk for each disk in the array. -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '15.9' 4 | --- 5 | > What is the effect on the cost of merging runs if the number of buffer blocks 6 | > per run is increased while overall memory available for buffering runs remains 7 | > fixed? 8 | 9 | -------------------------------- 10 | 11 | Seek overhead is reduced, but the number of runs that can be merged in a pass decreases, 12 | potentially leading to more passes. A value of $b_b$ that minimizes overall cost should 13 | be chosen. -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.27.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 27 3 | title: '6.27' 4 | --- 5 | > Explain the distinction between disjoint and overlapping constraints. 6 | 7 | -------------------------------- 8 | 9 | * **Disjoint Specialization Constraint** - An entity in a higher level entity set 10 | **cannot** belong to multiple specialized entity sets. 11 | 12 | * **Overlapping Specialization Constraint** - An entity in a higher level entity set 13 | **can** belong to multiple specialized entity sets. -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '11.12' 4 | --- 5 | > Suggest how predictive mining techniques can be used by a sports team, using 6 | > your favorite sport as an example. 7 | 8 | -------------------------------- 9 | 10 | If team A plays against team B, team A can study the matches that team B has played 11 | in the past, to decide/to plan the configuration of the members of team A. In fact, 12 | this kind of study can even include the personal abilities of the members of team B. -------------------------------------------------------------------------------- /Ch14_Indexing/14.16.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 16 3 | title: '14.16' 4 | --- 5 | > When is it preferable to use a dense index rather than a sparse index? Explain your answer. 6 | 7 | -------------------------------- 8 | 9 | 1. If the relation is **NOT** stored in a sorted order of the search key, then we cannot 10 | use sparse index. We have to use a dense index. 11 | 12 | 2. If our use case doesn't involve too many updates (insertions, deletions) and we have enough 13 | storage space then dense index is preferable. -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '13.12' 4 | --- 5 | > In the sequential file organization, why is an overflow _block_ used 6 | > even if there is, at the moment, only one overflow record? 7 | 8 | -------------------------------- 9 | 10 | Because we expect future insertions into the relation. If there is no 11 | space left after a deletion in the relation, a newly inserted record will have its 12 | new home in the overflow _block_ , and pointers pointing in a sequential order. -------------------------------------------------------------------------------- /Ch01_Introduction/1.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '1.6' 4 | --- 5 | > List four applications you have used that most likely employed a database 6 | > system to store persistent data. 7 | 8 | 1. Google Maps - https://www.google.com/maps 9 | 2. Facebook - https://facebook.com 10 | 3. Medium - Probably stores all of the articles in some kind of database - https://medium.com/ 11 | 4. Bitcoin - the famous cryptocurrency 12 | 5. Domain Name Server - translate urls into ip addresses. they need to use some kind of database. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.24.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 24 3 | title: '3.24' 4 | --- 5 | > Using the university schema, write an SQL query to find the name and ID of those 6 | > Accounting students advised by an instructor in the Physics department. 7 | 8 | -------------------------------- 9 | 10 | ```sql 11 | SELECT s.id, s.name 12 | FROM student AS s INNER JOIN advisor AS a ON s.id = a.s_id 13 | INNER JOIN instructor AS i ON a.i_id = i.id 14 | WHERE s.dept_name = 'Accounting' AND i.dept_name = 'Physics'; 15 | ``` -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '3.6' 4 | --- 5 | > The SQL **LIKE** operator is case sensitive (in most systems), but the **lower()** function 6 | > on strings can be used to perform case-insensitive matching. To show how, write a query 7 | > that finds departments whose names contain the string "sci" as a substring, regardless 8 | > of the case. 9 | 10 | -------------------------------- 11 | 12 | ```sql 13 | SELECT dept_name 14 | FROM department 15 | WHERE LOWER(dept_name) LIKE '%sci%' 16 | ``` -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '6.9' 4 | --- 5 | > Suppose the _advisor_ relationship set were one-to-one. What extra constraints 6 | > are required on the relation _advisor_ to ensure that the one-to-one cardinality 7 | > constraint is enforced? 8 | 9 | -------------------------------- 10 | 11 | In addition to declaring $s\_ID$ as primary key for _advisor_, we declare 12 | $i\_ID$ as a superkey for _advisor_ (this can be done in SQL using the **UNIQUE** 13 | constraint on $i\_ID$). -------------------------------------------------------------------------------- /Ch09_Application_Development/9.23.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 23 3 | title: '9.23' 4 | --- 5 | > What are two advantages of encrypting data stored in the database? 6 | 7 | -------------------------------- 8 | 9 | * Encrypting data stored in databases, protects against attackers who 10 | can access the disk contents but do not have access to the encryption key. 11 | 12 | * Encryption at the database level has the advantage of requiring relatively 13 | low time and space overhead and does not require modification of applications. 14 | -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.34.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 34 3 | title: '3.34' 4 | --- 5 | > Using the university schema, write an SQL query to find the number of students 6 | > in each section. The result columns should appear in the order "course_id, sec_id, 7 | > year,semester,num". You do not need to output sections with 0 students. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT course_id, sec_id, year, semester, COUNT(DISTINCT ID) AS num 13 | FROM takes 14 | GROUP BY course_id, sec_id, year, semester; 15 | ``` -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.37.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 37 3 | title: '7.37' 4 | --- 5 | > List the three design goals for relational databases, and explain why each 6 | > is desirable. 7 | 8 | -------------------------------- 9 | 10 | In general, the goal of relational database design is to generate a set of 11 | relational schemas that 12 | 13 | 1. Allow us to store information without unnecessary redundancy. 14 | 2. Allow us to retrieve information easily. 15 | 3. Allow us to store information without loss. 16 | 17 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '14.17' 4 | --- 5 | > What is the difference between a clustering index and a secondary index? 6 | 7 | -------------------------------- 8 | 9 | A **clustering index** is an index whose search key also defines the sequential 10 | order of the file. A **clustering index** is also called a **primary index**. 11 | 12 | A **secondary index** also known as **nonclustering index** is an index whose 13 | search key specifies an order different from the sequential order of the file. -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.26.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 26 3 | title: '6.26' 4 | --- 5 | > Design a generalization-specialization hierarchy for a motor vehicle sales 6 | > company. The company sells motorcycles, passenger cars, vans, and buses. Justify 7 | > your placement of attributes at each level of the hierarchy. Explain why they should 8 | > not be placed at a higher or lower level. 9 | 10 | -------------------------------- 11 | 12 | 13 | 14 | TODO: This answer requires more explanation. -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '13.5' 4 | --- 5 | > It is important to be able to quickly find out if a block is present in the buffer, 6 | > and if so where in the buffer it resides. Given that database buffer sizes are very 7 | > large, what (in-memory) data structure would you use for this task? 8 | 9 | -------------------------------- 10 | 11 | Hash table is the common option for large database buffers. The hash function helps 12 | in locating the appropriate bucket on which linear search is performed. -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '15.11' 4 | --- 5 | > Suppose a query retrieves only the first $K$ results of an operation 6 | > and terminates after that. Which choice of demand-driven or producer-driven 7 | > pipelining (with buffering) would be a good choice for such a query? 8 | > Explain your answer. 9 | 10 | -------------------------------- 11 | 12 | Demand driven is better, since it will only generate the top $K$ results. Producer 13 | driven may generate a lot more answers, many of which would not get used. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '5.10' 4 | --- 5 | > Using the relation from Exercise 5.9, write an SQL query to generate a report 6 | > showing the number of shares traded, number of trades, and total dollar volume 7 | > broken down by year, each month of each year, and each trading day. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT year,month,day,SUM(shares_traded) AS shares, 13 | SUM(num_trades) AS trades, 14 | SUM(dollar_volume) AS total_volume 15 | FROM nyse 16 | GROUP BY ROLLUP(year,month,day); 17 | ``` -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '2.4' 4 | --- 5 | > In the instance of _instructor_ shown in Figure 2.1, no two instructors 6 | > have the same name. From this, can we conclude that _name_ can be used 7 | > as a superkey (or primary key) of _instructor_? 8 | 9 | No. For this possible instance of the instructor table the names are unique, 10 | but in general this may not always be the case (unless the university has a 11 | rule that two instructors cannot have the same name, which is a rather unlikely 12 | scenario :) ). 13 | -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '2.7' 4 | --- 5 | > Consider the bank database of Figure 2.18. Give an expression 6 | > in the relational algebra for each of the following queries:
7 | > a. Find the name of each branch located in "Chicago".
8 | > b. Find the ID of each borrower who has a loan in branch "Downtown".
9 | 10 | a. $\Pi_{branch\_name}(\sigma_{branch\_city = "Chicago"}(branch))$
11 | b.$\Pi_{ID}(\sigma_{branch\_name = "Downtown"}(loan \bowtie_{loan.loan\_number = borrower.loan\_number} borrower))$
12 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.23.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 23 3 | title: '14.23' 4 | --- 5 | > What trade-offs do write-optimized indices pose as compared to B+-tree indices? 6 | 7 | -------------------------------- 8 | 9 | **Write-optimized indices** (such as **LSM trees**, **buffer trees**, ...) have very efficient 10 | write methods that are much faster than B+-tree's writes. But they offer this write efficiency 11 | at the cost of lookup efficiency. Lookup on LSM trees for example costs more than lookup on 12 | B+-trees, since a single lookup on LSM trees would require looking up multiple B+-trees. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.24.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 24 3 | title: '7.24' 4 | --- 5 | > Why are certain functional dependencies called _trivial_ functional dependencies? 6 | 7 | -------------------------------- 8 | 9 | Let $r(R)$ be any relation. Let $\alpha \subseteq R, \beta \subseteq R, \text{ and } 10 | \beta \subseteq \alpha$. Then we know that the functional dependency $\alpha \rightarrow \beta$ holds 11 | even though we don't know any thing about $r(R)$. $r(R)$ could have been **any** relation and our statement 12 | will hold true. This is unintersting to us, so we call it trivial. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '5.19' 4 | --- 5 | > Suppose there are two relations _r_ and _s_, such that the foreign key _B_ of _r_ 6 | > references the primary key _A_ of _s_. Describe how the trigger mechanism can be used 7 | > to implement the **on delete cascade** option when a tuple is deleted from _s_. 8 | 9 | -------------------------------- 10 | 11 | When any row is deleted from from the relation _s_ the trigger mechanism is supposed 12 | to take the following action: delete all rows from the relation _r_ that reference 13 | the deleted row from the relation _s_. -------------------------------------------------------------------------------- /Ch10_Big_Data/10.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '10.12' 4 | --- 5 | > Suppose your company has built a database application that runs on a 6 | > centralized database, but even with a high-end computer and appropriate indices 7 | > created on the data, the system is not able to handle the transaction load, 8 | > leading to slow processing of queries. What would be some of your options 9 | > to allow the application to handle the transaction load? 10 | 11 | -------------------------------- 12 | 13 | Using a **Parallel and Distributed Databases** such as Cloud Spanner and 14 | Cockroach DB. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '7.7' 4 | --- 5 | > Using the functional dependencies of Exercise 7.6, compute the canonical cover $F_c$. 6 | 7 | -------------------------------- 8 | 9 | The given set of FDs (Functional Dependencies) $F$ is:- 10 | 11 | $$ 12 | A \rightarrow BC \\ 13 | CD \rightarrow E \\ 14 | B \rightarrow D \\ 15 | E \rightarrow A 16 | $$ 17 | 18 | The left side of each FD in $F$ is unique. Also, none of the attributes in the left side 19 | or right side of any of the FDs is extraneous. Therefore the canonical cover $F_c$ is 20 | equal to $F$. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.26.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 26 3 | title: '3.26' 4 | --- 5 | > Using the university schema, use SQL to do the following: 6 | > For each student who has retaken a course at least twice 7 | > (i.e., the student has taken the course at least three times), 8 | > show the course ID and the student's ID. 9 | > Please display your results in order of course ID and do not 10 | > display duplicate rows. 11 | 12 | -------------------------------- 13 | 14 | ```sql 15 | SELECT id,course_id 16 | FROM takes 17 | GROUP BY id,course_id 18 | HAVING COUNT(*) >= 3 19 | ORDER BY course_id ASC; 20 | ``` -------------------------------------------------------------------------------- /Ch14_Indexing/14.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '14.19' 4 | --- 5 | > The solution presented in Section 14.3.5 to deal with nonunique search 6 | > keys added an extra attribute to the search key. What effect could this change 7 | > have on the height of the B+-tree? 8 | 9 | -------------------------------- 10 | 11 | It could increase the height of B+-tree, since adding extra attributes (uniquifier) 12 | to the search-key would result in a search-key that has size greater than it had before. 13 | This will decrease the fanout of the internal nodes, potentially increasing the height 14 | of the B+-tree. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.27.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 27 3 | title: '3.27' 4 | --- 5 | > Using the university schema, write an SQL query to find the IDs of those students who have 6 | > retaken at least three distinct courses at least once (i.e, the student has taken the course 7 | > at least two times). 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | WITH retakers(id,course_id,frequency) AS ( 13 | SELECT id,course_id,COUNT(*) 14 | FROM takes 15 | GROUP BY id,course_id 16 | HAVING COUNT(*) > 1 17 | ) 18 | SELECT id 19 | FROM retakers 20 | GROUP BY id 21 | HAVING COUNT(*) >= 3; 22 | ``` -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '6.14' 4 | --- 5 | > Explain the distinctions among the terms _primary key_, _candidate key_, and _superkey_. 6 | 7 | -------------------------------- 8 | 9 | A **superkey** is a set of one or more attributes that, taken collectively, allow us to 10 | identify uniquely a tuple in the relation. 11 | 12 | A **candidate key** is a superkey for which no proper subset is a superkey. 13 | 14 | A **primary key** is a candidate key that is chosen by the database designer as the principal 15 | means of identifying tuples within a relation. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.26.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 26 3 | title: '7.26' 4 | --- 5 | > Consider the following proposed rule for functional dependencies: If $\alpha \rightarrow \beta$ 6 | > and $\gamma \rightarrow \beta$, then $\alpha \rightarrow \gamma$. Prove that this rule is 7 | > _not_ sound by showing a relation _r_ that satisfies $\alpha \rightarrow \beta$ and 8 | > $\gamma \rightarrow \beta$, but does not satisfy $\alpha \rightarrow \gamma$. 9 | 10 | -------------------------------- 11 | 12 | $\alpha$ | $\gamma$ | $\beta$ 13 | ---------|----------|--------- 14 | 1|6|7 15 | 2|3|5 16 | 2|4|5 17 | 18 | -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '8.14' 4 | --- 5 | > Web sites that want to get some publicity can join a web ring, where 6 | > they create links to other sites in the ring in exchange for other sites 7 | > in the ring creating links to their site. What is the effect of such rings 8 | > on popularity ranking techniques such as PageRank? 9 | 10 | -------------------------------- 11 | 12 | It depends on what the inital PageRanks of the web pages were before they started 13 | the "web ring". 14 | 15 | But I think it will generally increase the PageRank for most webistes in the ring. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.22.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 22 3 | title: '3.22' 4 | --- 5 | > Rewrite the **WHERE** clause 6 | ```sql 7 | WHERE UNIQUE (SELECT title FROM course) 8 | ``` 9 | > without using the **UNIQUE** construct. 10 | 11 | -------------------------------- 12 | 13 | One method 14 | 15 | ```sql 16 | WHERE 1 >= ALL ( 17 | SELECT COUNT(*) 18 | FROM course 19 | GROUP BY title 20 | ) 21 | ``` 22 | 23 | Another method 24 | 25 | ```sql 26 | WHERE NOT EXISTS ( 27 | SELECT * 28 | FROM course AS c1, course AS c2 29 | WHERE c1.course_id != c2.course_id AND c1.title = c2.title 30 | ) 31 | ``` -------------------------------------------------------------------------------- /Ch10_Big_Data/10.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '10.1' 4 | --- 5 | > Suppose you need to store a very large number of small files, each of size 6 | > say 2 kilobytes. If your choice is between a distributed file system and distributed 7 | > key-value store, which would your prefer and explain why. 8 | 9 | -------------------------------- 10 | 11 | The key-value store, since the distributed file system is designed to store a moderate 12 | number of large files. With each file block being multiple megabytes, kilobyte-sized 13 | files would result in a lot of wasted space in each block and poor storage performance. -------------------------------------------------------------------------------- /Ch01_Introduction/1.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '1.10' 4 | --- 5 | > List at least two reasons why database systems support data manipulation using 6 | > a declarative query language such as SQL, instead of just providing a library of 7 | > C or C++ functions to carry out data manipulation. 8 | 9 | 1. Declarative query languages are easier to learn and use than procedural languages 10 | such as C or C++. 11 | 12 | 2. At the logical level it is better to emphasize ease of use since we have already 13 | defined efficient algorithms at the physical level. So declarative languages fit us 14 | well at the logical level. -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '11.2' 4 | --- 5 | > Draw a diagram that shows how the _classroom_ relation of our university schema as shown 6 | > in Appendix A would be stored under a column-oriented storage structure. 7 | 8 | -------------------------------- 9 | 10 | The relation would be stored in three files, one per attribute, as shown below. 11 | We assume that the row number can be inferred implicitly from position, by using 12 | fixed-size space for each attribute. Otherwise, the row number would also have 13 | to be stored explicitly. 14 | 15 | 16 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '14.2' 4 | --- 5 | > Is it possible in general to have two clustering indices on the same relation for 6 | > different search keys? Explain your answer. 7 | 8 | -------------------------------- 9 | 10 | In general, it is not possible to have two primary indices on the same relation for 11 | different keys because the tuples in a relation would have to be stored in different 12 | order to have the same values stored together. We could accomplish this by storing the 13 | relation twice and duplicating all values, but for a centralized system, this is not 14 | efficient. -------------------------------------------------------------------------------- /Ch14_Indexing/14.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '14.6' 4 | --- 5 | > Give pseudocode for a B+-tree function `findRangeIterator()`, which is like the 6 | > function `findRange()`, except that it returns an iterator object, as described in 7 | > Section 14.3.2. Also give pseudocode for the iterator class, including the 8 | > variables in the iterator object, and the `next()` method. 9 | 10 | -------------------------------- 11 | 12 | 13 | 14 | If you want to learn more about the magic of Python Generators, head on over to [python.org](https://docs.python.org/3/tutorial/classes.html#generators). 15 | 16 | -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '5.8' 4 | --- 5 | > Given a relation _S(student,subject,marks)_, write a query to find the top 10 students 6 | > by total marks, by using SQL ranking. Include all students tied for the final spot 7 | > in the ranking, even if that results in more than 10 total students. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT * 13 | FROM ( 14 | SELECT student,total,RANK() OVER (ORDER BY (total) DESC) AS t_rank 15 | FROM ( 16 | SELECT student, SUM(marks) AS total 17 | FROM S 18 | GROUP BY student 19 | ) 20 | ) 21 | WHERE t_rank <= 10 22 | ``` -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.24.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 24 3 | title: '4.24' 4 | --- 5 | > Suppose user _A_, who has all authorization privileges on a relation _r_, grants **select** 6 | > on relation _r_ to **public** with grant option. Suppose user _B_ then grants **select** on 7 | > r to _A_. Does this cause a cycle in the authorization graph? Explain why. 8 | 9 | -------------------------------- 10 | 11 | I don't think so. Since user _A_ has all authorization privileges on a relation _r_, _B_ granting 12 | **select** on r to _A_ doesn't bring any new privilege to user _A_. 13 | 14 | I guess this depends on the internals of the Database Management System. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.22.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 22 3 | title: '5.22' 4 | --- 5 | > Given relation _s(a,b,c)_, write an SQL statement to generate a histogram 6 | > showing the sum of _c_ values versus _a_, dividing _a_ into 20 equal-sized 7 | > partitions (i.e., where each partition contains 5 percent of tuples in _s_, 8 | > sorted by _a_). 9 | 10 | -------------------------------- 11 | 12 | ```sql 13 | WITH s_with_partition(a,b,c,partion_id) AS ( 14 | SELECT a,b,c,NTILE(20) OVER (ORDER BY (a) ASC) AS partition_id 15 | FROM s 16 | ) 17 | SELECT partition_id,SUM(c) 18 | FROM s_with_partition 19 | GROUP BY partition_id 20 | ORDER BY partition_id; 21 | ``` -------------------------------------------------------------------------------- /_quarto.yml: -------------------------------------------------------------------------------- 1 | project: 2 | type: website 3 | render: 4 | - "*.md" 5 | - "*.qmd" 6 | - "!PRIVATE_README.md" 7 | 8 | website: 9 | title: "Database System Concepts Answers" 10 | navbar: 11 | background: primary 12 | left: 13 | - href: index.qmd 14 | text: Home 15 | tools: 16 | - icon: github 17 | href: https://github.com/noahabe/database_system_concepts_answers 18 | sidebar: 19 | style: "floating" 20 | collapse-level: 1 21 | search: false 22 | contents: auto 23 | 24 | format: 25 | html: 26 | theme: cosmo 27 | css: styles.css 28 | toc: true 29 | 30 | 31 | 32 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.27.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 27 3 | title: '7.27' 4 | --- 5 | > Use Armstrong's axioms to prove the soundness of the decomposition rule. 6 | 7 | -------------------------------- 8 | 9 | * **Decomposition Rule** states: If $\alpha \rightarrow \beta\gamma$ holds, then $\alpha \rightarrow \beta$ holds 10 | and $\alpha \rightarrow \gamma$ holds. 11 | 12 | Suppose $\alpha \rightarrow \beta\gamma$ holds. By **Reflexivity rule** we know that, 13 | $\beta\gamma \rightarrow \beta$ and $\beta\gamma \rightarrow \gamma$ holds. 14 | 15 | By **Transitivity rule** $\alpha \rightarrow \beta$ holds 16 | and $\alpha \rightarrow \gamma$ holds. -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '11.4' 4 | --- 5 | > Consider the data warehouse schema depicted in Figure 11.2. Give an SQL query 6 | > to summarize sales numbers and price by store and date, along with the hierarchies 7 | > on store and date. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT store_id, city, state, country, 13 | date, month, quarter, year, 14 | sum(number), sum(price) 15 | FROM sales, store, date 16 | WHERE sales.store_id = store.store_id AND 17 | sales.date = date.date 18 | GROUP BY ROLLUP(country, state, city, store_id), 19 | ROLLUP(year, quarter, month, date) 20 | ``` -------------------------------------------------------------------------------- /Ch14_Indexing/14.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '14.10' 4 | --- 5 | > Suppose you are given a database schema and some queries that are executed frequently. 6 | > How would you use the above information to decide what indices to create? 7 | 8 | -------------------------------- 9 | 10 | Indices on any attributes on which there are selection conditions; if there are only a 11 | few distinct values for that attribute, a bitmap index may be created, otherwise a normal 12 | B+-tree index. 13 | 14 | B+-tree indices on primary-key and foreign-key attributes. 15 | 16 | Also indices on attributes that are involved in join conditions in the queries. 17 | -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '2.11' 4 | --- 5 | > Consider the _advisor_ relation shown in the schema diagram in Figure 2.9, 6 | > with _s_id_ as the primary key of _advisor_. Suppose a student can have more than one 7 | > advisor. Then, would _s_id_ still be a primary key of the _advisor_ relation? If not, 8 | > what should the primary key of _advisor_ be? 9 | 10 | -------------------------------- 11 | 12 | _s_id_ alone **cannot** be a primary key of the _advisor_ relation, since it 13 | doesn't identify uniquely a tuple in the relation _advisor_ (It is possible 14 | for one student to have many advisors). -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.23.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 23 3 | title: '4.23' 4 | --- 5 | > Explain why, when a manager, say Satoshi, grants an authorization, the grant 6 | > should be done by the manager role, rather than by the user Satoshi. 7 | 8 | -------------------------------- 9 | 10 | Assume the grant is done by the user Satoshi instead of the manager role. Then if Satoshi 11 | leaves the company one day, and the DBA revokes Satoshi's authorization, all of the employees 12 | granted by Satoshi will have their authorization revoked. 13 | 14 | If the DBA doesn't want that, he should have the manager role grant an authorization instead of 15 | the user Satoshi. 16 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '7.17' 4 | --- 5 | > Give an example of a relation schema $R'$ and set $F'$ of functional dependencies 6 | > such that there are at least three distinct lossless decompositions of $R'$ into 7 | > BCNF. 8 | 9 | -------------------------------- 10 | 11 | Given the relation $R' = (A, B, C, D)$ the set of functional dependencies 12 | $F' = A \rightarrow B, C \rightarrow D, B \rightarrow C$ allows three distinct 13 | BCNF decompositions. 14 | 15 | $$R_1 = \{ (A, B), (C, D), (B, C) \}$$ 16 | is in BCNF as is 17 | $$R_2 = \{ (A, B), (C, D), (A, C) \}$$ 18 | $$R_3 = \{ (B, C), (A, D), (A, B) \}$$ -------------------------------------------------------------------------------- /Ch10_Big_Data/10.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '10.3' 4 | --- 5 | > Suppose you wish to store utility bills for a large number of users, where 6 | > each bill is identified by a customer ID and a date. How would you store the 7 | > bills in a key-value store that supports range queries, if queries request the bills 8 | > of a specified customer for a specified date range. 9 | 10 | -------------------------------- 11 | 12 | Create a key by concatenating the customer ID and date (with date represented 13 | in the form year/month/date, e.g., 2018/02/28) and store the records indexed 14 | on this key. Now the required records can be retrieved by a range query. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.44.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 44 3 | title: '7.44' 4 | --- 5 | > Given two relations $r(A, B, \text{validtime})$ and $s(B, C, \text{validtime})$ where 6 | > $\text{validtime}$ denotes the valid time interval, write an SQL query to compute the 7 | > temporal natural join of the two relations. You can use the && operator to check if 8 | > two intervals overlap and the * operator to compute the intersection of two intervals. 9 | 10 | -------------------------------- 11 | 12 | ```sql 13 | SELECT A, r.B, r.validtime * s.validtime 14 | FROM r INNER JOIN s 15 | ON (r.B = s.B 16 | AND 17 | r.validtime && s.validtime) 18 | ``` -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.29.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 29 3 | title: '3.29' 4 | --- 5 | > Using the university schema, write an SQL query to find the name and ID of each History 6 | > student whose name begins with the letter 'D' and who has _not_ taken at least five 7 | > Music courses. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT id,name 13 | FROM student AS s 14 | WHERE dept_name = 'History' 15 | AND name LIKE 'D%' 16 | AND ( 17 | SELECT COUNT(DISTINCT course_id) 18 | FROM takes 19 | WHERE takes.id = s.id AND 20 | course_id IN (SELECT course_id FROM course WHERE dept_name = 'Music') 21 | ) < 5; 22 | ``` -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '6.7' 4 | --- 5 | > A weak entity set can always be made into a strong entity set 6 | > by adding to its attributes the primary-key attributes of its 7 | > identifying entity set. Outline what sort of redundancy will 8 | > result if we do so. 9 | 10 | -------------------------------- 11 | 12 | The primary key of a weak entity set can be inferred from its relationship 13 | with the strong entity set. If we add primary-key attributes to the weak 14 | entity set, they will be present in both the entity set, and the relationship 15 | set and they have to be the same. Hence there will be redundancy. -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '12.4' 4 | --- 5 | > Consider the following data and parity-block arrangement on four disks: 6 | > 7 | > 8 | > 9 | > The $B_i$ s represent data blocks; the $P_i$ s represent parity blocks. Parity block 10 | > $P_i$ is the parity block for data blocks $B_{4i-3}$ to $B_{4i}$. What, if any, 11 | > problem might this arrangement present? 12 | 13 | -------------------------------- 14 | 15 | This arrangement has the problem that $P_i$ and $B_{4i-3}$ are on the same disk. So if 16 | that disk fails, reconstruction of $B_{4i-3}$ is not possible, since data and parity 17 | are both lost. -------------------------------------------------------------------------------- /Ch14_Indexing/14.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '14.14' 4 | --- 5 | > Suppose you have a relation containing the $x, y$ coordinates and names of restaurants. 6 | > Suppose also that the only queries that will be asked are of the following form: The query 7 | > specifies a point and asks if there is a restaurant exactly at that point. Which type of index 8 | > would be preferable, R-tree or B-tree? Why? 9 | 10 | -------------------------------- 11 | 12 | B-tree index on the $(x, y)$ coordinates would be preferable. Even though we have a spatial data 13 | in two dimensions, the query that we are performing is not a **range queries** nor a **nearest neightbor** query. -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '8.7' 4 | --- 5 | > Compute the relevance (using appropriate definitions of term frequency and 6 | > inverse document frequency) of each of the Practice Exercises in this chapter 7 | > by the query "SQL relation". 8 | 9 | -------------------------------- 10 | 11 | We do not consider the questions containing neither of the keywords because 12 | their relevance to the keywords is zero. The number of words in a question 13 | include stop words. We use the equations given in Section 31.2 to compute 14 | relevance; the $\log$ term in the equation is assumed to be to the base $2$. 15 | 16 | -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.28.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 28 3 | title: '3.28' 4 | --- 5 | > Using the university schema, write an SQL query to find the names and IDs of 6 | > those instructors who teach every course taught in his or her department 7 | > (i.e., every course that appears in the _course_ relation with the instructor's 8 | > department name). Order result by name. 9 | 10 | -------------------------------- 11 | 12 | ```sql 13 | SELECT id, name 14 | FROM instructor AS i 15 | WHERE NOT EXISTS ( 16 | (SELECT course_id FROM course WHERE dept_name = i.dept_name) 17 | EXCEPT 18 | (SELECT course_id FROM teaches WHERE teaches.id = i.id) 19 | ) 20 | ORDER BY name ASC 21 | ``` -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '15.5' 4 | --- 5 | > Let $r$ and $s$ be relations with no indices, and assume that the relations are not sorted. 6 | > Assuming infinite memory, what is the lowest-cost way (in terms of I/O operations) 7 | > to compute $r \bowtie s$? What is the amount of memory required for this algorithm. 8 | 9 | -------------------------------- 10 | 11 | We can store the entire smaller relation in memory, read the larger relation block by block, 12 | and perform nested-loop join using the larger one as the outer relation. The number of I/O operations 13 | is equal to $b_r + b_s$, and the memory requirement is $\min(b_r, b_s) + 2$ pages. -------------------------------------------------------------------------------- /Ch01_Introduction/1.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '1.5' 4 | --- 5 | > Keyword queries used in web search are quite different from database queries. 6 | > List key differences between the two, in terms of the way the queries are 7 | > specified and in terms of what is the result of a query. 8 | 9 | Queries used in the web are specified by providing a list of keywords with no 10 | specific syntax. The result is typically an ordered list of URLs, along with snippets 11 | of information about the content of the URLs. In contrast, database queries have 12 | a specific syntax allowing complex queries to be specified. And in the relational 13 | world the result of query is always a table. -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '2.3' 4 | --- 5 | > Consider the _time_slot_ relation. Given that a particular time slot 6 | > can meet more than once in a week, explain why _day_ and _start_time_ 7 | > are part of the primary key of this relation, while _end_time_ is not. 8 | 9 | The attributes _day_ and _start_time_ are part of the primary key 10 | since a particular class will most likely meet on several different 11 | days and may even meet more than once in a day. However, _end_time_ 12 | is not part of the primary key since a particular class that starts 13 | at a particular time on a particular day cannot end at more than one time. 14 | -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.33.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 33 3 | title: '3.33' 4 | --- 5 | > Using the university schema, write an SQL query to find the ID and title 6 | > of each course in Comp. Sci. that has had at least one section with afternoon 7 | > hours (i.e., ends at or after 12:00). (You should eliminate duplicates if any.) 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | SELECT course_id,title 13 | FROM course AS c 14 | WHERE dept_name = 'Comp. Sci.' AND 15 | EXISTS ( 16 | SELECT * 17 | FROM section 18 | WHERE section.course_id = c.course_id AND 19 | time_slot_id IN (SELECT time_slot_id FROM time_slot WHERE end_hr >= 12) 20 | ) 21 | ``` -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '6.17' 4 | --- 5 | > Explain the difference between a weak and a strong entity set. 6 | 7 | -------------------------------- 8 | 9 | A **weak entity set** is one whose existence is dependent on another entity set, 10 | called its **identifying entity set**; instead of associating a primary key with a weak 11 | entity, we use the primary key of the identifying entity, along with extra attributes, 12 | called **discriminator attributes** to uniquely identify a weak entity. 13 | 14 | An entity set that is not a weak entity set is termed a **strong entity set**. 15 | 16 | Learn more in section 6.5.3 of the book. -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '8.15' 4 | --- 5 | > The Google search engine provides a feature whereby websites 6 | > can display advertisements supplied by Google. The advertisements 7 | > supplied are based on the contents of the page. Suggest how Google 8 | > might choose which advertisements to supply for a page, given 9 | > the page contents. 10 | 11 | -------------------------------- 12 | 13 | Google might use **data clustering** algorithms to cluster a given webpage. 14 | Possible clusters might look like: [_sports_, _car engines_, _cloth_]. And based 15 | on which cluster a webpage ends up in, it can show ads that are relevant to that 16 | cluster. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.25.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 25 3 | title: '4.25' 4 | --- 5 | > Suppose a user creates a new relation _r1_ with a foreign key referencing another relation 6 | > _r2_. What authorization privilege does the user need on _r2_? Why should this not simply 7 | > be allowed without any such authorization? 8 | 9 | -------------------------------- 10 | 11 | The **references** privilege needs to be granted to the user on the relation _r2_. This should not 12 | be allowed without any such authorization because, foreign-key constraints restrict deletion and 13 | update operations on the referenced relation. That is, operations such as **update** and **delete** 14 | on _r2_ may bring changes to _r1_ as well. 15 | -------------------------------------------------------------------------------- /Ch10_Big_Data/10.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '10.15' 4 | --- 5 | > Suppose a stream can deliver tuples out of order with respect to tuple 6 | > timestamps. What extra information should the stream provide, so a stream 7 | > query processing system can decide when all tuples in a window have been 8 | > seen? 9 | 10 | -------------------------------- 11 | 12 | Such streams should contain **punctuations**, that is, metadata tuples that state 13 | that all future tuples will have a timestamp greater than some value. Such punctuations 14 | are emitted periodically and can be used by window operators to decide when an aggregate 15 | result, such as aggregates for an hourly window, is complete and can be output. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.35.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 35 3 | title: '3.35' 4 | --- 5 | > Using the university schema, write an SQL query to find section(s) with maximum 6 | > enrollment. The result columns should appear in the order "courseid, secid, year, 7 | > semester, num". (It may be convenient to use the _with_ construct.) 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | WITH section_student_frequency(courseid, secid, year, semester, num) AS ( 13 | SELECT course_id, sec_id, year, semester, COUNT(DISTINCT ID) 14 | FROM takes 15 | GROUP BY course_id, sec_id, year, semester 16 | ) 17 | SELECT * 18 | FROM section_student_frequency 19 | WHERE num = (SELECT MAX(num) FROM section_student_frequency); 20 | ``` -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '4.11' 4 | --- 5 | > Operating systems usually offer only two types of authorization control for data 6 | > files: read access and write access. Why do database systems offer so many kinds of 7 | > authorization? 8 | 9 | -------------------------------- 10 | 11 | There are many reasons - we list a few here. One might wish to allow 12 | a user only to append new information without altering old information. 13 | One might wish to allow a user to access a relation but not change its schema. 14 | One might wish to limit access to aspects of the database that are not technically 15 | data access but instead impact resource utilization, such as creating an index. 16 | -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.31.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 31 3 | title: '3.31' 4 | --- 5 | > Using the university schema, write an SQL query to find the ID and name 6 | > of each instructor who has never given an A grade in any course she or 7 | > he has taught. (Instructors who have never taught a course trivially satisfy 8 | > this condition.) 9 | 10 | -------------------------------- 11 | 12 | ```sql 13 | SELECT id, name 14 | FROM instructor AS i 15 | WHERE 'A' NOT IN ( 16 | SELECT takes.grade 17 | FROM takes INNER JOIN teaches 18 | ON (takes.course_id,takes.sec_id,takes.semester,takes.year) = 19 | (teaches.course_id,teaches.sec_id,teaches.semester,teaches.year) 20 | WHERE teaches.id = i.id 21 | ) 22 | ``` -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '7.5' 4 | --- 5 | > Use Armstrong's axioms to prove the soundness of the pseudotransitivity rule. 6 | 7 | -------------------------------- 8 | 9 | Proof using Armstrong's axioms of the pseudotransitivity rule: 10 | 11 | $$ 12 | \text{if $\alpha \rightarrow \beta$ and $\gamma\beta \rightarrow \delta$ then $\alpha\gamma \rightarrow \delta$ } 13 | $$ 14 | 15 | Proof: 16 | 17 | $$ 18 | \alpha \rightarrow \beta \quad \text{given} \\ 19 | \alpha\gamma \rightarrow \gamma\beta \quad \text{augmentation rule and set union commutativity} \\ 20 | \gamma\beta \rightarrow \delta \quad \text{given} \\ 21 | \alpha\gamma \rightarrow \delta \quad \text{transitivity rule} \\ 22 | $$ -------------------------------------------------------------------------------- /Ch14_Indexing/14.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '14.11' 4 | --- 5 | > In write-optimized trees such as the LSM tree or the stepped-merge index, entries in 6 | > one level are merged into the next level only when the level is full. Suggest how this 7 | > policy can be changed to improve read performance during periods when there are many 8 | > reads but no updates. 9 | 10 | -------------------------------- 11 | 12 | If there have been no updates in a while, but there are a lot of index look ups on an 13 | index, then entries at one level, say $i$, can be merged into the next level, even if the 14 | level is not full. The benefit is that reads would then not have to look up indices at level 15 | $i$, reducing the cost of reads. -------------------------------------------------------------------------------- /Ch01_Introduction/1.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '1.14' 4 | --- 5 | > Explain why NoSQL systems emerged in the 2000s, and briefly contrast their 6 | > features with traditional database systems. 7 | 8 | Why did NoSQL systems emerged in the 2000s? 9 |
10 | -> **The variety of new data-intensive applications** and the **need for rapid development** 11 | led to NoSQL systems. 12 | 13 | Contrast NoSQL systems with traditional database systems 14 |
15 | -> Traditional database systems support **strict data consistency** while, NoSQL systems support 16 | **"eventual consistency"** which allowed for distributed copies of data to be inconsistent 17 | as long as they would eventually converge in the absence of further updates. 18 | -------------------------------------------------------------------------------- /Ch01_Introduction/1.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '1.9' 4 | --- 5 | > List five responsibilities of a database-management system. For each responsibility, 6 | > explain the problems that would arise if the responsibility were not discharged. 7 | 8 | 1. Security - Since DBMS have the concept of a ROLE (user) it easier for setting 9 | access managmenent. 10 | 11 | 2. Needs to offer atomicity when needed - If atomicity is not provided, inconsistency 12 | will be inevitable. 13 | 14 | 3. Needs to offer a simple and efficient way to query data 15 | 16 | 4. Needs to offer durability i.e. once an update or an insert has happened it must 17 | be persisted. 18 | 19 | 5. A DBMS needs to offer a way for protection against concurrent-access anomalies. -------------------------------------------------------------------------------- /Ch14_Indexing/14.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '14.15' 4 | --- 5 | > Suppose you have a spatial database that supports region queries with circular 6 | > regions, but not nearest-neighbor queries. Describe an algorithm to find the 7 | > nearest neighbor by making use of multiple region queries. 8 | 9 | -------------------------------- 10 | 11 | Start with regions with very small radius, and retry with a larger radius if a particular 12 | region does not contain any result. For example, each time the radius could be increased by 13 | a factor of (say) 1.5. The benefit is that since we do not use a very large radius 14 | compared to the minimum radius required, there will (hopefully!) not be too many points 15 | in the circular range query result. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '4.19' 4 | --- 5 | > Under what circumstances would the query 6 | > 7 | > ```sql 8 | >SELECT * 9 | >FROM student NATURAL FULL OUTER JOIN takes 10 | > NATURAL FULL OUTER JOIN course 11 | > ``` 12 | > 13 | > include tuples with null values for the _title_ attribute? 14 | 15 | -------------------------------- 16 | 17 | 1. If the course already had a **null** value for the _title_ attribute. 18 | This is a valid circumstance since there is no **NOT NULL** domain constraint 19 | in the schema of _course_ relation. 20 | 21 | 2. If a student takes a course that is not given by his/her department. (Recall that 22 | _dept_name_ attribute appears in both _course_ relation and _student_ relation). 23 | -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.26.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 26 3 | title: '4.26' 4 | --- 5 | > Explain the difference between integrity constraints and authorization constraints. 6 | 7 | -------------------------------- 8 | 9 | **Integrity constraints** ensure that changes made to the database by authorized users 10 | do not result in a loss of data consistency. That is, they guard us against accidental 11 | damage to the database. 12 |
13 | Example: 14 | * Domain constriants 15 | * Unique constraints 16 | * Referential Integrity constraints 17 | 18 | **Authorization constraints** guard against access to the database by unauthorized users. 19 |
20 | Example: 21 | * Authorization to read data 22 | * Authorization to insert new data 23 | * Authorization to delete data -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.43.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 43 3 | title: '7.43' 4 | --- 5 | > Although SQL does not support functional dependency constraints, if the 6 | > database system supports constraints on materialized views, and materialized 7 | > views are maintained immediately, it is possible to enforce functional 8 | > dependency constraints in SQL. Given a relation $r(A,B,C)$, explain 9 | > how constraints on materialized views can be used to enforce the functional 10 | > dependency $B \rightarrow C$. 11 | 12 | -------------------------------- 13 | 14 | We will create a materialized view of the following SQL query: 15 | 16 | ```sql 17 | SELECT B, COUNT(DISTINCT C) AS X 18 | FROM r 19 | GROUP BY B 20 | ``` 21 | 22 | with the added constraint that $X \leq 1$. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '3.7' 4 | --- 5 | > Consider the SQL query 6 | ```sql 7 | SELECT p.a1 8 | FROM p, r1, r2 9 | WHERE p.a1 = r1.a1 OR p.a1 = r2.a1 10 | ``` 11 | > Under what conditions does the preceding query select values of _p.a1_ that are 12 | > either in _r1_ or in _r2_ ? Examine carefully the cases where either _r1_ or _r2_ 13 | > may be empty. 14 | 15 | -------------------------------- 16 | 17 | The query selects those values of _p.a1_ that are equal to some value of _r1.a1_ or 18 | _r2.a1_ if and only if both _r1_ and _r2_ are non-empty. If one or both of _r1_ and _r2_ 19 | are empty, the Cartesian product of _p_, _r1_ and _r2_ is empty, hence the result of the 20 | query is empty. If _p_ itself is empty, the result is empty. -------------------------------------------------------------------------------- /Ch10_Big_Data/10.16.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 16 3 | title: '10.16' 4 | --- 5 | > Explain how multiple operations can be executed on a stream using a 6 | > publish-subscribe system such as Apache Kafka. 7 | 8 | -------------------------------- 9 | 10 | Each data source is assigned a unique topic name; the output of each operator is 11 | also assigned a unique topic name. Each operator subscribes to the topics of 12 | its inputs and publishes to the topics corresponding to its output. Data sources publish 13 | to their associated topic, while data sinks subscribe to the topics of the operators 14 | whose output goes to the sink. 15 | 16 | To execute multiple operations, say _Op1_ and _Op2_, _Op2_ subscribes to the output topic 17 | of _Op1_. And _Op1_ publishes to its output topic. -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '12.8' 4 | --- 5 | > List the physical storage media available on the computers you use 6 | > routinely. Give the speed with which data can be accessed on each medium. 7 | 8 | -------------------------------- 9 | 10 | Physical Storage Media | Speed | 11 | -----------------------|---------------------| 12 | Cache (L1) | 1 TB/second | 13 | Main Memory | 100 GB/second | 14 | Flash Memory (SSD) | 2 to 3 GB/second | 15 | Magnetic-disk storage | 50 to 200 MB/second | 16 | 17 | See [here](https://www.intel.com/content/www/us/en/developer/articles/technical/memory-performance-in-a-nutshell.html#:~:text=Similar%20to%20MCDRAM-,100%20GB/second,-Main%20memory%20on). -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '4.12' 4 | --- 5 | > Suppose a user wants to grant **select** access on a relation to another user. 6 | > Why should the user include (or not include) the clause **granted by current role** 7 | > in the **grant** statement? 8 | 9 | -------------------------------- 10 | 11 | Both cases give the same authorization at the time the statement is executed, but 12 | the long-term effects differ. If the grant is done based on the role, then the grant 13 | remains in effect even if the user who performed the grant leaves and that user's 14 | account is terminated. Whether that is a good or bad idea depends on the 15 | specific situation, but usually granting through a role is more consistent with a well-run 16 | enterprise. 17 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '7.1' 4 | --- 5 | > Suppose that we decompose the schema $R = (A, B, C, D, E)$ into 6 | $$ 7 | (A, B, C) \\ 8 | (A, D, E) 9 | $$ 10 | > Show that this decomposition is a lossless decomposition if the following 11 | > set $F$ of functional dependencies holds: 12 | $$ 13 | A \rightarrow BC \\ 14 | CD \rightarrow E \\ 15 | B \rightarrow D \\ 16 | E \rightarrow A \\ 17 | $$ 18 | 19 | -------------------------------- 20 | 21 | A decomposition $ \{ R_1, R_2 \}$ is a lossless decomposition if $R_1 \cap R_2 \rightarrow R_1$ or 22 | $R_1 \cap R_2 \rightarrow R_2$. Let $R_1 = (A, B, C)$, $R_2 = (A, D, E)$, and $R_1 \cap R_2 = A$. 23 | Since $A$ is a candidate key (see Practice Exercise 7.6), $R_1 \cap R_2 \rightarrow R_1$. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '4.1' 4 | --- 5 | > Consider the following SQL query that seeks to find a list of titles of all courses 6 | > taught in Spring 2017 along with the name of the instructor. 7 | 8 | ```sql 9 | SELECT name, title 10 | FROM instructor NATURAL JOIN teaches NATURAL JOIN section NATURAL JOIN course 11 | WHERE semester = 'Spring' AND year = 2017; 12 | ``` 13 | 14 | > What is wrong with this query? 15 | 16 | -------------------------------- 17 | 18 | Although the query is syntactically correct, it does not compute the expected answer 19 | because _dept_name_ is an attribute of both _course_ and _instructor_. As a result of the 20 | natural join, results are shown only when an instructor teaches a course in her or his 21 | own department. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '5.13' 4 | --- 5 | > Suppose you were asked to define a class `MetaDisplay` in Java, containing a 6 | > method `static void printTable(String r)`; the method takes a relation name _r_ as 7 | > input, executes the query "**SELECT * FROM r**", and prints the result out in tabular 8 | > format, with the attribute names displayed in the header of the table. 9 | > 10 | > a. What do you need to know about relation _r_ to be able to print the result in the 11 | > specified tabular format?
12 | > b. What JDBC methods(s) can get you the required information?
13 | > c. Write the method `printTable(String r)` using the JDBC API.
14 | 15 | -------------------------------- 16 | 17 | ```java 18 | // TODO #2 19 | ``` 20 | 21 | -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '2.6' 4 | --- 5 | > Consider the employee database of Figure 2.17. Give an expression in the relational 6 | > algebra to express each of the following queries:
7 | > a. Find the name of each employee who lives in city "Miami".
8 | > b. Find the name of each employee whose salary is greater than $100000
9 | > c. Find the name of each employee who lives in "Miami" and whose salary is 10 | > greater than $100000.
11 | 12 | a. $\Pi_{person\_name}(\sigma_{city = "Miami"}(employee))$
13 | b. $\Pi_{person\_name}(\sigma_{salary > 100000}(employee \bowtie works))$
14 | c. $\Pi_{person\_name}(\sigma_{salary > 100000 \wedge city = "Miami"}(employee \bowtie works))$
15 | 16 | -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '5.7' 4 | --- 5 | > Consider the bank database of Figure 5.21. Write an SQL trigger to carry out the following 6 | > action: On **delete** of an account, for each customer-owner of the account, check if the 7 | > owner has any remaining accounts, and if she does not, delete her from the _depositor_ 8 | > relation. 9 | 10 | 11 | 12 | -------------------------------- 13 | 14 | ```sql 15 | CREATE TRIGGER check_delete_trigger 16 | AFTER DELETE ON account 17 | REFERENCING OLD ROW AS orow 18 | FOR EACH ROW 19 | DELETE FROM depositor 20 | WHERE depositor.customer_name NOT IN ( 21 | SELECT customer_name 22 | FROM depositor 23 | WHERE account_number <> orow.account_number 24 | ) 25 | END 26 | ``` 27 | 28 | -------------------------------------------------------------------------------- /Ch09_Application_Development/9.26.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 26 3 | title: '9.26' 4 | --- 5 | > Explain what is a challenge-response system for authentication. Why is 6 | > it more secure than a traditional password-based system? 7 | 8 | -------------------------------- 9 | 10 | The following is taken from [wiki](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication). 11 | 12 | **Challenge-response authentication** is a family of protocols in which one party presents 13 | a question ("challenge") and another party must provide a valid answer ("response") to be 14 | authenticated. 15 | 16 | The simplest example of a challenge-response protocol is password authentication, where the 17 | challenge is asking for the password and the valid response is the correct password. 18 | 19 | -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '6.3' 4 | --- 5 | > Design an E-R diagram for keeping track of the 6 | > scoring statistics of your favorite sports team. 7 | > You should store **the matches played**, **the scores 8 | > in each match**, **the players in each match**, and 9 | > **individual player scoring statistics for each match**. 10 | > Summary statistics should be modeled as derived 11 | > attributes with an explanation as to how they 12 | > are computed. 13 | 14 | -------------------------------- 15 | 16 | The diagram is shown in Figure 6.104. The derived 17 | attribute _season_score_ is computed by summing 18 | the score values associated with the _player_ 19 | entity set via the _played_ relationship set. 20 | 21 | -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '2.17' 4 | --- 5 | > Discuss the relative merits of imperative, functional, and declarative languages. 6 | 7 | -------------------------------- 8 | 9 | Merits of imperative languages 10 | * Easy to read 11 | * Conceptual model (solution path) is very easy for beginners to understand. 12 | * Characteristics of specific applications can be taken into account. 13 | [for more](https://www.ionos.com/digitalguide/websites/web-development/imperative-programming/) 14 | 15 | Mertis of functional languages 16 | * Lazy Evaluation 17 | * Seamless Parallel Programming 18 | [for more](https://en.wikipedia.org/wiki/Functional_programming) 19 | 20 | Merits of declarative languages 21 | * easy to use (since you only tell what you need). -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '8.10' 4 | --- 5 | > Redesign the database of Exercise 8.4 into first normal form and fourth normal form. 6 | > List any functional and multivalued dependencies that you assume. Also list all 7 | > referential-integrity constraints that should be present in the first and fourth 8 | > normal form schemas. 9 | 10 | -------------------------------- 11 | 12 | _Emp = (emp_id, ename)_
13 | _Children = (child_id, name, birthday)_
14 | _EmpChild = (emp_id, child_id)_
15 | 16 | _Skills = (type)_
17 | _Exams = (year, city)_
18 | _SkillExams = (type,year, city)_
19 | 20 | _SkillExamsEmp = (emp_id,type,year, city)_
21 | 22 | -------------------------------------------------------------------------------- /Ch10_Big_Data/10.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '10.5' 4 | --- 5 | > What is the conceptual problem with the following snippet of Apache Spark 6 | > code meant to work on very large data. Note that the **collect()** function 7 | > returns a Java collection, and Java collections (from Java 8 onwards) support 8 | > map and reduce functions. 9 | > 10 | > ```java 11 | > JavaRDD lines = sc.textFile('logDirectory'); 12 | > int totalLength = lines.collect().map(s -> s.length()) 13 | > .reduce(0, (a,b) -> a + b); 14 | > ``` 15 | 16 | -------------------------------- 17 | 18 | The problem with the code is that the **collect()** function gathers the RDD 19 | data at a single node, and the map and reduce functions are then executed on that 20 | single node, not in parallel as intended. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '5.4' 4 | --- 5 | > Describe the circumstances in which you would choose to use embedded SQL 6 | > rather than SQL alone or only a general-purpose programming language. 7 | 8 | -------------------------------- 9 | 10 | Writing queries in SQL is typically much easier than coding the same queries 11 | in a general-purpose programming language. However, not all kinds of queries 12 | can be written in SQL. Also, nondeclarative actions such as printing a report, 13 | interacting with a user, or sending the results of a query to a GUI cannot be 14 | done from within SQL. Under circumstances in which we want the best of both worlds, 15 | we can choose embedded SQL or dynamic SQL, rather than using SQL alone or using 16 | only a general-purpose programming language. 17 | 18 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '14.12' 4 | --- 5 | > What trade offs do buffer trees pose as compared to LSM trees? 6 | 7 | -------------------------------- 8 | 9 | The idea of buffer trees can be used with any tree-structured index to reduce the cost of 10 | inserts and updates, including spatial indices. In contrast, LSM trees can only be used with 11 | linearly ordered data that are amenable to merging. On the other hand, buffer trees require 12 | more random I/O to perform insert operations as compared to (all variants of) LSM trees. 13 | 14 | Write-optimized indices can significantly reduce the cost of inserts, and to a lesser extent, 15 | of updates, as compared to B+-trees. On the other hand, the index lookup cost can be significantly 16 | higher for write-optimized indices as compared to B+-trees. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '5.2' 4 | --- 5 | > Write a Java method using JDBC metadata features that takes a _ResultSet_ 6 | > as an input parameter and prints out the result in tabular form, with appropriate 7 | > names as column headings. 8 | 9 | -------------------------------- 10 | 11 | ```java 12 | printTable(ResultSet result) throws SQLException { 13 | metadata = result.getMetaData(); 14 | num_cols = metadata.getColumnCount(); 15 | for (int i = 1; i <= num_cols; i++) { 16 | System.out.print(metadata.getColumnName(i) + '\t'); 17 | } 18 | System.out.println(); 19 | while (result.next()) { 20 | for (int i = 1; i <= num_cols; i++) { 21 | System.out.print(result.getString(i) + '\t'); 22 | } 23 | System.out.println(); 24 | } 25 | } 26 | ``` -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '13.2' 4 | --- 5 | > Show the structure of the file of Figure 13.4 after each of the following steps: 6 | > 7 | > a. Insert(24556, Turnamian, Finance, 98000). 8 | > 9 | > b. Delete record 2. 10 | > 11 | > c. Insert (34556, Thompson, Music, 67000). 12 | 13 | -------------------------------- 14 | We use "$\uparrow i$" to denote a pointer to record "$i$". 15 | 16 | > a. Insert(24556, Turnamian, Finance, 98000). 17 | 18 | 19 | 20 | > b. Delete record 2. 21 | 22 | 23 | 24 | 25 | Note that the free record chain could have alternatively been from the header to 26 | 4, from 4 to 2, and finally from 2 to 6. 27 | 28 | > c. Insert (34556, Thompson, Music, 67000). 29 | 30 | 31 | -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '12.7' 4 | --- 5 | > Storing all blocks of a large file on consecutive disk blocks would 6 | > minimize seeks during sequential file reads. Why is it impractical to do so? 7 | > What do operating systems do instead, to minimize the number of seeks during 8 | > sequential reads? 9 | 10 | -------------------------------- 11 | 12 | Reading data sequentially from a large file could be done with only one seek 13 | if the entire file was stored on consecutive disk blocks. Ensuring availability 14 | of large numbers of consecutive free blocks is not easy, since files are created 15 | and deleted, resulting in fragmentation of the free blocks on disks. Operating systems 16 | allocate blocks on large but fixed-sized sequential extents instead, and only one seek 17 | is required per extent. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '5.1' 4 | --- 5 | > Consider the following relations for a company database: 6 | > * _emp(ename, dname, salary)_ 7 | > * _mgr(ename,mname)_
8 | > 9 | > and the Java code in Figure 5.20, which uses the JDBC API. Assume that the 10 | > userid, password, machine name, etc. are all okay. Descrie in concise English 11 | > what the Java program does. (That is, produce an English sentence like 12 | > "It finds the manager of the toy department," not a line-by-line description 13 | > of what each Java statement does.) 14 | 15 | -------------------------------- 16 | 17 | It prints out the manager of "dog", that manager's manager, etc., until we 18 | reach a manager who has no manager (presumably, the CEO, who most certainly is 19 | a cat). Note: If you try to run this, use your own Oracle ID and password. 20 | -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '12.5' 4 | --- 5 | > A database administrator can choose how many disks are organized into 6 | > a single RAID 5 array. What are the trade-offs between having fewer 7 | > disks versus more disks, in terms of cost, reliability, performance during 8 | > failure, and performance during rebuild? 9 | 10 | -------------------------------- 11 | 12 | Fewer disks has higher cost, but with more disks, the chance of two disk 13 | failures, which would lead to data loss, is higher. Further, performance 14 | during failure would be poor since a block read from a failed disk would result 15 | a large number of block reads from the other disks. Similarly, the overhead 16 | for rebuilding the failed disk would also be higher, since more disks 17 | need to be read to reconstruct the data in the failed disk. 😎 -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.41.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 41 3 | title: '7.41' 4 | --- 5 | > Explain why 4NF is a normal form more desirable than BCNF. 6 | 7 | -------------------------------- 8 | 9 | 4NF is more desirable because it removes redundancy that BCNF doesn't. 10 | 11 | To quote an example from the book presented on the beginning of section 7.6, 12 | take the relational schema:- 13 | 14 | _r2(ID,dept_name,street,city)_ 15 | 16 | We must repeat the department name once for each address that an instructor has, 17 | and we must repeat the address for each department with which an instructor is 18 | associated. Eventhough, we have this redundancy the relation _r2_ is still in 19 | BCNF, though it is **NOT** in 4NF. 20 | 21 | 4NF removes this redundancy, by decomposing _r2_ into the schemas:- 22 | 23 | _(ID, dept_name)_
24 | _(ID, street, city)_ -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '12.2' 4 | --- 5 | > Some databases use magnetic disks in a way that only sectors in outer tracks 6 | > are used, while sectors in inner tracks are left unused. What might be the benefits 7 | > of doing so? 8 | 9 | -------------------------------- 10 | 11 | The disk's data-transfer rate will be greater on the outer tracks than the inner tracks. 12 | This is because the disk spins at a constant rate, so more sectors pass underneath the 13 | drive head in a given amount of time when the arm is positioned on an outer track than when 14 | on an inner track. Even more importantly, by using only outer tracks, the disk arm movement is 15 | minimized, reducing the disk access latency. This aspect is important for transaction-processing 16 | systems, where latency affects the transaction-processing rate. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '7.15' 4 | --- 5 | > The algorithm to generate a canonical cover only removes one extraneous 6 | > attribute at a time. Use the functional dependencies from Exercise 7.14 to 7 | > show what can go wrong if two attributes inferred to be extraneous are deleted 8 | > at once. 9 | 10 | -------------------------------- 11 | 12 | In $X \rightarrow YZ$, one can infer that $Y$ is extraneous, and so is $Z$. But 13 | deleting both will result in a set of dependencies from which $X \rightarrow YZ$ 14 | can no longer be inferred. Deleting $Y$ results in $Z$ no longer being extraneous, 15 | and deleting $Z$ results in $Y$ no longer being extraneous. The canonical cover 16 | algorithm only deletes one attribute at a time, avoiding the problem that could occur 17 | if two attributes are deleted at the same time. -------------------------------------------------------------------------------- /Ch01_Introduction/1.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '1.13' 4 | --- 5 | > List two features developed in the 2000s and that help database systems handle 6 | > data-analytics workloads. 7 | 8 | 1) Database systems featured physical data organizations suitable for analytic 9 | processing, such as "column-stores," in which tables are stored by column 10 | rather than the traditional row-oriented storage of the major commerical 11 | database systems. 12 | 13 | 2) The huge volumes of data, as well as the fact that much of the data used 14 | for analytics was textual or semi-structured, led to the development of programming 15 | frameworks, such as _map-reduce_, to facilitate application programmers' 16 | use of parallelism in analyzing data. 17 | 18 | [column-stores](https://en.wikipedia.org/wiki/Column-oriented_DBMS) 19 | [map-reduce](https://en.wikipedia.org/wiki/MapReduce) 20 | -------------------------------------------------------------------------------- /Ch10_Big_Data/10.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '10.11' 4 | --- 5 | > One of the characteristics of Big Data is the variety of data. Explain why 6 | > this characteristic has resulted in the need for languages other than 7 | > SQL for processing Big Data. 8 | 9 | -------------------------------- 10 | 11 | While much of today's data can be efficiently represented in relational form, there 12 | are many sources that have other forms of data, such as semi-structured data, textual 13 | data, and graph data. The SQL query language is well suited to specifying a variety 14 | of queries on relational data, and it has been extended to handle semi-structured data. 15 | However, many computations cannot be easily expressed in SQL or efficiently evaluated 16 | if represented using SQL. Hence the need for languages other than SQL for processing 17 | these types of data (Big Data). -------------------------------------------------------------------------------- /Ch01_Introduction/1.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '1.3' 4 | --- 5 | > List six major steps that you would take in setting up a database for a particular enterprise. 6 | 7 | Six major steps in setting up a database for a particular enterprise are: 8 | 9 | * Define the high-level requirements of the enterprise (this step generates a 10 | document known as the system requirements specification). 11 | 12 | * Define a model containing all appropriate types of data and data relationships. 13 | 14 | * Define the integrity constraints on the data. 15 | 16 | * Define the physical level. 17 | 18 | * For each known problem to be solved on a regular basis (e.g., tasks to be 19 | carried out by clerks or web users), define a user interface to carry out the 20 | task, and write the necessary application programs to implement the user interface. 21 | 22 | * Create/initalize the database. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '4.17' 4 | --- 5 | > Express the following query in SQL using no subqueries and no set operations. 6 | > 7 | > ```sql 8 | > SELECT id 9 | > FROM student 10 | > 11 | > EXCEPT 12 | > 13 | > SELECT s_id 14 | > FROM advisor 15 | > WHERE i_id IS NOT NULL 16 | > ``` 17 | 18 | -------------------------------- 19 | 20 | The above query is going to get the ids of students that don't have an advisor. 21 | That means, those students that have ids that don't appear in the _advisor_ relation or 22 | their advisor's id is set to null. 23 | 24 | ```sql 25 | SELECT s.id 26 | FROM student s LEFT OUTER JOIN advisor a 27 | ON s.id = a.s_id 28 | WHERE a.i_id IS NULL 29 | OR a.s_id IS NULL; 30 | ``` 31 | 32 | Note that we are assuming, the primary key of _advisor_ to be _s_id_. That is each student 33 | can have at most one advisor. -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '13.13' 4 | --- 5 | > Give a normalized version of the Index_metadata relation, and 6 | > explain why using the normalized version would result in worse performance. 7 | 8 | -------------------------------- 9 | 10 | 11 | 12 | The above schema will result in worse performance, because given an **index_name** 13 | we would have to perform a join with the relation **Index_Attributes** every time 14 | we want to see the attributes that are associated with that index. 15 | 16 | Note that the above schema is in first normal form. Normalizing it further, to BCNF 17 | will result in a more even worse schema in the eyes of performance. It is not in 18 | BCNF because of the functional dependency index_name $\twoheadrightarrow$ relation_name 19 | in the table **Index_Attributes**. -------------------------------------------------------------------------------- /Ch10_Big_Data/10.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '10.14' 4 | --- 5 | > Fill in the blanks below to complete the following Apache Spark 6 | > program which computes the number of occurrences of each word in a file. 7 | > For simplicity we assume that words only occur in lowercase, and there 8 | > are no punctuations marks. 9 | > 10 | > ```java 11 | > JavaRDD textFile = sc.textFile("hdfs://..."); 12 | > JavaPairRDD counts = textFile.____(s->Arrays.asList(s.split(" "))._____()) 13 | > .mapToPair(word -> new _______).reduceByKey((a,b) -> a + b); 14 | > ``` 15 | 16 | -------------------------------- 17 | 18 | ```java 19 | JavaRDD textFile = sc.textFile("hdfs://..."); 20 | JavaPairRDD counts = textFile.flatMap(s->Arrays.asList(s.split(" ")).iterator()) 21 | .mapToPair(word -> new Tuple2<>(word, 1)).reduceByKey((a,b) -> a + b); 22 | ``` -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.23.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 23 3 | title: '3.23' 4 | --- 5 | > Consider the query: 6 | 7 | ```sql 8 | WITH dept_total(dept_name, value) AS ( 9 | SELECT dept_name, SUM(salary) 10 | FROM instructor 11 | GROUP BY dept_name 12 | ), dept_total_avg(value) AS ( 13 | SELECT AVG(value) 14 | FROM dept_total 15 | ) 16 | SELECT dept_name 17 | FROM dept_total, dept_total_avg 18 | WHERE dept_total.value >= dept_total_avg.value 19 | ``` 20 | Rewrite this query without using the **with** construct. 21 | 22 | -------------------------------- 23 | 24 | ```sql 25 | SELECT dept_name 26 | FROM (SELECT dept_name, SUM(salary) AS value FROM instructor GROUP BY dept_name) AS dept_total, 27 | (SELECT AVG(value) AS value FROM (SELECT dept_name, SUM(salary) AS value FROM instructor GROUP BY dept_name) AS x) AS dept_total_avg 28 | WHERE dept_total.value >= dept_total_avg.value 29 | ``` -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.24.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 24 3 | title: '6.24' 4 | --- 5 | > Design a database for an airline. The database must keep track of customers 6 | > and their reservations, flights and their status, seat assignments on 7 | > individual flights, and the schedule and routing of future flights. 8 | > 9 | > Your design should include an E-R diagram, a set of relational schemas, and 10 | > a list of constraints, including primary-key and foreign-key cosntraints. 11 | 12 | -------------------------------- 13 | 14 | 15 | 16 | Relation schemas: 17 | 18 | 19 | customer(customer_id, name, phone_number, address)
20 | flights(flight_id,src,dest,timestamp_src,timestamp_dest)
21 | reservation(customer_id,flight_id)
22 |
23 | 24 | 25 |
26 | 27 | 28 | Note: This is a very simplistic design! -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '2.2' 4 | --- 5 | > Consider the foreign-key constraint from the _dept_name_ attribute 6 | > of _instructor_ to the _department_ relation. Give examples of 7 | > inserts and deletes to these relations that can cause a violation 8 | > of the foreign-key constraint. 9 | 10 | * Insert a tuple
11 | (10111, Ostrom, Economics, 110000)
12 | into the _instructor_ table, where the _department_ table does not 13 | have the department _Economics_, would violate the foreign-key constraint. 14 | (Refer Figure 2.4 and Figure 2.5 for the instances of the relation _instructor_ 15 | and _department_). 16 | 17 | * Delete the tuple
18 | (Biology, Watson, 90000)
19 | from the _department_ table, where at least one _student_ or _instructor_ tuple 20 | has _dept_name_ as Biology, would violate the foreign-key constraint. -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '6.12' 4 | --- 5 | > Consider the following lattice structure of generalization and specialization (attributes 6 | > are not shown). 7 | 8 | 9 | 10 | > For entity sets $A$, $B$, and $C$, explain how attributes are inherited from the higher 11 | > level entity sets $X$ and $Y$. Discuss how to handle a case where an attribute of $X$ 12 | > has the same name as some attribute of $Y$. 13 | 14 | -------------------------------- 15 | 16 | $A$ inherits all the attributes of $X$, plus it may define its own attributes. Similarly, $C$ 17 | inherits all the attributes of $Y$ plus its own attributes. $B$ inherits the attributes of both 18 | $X$ and $Y$. If there is some attribute _name_ which belongs to both $X$ and $Y$, it may be 19 | referred to in $B$ by the qualified name $X$._name_ or $Y$._name_. -------------------------------------------------------------------------------- /Ch14_Indexing/14.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '14.7' 4 | --- 5 | > What would the occupancy of each leaf node of a B+-tree be if index entries 6 | > were inserted in sorted order? Explain why. 7 | 8 | -------------------------------- 9 | 10 | If the index entries are inserted in ascending order, the new entries get directed 11 | to the last leaf node. When this leaf node gets filled, it is split into two. Of the 12 | two nodes generated by the split, the left node is left untouched and the insertions 13 | take place on the right node. This makes the occupancy of the leaf nodes about 50 percent 14 | except for the last leaf. 15 | 16 | If keys that are inserted are sorted in descending order, the above situation would 17 | still occur, but symmetrically, with the right node of a split never getting touched again, 18 | and occupancy would again be 50 percent for all nodes other than the first leaf. -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '12.11' 4 | --- 5 | > RAID systems typically allow you to replace failed disks without stopping 6 | > access to the system. Thus, the data in the failed disk must be rebuilt 7 | > and written to the replacement disk while the system is in operation. 8 | > Which of the RAID levels yields the least amount of interference 9 | > between the rebuild and ongoing disk accesses? Explain your answer. 10 | 11 | -------------------------------- 12 | 13 | **RAID level 1** (Mirroring Disks). 14 | 15 | The time to rebuild the data of a failed disk can be significant, and it 16 | varies with the RAID level that is used. Rebuilding is easiest for **RAID 17 | level 1**, since data can be copied from another disk; for the other levels 18 | (excluding RAID level 0), we need to access all the other disks in the 19 | array to rebuild data of a failed disk. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '4.14' 4 | --- 5 | > Consider the query 6 | > ```sql 7 | > SELECT course_id,semester,year,sec_id,AVG(tot_cred) 8 | > FROM takes NATURAL JOIN student 9 | > WHERE year = 2017 10 | > GROUP BY course_id,semester,year,sec_id 11 | > HAVING COUNT(id) >= 2; 12 | > ``` 13 | > Explain why appending **NATURAL JOIN** _section_ in the **FROM** clause would not change the 14 | > result. 15 | 16 | -------------------------------- 17 | 18 | Appending **NATURAL JOIN** _section_ in the **FROM** clause would not change the 19 | result because there is no logic in the query that uses attributes in _section_ that 20 | are not found in _takes_ relation. Also since every tuple in _takes_ relation has a corresponding 21 | tuple in the _section_ relation (due to foreign key constraint), there will be no change if we append 22 | the **NATURAL JOIN** _section_ in the **FROM** clause. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.35.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 35 3 | title: '7.35' 4 | --- 5 | > Although the BCNF algorithm ensures that the resulting decomposition 6 | > is lossless, it is possible to have a schema and a decompsition that 7 | > was not generated by the algorithm, that is in BCNF, and is not 8 | > lossless. Give an example of such a schema and its decomposition. 9 | 10 | -------------------------------- 11 | 12 | Take the schema $R = (A,B,C,D,E)$ given on Practice Exercise 1. 13 | Assume the following set $F$ of functional dependencies holds: 14 | 15 | $$ 16 | F := \{A \rightarrow BC, CD \rightarrow E, B \rightarrow D, E \rightarrow A\} 17 | $$ 18 | 19 | Take the decomposition of $R$ given on Exercise 7.29. 20 | 21 | $$ 22 | (A,B,C) \\ 23 | (C,D,E) \\ 24 | $$ 25 | 26 | The above two decompsitions are clearly in BCNF. 27 | 28 | They are also **lossy decompositions** (We have shown this in Exercise 7.29). -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.25.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 25 3 | title: '6.25' 4 | --- 5 | > In Section 6.9.4, we represented a ternary relationship (repeated in Figure 6 | > 6.29a) using binary relationships, as shown in Figure 6.29b. Consider the 7 | > alternative shown in Figure 6.29c. Discuss the relative mertis of these 8 | > two alternative representations of a ternary relationship by binary relationships. 9 | 10 | 11 | 12 | 13 | -------------------------------- 14 | 15 | For Figure 6.29b to represent a ternary relationship, all relationship sets 16 | $R_A$, $R_B$, and $R_C$ need to be many-to-one relationships, with a total 17 | participation from entity set $E$. 18 | 19 | Let $(a,b,c)$ in $R$ (the ternary relationship in the above figure). Then we insert 20 | 21 | * $(a,b)$ in $R_{AB}$ 22 | * $(b,c)$ in $R_{BC}$ 23 | * $(c,a)$ in $R_{AC}$ 24 | 25 | TODO: This answer needs more explanation. -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '2.15' 4 | --- 5 | > Consider the bank database of Figure 2.18. Give an expression in the 6 | > relational algebra for each of the following queries: 7 | > 8 | > a. Find each loan number with a loan amount greater than $10000.
9 | > b. Find the ID of each depositor who has an account with a balance 10 | > greater than $6000.
11 | > c. Find the ID of each depositor who has an account with a balance 12 | > greater than $6000 at the "Uptown" branch.
13 | 14 | -------------------------------- 15 | 16 | a. $\Pi_{loan\_number}(\sigma_{amount > 10000}(loan))$
17 | b. $\Pi_{ID}(depositor \bowtie_{depositor.account\_number = account.account\_number} (\sigma_{balance > 6000}(account)))$
18 | c. $\Pi_{ID}(depositor \bowtie_{depositor.account\_number = account.account\_number} (\sigma_{balance > 6000 \wedge branch\_name = "Uptown"}(account) ))$
-------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '4.10' 4 | --- 5 | > Given the relations _a(name,address,title)_ and _b(name,address,salary)_, show 6 | > how to express _a_ **natural full outer join** _b_ using the **full outer-join** 7 | > operation with an **on** condition rather than using the **natural join** syntax. 8 | > This can be done using the **coalesce** operation. Make sure that the result relation 9 | > does not contain two copies of the attributes _name_ and _address_ and that the solution 10 | > is correct even if some tuples in _a_ and _b_ have null values for attributes 11 | > _name_ or _address_. 12 | 13 | -------------------------------- 14 | 15 | ```sql 16 | SELECT COALESCE(a.name, b.name) AS name, 17 | COALESCE(a.address, b.address) AS address, 18 | a.title, 19 | b.salary 20 | FROM a FULL OUTER JOIN b 21 | ON a.name = b.name AND 22 | a.address = b.address; 23 | ``` 24 | 25 | 26 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '7.2' 4 | --- 5 | > List all nontrivial functional dependencies satisfied by the relation 6 | > of Figure 7.17. 7 | 8 | 9 | 10 | -------------------------------- 11 | 12 | The nontrivial functional dependencies are $A \rightarrow B$ and $C \rightarrow B$, 13 | and a dependency they logically imply: $AC \rightarrow B$. $C$ does not functionally 14 | determine $A$ because the first and third tuples have the same $C$ value but different 15 | $A$ values. The same tuples also show $B$ does not functionally determine $A$. Likewise, 16 | $A$ does not functionally determine $C$ because the first two tuples have the same $A$ 17 | value and different $C$ values. The same tuples also show $B$ does not functionally 18 | determine $C$. There are 19 trivial functional dependencies of the form $\alpha \rightarrow \beta$, 19 | where $\beta \subseteq \alpha$. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.29.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 29 3 | title: '7.29' 4 | --- 5 | > Show that the following decomposition of the schema $R$ of Exercise 7.1 is not a lossless 6 | > decomposition: 7 | > 8 | > $$ 9 | > (A,B,C) \\ 10 | > (C,D,E) \\ 11 | > $$ 12 | > 13 | > _Hint_: Give an example of a relation $r(R)$ such that $\Pi_{A,B,C}(r) \bowtie \Pi_{C,D,E}(r) \not = r$ 14 | 15 | -------------------------------- 16 | 17 | Take the following instance of $r(R)$:- 18 | 19 | A|B|C|D|E 20 | -|-|-|-|- 21 | 1|6|5|7|3 22 | 2|8|5|9|4 23 | 24 | Then $\Pi_{A,B,C}(r)$ is:- 25 | 26 | A|B|C 27 | -|-|- 28 | 1|6|5 29 | 2|8|5 30 | 31 | $\Pi_{C,D,E}(r)$ is:- 32 | 33 | C|D|E 34 | -|-|- 35 | 5|7|3 36 | 5|9|4 37 | 38 | And their natural join $\Pi_{A,B,C}(r) \bowtie \Pi_{C,D,E}(r)$ is:- 39 | 40 | A|B|C|D|E 41 | -|-|-|-|- 42 | 1|6|5|7|3 43 | 1|6|5|9|4 44 | 2|8|5|7|3 45 | 2|8|5|9|4 46 | 47 | Thus, the decomposition is a **lossy decomposition**. -------------------------------------------------------------------------------- /Ch10_Big_Data/10.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '10.9' 4 | --- 5 | > Suppose you wish to model the university schema as a graph. For each of the 6 | > following relations, explain whether the relation would be modeled as a node 7 | > or as an edge:
8 | > (i) _student_
9 | > (ii) _instructor_
10 | > (iii) _course_
11 | > (iv) _section_
12 | > (v) _takes_
13 | > (vi) _teaches_
14 | > Does the model capture connections between sections and courses? 15 | -------------------------------- 16 | 17 | Each relation corresponding to an entity (student, instructor, course, section) would be 18 | modeled as a node. _Takes_ and _teaches_ would be modeled as edges. There is a further 19 | edge between _course_ and _section_, which has been merged into the _section_ relation 20 | and cannot be captured with the above schema. It can be modeled if we create a separate 21 | relation that links sections to courses. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '9.6' 4 | --- 5 | > List three ways in which caching can be used to speed up web server 6 | > performance. 7 | 8 | -------------------------------- 9 | 10 | Caching can be used improve performance by exploiting the commonalities 11 | between transactions. 12 | 13 | a. If the application code for servicing each request needs to open a 14 | connection to the database, which is time consuming, then a pool of 15 | open connections may be created beforehand, and each request uses one 16 | from those. 17 | 18 | b. The results of a query generated by a request can be cached. If the 19 | same request comes again, or generates the same query, then the cached 20 | result can be used instead of connecting to the database again. 21 | 22 | c. The final web page generated in response to a request can be cached. 23 | If the same request comes again, then the cached page can be outputed. -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.11.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 11 3 | title: '8.11' 4 | --- 5 | > Consider the schemas for the table _people_, and the tables _students_ 6 | > and _teachers_, which were created under _people_, in Section 8.2.1.3. 7 | > Give a relational schema in third normal form that represents the same 8 | > information. Recall the constraints on subtables, and give all constraints 9 | > that must be imposed on the relational schema so that every database instance 10 | > of the relational schema can also be representated by an instance of the schema 11 | > with inheritance. 12 | 13 | -------------------------------- 14 | 15 | Take the schema: 16 | 17 | _tpeople(ID, name, address, degree, salary)_ 18 | 19 | with the added constraint that at least one attribute of {_degree_ , _salary_ } is NULL. 20 | 21 | The functional dependency that holds on _tpeople_:- 22 | 23 | $\{$ _ID_ $\} \rightarrow $ $\{$ _name, address, degree, salary_ $\}$ -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '6.10' 4 | --- 5 | > Consider a many-to-one relationship $R$ between entity sets $A$ and $B$. 6 | > Suppose the relation created from $R$ is combined with the relation created 7 | > from $A$. In SQL, attributes participating in a foreign key constraint can be null. 8 | > Explain how a constraint on total participation of $A$ in $R$ can be enforced using 9 | > **NOT NULL** constriant in SQL. 10 | 11 | -------------------------------- 12 | 13 | The foreign-key attribute in $R$ corresponding to the primary key of $B$ should be 14 | made **NOT NULL**. This ensures that no tuple of $A$ which is not related to any entry 15 | in $B$ under $R$ can come in $R$. For example, say **a** is a tuple in $A$ which has no 16 | corresponding entry in $R$. This means when $R$ is combined with $A$, it would have 17 | a foreign-key attribute corresponding to $B$ as **NULL**, which is not allowed. -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '6.4' 4 | --- 5 | > Consider an E-R diagram in which the same entity 6 | > set appears several times, with its attributes 7 | > repeated in more than one occurrence. Why is 8 | > allowing this redundancy a bad practice that 9 | > one should avoid? 10 | 11 | -------------------------------- 12 | 13 | The reason an entity set would appear more than 14 | once is, if one is drawing a diagram that spans multiple 15 | pages. 16 | 17 | The different occurrences of an entity set may have 18 | different sets of attributes, leading to an inconsistent 19 | diagram. Instead, the attributes of an entity set should be 20 | specified only once. All other occurrences of the entity should 21 | omit attributes. Since it is not possible to have 22 | an entity set without any attributes, an occurrence of 23 | an entity set without attributes clearly indicates that the 24 | attributes are specified elsewhere. -------------------------------------------------------------------------------- /Ch14_Indexing/14.24.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 24 3 | title: '14.24' 4 | --- 5 | > An _existence bitmap_ has a bit for each record position, with the bit set to 6 | > $1$ if the record exists, and $0$ if there is no record at that position (for example, 7 | > if the record were deleted). Show how to compute the existence bitmap from other bitmaps. 8 | > Make sure that your technique works even in the presence of null values by using a bitmap 9 | > for the value _null_. 10 | 11 | -------------------------------- 12 | 13 | Let $r(A, B, C)$ be a relation. Suppose there is a bitmap index on the attribute $A$. 14 | Suppose the attribute $A$ can have 2 distinct values and can also be _null_. Thus the bitmap 15 | index on the attribute $A$ is going to have 3 bitmaps, namely: $S_1, S_2, S_{null}$. 16 | 17 | Thus, the _existence bitmap_ can be computed as follows:- 18 | 19 | $$ 20 | S_1 \vee S_2 \vee S_{null} 21 | $$ 22 | 23 | where $\vee$ denotes the bitwise or operator. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.23.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 23 3 | title: '5.23' 4 | --- 5 | > Consider the _nyse_ relation of Exercise 5.9. For each month of each year, 6 | > show the total monthly dollar volume and the average monthly dollar volume 7 | > for that month and the two prior months. (Hint: First write a query to find 8 | > the total dollar volume for each month of each year. Once that is right, put 9 | > that in the from clause of the outer query that solves the full problem. That 10 | > outer query will need windowing. The subquery does not.) 11 | 12 | -------------------------------- 13 | 14 | ```sql 15 | WITH nyse_monthly(year,month,total_monthly_dollar_volume) AS ( 16 | SELECT year,month,SUM(dollar_volume) AS total_monthly_dollar_volume 17 | FROM nyse 18 | GROUP BY year,month 19 | ) 20 | SELECT year,month,total_monthly_dollar_volume, 21 | AVG(total_monthly_dollar_volume) OVER (ORDER BY (year,month) ASC ROWS 2 PRECEDING) 22 | FROM nyse_monthly; 23 | ``` -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '13.6' 4 | --- 5 | > Suppose your university has a very large number of _takes_ records, accumulated over 6 | > many years. Explain how table partitioning can be done on the _takes_ relation, and 7 | > what benefits it could offer. Explain also one potential drawback of the technique. 8 | 9 | -------------------------------- 10 | 11 | The table can be partitioned on _(year, semester)_. Old _takes_ records that are no 12 | longer accessed frequently can be stored on magnetic disk, while newer records can be stored 13 | on SSD (they could also be stored on _storage-class memory_ like _Intel Optanes_). 14 | Queries that specify a year can be answered without reading records for other 15 | years. 16 | 17 | A drawback is that queries that fetch records corresponding to multiple years will have 18 | a higher overhead, since the records may be partitioned across different relations and 19 | disk blocks. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.22.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 22 3 | title: '4.22' 4 | --- 5 | > Show how to express the **coalesce** function using the **case** construct. 6 | 7 | -------------------------------- 8 | 9 | The following is taken from postgresql 14.1 documentation in section 9.18.2. 10 | 11 | ``` 12 | COALESCE(value [, ...]) 13 | 14 | The COALESCE function returns the first of its arguments that is not null. 15 | Null is returned only if all arguments are null. 16 | ``` 17 | 18 | If a query is written using **coalesce** function as:- 19 | 20 | ```sql 21 | SELECT COALESCE( 22 | d1, 23 | d2, 24 | . 25 | . 26 | . 27 | dn, 28 | ) FROM r 29 | ``` 30 | 31 | It can be rewritten as:- 32 | 33 | ```sql 34 | SELECT 35 | CASE 36 | WHEN d1 IS NOT NULL THEN d1 37 | WHEN d2 IS NOT NULL THEN d2 38 | . 39 | . 40 | . 41 | WHEN dn IS NOT NULL THEN dn 42 | ELSE NULL 43 | END 44 | FROM r 45 | ``` 46 | -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '6.8' 4 | --- 5 | > Consider a relation such as _sec_course_, generated from a many-to-one relationship 6 | > set _sec_course_. Do the primary and foreign key constraints created on the relation 7 | > enforce the many-to-one cardinality constraint? Explain why. 8 | 9 | -------------------------------- 10 | 11 | In this example, the primary key of _section_ consists of the attributes 12 | _(course_id, sec_id, semester, year)_ which would also be the primary key of 13 | _sec_course_, while _course_id_ is a foreign key from _sec_course_ referencing 14 | _course_. These constraints ensure that a particular _section_ can only 15 | correspond to one _course_, and thus the many-to-one cardinality constraint 16 | is enforced. 17 | 18 | However, these constaints cannot enforce a total participation constraint, since 19 | a course or a section may not participate in the _sec_course_ relationship. 20 | 21 | -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '11.10' 4 | --- 5 | > Suppose half of all the transactions in a clothes shop purchase jeans, and one-third 6 | > of all transactions in the shop purchase T-shirts. Suppose also that half of the 7 | > transactions that purchase jeans also purchase T-shirts. Write down all the (nontrivial) 8 | > association rules you can deduce from the above information, giving support and 9 | > confidence of each rule. 10 | 11 | -------------------------------- 12 | Here is what we know: 13 | * 50% of all transactions purchase jeans. 14 | * 33.33% of all transactions purchase T-shirts. 15 | * 50% of all transactions that purchase jeans also purchase T-shirts. 16 | 17 | Possible association rule: 18 | 19 | 1. jeans => T-shirts 20 | * **Support**: 25% <= support <= 33.33% 21 | * **Confidence**: 50% 22 | 23 | 2. T-shirts => jeans 24 | * **Support**: 25% <= support <= 33.33% 25 | * **Confidence**: We cannot say anything here 26 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '7.9' 4 | --- 5 | > Given the database schema $R(A,B,C)$, and a relation $r$ on the schema $R$, 6 | > write an SQL query to test whether the functional dependency $B \rightarrow C$ holds 7 | > on relation $r$. Also write an SQL assertion that enforces the functional dependency. 8 | > Assume that no null values are present. (Although part of the SQL standard, such 9 | > assertions are not supported by any database implementation currently.) 10 | 11 | -------------------------------- 12 | 13 | a. The query is given below. Its result is non-empty if and only if 14 | $B \rightarrow C$ **does not** hold on $r$. 15 | 16 | ```sql 17 | SELECT B 18 | FROM r 19 | GROUP BY B 20 | HAVING COUNT(DISTINCT C) > 1 21 | ``` 22 | 23 | b. 24 | 25 | ```sql 26 | CREATE ASSERTION b_to_c CHECK ( 27 | NOT EXISTS ( 28 | SELECT B 29 | FROM r 30 | GROUP BY B 31 | HAVING COUNT(DISTINCT C) > 1 32 | ) 33 | ) 34 | ``` -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '11.7' 4 | --- 5 | > Why is column-oriented storage potentially advantageous in a database system 6 | > that supports a data warehouse? 7 | 8 | -------------------------------- 9 | 10 | Column-oriented storage has at least two major benefits over row-oriented storage: 11 | 12 | 1. When a query needs to access only a few attributes of a relation with a large number 13 | of attributes, the remaining attributes need not be fetched from disk into memory. In 14 | contrast, in row-oriented storage, not only are irrelevant attributes fetched into 15 | memory, but they may also get prefetched into processor cache (L1, L2, L3), wasting 16 | cache space and memory bandwidth, if they are stored adjacent to attributes used in the 17 | query. 18 | 19 | 2. Storing values of the same type together increases the effectiveness of compression; 20 | compression can greatly reduce both the disk storage cost and the time to retrieve data 21 | from disk. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '9.1' 4 | --- 5 | > What is the main reason why servlets give better performance than programs 6 | > that use the common gateway interface (CGI), even though Java programs generally 7 | > run slower than C or C++ programs? 8 | 9 | -------------------------------- 10 | 11 | The CGI interface starts a new process to service each request, which has a 12 | significat operating system overhead. On the other hand, servlets are run as 13 | threads of an existing process, avoiding this overhead. Further, the process 14 | running threads could be the web server process itself, avoiding interprocess 15 | communication, which can be expensive. Thus, for small to moderate-sized tasks, 16 | the overhead of Java is less than the overhead saved by avoiding process 17 | creation and communication. 18 | 19 | 20 | For tasks involving a lot of CPU activity, this may not be the case, and using 21 | CGI with a C or C++ program may give better performance. -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '2.5' 4 | --- 5 | > What is the result of first performing the Cartesian product of _student_ 6 | > and _advisor_, and then performing a selection operation on the result 7 | > with the predicate _s_id_ = ID? (Using the symbolic notation of relational 8 | > algebra, this query can be written as $\sigma_{s\_id = ID}(student \times advisor)$.) 9 | 10 | The result attributes include all attribute values of _student_ followed by 11 | all attributes of _advisor_. The tuples in the result are as follows: For each 12 | student who has an advisor, the result has a row containing the student's 13 | attributes, followed by _s_id_ attribute identical to the student's ID attribute, 14 | followed by the _i_id_ attribute containing the ID of the students advisor. 15 | 16 | Students who do not have an advisor will not appear in the result. A student who 17 | has more than one advisor will appear a corresponding number of times in the result. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '9.3' 4 | --- 5 | > Consider a carelessly written web application for an online-shopping 6 | > site, which stores the price of each item as a hidden form variable in 7 | > the web page sent to the customer; when the customer submits the form, 8 | > the information from the hidden form variable is used to compute the 9 | > bill for the customer. What is the loophole in this scheme? (There was 10 | > a real instance where the loophole was exploited by some customers of 11 | > an online-shopping site before the problem was detected and fixed.) 12 | 13 | -------------------------------- 14 | 15 | A hacker can edit the HTML source code of the web page and replace the value 16 | of the hidden variable price with another value, use the modified web page 17 | to place an order. The web application would then use the user-modified 18 | value as the price of the product. 19 | 20 | Note that, the hacker can even set the price of the product to 0. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '7.10' 4 | --- 5 | > Our discussion of lossless decomposition implicitly assumed that attributes 6 | > on the left-hand side of a functional dependency cannot take on null values. What 7 | > could go wrong on decomposition, if this property is violated? 8 | 9 | -------------------------------- 10 | 11 | The natural join operator is defined in terms of the Cartesian product and the 12 | selection operator. The selection operator gives _unknown_ for any query on a 13 | null value. Thus, the natural join excludes all tuples with null values on the 14 | common attributes from the final result. Thus, the decomposition would be 15 | lossy (in a manner different from the usual case of lossy decomposition), if null 16 | values occur in the left-hand side of the functional dependency used to decompose 17 | the relation. (Null values in attributes that occur only in the right-hand side of the 18 | functional dependency do not cause any problems). -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '15.2' 4 | --- 5 | > Consider the bank database of Figure 15.14, where the primary keys are underlined, and the 6 | > following SQL query: 7 | > 8 | > ```sql 9 | > SELECT T.branch_name 10 | > FROM branch T, branch S 11 | > WHERE T.assets > S.assets AND S.branch_city = "Brooklyn" 12 | > ``` 13 | > Write an efficient relational-algebra expression that is equivalent to this query. 14 | > Justify your choice. 15 | > 16 | > 17 | 18 | -------------------------------- 19 | Query: 20 | 21 | $ \Pi_{T.branch\_name} ( (\Pi_{branch\_name, assets}(\rho_T(branch))) \bowtie_{T.assets > S.assets} (\Pi_{assets} (\sigma_{branch\_city = 'Brooklyn'}(\rho_S(branch))))) $ 22 | 23 | This expression performs the theta join on the smallest amount of data possible. It does 24 | this by restricting the right-hand side operand of the join to only those branches in Brooklyn 25 | and also eliminating the unneeded attributes from both the operands. -------------------------------------------------------------------------------- /Ch01_Introduction/1.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '1.7' 4 | --- 5 | > List four significant differences between a file-processing system and a DBMS. 6 | 7 | 1) A file-processing system is more specific to the problem at hand while a DBMS 8 | is more general. A file-processing system used by a university is difficult to 9 | use in a hospital setting. While a DBMS once written can be used in different places. 10 | 11 | 2) It is difficult to ensure atomicity in a conventional file-processing system while 12 | it is a lot easier in a DBMS. Often wrapping a set of SQL statements in a "BEGIN TRANSACTION" 13 | and "END TRANSACTION" are often enough in the relational DBMS world. 14 | 15 | 3) Protecting against concurrent-access anomalies in a file-processing system is 16 | difficult. Using a DBMS is much easier to protect against concurrent-access anomalies. 17 | 18 | 4) Most DBMS have a concept of a user and what access that user has. Enforcing such 19 | authorization in a file-processing system is really difficult. 20 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '14.1' 4 | --- 5 | > Indices speed query processing, but it is usually a bad idea to create indices on 6 | > every attribute, and every combination of attributes, that are potential search keys. 7 | > Explain why. 8 | 9 | -------------------------------- 10 | 11 | Reasons for not keeping indices on every attribute include: 12 | 13 | * Every index requires additional CPU time and disk I/O overhead during inserts 14 | and deletions. 15 | 16 | * Indices on non-primary keys might have to be changed on updates, although an 17 | index on the primary key might not (this is because updates typically do not modify 18 | the primary-key attributes). 19 | 20 | * Each extra index requires additional storage space. 21 | 22 | * For queries which involve conditions on several search keys, efficiency might not 23 | be bad even if only some of the keys have indices on them. Therefore, database performance 24 | is improved less by adding indices when many indices already exist. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '9.10' 4 | --- 5 | > Exercise 9.9 addresses the problem of encryption of certain attributes. 6 | > However, some database systems support encryption of entire databases. 7 | > Explain how the problems raised in Exercise 9.9 are avoided if the entire 8 | > database is encrypted. 9 | 10 | -------------------------------- 11 | 12 | When the entire database is encrypted, it is easy for the database to perform 13 | decryption as data are fetched from disk into memory, so in-memory storage is 14 | unencrypted. With this option, everything in the database, including indices, is 15 | encrypted when on disk, but unencrypted in memory. As a result, only the data access 16 | layer of the database system code needs to be modified to perform encryption, leaving 17 | other layers untouched. Thus, indices can be used unchanged, and primary-key and 18 | foreign-key constraints enforced without any change to the corresponding layers of the 19 | database system code. -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '15.19' 4 | --- 5 | > Design a variant of the hybrid merge-join algorithm for the case where both relations 6 | > are not physically sorted, but both have a sorted secondary index on the join attributes. 7 | 8 | -------------------------------- 9 | 10 | Let $r$ and $s$ be the two relations that are unsorted but have an ordered secondary index 11 | (such as B+-trees) on the join attributes. We use an algorithm similar to the one given on 12 | Figure 15.7, but instead of acting on the actual relations, it will operate on the leaf nodes 13 | of the indices of $r$ and $s$. The result file contains addresses for tuples of $r$ and addresses 14 | for tuples of $s$. We will first sort the file on the addresses for tuples of $r$ and retrieve the 15 | actual tuples in physical storage order and concatenate them with their counterparts of $s$. We will 16 | then sort this new file on the addresses of $s$ and retrieve them in phyiscal storage order. 17 | This will complete the join. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.19.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 19 3 | title: '9.19' 4 | --- 5 | > Many web sites today provide rich user interfaces using Ajax. List 6 | > two features each of which reveals if a site uses Ajax, without having 7 | > to look at the source code. Using the above features, find three sites 8 | > which use Ajax; you can view the HTML source of the page to check if the 9 | > site is actually using Ajax. 10 | 11 | -------------------------------- 12 | 13 | If a website can load data without refreshing the page, it is using Ajax. 14 | 15 | Two features that reveal if a site uses Ajax: 16 | 1. Infinite scroll (like the one on twitter) 17 | 2. Maps loading new data when you zoom (like the one on Google maps) 18 | 19 | Three sites that use Ajax: 20 | 1. [Google Maps](https://www.google.com/maps/) 21 | 2. [Twitter](https://twitter.com/home) 22 | 3. Pretty much every website that uses [Ads](https://www.google.com/adsense/start/). 23 | Since the ads need to be refreshed, without refreshing the web page they need to use Ajax. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.2.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 2 3 | title: '9.2' 4 | --- 5 | > List some benefits and drawbacks of connectionless protocols over protocols 6 | > that maintain connections. 7 | 8 | -------------------------------- 9 | 10 | Most computers have limits on the number of simultaneous connections they can 11 | accept. With connectionless protocols, connections are broken as soon as the 12 | request is satisfied, and therefore other clients can open connections. Thus 13 | more clients can be served at the same time. A request can be routed to any one 14 | of a number of different servers to balance load, and if a server crashes, 15 | another can take over without the client noticing any problem. 16 | 17 | The drawback of connectionless protocols is that a connection has to be 18 | reestablished every time a request is sent. Also, session information has 19 | to be sent each time in the form of cookies or hidden fields. This make 20 | them slower than the protocols which maintain connections in case state information 21 | is required. -------------------------------------------------------------------------------- /Ch14_Indexing/14.26.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 26 3 | title: '14.26' 4 | --- 5 | > Some attributes of relations may contain sensitive data, and may be 6 | > required to be stored in an encrypted fashion. How does data encryption 7 | > affect index schemes? In particular, how might it affect schemes that 8 | > attempt to store data in sorted order? 9 | 10 | -------------------------------- 11 | 12 | Let $r(A, B, C)$ be a relation. Suppose attribute $B$ of relation $r$ contains 13 | sensitive data and we want it encrypted. Suppose also we want to create an index 14 | on attribute $B$. 15 | 16 | But there are so many questions that we can ask from the above paragraph: 17 | 1. What kind of encryption algorithm are we going to use? 18 | 2. Are we going to create the index on attribute $B$, before it is encrypted or after? 19 | That is, are we going to index on the plain text values of attribute $B$ or the encrypted 20 | values of attribute $B$? 21 | 3. In what kinds of situation are we in? That is, 22 | what is the attack model? Who is Alice, Bob and Eve??? -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.18.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 18 3 | title: '4.18' 4 | --- 5 | > For the database of Figure 4.12, write a query to find the ID of each employee 6 | > with no manager. Note that an employee may simply have no manager listed or may 7 | > have a _null_ manager. Write your query using an outer join and then write it 8 | > again using no outer join at all. 9 | 10 | -------------------------------- 11 | 12 | Note that, the primary key of the _manages_ relation is _id_. Thus each employee can 13 | have at most one manager. 14 | 15 | Using an outer join we can write the query as:- 16 | 17 | ```sql 18 | SELECT e.id 19 | FROM employee e LEFT OUTER JOIN manages m 20 | ON e.id = m.id 21 | WHERE m.id IS NULL 22 | OR m.manager_id IS NULL; 23 | ``` 24 | 25 | Using no outer join we can write the query as:- 26 | 27 | ```sql 28 | SELECT id 29 | FROM employee 30 | WHERE id NOT IN ( 31 | -- this gives us all of the employee ids that have a manager 32 | SELECT id 33 | FROM manages 34 | WHERE manager_id IS NOT NULL 35 | ); 36 | ``` -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '11.8' 4 | --- 5 | > Consider each of the _takes_ and _teaches_ relations as a fact table; they do not 6 | > have an explicit measure attribute, but assume each table has a measure attribute reg_count 7 | > whose value is always 1. What would the dimension attributes and dimension tables 8 | > be in each case. Would the resultant schemas be star schemas or snowflake schemas? 9 | 10 | -------------------------------- 11 | 12 | Dimension attributes: 13 | 14 | * Dimension attributes of the _takes_ relation: ID, course_id, sec_id, semester, year, grade 15 | * Dimension attributes of the _teaches_ relation: ID, course_id, sec_id, semester, year 16 | 17 | Dimension tables: 18 | 19 | * Dimension tables of the _takes_ relation: student, section, course, department, classroom, time_slot 20 | * Dimension tables of the _teaches_ relation: instructor, section, course, department, classroom, time_slot 21 | 22 | 23 | The resultant schema would be a **snowflake schema** ❄️. -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '7.4' 4 | --- 5 | > Use Armstrong's axioms to prove the soundness of the union rule. (_Hint_: Use the 6 | > augmentation rule to show that, if $\alpha \rightarrow \beta$, then 7 | > $\alpha \rightarrow \alpha \beta$. Apply the augmentation rule again, using 8 | > $\alpha \rightarrow \gamma$, and then apply the transitivity rule.) 9 | 10 | -------------------------------- 11 | 12 | To prove that: 13 | 14 | $$ 15 | \text{if $\alpha \rightarrow \beta$ and $\alpha \rightarrow \gamma$ then $\alpha \rightarrow \beta \gamma$ } 16 | $$ 17 | 18 | Following the hint, we derive: 19 | 20 | $$ \alpha \rightarrow \beta \quad \text{given}\\ 21 | \alpha \alpha \rightarrow \alpha \beta \quad \text{augmentation rule}\\ 22 | \alpha \rightarrow \alpha \beta \quad \text{union of identical sets}\\ 23 | \alpha \rightarrow \gamma \quad \text{given}\\ 24 | \alpha\beta \rightarrow \gamma\beta \quad \text{augmentation rule}\\ 25 | \alpha \rightarrow \beta\gamma \quad \text{transitivity rule and set union commutativity} 26 | $$ -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '4.13' 4 | --- 5 | > Consider a view _v_ whose definition references only relation _r_. 6 | > * If a user is granted **select** authorization on _v_, does that user need to have 7 | > **select** authorization on _r_ as well? Why or why not?
8 | > * If a user is granted **update** authorization on _v_, does that user need to have 9 | > **update** authorization on _r_ as well? Why or why not?
10 | > * Give an example of an **insert** operation on a view _v_ to add a tuple _t_ that is 11 | > not visible in the result of **select * from v**. Explain your answer. 12 | 13 | -------------------------------- 14 | 15 | * No. This allows a user to be granted access to only part of relation _r_. 16 | 17 | * Yes. A valid update issued using view _v_ must update _r_ for the update to be 18 | stoared in the database. 19 | 20 | * Any tuple _t_ compatible with the schema for _v_ but not satisfying the **where** 21 | clause in the definiton of view _v_ is a valid example. ONe such example appears in 22 | Section 4.2.4. 23 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '7.3' 4 | --- 5 | > Explain how functional dependencies can be used to indicate the following: 6 | > * A one-to-one relationship set exists between entity sets _student_ and _instructor_. 7 | > * A many-to-one relationship set exists between entity sets _student_ and _instructor_. 8 | 9 | 10 | -------------------------------- 11 | 12 | Let $Pk(r)$ denote the primary key of relation $r$. 13 | 14 | * The functional dependencies $Pk(student) \rightarrow Pk(instructor)$ and 15 | $Pk(instructor) \rightarrow Pk(student)$ indicate a one-to-one relationship 16 | because any two tuples with the same value for _student_ must have the same 17 | value for _instructor_, and any two tuples agreeing on _instructor_ must have 18 | the same value for _student_. 19 | 20 | * The functional dependency $Pk(student) \rightarrow Pk(instructor)$ indicates a 21 | many-to-one relationship since any student value which is repeated will have the 22 | same instructor value, but many student values may have the same instructor value. 23 | -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.21.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 21 3 | title: '6.21' 4 | --- 5 | > Consider the E-R diagram in Figure 6.30, which models an online bookstore. 6 | > 7 | > a. Suppose the bookstore adds Blu-ray discs and downloadable video to its 8 | > collection. The same item may be present in one or both formats, with 9 | > differing prices. Draw the part of the E-R diagram that models this addition, 10 | > show just the parts related to video. 11 | > 12 | > b. Now extend the full E-R diagram to model the case where a shopping basket 13 | > may contain any combination of books, Blu-ray discs, or downloadable video.
14 | 15 | 16 | 17 | -------------------------------- 18 | 19 | a. 20 | 21 | 22 | 23 | Note that `Blu_ray_discs` and `downloadable_videos` are weak entities while 24 | `video_in_bluray` and `video_on_net` are the identifying relationships sets. 25 | `video` is the identifying entity set and owns both of the weak entities. 26 | 27 | b. 28 | 29 | 30 | -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '7.14' 4 | --- 5 | > Show that there can be more than one canonical cover for a given set of functional 6 | > dependencies, using the following set of dependencies: 7 | > $$ 8 | > X \rightarrow YZ, Y \rightarrow XZ, \text{ and } Z \rightarrow XY 9 | > $$ 10 | 11 | -------------------------------- 12 | 13 | Consider the first functional dependency. We can verify that $Z$ is extraneous in 14 | $X \rightarrow YZ$ and delete it. Subsequently, we can similarly check that $X$ is 15 | extraneous in $Y \rightarrow XZ$ and delete it, and that $Y$ is extraneous in 16 | $Z \rightarrow XY$ and delete it, resulting in a canonical cover 17 | $X \rightarrow Y, Y \rightarrow Z, Z \rightarrow X$. 18 | 19 | However, we can also verify that $Y$ is extraneous in $X \rightarrow YZ$ and delete it. 20 | Subsequently, we can similarly check that $Z$ is extraneous in $Y \rightarrow XZ$ and 21 | delete it, and that $X$ is extraneous in $Z \rightarrow XY$ and delete it, resulting 22 | in a canonical cover $X \rightarrow Z, Y \rightarrow X, Z \rightarrow Y$. -------------------------------------------------------------------------------- /Ch14_Indexing/14.21.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 21 3 | title: '14.21' 4 | --- 5 | > Suppose you have to create a B+-tree index on a large number of names, where 6 | > the maximum size of a name may be quite large (say 40 characters) and the 7 | > average name is itself large (say 10 characters). Explain how prefix compression 8 | > can be used to maximize the average fanout of nonleaf nodes. 9 | 10 | -------------------------------- 11 | 12 | With **prefix compression**, we do not store the entire search key value at nonleaf 13 | nodes. We only store a prefix of each search key value that is sufficient to distinguish 14 | between the key values in the subtrees that it separates. For example, if we had an 15 | index on names, the key values at a nonleaf node could be a prefix of a name; it may 16 | suffice to store "Silb" at a nonleaf node, instead of the full "Silberschatz" if the 17 | closest values in the two subtrees that it separates are, say, "Silas" and "Silver" 18 | respectively. Obviously "Silb" is going to occupy less space than "Silberschatz", thus 19 | maximizing the average fanout of nonleaf nodes. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.18.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 18 3 | title: '9.18' 4 | --- 5 | > Explain the terms CRUD and REST. 6 | 7 | -------------------------------- 8 | 9 | * **CRUD** stands for **Create, Read, Update,** and **Delete**. They are the 10 | four basic operations of persistent storage. For more info, read the wiki page of CRUD: 11 | [wiki](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) 12 | 13 | * **REST** stands for **Representational state transfer**. It is a software architectural 14 | style that describes a uniform interface between physically separate components, often 15 | across the Internet in a Client-Server architecture. In web development REST allows 16 | content to be rendered when it's requested, often referred to as Dynamic Content. 17 | RESTful Dynamic content uses server-side rendering to generate a website and 18 | send the content to the requesting web browser, which interprets the server's code and 19 | renders the page in the user's web browser. For more info, read the wiki page of REST: 20 | [wiki](https://en.wikipedia.org/wiki/Representational_state_transfer) -------------------------------------------------------------------------------- /Ch14_Indexing/14.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '14.3' 4 | --- 5 | > Construct a B+-tree for the following set of key values: 6 | > $$ 7 | > (2, 3, 5, 7, 11, 17, 19, 23, 29, 31) 8 | > $$ 9 | > Assume that the tree is initially empty and values are added in 10 | > ascending order. Construct B+-trees for the case where the number 11 | > of pointers that will fit in one node is as follows: 12 | > 13 | > a. Four 14 | > 15 | > b. Six 16 | > 17 | > c. Eight 18 | 19 | -------------------------------- 20 | 21 | The following were generated by inserting values into the B+-tree in ascending 22 | order. A node (other than the root) was never allowed to have fewer than 23 | $\lceil n / 2 \rceil$ values/pointers. 24 | 25 | > a. Four 26 | 27 | 28 | 29 | > b. Six 30 | 31 | 32 | 33 | > c. Eight 34 | 35 | 36 | 37 | TODO: change the above 3 pngs into an animated gif that shows the insertion step by step. 38 | 39 | If you want to play with B+-tree animation, head on over to: [B+ tree animation](https://dichchankinh.com/~galles/visualization/BPlusTree.html). -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Noah Abe 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '15.14' 4 | --- 5 | > Suggest how a document containing a word (such as "leopard") can be indexed 6 | > such that it is efficiently retrieved by queries using a more general concept 7 | > (such as "carnivore" or "mammal"). You can assume that the concept hierarchy is 8 | > not very deep, so each concept has only a few generalizations (a concept can, 9 | > however, have a large number of specializations). You can also assume that you are 10 | > provided with a function that returns the concept for each word in a document. Also 11 | > suggest how a query using a specialized concept can retrieve documents using a more 12 | > general concept. 13 | 14 | -------------------------------- 15 | 16 | By using a **classification hierarchy** in which the general concepts lie in the upper parts 17 | of the hierarchy and the specializations lie in the lower parts of the hierarchy. 18 | 19 | I highly recommend reading **Section 31.9 Directories and Categories** from the 20 | online chapter of the book "[Chapter 31: Information Retrieval](https://www.db-book.com/online-chapters-dir/31.pdf)". It is free. -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.4.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 4 3 | title: '4.4' 4 | --- 5 | > Suppose we have three relations $r(A,B)$, $s(B,C)$, and $t(B,D)$, with all attributes declared 6 | > as **not null**. 7 | >
8 | > a. Give instances of relations $r$,$s$, and $t$ such that in the result of 9 | > ```sql 10 | > (r NATURAL LEFT OUTER JOIN s) NATURAL LEFT OUTER JOIN t 11 | > ``` 12 | > attribute C has a null value but attribute D has a non-null value.
13 | > b. Are there instances of $r$, $s$, and $t$ such that the result of 14 | > ```sql 15 | > r NATURAL LEFT OUTER JOIN (s NATURAL LEFT OUTER JOIN t) 16 | > ``` 17 | > has a null value for $C$ but a non-null value for $D$? Explain why or why not. 18 | 19 | -------------------------------- 20 | 21 | a. Consider $r = (a,b)$, $s = (b1,c1)$, $t = (b,d)$. The second expression would give 22 | $(a,b,null,d)$. 23 | 24 | b. Since s **natural left outer join** t is computed first, the absence of nulls in both 25 | s and t implies that each tuple of the result can have D null, but C can never be null. 26 | 27 |
28 | 29 | Note: This imples that the the operation **natural left outer join** is **NOT** associative. -------------------------------------------------------------------------------- /Ch11_Data_Analytics/11.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '11.5' 4 | --- 5 | > Classification can be done using _classification rules_, which have a _condition_, 6 | > a _class_, and a _confidence_; the confidence is the percentage of the inputs satisfying 7 | > the condition that fall in the specified class. 8 | > 9 | > For example, a classification rule for credit ratings may have a condition that salary 10 | > is between $30,000 and $50,000, and education level is graduate, with the credit rating 11 | > class of _good_, and a confidence of 80%. A second rule may have a condition that 12 | > salary is between $30,000 and $50,000, and education level is high-school, with the 13 | > credit rating class of _satisfactory_, and a confidence of 80%. A third rule may have 14 | > a condition that salary is above $50,001, with the credit rating class of _excellent_, 15 | > and a confidence of 90%. Show a decision tree classifier corresponding to the above 16 | > rules. 17 | > 18 | > Show how the decision tree classifier can be extended to record the confidence values. 19 | 20 | -------------------------------- 21 | 22 | 23 | -------------------------------------------------------------------------------- /Ch14_Indexing/14.25.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 25 3 | title: '14.25' 4 | --- 5 | > Spatial indices that can index spatial intervals can conceptually be used to index 6 | > temporal data by treating valid time as a time interval. What is the problem 7 | > with doing so, and how is the problem solved? 8 | 9 | -------------------------------- 10 | 11 | The problem of using a spatial index is that the end time of an interval may be infinity 12 | (perhaps represented by a very large value), whereas spatial indices typically assume that 13 | bounding boxes are finite, and may have poor performance if bounding boxes are very large. 14 | This problem can be dealt with as follows: 15 | 16 | * All current tuples (i.e., those with end time as infinity, which is perhaps represented 17 | by a large time value) are stored in a separate index from those tuples that have a non-infinite 18 | end time. The index on current tuples can be a B+-tree index. 19 | 20 | * Those tuples that are not current can be indexed by a spatial index. When performing 21 | a lookup, we look both in the spatial index and our B+-tree index. 22 | 23 | Read more in section 14.10.2 (Indexing Temporal Data). -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.9.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 9 3 | title: '4.9' 4 | --- 5 | > SQL allows a foreign-key dependecy to refer to the same relation, as in the following 6 | > example: 7 | > ```sql 8 | > CREATE TABLE manager ( 9 | > employee_id char(20), 10 | > manager_id char(20), 11 | > PRIMARY KEY employee_id, 12 | > FOREIGN KEY (manager_id) REFERENCES manager (employee_id) 13 | > ON DELETE CASCADE 14 | > ); 15 | > ``` 16 | > Here, _employee_id_ is a key to the table _manager_, meaning that each employee has at 17 | > at most one manager. The foreign-key clause requires that every manager also be an employee. 18 | > Explain exactly what happens when a tuple in the relation _manager_ is deleted. 19 | 20 | -------------------------------- 21 | 22 | The tuples of all employees of the manager, at all levels, get deleted as well! 23 | This happens in a series of steps. The inital deletion will 24 | trigger deletion of all the tuples corresponding to direct 25 | employees of the manager. These deletions will in turn cause 26 | deletions of second-level employee tuples, and so on, till all 27 | direct and indirect employee tuples are deleted. 28 | -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '6.15' 4 | --- 5 | > Construct an E-R diagram for a hospital with a set of patients and a set of medical 6 | > doctors. Associate with each patient a log of the various tests and examinations 7 | > conducted. 8 | 9 | -------------------------------- 10 | 11 | 12 | 13 | `patientTests` is a ternary relationship set. 14 | 15 | Another method is, to make the `testsAndExaminations` entity a weak entity having identifying 16 | entity set `Patient`. And then adding a relationship set between the weak entity `testsAndExaminations` 17 | and `MedicalDoctor`, representing which medical doctor performed which test and examination. 18 | In fact doing that has the added benefit of constraining each entity in `testsAndExaminations` to a 19 | single `Patient`. 20 | 21 | But using a ternary relationship as depicted in the above diagram, also has its benefits. For example, 22 | if a group of patients are tested and examined by the same type of test and have the same result, we 23 | might associate each of the patients in the group to the same entity in `testsAndExaminations`. -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '13.8' 4 | --- 5 | > PostgreSQL normally uses a small buffer, leaving it to the operating system 6 | > buffer manager to manage the rest of main memory available for file system 7 | > buffering. Explain (a) What is the benefit of this approach, and 8 | > (b) one key limitation of this approach. 9 | 10 | -------------------------------- 11 | 12 | The database system does not know what are the memory demands from other processes. 13 | By using a small buffer, PostgreSQL ensures that it does not grab too much of main memory. 14 | But at the same time, even if a block is evicted from the buffer, if the file system buffer 15 | manager has enough memory allocated to it, the evicted page is likely to still be cached 16 | in the file system buffer. Thus, a database buffer miss is often not very expensive 17 | since the block is still in the file system buffer. 18 | 19 | The drawback of this approach is that the database system may not be able to control 20 | the file system buffer replacement policy. Thus, the operating system may make suboptimal 21 | decisions on what to evict from the file system buffer. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '3.17' 4 | --- 5 | > Consider the employee database of Figure 3.19. Give an expression in SQL for 6 | > each of the following queries. 7 | > 8 | > a. Give all employees for "First Bank Corporation" a 10 percent raise.
9 | > b. Give all managers of "First Bank Corporation" a 10 percent raise.
10 | > c. Delete all tuples in the _works_ relation for employees of "Small Bank Corporation".
11 | 12 | -------------------------------- 13 | 14 | a. Give all employees for "First Bank Corporation" a 10 percent raise. 15 | 16 | ```sql 17 | UPDATE works 18 | SET salary = salary * 1.1 19 | WHERE company_name = 'First Bank Corporation' 20 | ``` 21 | 22 | b. Give all managers of "First Bank Corporation" a 10 percent raise. 23 | 24 | ```sql 25 | UPDATE works 26 | SET salary = salary * 1.1 27 | WHERE company_name = 'First Bank Corporation' AND id IN ( 28 | SELECT manager_id 29 | FROM manages 30 | ) 31 | ``` 32 | 33 | c. Delete all tuples in the _works_ relation for employees of "Small Bank Corporation". 34 | 35 | ```sql 36 | DELETE FROM works 37 | WHERE company_name = 'Small Bank Corporation' 38 | ``` -------------------------------------------------------------------------------- /Ch02_Introduction_to_the_Relational_Model/2.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '2.12' 4 | --- 5 | > Consider the bank database of Figure 2.18. Assume that branch names 6 | > and customer names uniquely identify branches and customers, but loans 7 | > and accounts can be associated with moren than one customer.
8 | > a. What are the appropriate primary keys?
9 | > b. Given your choice of primary keys, identify appropriate foreign keys. 10 | 11 | -------------------------------- 12 | 13 | a. 14 | 15 | |Relation Name|Primary key| 16 | |-------------|-----------| 17 | |branch|branch_name| 18 | |customer|ID| 19 | |loan|loan_number| 20 | |borrower|{ID, loan_number}| 21 | |account|account_number| 22 | |depositor|{ID, account_number}| 23 | 24 | b. 25 | |Relation Name|Foreign key| 26 | |-------------|-----------| 27 | |branch|No Foreign Key| 28 | |customer|No Foreign Key| 29 | |loan|branch_name| 30 | |borrower|**ID** - a foreign key referencing **customer** relation, **loan_number** - a foreign key referencing **loan** relation| 31 | |account|branch_name| 32 | |depositor|**ID** - a foreign key referencing **customer** relation, **account_number** - a foreign key referencing **account** relation| -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.32.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 32 3 | title: '3.32' 4 | --- 5 | > Rewrite the preceding query, but also ensure that you include only instructors 6 | > who have given at least one other non-null grade in some course. 7 | 8 | -------------------------------- 9 | 10 | So the question paraphrased is as follows: write an SQL query to find the ID and name 11 | of each instructor who has never given an A grade in any course she or 12 | he has taught and who has given at least one other non-null grade in some course. 13 | 14 | 15 | ```sql 16 | SELECT id, name 17 | FROM instructor AS i 18 | WHERE 'A' NOT IN ( 19 | SELECT takes.grade 20 | FROM takes INNER JOIN teaches 21 | ON (takes.course_id,takes.sec_id,takes.semester,takes.year) = 22 | (teaches.course_id,teaches.sec_id,teaches.semester,teaches.year) 23 | WHERE teaches.id = i.id 24 | ) 25 | AND 26 | ( 27 | SELECT COUNT(*) 28 | FROM takes INNER JOIN teaches 29 | ON (takes.course_id,takes.sec_id,takes.semester,takes.year) = 30 | (teaches.course_id,teaches.sec_id,teaches.semester,teaches.year) 31 | WHERE teaches.id = i.id AND takes.grade IS NOT NULL 32 | ) >= 1 33 | ``` -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.23.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 23 3 | title: '7.23' 4 | --- 5 | > Explain what is meant by _repetition of information_ and _inability to represent information_. 6 | > Explain why each of these properties may indicate a bad relational-database design. 7 | 8 | -------------------------------- 9 | 10 | * _repetition of information_ : When inserting data into our database model, if the model 11 | requires us to insert the same information multiple times, then we say our database model 12 | has the _repetition of information_ issue. Note that we may sometimes intentionally want 13 | some information to be repeated for performance reasons. 14 | 15 | * _inability to represent information_ : If the database model was not designed well or 16 | not taking into account some things in reality, then the issue of "_inability to represent 17 | information_" may arise. For example, in our university schema if we removed the 18 | _department_ relation, and instead used the schema 19 | _instructor(ID, name, dept_name, salary)_ to represent both the instructor and the departments 20 | in the university, then our database model would NOT be able to represent a department 21 | having no instructors. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.21.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 21 3 | title: '9.21' 4 | --- 5 | > What is multifactor authentication? How does it help safeguard against stolen passwords? 6 | 7 | -------------------------------- 8 | 9 | **Multi-factor authentication** is an electronic authentication method in which a user 10 | is granted access to a website or application only after successfully presenting two 11 | or more pieces of evidence (or factors) to an authentication mechanism: knowledge 12 | (something only the user knows), possession (something only the user has), and 13 | inherence (something only the user is). 14 | 15 | **MFA** protects user data - which may include personal identification or financial 16 | assets - from being accessed by an unauthorised third party that may have been able to 17 | discover, for example, a single password. 18 | 19 | Note that the factors in MFA should not share a common vulnerability; for example, if 20 | a system merely required two passwords, both could be vulnerable to leakage in the same 21 | manner (by network sniffing, or by a virus on the computer used by the user, for example). 22 | 23 | Read more at [wiki](https://en.wikipedia.org/wiki/Multi-factor_authentication). -------------------------------------------------------------------------------- /Ch12_Physical_Storage_Systems/12.10.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 10 3 | title: '12.10' 4 | --- 5 | > Operating systems try to ensure that consecutive blocks of a file are 6 | > stored on consecutive disk blocks. Why is doing so very important 7 | > with magnetic disks? If SSDs were used instead, is doing so still 8 | > important, or is it irrelevant? Explain why. 9 | 10 | -------------------------------- 11 | 12 | Operating systems try to ensure that consecutive blocks of a file are 13 | stored on consecutive disk blocks. This is because, when processing the file 14 | (that is stored on disk), most programs perform a **sequential access pattern**. 15 | To read blocks in sequential access, a disk seek may be required for the 16 | first block, but successive requests would either not require a seek, or require 17 | a seek to an adjacent track, which is faster than a seek to a track that is 18 | farther away. Data transfer rates are highest, when consecutive file blocks are stored 19 | on consecutive disk blocks (since seek time is minimal). 20 | 21 | Storing consecutive blocks of a file on consecutive disk blocks, also improves 22 | performance when using SSDs. This is because of the read-ahead caching that is performed. -------------------------------------------------------------------------------- /Ch09_Application_Development/9.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '9.8' 4 | --- 5 | > Testing for SQL-injection vulnerability: 6 | > 7 | > a. Suggest an approach for testing an application to find if it 8 | > is vulnerable to SQL injection attacks on text input. 9 | > 10 | > b. Can SQL injection occur with forms of HTML input other than 11 | > text boxes? If so, how would you test for vulnerability? 12 | 13 | -------------------------------- 14 | 15 | a. One approach is to enter a string containing a single quote in each 16 | of the input text boxes of each of the forms provided by the application 17 | to see if the application correctly saves the value. If it does not save 18 | the value correctly and/or gives an error message, it is vulnerable to 19 | SQL injection. 20 | 21 | b. Yes, SQL injection can even occur with selection inputs such as 22 | drop-down menus, by modifying the value sent back to the server when 23 | the input value is chosen - for example by editing the page directly, or 24 | in the browser's DOM tree. Most modern browsers provide a way for users to 25 | edit the DOM tree. This feature can be used to modify the values sent to the 26 | application, inserting a single quote into the value. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 |

3 | Cover page of the book database system concepts 7th edition 4 |

5 | 6 |
7 | 8 | # Solutions to **Database System Concepts** _Seventh Edition_ 9 | 10 |
11 | 12 | The solutions to the Practice Exercises are given at the book's [website](https://www.db-book.com/Practice-Exercises/index-solu.html). But I include them here for completeness sake. 13 | 14 | 15 | I try to answer the Exercises. 16 | 17 | ## Usage 18 | 19 | ### Option 1 20 | 21 | Just head on over to [https://dsc-answers.web.app](https://dsc-answers.web.app) 22 | 23 | ### Option 2 24 | 25 | 1. Clone the repo: 26 | ``` 27 | git clone https://github.com/noahabe/database_system_concepts_answers 28 | ``` 29 | 30 | 2. Open the markdown files (`x.md`) in [VS Code (Visual Studio Code)](https://code.visualstudio.com/). 31 | 3. Then press `Ctrl + Shift + V` to see the rendered file. 32 | 33 | ## Contributing 34 | 35 | If you find any mistakes, please create an issue to tell me. 36 | 37 | Pull requests are also appreciated. 38 | -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '15.8' 4 | --- 5 | > Design sort-based and hash-based algorithms for computing the relational 6 | > division operation (see Practice Exercise 2.9 for a definition of the 7 | > division operation). 8 | 9 | -------------------------------- 10 | 11 | From Practice Exercise 2.9 we know that we can express the relational division operation 12 | as follows: 13 | 14 | $$ 15 | r \div s = \Pi_{R-S}(r) - \Pi_{R-S}((\Pi_{R-S}(r) \times s) - \Pi_{R-S, S}(r)) 16 | $$ 17 | 18 | From section 15.6.2 we know how to implement **projection**. 19 | 20 | From section 15.6.3 we know how to implement **set difference**. 21 | 22 | And for the **cartesian product** $\Pi_{R-S}(r) \times s$ we can use **Block Nested-Loop Join** 23 | discussed in section 15.5.2 (except that we don't have any predicate to test.) 24 | 25 | Thus to implement $r \div s$ simply use **pipelined evaluation** (see section 15.7.2) on the following expression: 26 | 27 | $$ 28 | \Pi_{R-S}(r) - \Pi_{R-S}((\Pi_{R-S}(r) \times s) - \Pi_{R-S, S}(r)) 29 | $$ 30 | 31 | Note: For an alternative answer to the above you can read page 5 of https://www.db-book.com/Practice-Exercises/PDF-practice-solu-dir/15.pdf 32 | -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '5.15' 4 | --- 5 | > Consider an employee database with two relations: 6 | 7 | > _employee(employee_name, street,city)_
8 | > _works(employee_name,company_name,salary)_ 9 | 10 | > where the primary keys are underlined. Write a function _avg_salary_ 11 | > that takes a company name as an argument and finds the average salary of 12 | > employees at that company. Then, write an SQL statement, using that function, 13 | > to find companies whose employees earn a higher salary, on average, than 14 | > the average salary at "First Bank." 15 | 16 | -------------------------------- 17 | 18 | ```sql 19 | -- The following defines the sql function avg_salary. 20 | -- Takes a company name as an argument and finds the average salary of 21 | -- employees at that company. 22 | CREATE FUNCTION avg_salary(company_name VARCHAR(20)) 23 | RETURNS REAL 24 | BEGIN 25 | DECLARE retval REAL; 26 | SELECT AVG(salary) 27 | FROM works 28 | WHERE works.company_name = company_name; 29 | RETURN retval; 30 | END; 31 | 32 | SELECT DISTINCT company_name 33 | FROM works 34 | WHERE avg_salary(company_name) > avg_salary('First Bank'); 35 | ``` 36 | 37 | -------------------------------------------------------------------------------- /Ch08_Complex_Data_Types/8.13.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 13 3 | title: '8.13' 4 | --- 5 | > Suppose you wish to perform keyword querying on a set of tuples in a 6 | > database, where each tuple has only a few attributes, each containing only 7 | > a few words. Does the concept of term frequency make sense in this context? And 8 | > that of inverse document frequency? Explain your answer. Also suggest how 9 | > you can define the similarity of two tuples using TF-IDF concepts. 10 | 11 | -------------------------------- 12 | 13 | Although querying on structured data are typically done using query languages 14 | such as SQL, users who are not familiar with the schema or the query language 15 | find it difficult to get information from such data. Based on the success of 16 | keyword querying in the context of information retrieval from the web, techniques 17 | have been developed to support keyword queries on structured and semi-structured 18 | data. 19 | 20 | The above paragraph is taken from section 8.3.4 of the book. 21 | 22 | So yes, the concepts of **term frequency** and **inverse document frequency** do make 23 | sense when performing keyword querying on a set of tuples in a database. 24 | 25 | The tuples will be considered as the documents. 26 | 27 | -------------------------------------------------------------------------------- /Ch13_Data_Storage_Structures/13.1.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 1 3 | title: '13.1' 4 | --- 5 | > Consider the deletion of record 5 from the file of Figure 13.3. Compare the relative 6 | > merits of the following techniques for implementing the deletion: 7 | > 8 | > a. Move record 6 to the space occupied by record 5, and move record 7 to the space occupied 9 | > by record 6. 10 | > 11 | > b. Move record 7 to the space occupied by record 5. 12 | > 13 | > c. Mark record 5 as deleted, and move no records. 14 | 15 | -------------------------------- 16 | 17 | a. Although moving record 6 to the space for 5 and moving record 7 to the space for 6 is 18 | the most straightforward approach, it requires moving the most records and involves the most 19 | accesses. 20 | 21 | b. Moving record 7 to the space for 5 moves fewer records (than option **a**) but destroys 22 | any ordering in the file. 23 | 24 | c. Marking the space for 5 as deleted preserves ordering and moves no records, but it requires 25 | additional overhead to keep track of all of the free space in the file. This method may lead 26 | to too many "holes" in the file, which if not compacted from time to time, will affect performance 27 | because of the reduced availability of contiguous free records. 28 | 29 | -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.3.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 3 3 | title: '3.3' 4 | --- 5 | > Write the following inserts, deletes, or updates in SQL, using the university schema.
6 | > a. Increase the salary of each instructor in the Comp. Sci. department by 10%.
7 | > b. Delete all courses that have never been offered (i.e., do not occur in the _section_ relation).
8 | > c. Insert every student whose _tot_cred_ attribute is greater than 100 as an instructor 9 | > in the same department, with a salary of $10,000.
10 | 11 | -------------------------------- 12 | 13 | a. Increase the salary of each instructor in the Comp. Sci. department by 10%. 14 | 15 | ```sql 16 | UPDATE instructor 17 | SET salary = salary * 1.10 18 | WHERE dept_name = 'Comp. Sci.' 19 | ``` 20 | 21 | b. Delete all courses that have never been offered (i.e., do not occur in the _section_ relation). 22 | 23 | ```sql 24 | DELETE FROM course 25 | WHERE course_id NOT IN (SELECT course_id FROM section) 26 | ``` 27 | 28 | c. Insert every student whose _tot_cred_ attribute is greater than 100 as an instructor 29 | in the same department, with a salary of $10,000. 30 | 31 | ```sql 32 | INSERT INTO instructor 33 | SELECT ID, name, dept_name, 10000 34 | FROM student 35 | WHERE tot_cred > 100 36 | ``` -------------------------------------------------------------------------------- /Ch04_Intermediate_SQL/4.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '4.7' 4 | --- 5 | > Consider the employee database of Figure 4.12. Give an SQL DDL definition of 6 | > this database. Identify referential-integrity constraints that should hold, and 7 | > include them in the DDL definition. 8 | 9 | -------------------------------- 10 | 11 | ```sql 12 | CREATE TABLE employee ( 13 | id INTEGER, 14 | person_name VARCHAR(50), 15 | street VARCHAR(50), 16 | city VARCHAR(50), 17 | PRIMARY KEY (id) 18 | ); 19 | 20 | CREATE TABLE company ( 21 | company_name VARCHAR(50), 22 | city VARCHAR(50), 23 | PRIMARY KEY(company_name) 24 | ); 25 | 26 | CREATE TABLE works ( 27 | id INTEGER, 28 | company_name VARCHAR(50), 29 | salary numeric(10,2), 30 | PRIMARY KEY(id), 31 | FOREIGN KEY (id) REFERENCES employee(id), 32 | FOREIGN KEY (company_name) REFERENCES company(company_name) 33 | ); 34 | 35 | CREATE TABLE manages ( 36 | id INTEGER, 37 | manager_id INTEGER, 38 | PRIMARY KEY (id), 39 | FOREIGN KEY (id) REFERENCES employee (id), 40 | FOREIGN KEY (manager_id) REFERENCES employee (id) 41 | ) 42 | ``` 43 | 44 | Note that alternative data types are possible. Other choices for **not null* attirbutes 45 | may be acceptable. -------------------------------------------------------------------------------- /Ch05_Advanced_SQL/5.16.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 16 3 | title: '5.16' 4 | --- 5 | > Consider the relational schema 6 | 7 | > _part(part_id, name, cost)_
8 | > _subpart(part_id, subpart_id, count)_
9 | 10 | > where the primary-key attributes are underlined. A tuple $(p_1,p_2,3)$ in the 11 | > _subpart_ relation denotes that the part with _part_id_ $p_2$ is a direct 12 | > subpart of the part with _part_id_ $p_1$, and $p_1$ has 3 copies of $p_2$. 13 | > Note that $p_2$ may itself have further subparts. Write a recursive SQL query 14 | > that outputs the names of all subparts of the part with part-id 'P-100'. 15 | 16 | -------------------------------- 17 | 18 | ```sql 19 | WITH RECURSIVE all_sub_parts_of_p100(part_id,name) AS ( 20 | ( 21 | SELECT p.part_id,p.name 22 | FROM part p INNER JOIN subpart s 23 | ON p.part_id = s.subpart_id 24 | WHERE s.part_id = 'P-100' 25 | ) 26 | UNION 27 | ( 28 | SELECT p.part_id,p.name 29 | FROM 30 | part p INNER JOIN subpart s 31 | ON p.part_id = s.subpart_id 32 | INNER JOIN all_sub_parts_of_p100 a 33 | ON s.part_id = a.part_id 34 | ) 35 | ) 36 | SELECT name FROM all_sub_parts_of_p100; 37 | ``` -------------------------------------------------------------------------------- /Ch10_Big_Data/10.6.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 6 3 | title: '10.6' 4 | --- 5 | > Apache Spark: 6 | > 7 | > a. How does Apache Spark perform computations in parallel? 8 | > 9 | > b. Explain the statement: "Apache Spark performs transformations on RDDs in 10 | > a lazy manner." 11 | > 12 | > c. What are some of the benefits of lazy evaluation of operations in Apache Spark? 13 | 14 | -------------------------------- 15 | 16 | > a. How does Apache Spark perform computations in parallel? 17 | 18 | RDDs are stored partitioned across multiple nodes. Each of the transformation 19 | operations on an RDD are executed in parallel on multiple nodes. 20 | 21 | > b. Explain the statement: "Apache Spark performs transformations on RDDs in 22 | > a lazy manner." 23 | 24 | Transformations are not executed immediately but postponed until the result is 25 | required for functions such as _collect()_ or _saveAsTextFile()_. 26 | 27 | > c. What are some of the benefits of lazy evaluation of operations in Apache Spark? 28 | 29 | The operations are organized into a tree, and query optimization can be applied to 30 | the tree to speed up computation. Also, answers can be pipelined from one operation to 31 | another, without being written to disk, to reduce time overheads of disk storage. 32 | 33 | -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.14.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 14 3 | title: '3.14' 4 | --- 5 | > Consider the insurance database of Figure 3.17, where the primary keys are underlined. 6 | > Construct the following SQL queries for this relational database. 7 | > 8 | > a. Find the number of accidents involving a car belonging to a person named "John Smith".
9 | > b. Update the damage amount for the car with license_plate "AABB2000" in the accident with report number 10 | > "AR2197" to $3000. 11 | 12 | -------------------------------- 13 | 14 | a. Find the number of accidents involving a car belonging to a person named "John Smith". 15 | 16 | ```sql 17 | WITH all_cars_owned_by_john_smith(license_plate) AS ( 18 | SELECT license_plate 19 | FROM person INNER JOIN owns ON person.driver_id = owns.driver_id 20 | WHERE person.name = 'John Smith' 21 | ) 22 | SELECT COUNT(DISTINCT report_number) 23 | FROM participated 24 | WHERE license_plate IN (SELECT license_plate FROM all_cars_owned_by_john_smith); 25 | ``` 26 | 27 | b. Update the damage amount for the car with license_plate "AABB2000" in the accident with report number 28 | "AR2197" to $3000. 29 | 30 | ```sql 31 | UPDATE participated 32 | SET damage_amount = 3000 33 | WHERE report_number = 'AR2197' AND license_plate = 'AABB2000'; 34 | ``` -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.31.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 31 3 | title: '7.31' 4 | --- 5 | > Consider the schema $R = (A,B,C,D,E,G)$ and the set $F$ of functional dependencies: 6 | > $$ 7 | > AB \rightarrow CD \\ 8 | > B \rightarrow D \\ 9 | > DE \rightarrow B \\ 10 | > DEG \rightarrow AB \\ 11 | > AC \rightarrow DE \\ 12 | > $$ 13 | > 14 | > $R$ is not in BCNF for many reasons, one of which arises from the functional 15 | > dependency $AB \rightarrow CD$. Explain why $AB \rightarrow CD$ shows that 16 | > $R$ is not in BCNF and then use the BCNF decomposition algorithm starting with 17 | > $AB \rightarrow CD$ to generate a BCNF decomposition of $R$. Once that is done, 18 | > determine whether your result is or is not dependency preserving, and explain 19 | > your reasoning. 20 | 21 | -------------------------------- 22 | 23 | $AB \rightarrow CD$ is **NOT** a trivial functional dependency. 24 | 25 | Also, $(AB)^+ = \{A, B, C, D, E\}$. That is, $AB$ is **NOT** a superkey. 26 | 27 | Thus the relation $R$ is **NOT** in BCNF. 28 | 29 | BCNF decomposition of $R$:- 30 | 31 | $$ 32 | \{ \{A,B,G\}, \{A,B,E\}, \{A,B,C\}, \{B,D\} \} 33 | $$ 34 | 35 | Clearly the above decomposition is NOT dependency preserving since the functional 36 | dependency $DE \rightarrow B$ cannot be tested by using only one relation. -------------------------------------------------------------------------------- /Ch03_Introduction_to_SQL/3.18.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 18 3 | title: '3.18' 4 | --- 5 | > Give an SQL schema definition for the employee database of Figure 3.19. 6 | > Choose an appropriate domain for each attribute and an appropriate primary 7 | > key for each relation schema. Include any foreign-key constraints that might be 8 | > appropriate. 9 | 10 | -------------------------------- 11 | 12 | ```sql 13 | CREATE TABLE employee ( 14 | id VARCHAR(8), 15 | person_name VARCHAR(30) NOT NULL, 16 | street VARCHAR(40), 17 | city VARCHAR(30), 18 | PRIMARY KEY (id) 19 | ); 20 | 21 | CREATE TABLE company ( 22 | company_name VARCHAR(40), 23 | city VARCHAR(30), 24 | PRIMARY KEY (company_name) 25 | ); 26 | 27 | CREATE TABLE works ( 28 | id VARCHAR(8), 29 | company_name VARCHAR(40), 30 | salary NUMERIC(10,2) CHECK (salary > 10000), 31 | PRIMARY KEY (id), 32 | FOREIGN KEY (id) REFERENCES employee(id) 33 | ON DELETE CASCADE, 34 | FOREIGN KEY (company_name) REFERENCES company(company_name) 35 | ON DELETE CASCADE 36 | ); 37 | 38 | CREATE TABLE manages ( 39 | id VARCHAR(8), 40 | manager_id VARCHAR(8), 41 | PRIMARY KEY (id), 42 | FOREIGN KEY (id) REFERENCES employee (id), 43 | FOREIGN KEY (manager_id) REFERENCES employee (id) 44 | ); 45 | ``` -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.15.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 15 3 | title: '15.15' 4 | --- 5 | > Explain why the nested-loops join algorithm (see Section 15.5.1) would work 6 | > poorly on a database stored in a column-oriented manner. Describe an alternative algorithm 7 | > that would work better, and explain why your solution is better. 8 | 9 | -------------------------------- 10 | 11 | If the nested-loops join algorithm is used as is, it would require tuples for each of the 12 | relations to be assembled before they are joined. Assembling tuples can be expensive in a 13 | column store, since each attribute may come from a separate area of the disk; the overhead 14 | of assembly would be particularly wasteful if many tuples do not satisfy the join condition 15 | and would be discarded. 16 | 17 | In such a situation it would be better to first find which tuples match by accessing only 18 | the join columns of the relations. Sort-merge join, hash join, or indexed nested loops join 19 | can be used for this task. After the join is performed, only tuples that get output by join 20 | need to be assembled; assembly can be done by sorting the join result on the record identifier 21 | of one of the relations and accessing the corresponding attributes, then resorting on record 22 | identifiers of the other relation to access its attributes. -------------------------------------------------------------------------------- /Ch10_Big_Data/10.8.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 8 3 | title: '10.8' 4 | --- 5 | > Consider the following query using the tumbling window operator: 6 | > ```sql 7 | > SELECT itemid, System.Timestamp as window_end, SUM(amount) 8 | > FROM order TIMESTAMP BY datetime 9 | > GROUP BY itemid, TUMBLINGWINDOW(hour, 1) 10 | > ``` 11 | > Give an equivalent query using normal SQL constructs, without using the 12 | > tumbling window operator. You can assume that the timestamp can be converted 13 | > to an integer value that represents the number of seconds elapsed since (say) 14 | > midnight, January 1, 1970, using the function _to_seconds(timestamp)_. You can 15 | > also assume that the usual arithmetic functions are available, along with the 16 | > function _floor(a)_ which returns the largest integer $\leq a$. 17 | 18 | -------------------------------- 19 | 20 | Remember that the schema of the relation _order_ is _order(orderid, datetime, itemid, amount)_. 21 | Divide by 3600, and take floor, group by that. To output the timestamp of the window 22 | end, add 1 to hour and multiply by 3600. 23 | 24 | ```sql 25 | WITH o(itemid, hour, amount) AS ( 26 | SELECT itemid, floor(to_seconds(datetime)/3600), amount 27 | FROM order 28 | ) 29 | SELECT itemid, (hour + 1) * 3600 as window_end, SUM(amount) 30 | FROM o 31 | GROUP BY itemid, hour; 32 | ``` -------------------------------------------------------------------------------- /Ch07_Relational_Database_Design/7.39.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 39 3 | title: '7.39' 4 | --- 5 | > Given the three goals of relational database design, is there any reason 6 | > to design a database schema that is in 2NF, but is in no higher-order 7 | > normal form? (See Exercise 7.19 for the definition of 2NF). 8 | 9 | -------------------------------- 10 | 11 | 2NF does not prohibit as much repetition of information since the schema 12 | $(A,B,C)$ with dependencies $A \rightarrow B$ and $B \rightarrow C$ is 13 | allowed under 2NF, although the same $(B,C)$ pair could be associated with 14 | many $A$ values, needlessly duplicating $C$ values. To void this we must go to 3NF. 15 | Repetition of information is allowed in 3NF in some but not all of 16 | the cases where it is allowed in 2NF. Thus, in general, 3NF reduces 17 | repetition of information. Since we can always achieve a lossless 18 | decomposition into 3NF, there is no loss of information needed in going 19 | from 2NF to 3NF. 20 | 21 | Note that the decomposition $\{ \{A, B\}, \{B, C\} \}$ is a dependency-preserving 22 | 3NF decomposition of the schema $(A, B, C)$. However, in case we choose this decomposition, 23 | retrieving information about the relationship between $A$, $B$ and $C$ requires a join 24 | of two relations, which is avoided in the corresponding 2NF decomposition. 25 | 26 | -------------------------------------------------------------------------------- /Ch09_Application_Development/9.17.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 17 3 | title: '9.17' 4 | --- 5 | > Write pseudocode to manage a connection pool. Your pseudocode must include 6 | > a function to create a pool (providing a database connection string, database 7 | > username, and password as parameters), a function to request a connection 8 | > from the pool, a function to release a connection to the pool, and a function 9 | > to close the connection pool. 10 | 11 | -------------------------------- 12 | 13 | ```cpp 14 | // represents a single connection to the database. 15 | class Connection { 16 | 17 | } 18 | 19 | // represents a connection pool 20 | class ConnectionPool { 21 | // a list of connections that are in the pool. 22 | List listOfConnections; 23 | } 24 | 25 | // a function to create a pool 26 | ConnectionPool createConnectionPool(string db_conn, string db_username, string db_password) { 27 | 28 | } 29 | 30 | // a function to request a connection 31 | // from the pool 32 | Connection getConnectionFromPool(ConnectionPool pool) { 33 | 34 | } 35 | 36 | // a function to release a connection to the pool 37 | void releaseConnectionToThePool(ConnectionPool pool, Connection conn) { 38 | 39 | } 40 | 41 | // a function to close the connection pool 42 | void closeConnection(ConnectionPool pool) { 43 | 44 | } 45 | 46 | ``` -------------------------------------------------------------------------------- /Ch09_Application_Development/9.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '9.7' 4 | --- 5 | > The _netstat_ command (available on Linux and on Windows) shows 6 | > the active network connections on a computer. Explain how this 7 | > command can be used to find out if a particular web page is 8 | > not closing connections that it opened, or if connection pooling 9 | > is used, not returing connections to the connection pool. 10 | > You should account for the fact that with connection pooling, 11 | > the connection may not get closed immediately. 12 | 13 | -------------------------------- 14 | 15 | The tester should run _netstat_ to find all connections open to the 16 | machine/socket used by the database. (If the application server is 17 | separate from the database server, the command may be executed at 18 | either of the machines). Then the web page being tested should be 19 | accessed repeatedly (this can be automated by using tools such as 20 | JMeter to generate page accesses). The number of connections to the 21 | database would go from 0 to some value (depending on the number of 22 | connections retained in the pool), but after some time the number 23 | of connections should stop increasing. If the number keeps increasing, 24 | the code underlying the web page is clearly not closing connections or 25 | returning the connection to the pool. -------------------------------------------------------------------------------- /Ch01_Introduction/1.12.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 12 3 | title: '1.12' 4 | --- 5 | > Explain the difference between two-tier and three-tier application architectures. 6 | > Which is better suited for web applications? Why? 7 | 8 | * Earlier-generation database applications used a **two-tier architecture**, where as 9 | a **three-tier architecture** is used by a modern database application. 10 | 11 | * In a **two-tier architecture** the application resides at the client machine, and invokes 12 | database system functionality at the server machine through query language statements. 13 | In a **three-tier architecture** the client machine acts as merely a front end and 14 | _does not contain any direct database calls_ ; the front end communicates with an 15 | **application server**. The application server, in turn, communicates with a database 16 | system to access data. 17 | 18 | * Three-tier applications provide better security as well as better performance 19 | than two-tier applications. 20 | 21 | Note: Even though the book classifies database applications in to two, the reality is 22 | that most famous applications use **four-tier architecture**. If we take most chatting 23 | mobile applications they are a **three-tier architecture** with a local database 24 | such as sqlite for caching the data and accessing it when the mobile is not connected 25 | to the internet. -------------------------------------------------------------------------------- /Ch06_Database_Design_Using_the_ER_Model/6.5.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 5 3 | title: '6.5' 4 | --- 5 | > An E-R diagram can be viewed as a graph. What do 6 | > the following mean in terms of the structure of an 7 | > enterprise schema? 8 | > 9 | > a. The graph is disconnected.
10 | > b. The graph has a cycle. 11 | 12 | -------------------------------- 13 | 14 | a. If a pair of entity sets are connected by a path in an E-R 15 | diagram, the entity sets are related, though perhaps indirectly. 16 | A disconnected graph implies that there are pairs of entity sets 17 | that are unrelated to each other. In an enterprise, we can say that 18 | the two parts of the enterprise are completely independent of each 19 | other. If we split the graph into connected components, we have, a 20 | separate database corresponding to each independent independent part 21 | of the enterprise. 22 | 23 | b. As indicated in the answer to the previous part, a path 24 | in the graph between a pair of entity sets indicates a (possibly 25 | indirect) relationship between the two entity sets. If there is 26 | a cycle in the graph, then every pair of entity sets on the cycle 27 | are related to each other in at least two distinct ways. If the E-R 28 | diagram is acyclic, then there is unique path between every pair of entity 29 | sets and thus a unique relationship between every pair of entity sets. -------------------------------------------------------------------------------- /Ch15_Query_Processing/15.7.md: -------------------------------------------------------------------------------- 1 | --- 2 | order: 7 3 | title: '15.7' 4 | --- 5 | > Write pseudocode for an iterator that implements indexed nested-loop join, 6 | > where the outer relation is pipelined. Your pseudocode must define the standard 7 | > iterator functions open(), next(), and close(). Show what state 8 | > information the iterator must maintain between calls. 9 | 10 | -------------------------------- 11 | 12 | Let outer be the iterator which returns successive tuples from the pipelined outer 13 | relation. Let inner be the iterator which returns successive tuples of the inner relation 14 | having a given value at the join attributes. The inner iterator returns these tuples 15 | by performing an index lookup. The functions **IndexedNLJoin::open**, **IndexedNLJoin::close** 16 | and **IndexedNLJoin::next** to implement the indexed nested-loop join iterator are given below. 17 | 18 | The two iterators outer and inner, the value of the last read outer relation 19 | tuple $t_r$ and a flag $\text{done}_r$ indicating whether the end of the outer relation 20 | scan has been reached are the state information which need to be remembered by 21 | **IndexedNLJoin** between calls. 22 | 23 | 24 | 25 | --------------------------------------------------------------------------------