└── README.md /README.md: -------------------------------------------------------------------------------- 1 | ## Cloud Database Papers 2 | 3 | Continuously update the **Cloud Database** papers. Please inform us if there are any great papers missed :) 4 | 5 | Table of Contents 6 | ================= 7 | - [Table of Contents](#table-of-contents) 8 | - [0. Unmerged](#0-unmerged) 9 | - [1. Survey & Tutorial](#1-survey--tutorial) 10 | - [2. Database as a Service](#2-database-as-a-service) 11 | - [3. Auto-scaling & Partition](#3-auto-scaling--partition) 12 | - [4. Disaggregation](#4-disaggregation) 13 | - [5, Optimizer](#5-optimizer) 14 | - [6. Safety & Recovery](#6-safety--recovery) 15 | - [7. Hardware](#7-hardware) 16 | - [8. Application & Industry](#8-application--industry) 17 | - [9. Challenges](#9-challenges) 18 | 19 | ## 0. Unmerged 20 | 21 | **[Monitor]** Curino, C., Jones, E. P. C., Madden, S., & Balakrishnan, H. (2011). Workload-aware database monitoring and consolidation. Proceedings of the ACM SIGMOD International Conference on Management of Data, 313–324. [[paper](https://doi.org/10.1145/1989323.1989357) ] 22 | 23 | ## 1. Survey & Tutorial 24 | 25 | **[Survey]** Armbrust, A. Fox, and R. Griffith, M. (2009). Above the clouds: A Berkeley view of cloud computing. University of California, Berkeley, Tech. Rep. UCB, 07–013. [[paper](https://doi.org/10.1145/1721654.1721672) ] 26 | 27 | **[Survey]** Jonas, E., Schleier-Smith, J., Sreekanti, V., Tsai, C.-C., Khandelwal, A., Pu, Q., Shankar, V., Carreira, J., Krauth, K., Yadwadkar, N., Gonzalez, J. E., Popa, R. A., Stoica, I., & Patterson, D. A. (2019). Cloud Programming Simplified: A Berkeley View on Serverless Computing. [[paper](http://arxiv.org/abs/1902.03383) ] 28 | 29 | **[Survey]** Li, F. (2018). Cloud native database systems at Alibaba: Opportunities and challenges. Proceedings of the VLDB Endowment, 12(12), 2263–2272. [[paper](https://doi.org/10.14778/3352063.3352141) ] 30 | 31 | ## 2. Database as a Service 32 | 33 | **[DBaaS]** Depoutovitch, A., Chen, C., Chen, J., Larson, P., Lin, S., Ng, J., Cui, W., Liu, Q., Huang, W., Xiao, Y., & He, Y. (2020). Taurus Database: How to be Fast, Available, and Frugal in the Cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1463–1478. [[paper](https://doi.org/10.1145/3318464.3386129) ] 34 | 35 | **[DBaaS]** Taft, R., Lang, W., Duggan, J., Elmore, A. J., Stonebraker, M., & De Witt, D. (2016). STeP: Scalable tenant placement for managing database-as-a-service deployments. Proceedings of the 7th ACM Symposium on Cloud Computing, SoCC 2016, 388–400. [[paper](https://doi.org/10.1145/2987550.2987575) ] 36 | 37 | **[DBaaS]** Das, S., Li, F., Narasayya, V. R., & König, A. C. (2016). Automated demand-driven resource scaling in relational database-as-a-service. Proceedings of the ACM SIGMOD International Conference on Management of Data, 26-June-2016, 1923–1934. [[paper](https://doi.org/10.1145/2882903.2903733) ] 38 | 39 | **[DBaaS]** Narasayya, V., Menache, I., Singh, M., Li, F., Syamala, M., & Chaudhuri, S. (2015). Sharing Buffer Pool Memory in Multi-Tenant Relational. Proceedings of the VLDB Endowment, 8(7), 726-737. [[paper](https://doi.org/10.14778/2752939.2752942) ] 40 | 41 | ## 3. Auto-scaling & Partition 42 | 43 | **[Auto-scaling]** Perron, M., Castro Fernandez, R., Dewitt, D., & Madden, S. (2020). Starling: A Scalable Query Engine on Cloud Functions. Proceedings of the ACM SIGMOD International Conference on Management of Data, 131–141. [[paper](https://doi.org/10.1145/3318464.3380609) ] 44 | 45 | **[Auto-scaling]** Shen, Z., Subbiah, S., Gu, X., & Wilkes, J. (2011). CloudScale: Elastic resource scaling for multi-tenant cloud systems. Proceedings of the 2nd ACM Symposium on Cloud Computing, SOCC 2011. [[paper](https://doi.org/10.1145/2038916.2038921) ] 46 | 47 | **[Auto-scaling]** Wu, C., Sreekanti, V., & Hellerstein, J. M. (2021). Autoscaling tiered cloud storage in Anna. VLDB Journal, 30(1), 25–43. [[paper](https://doi.org/10.1007/s00778-020-00632-7) ] 48 | 49 | **[Auto-scaling] [Disaggregation]** Zhang, Y., Ruan, C., Li, C., Yang, J., Cao, W., Li, F., Wang, B., Fang, J., Wang, Y., Huo, J., & Bi, C. (2021). Towards Cost-Effective and Elastic Cloud Database Deployment via Memory Disaggregation. Proc. VLDB Endow., 14(1), 1900–1912. [[paper](https://doi.org/10.14778/3467861.3467877) ] 50 | 51 | **[Partition]** Hilprecht, B., Binnig, C., & Röhm, U. (2020). Learning a Partitioning Advisor for Cloud Databases. Proceedings of the ACM SIGMOD International Conference on Management of Data, 143–157. [[paper](https://doi.org/10.1145/3318464.3389704) ] 52 | 53 | ## 4. Disaggregation 54 | 55 | **[Disaggregation]** Shan, Y., Huang, Y., Chen, Y., & Zhang, Y. (2018). LegoOS : A Disseminated , Distributed OS for Hardware Resource Disaggregation. Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18)., 69–87. [[paper](https://www.usenix.org/conference/osdi18/presentation/shan)] 56 | 57 | **[Disaggregation]** Angel, S., Nanavati, M., & Sen, S. (2020). Disaggregation and the application. HotCloud 2020 - 12th USENIX Workshop on Hot Topics in Cloud Computing, Co-Located with USENIX ATC 2020.[[paper](https://www.usenix.org/conference/hotcloud20/presentation/angel)] 58 | 59 | **[Disaggregation]** Klimovic, A., Kozyrakis, C., Thereska, E., John, B., & Kumar, S. (2016). Flash storage disaggregation. Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016. [[paper](https://doi.org/10.1145/2901318.2901337)] 60 | 61 | **[Disaggregation]** Zhang, Q., Cai, Y., Chen, X., Angel, S., Chen, A., Liu, V., & Loo, B. T. (2020). Understanding the effect of data center resource disaggregation on production DBMSs. Proceedings of the VLDB Endowment, 13(9), 1568–1581. [[paper](https://doi.org/10.14778/3397230.3397249)] 62 | 63 | ## 5, Optimizer 64 | 65 | **[Optimizer]** Wu, C., Jindal, A., Amizadeh, S., Patel, H., & Le, W. (2018). Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment, 12(3), 210–222. [[paper](https://doi.org/10.14778/3291264.3291267)] 66 | 67 | **[Optimizer]** Leis, V., & Kuschewski, M. (2021). Towards Cost-Optimal Query Processing in the Cloud. Proc. {VLDB} Endow., 14(9), 1606–1612. [[paper](https://doi.org/10.14778/3461535.3461549)] 68 | 69 | ## 6. Safety & Recovery 70 | 71 | **[Safety]** Antonopoulos, P., Arasu, A., Singh, K. D., Eguro, K., Gupta, N., Jain, R., Kaushik, R., Kodavalla, H., Kossmann, D., Ogg, N., Ramamurthy, R., Szymaszek, J., Trimmer, J., Vaswani, K., Venkatesan, R., & Zwilling, M. (2020). Azure SQL Database Always Encrypted. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1, 1511–1525. [[paper](https://doi.org/10.1145/3318464.3386141)] 72 | 73 | **[Safety]** Arasu, A., Eguro, K., Kaushik, R., & Ramamurthy, R. (2014). Querying encrypted data. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1259–1261. [[paper](https://doi.org/10.1145/2588555.2588893)] 74 | 75 | **[Recovery]** Yang, Y., Youill, M., Woicik, M., Liu, Y., Yu, X., Serafini, M., Aboulnaga, A., & Stonebraker, M. (2021). FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. PVLDB, 14(11), 2101–2113. [[paper](https://doi.org/10.14778/3476249.3476265)] 76 | 77 | ## 7. Hardware 78 | 79 | **[System]** Ortiz, J., Lee, B., Balazinska, M., Gehrke, J., & Hellerstein, J. L. (2020). SLAOrchestrator: Reducing the cost of performance SLAs for cloud data analytics. Proceedings of the 2018 USENIX Annual Technical Conference, USENIX ATC 2018, 547–560.[[paper](https://www.usenix.org/conference/atc18/presentation/ortiz)] 80 | 81 | **[Hardware]** Do, J., Sengupta, S., & Swanson, S. (2019). Programmable solid-state storage in future cloud datacenters. Communications of the ACM, 62(6), 54–62. [[paper](https://doi.org/10.1145/3286588)] 82 | 83 | **[Hardware]** Xue, S., Zhao, S., Chen, Q., Deng, G., Liu, Z., Zhang, J., Song, Z., Ma, T., Yang, Y., Zhou, Y., Niu, K., Sun, S., & Guo, M. (2020). Spool: Reliable virtualized NVMe storage pool in public cloud infrastructure. Proceedings of the 2020 USENIX Annual Technical Conference, ATC 2020, 97–110. 84 | 85 | **[Memory]** Kalia, A., Andersen, D., & Kaminsky, M. (2020). Challenges and solutions for fast remote persistent memory access. SoCC 2020 - Proceedings of the 2020 ACM Symposium on Cloud Computing, 105–119. [[paper](https://doi.org/10.1145/3419111.3421294)] 86 | 87 | **[Memory]** Wei, X., Chen, R., Chen, H., Jiao, S., Wei, X., Chen, R., & Chen, H. (2020). Fast RDMA-based Ordered Key-Value Store using Remote Learned Cache. Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation Fast RDMA-Based Ordered Key-Value Store Using Remote Learned Cache.[[paper](https://www.usenix.org/conference/atc20/presentation/xue)] 88 | 89 | **[Memory]** Nelson, J., Holt, B., Myers, B., Briggs, P., Ceze, L., Kahan, S., & Oskin, M. (2015). Latency-Tolerant Software Distributed Shared Memory. Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015, 291–305.[[paper](https://www.usenix.org/conference/atc15/technical-session/presentation/nelson)] 90 | 91 | **[Memory]** Shan, Y., Tsai, S. Y., & Zhang, Y. (2017). Distributed shared persistent memory. SoCC 2017 - Proceedings of the 2017 Symposium on Cloud Computing, 323–337. [[paper](https://doi.org/10.1145/3127479.3128610)] 92 | 93 | **[Memory]** Fent, P., Renen, A. Van, Kipf, A., Leis, V., Neumann, T., & Kemper, A. (2020). Low-latency communication for fast DBMS Using RDMA and shared memory. Proceedings - International Conference on Data Engineering, 2020-April, 1477–1488. [[paper](https://doi.org/10.1109/ICDE48307.2020.00131)] 94 | 95 | **[Memory]** Aguilera, M. K., Amit, N., Calciu, I., Deguillard, X., Gandhi, J., Subrahmanyam, P., Suresh, L., Tati, K., Venkatasubramanian, R., & Wei, M. (2017). Remote memory in the age of fast networks. SoCC 2017 - Proceedings of the 2017 Symposium on Cloud Computing, 121–127. [[paper](https://doi.org/10.1145/3127479.3131612)] 96 | 97 | **[Memory]** Lagar-Cavilla, A., Ahn, J., Souhlal, S., Agarwal, N., Burny, R., Butt, S., Chang, J., Chaugule, A., Deng, N., Shahid, J., Thelen, G., Yurtsever, K. A., Zhao, Y., & Ranganathan, P. (2019). Software-Defined Far Memory in Warehouse-Scale Computers. International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS, 317–330. [[paper](https://doi.org/10.1145/3297858.3304053)] 98 | 99 | **[Network]** Ziegler, T., Vani, S. T., Binnig, C., Fonseca, R., & Kraska, T. (2019). Designing distributed tree-based index structures for fast RDMA-capable networks. Proceedings of the ACM SIGMOD International Conference on Management of Data, 741–758. [[paper](https://doi.org/10.1145/3299869.3300081)] 100 | 101 | **[Network]** Tirmazi, M., Ben Basat, R., Gao, J., & Yu, M. (2020). Cheetah: Accelerating Database Queries with Switch Pruning. Proceedings of the ACM SIGMOD International Conference on Management of Data, 2407–2422. [[paper](https://doi.org/10.1145/3318464.3389698)] 102 | 103 | **[Network]** Craddock, H., Konudula, L. P., Cheng, K., & Kul, G. (2019). The case for physical memory pools: A vision paper. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11513 LNCS(Vm), 208–221. [[paper](https://doi.org/10.1007/978-3-030-23502-4_15)] 104 | 105 | ## 8. Application & Industry 106 | 107 | **[Application]** Müller, I., Marroquín, R., & Alonso, G. (2020). Lambada: Interactive Data Analytics on Cold Data Using Serverless Cloud Infrastructure. Proceedings of the ACM SIGMOD International Conference on Management of Data, 115–130. [[paper](https://doi.org/10.1145/3318464.3389758)] 108 | 109 | **[Application]** Yu, X., Youill, M., Woicik, M., Ghanem, A., Serafini, M., Aboulnaga, A., & Stonebraker, M. (2020). PushdownDB: Accelerating a DBMS Using S3 Computation. Proceedings - International Conference on Data Engineering, 2020-April, 1802–1805. [[paper](https://doi.org/10.1109/ICDE48307.2020.00174)] 110 | 111 | **[Application]** Antonopoulos, P., Budovski, A., Diaconu, C., Saenz, A. H., Hu, J., Kodavalla, H., Kossmann, D., Lingam, S., Minhas, U. F., Prakash, N., Purohit, V., Qu, H., Ravella, C. S., Reisteter, K., Shrotri, S., Tang, D., & Wakade, V. (2019). Socrates: The new SQL server in the cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1743–1756. [[paper](https://doi.org/10.1145/3299869.3314047)] 112 | 113 | **[Industry]** Verbitski, A., Gupta, A., Saha, D., Corey, J., Gupta, K., Brahmadesam, M., Mittal, R., Krishnamurthy, S., Maurice, S., Kharatishvilli, T., & Bao, X. (2018). Amazon Aurora. 789–796. [[paper](https://doi.org/10.1145/3183713.3196937)] 114 | 115 | **[Industry]** Li, F. (2018). Cloud native database systems at Alibaba: Opportunities and challenges. Proceedings of the VLDB Endowment, 12(12), 2263–2272. [[paper](https://doi.org/10.14778/3352063.3352141)] 116 | 117 | **[Industry]** Dobrescu, M., Argyraki, K., & Argyraki EPFL, K. (2014). Millions of Tiny Databases. Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’14). [[paper](https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/dobrescu)] 118 | 119 | **[Industry]** Cao, W., Liu, Y., Cheng, Z., Zheng, N., Li, W., Wu, W., Ouyang, L., Wang, P., Wang, Y., Kuan, R., Liu, Z., Zhu, F., & Zhang, T. (2020). POLARDB Meets Computational Storage : Efficiently Support Analytical Workloads in Cloud-Native Relational Database. 18th USENIX Conference on File and Storage Technologies (FAST 20). 2020. [[paper](https://www.usenix.org/conference/fast20/presentation/cao-wei)] 120 | 121 | **[Industry]** Cao, W., Zhang, Y., Yang, X., Li, F., Wang, S., Hu, Q., Cheng, X., Chen, Z., Liu, Z., Fang, J., Wang, B., Wang, Y., Sun, H., Yang, Z., Cheng, Z., Chen, S., Wu, J., Hu, W., Zhao, J., … Tong, J. (2021). PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers. Proceedings of the ACM SIGMOD International Conference on Management of Data, 2477–2489. [[paper](https://doi.org/10.1145/3448016.3457560)] 122 | 123 | **[Industry]** Cao, W., Liu, Z., Wang, P., Chen, S., Zhu, C., Zheng, S., Wang, Y., & Ma, G. (2018). PolarFS: An ultralow latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment, 11(12), 1849–1862. [[paper](https://doi.org/10.14778/3229863.3229872)] 124 | 125 | **[Industry]** Dageville, B., Cruanes, T., Zukowski, M., Antonov, V., Avanes, A., Bock, J., Claybaugh, J., Engovatov, D., Hentschel, M., Huang, J., Lee, A. W., Motivala, A., Munir, A. Q., Pelley, S., Povinec, P., Rahn, G., Triantafyllis, S., & Unterbrunner, P. (2016). The snowflake elastic data warehouse. Proceedings of the ACM SIGMOD International Conference on Management of Data, 26-June-20, 215–226. [[paper](https://doi.org/10.1145/2882903.2903741)] 126 | 127 | **[Industry]** Mattson, T., Rogers, J., & Elmore, A. J. (2018). The BigDAWG polystore system. Making Databases Work: The Pragmatic Wisdom of Michael Stonebraker, 44(2), 279–289. [[paper](https://doi.org/10.1145/3226595.3226620)] 128 | 129 | **[Industry]** Huang, D., Liu, Q., Cui, Q., Fang, Z., Ma, X., Xu, F., Shen, L., Tang, L., Zhou, Y., Huang, M., Wei, W., Liu, C., Zhang, J., Li, J., Wu, X., Song, L., Sun, R., Yu, S., Zhao, L., … Tang, X. (2020). TiDB: a Raft-based HTAP database. Proceedings of the VLDB Endowment, 13(12), 3072–3084. [[paper](https://doi.org/10.14778/3415478.3415535)] 130 | 131 | ## 9. Challenges 132 | 133 | **[Challenges]** Zhang, Q., Cai, Y., Angel, S., Liu, V., Chen, A., & Loo, B. T. (2020). Rethinking Data Management Systems for Disaggregated Data Centers. CIDR 2019 - 9th Biennial Conference on Innovative Data Systems Research.[[paper](https://par.nsf.gov/biblio/10157860)] 134 | 135 | **[Challenges]** Hellerstein, J. M., Faleiro, J., Gonzalez, J. E., Schleier-Smith, J., Sreekanti, V., Tumanov, A., & Wu, C. (2019). Serverless computing: One step forward, two steps back. CIDR 2019 - 9th Biennial Conference on Innovative Data Systems Research.[[paper](https://arxiv.org/abs/1812.03651)] 136 | 137 | --------------------------------------------------------------------------------