└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | ## Cloud Database Papers
  2 | 
  3 | Continuously update the **Cloud Database** papers. Please inform us if there are any great papers missed :)
  4 | 
  5 | Table of Contents
  6 | =================
  7 | - [Table of Contents](#table-of-contents)
  8 |   - [0. Unmerged](#0-unmerged)
  9 |   - [1. Survey & Tutorial](#1-survey--tutorial)
 10 |   - [2. Database as a Service](#2-database-as-a-service)
 11 |   - [3. Auto-scaling & Partition](#3-auto-scaling--partition)
 12 |   - [4. Disaggregation](#4-disaggregation)
 13 |   - [5, Optimizer](#5-optimizer)
 14 |   - [6. Safety & Recovery](#6-safety--recovery)
 15 |   - [7. Hardware](#7-hardware)
 16 |   - [8. Application & Industry](#8-application--industry)
 17 |   - [9. Challenges](#9-challenges)
 18 | 
 19 | ## 0. Unmerged
 20 | 
 21 | **[Monitor]** Curino, C., Jones, E. P. C., Madden, S., & Balakrishnan, H. (2011). Workload-aware database monitoring and consolidation. Proceedings of the ACM SIGMOD International Conference on Management of Data, 313–324. [[paper](https://doi.org/10.1145/1989323.1989357) ]
 22 | 
 23 | ## 1. Survey & Tutorial
 24 | 
 25 | **[Survey]** Armbrust, A. Fox, and R. Griffith, M. (2009). Above the clouds: A Berkeley view of cloud computing. University of California, Berkeley, Tech. Rep. UCB, 07–013. [[paper](https://doi.org/10.1145/1721654.1721672) ]
 26 | 
 27 | **[Survey]** Jonas, E., Schleier-Smith, J., Sreekanti, V., Tsai, C.-C., Khandelwal, A., Pu, Q., Shankar, V., Carreira, J., Krauth, K., Yadwadkar, N., Gonzalez, J. E., Popa, R. A., Stoica, I., & Patterson, D. A. (2019). Cloud Programming Simplified: A Berkeley View on Serverless Computing. [[paper](http://arxiv.org/abs/1902.03383) ]
 28 | 
 29 | **[Survey]** Li, F. (2018). Cloud native database systems at Alibaba: Opportunities and challenges. Proceedings of the VLDB Endowment, 12(12), 2263–2272. [[paper](https://doi.org/10.14778/3352063.3352141) ]
 30 | 
 31 | ## 2. Database as a Service
 32 | 
 33 | **[DBaaS]** Depoutovitch, A., Chen, C., Chen, J., Larson, P., Lin, S., Ng, J., Cui, W., Liu, Q., Huang, W., Xiao, Y., & He, Y. (2020). Taurus Database: How to be Fast, Available, and Frugal in the Cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1463–1478. [[paper](https://doi.org/10.1145/3318464.3386129) ]
 34 | 
 35 | **[DBaaS]** Taft, R., Lang, W., Duggan, J., Elmore, A. J., Stonebraker, M., & De Witt, D. (2016). STeP: Scalable tenant placement for managing database-as-a-service deployments. Proceedings of the 7th ACM Symposium on Cloud Computing, SoCC 2016, 388–400. [[paper](https://doi.org/10.1145/2987550.2987575) ]
 36 | 
 37 | **[DBaaS]** Das, S., Li, F., Narasayya, V. R., & König, A. C. (2016). Automated demand-driven resource scaling in relational database-as-a-service. Proceedings of the ACM SIGMOD International Conference on Management of Data, 26-June-2016, 1923–1934. [[paper](https://doi.org/10.1145/2882903.2903733) ]
 38 | 
 39 | **[DBaaS]** Narasayya, V., Menache, I., Singh, M., Li, F., Syamala, M., & Chaudhuri, S. (2015). Sharing Buffer Pool Memory in Multi-Tenant Relational. Proceedings of the VLDB Endowment, 8(7), 726-737. [[paper](https://doi.org/10.14778/2752939.2752942) ]
 40 | 
 41 | ## 3. Auto-scaling & Partition
 42 | 
 43 | **[Auto-scaling]** Perron, M., Castro Fernandez, R., Dewitt, D., & Madden, S. (2020). Starling: A Scalable Query Engine on Cloud Functions. Proceedings of the ACM SIGMOD International Conference on Management of Data, 131–141. [[paper](https://doi.org/10.1145/3318464.3380609) ]
 44 | 
 45 | **[Auto-scaling]** Shen, Z., Subbiah, S., Gu, X., & Wilkes, J. (2011). CloudScale: Elastic resource scaling for multi-tenant cloud systems. Proceedings of the 2nd ACM Symposium on Cloud Computing, SOCC 2011. [[paper](https://doi.org/10.1145/2038916.2038921) ]
 46 | 
 47 | **[Auto-scaling]** Wu, C., Sreekanti, V., & Hellerstein, J. M. (2021). Autoscaling tiered cloud storage in Anna. VLDB Journal, 30(1), 25–43. [[paper](https://doi.org/10.1007/s00778-020-00632-7) ] 
 48 | 
 49 | **[Auto-scaling] [Disaggregation]** Zhang, Y., Ruan, C., Li, C., Yang, J., Cao, W., Li, F., Wang, B., Fang, J., Wang, Y., Huo, J., & Bi, C. (2021). Towards Cost-Effective and Elastic Cloud Database Deployment via Memory Disaggregation. Proc. VLDB Endow., 14(1), 1900–1912. [[paper](https://doi.org/10.14778/3467861.3467877) ] 
 50 | 
 51 | **[Partition]** Hilprecht, B., Binnig, C., & Röhm, U. (2020). Learning a Partitioning Advisor for Cloud Databases. Proceedings of the ACM SIGMOD International Conference on Management of Data, 143–157. [[paper](https://doi.org/10.1145/3318464.3389704) ]
 52 | 
 53 | ## 4. Disaggregation
 54 | 
 55 | **[Disaggregation]** Shan, Y., Huang, Y., Chen, Y., & Zhang, Y. (2018). LegoOS : A Disseminated , Distributed OS for Hardware Resource Disaggregation. Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18)., 69–87. [[paper](https://www.usenix.org/conference/osdi18/presentation/shan)]
 56 | 
 57 | **[Disaggregation]** Angel, S., Nanavati, M., & Sen, S. (2020). Disaggregation and the application. HotCloud 2020 - 12th USENIX Workshop on Hot Topics in Cloud Computing, Co-Located with USENIX ATC 2020.[[paper](https://www.usenix.org/conference/hotcloud20/presentation/angel)]
 58 | 
 59 | **[Disaggregation]** Klimovic, A., Kozyrakis, C., Thereska, E., John, B., & Kumar, S. (2016). Flash storage disaggregation. Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016. [[paper](https://doi.org/10.1145/2901318.2901337)]
 60 | 
 61 | **[Disaggregation]** Zhang, Q., Cai, Y., Chen, X., Angel, S., Chen, A., Liu, V., & Loo, B. T. (2020). Understanding the effect of data center resource disaggregation on production DBMSs. Proceedings of the VLDB Endowment, 13(9), 1568–1581. [[paper](https://doi.org/10.14778/3397230.3397249)]
 62 | 
 63 | ## 5, Optimizer
 64 | 
 65 | **[Optimizer]** Wu, C., Jindal, A., Amizadeh, S., Patel, H., & Le, W. (2018). Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment, 12(3), 210–222. [[paper](https://doi.org/10.14778/3291264.3291267)]
 66 | 
 67 | **[Optimizer]** Leis, V., & Kuschewski, M. (2021). Towards Cost-Optimal Query Processing in the Cloud. Proc. {VLDB} Endow., 14(9), 1606–1612. [[paper](https://doi.org/10.14778/3461535.3461549)]
 68 | 
 69 | ## 6. Safety & Recovery
 70 | 
 71 | **[Safety]** Antonopoulos, P., Arasu, A., Singh, K. D., Eguro, K., Gupta, N., Jain, R., Kaushik, R., Kodavalla, H., Kossmann, D., Ogg, N., Ramamurthy, R., Szymaszek, J., Trimmer, J., Vaswani, K., Venkatesan, R., & Zwilling, M. (2020). Azure SQL Database Always Encrypted. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1, 1511–1525. [[paper](https://doi.org/10.1145/3318464.3386141)]
 72 | 
 73 | **[Safety]** Arasu, A., Eguro, K., Kaushik, R., & Ramamurthy, R. (2014). Querying encrypted data. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1259–1261. [[paper](https://doi.org/10.1145/2588555.2588893)]
 74 | 
 75 | **[Recovery]** Yang, Y., Youill, M., Woicik, M., Liu, Y., Yu, X., Serafini, M., Aboulnaga, A., & Stonebraker, M. (2021). FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. PVLDB, 14(11), 2101–2113. [[paper](https://doi.org/10.14778/3476249.3476265)]
 76 | 
 77 | ## 7. Hardware
 78 | 
 79 | **[System]** Ortiz, J., Lee, B., Balazinska, M., Gehrke, J., & Hellerstein, J. L. (2020). SLAOrchestrator: Reducing the cost of performance SLAs for cloud data analytics. Proceedings of the 2018 USENIX Annual Technical Conference, USENIX ATC 2018, 547–560.[[paper](https://www.usenix.org/conference/atc18/presentation/ortiz)]
 80 | 
 81 | **[Hardware]** Do, J., Sengupta, S., & Swanson, S. (2019). Programmable solid-state storage in future cloud datacenters. Communications of the ACM, 62(6), 54–62. [[paper](https://doi.org/10.1145/3286588)]
 82 | 
 83 | **[Hardware]** Xue, S., Zhao, S., Chen, Q., Deng, G., Liu, Z., Zhang, J., Song, Z., Ma, T., Yang, Y., Zhou, Y., Niu, K., Sun, S., & Guo, M. (2020). Spool: Reliable virtualized NVMe storage pool in public cloud infrastructure. Proceedings of the 2020 USENIX Annual Technical Conference, ATC 2020, 97–110.
 84 | 
 85 | **[Memory]** Kalia, A., Andersen, D., & Kaminsky, M. (2020). Challenges and solutions for fast remote persistent memory access. SoCC 2020 - Proceedings of the 2020 ACM Symposium on Cloud Computing, 105–119. [[paper](https://doi.org/10.1145/3419111.3421294)]
 86 | 
 87 | **[Memory]** Wei, X., Chen, R., Chen, H., Jiao, S., Wei, X., Chen, R., & Chen, H. (2020). Fast RDMA-based Ordered Key-Value Store using Remote Learned Cache. Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation Fast RDMA-Based Ordered Key-Value Store Using Remote Learned Cache.[[paper](https://www.usenix.org/conference/atc20/presentation/xue)]
 88 | 
 89 | **[Memory]** Nelson, J., Holt, B., Myers, B., Briggs, P., Ceze, L., Kahan, S., & Oskin, M. (2015). Latency-Tolerant Software Distributed Shared Memory. Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015, 291–305.[[paper](https://www.usenix.org/conference/atc15/technical-session/presentation/nelson)]
 90 | 
 91 | **[Memory]** Shan, Y., Tsai, S. Y., & Zhang, Y. (2017). Distributed shared persistent memory. SoCC 2017 - Proceedings of the 2017 Symposium on Cloud Computing, 323–337. [[paper](https://doi.org/10.1145/3127479.3128610)]
 92 | 
 93 | **[Memory]** Fent, P., Renen, A. Van, Kipf, A., Leis, V., Neumann, T., & Kemper, A. (2020). Low-latency communication for fast DBMS Using RDMA and shared memory. Proceedings - International Conference on Data Engineering, 2020-April, 1477–1488. [[paper](https://doi.org/10.1109/ICDE48307.2020.00131)]
 94 | 
 95 | **[Memory]** Aguilera, M. K., Amit, N., Calciu, I., Deguillard, X., Gandhi, J., Subrahmanyam, P., Suresh, L., Tati, K., Venkatasubramanian, R., & Wei, M. (2017). Remote memory in the age of fast networks. SoCC 2017 - Proceedings of the 2017 Symposium on Cloud Computing, 121–127. [[paper](https://doi.org/10.1145/3127479.3131612)]
 96 | 
 97 | **[Memory]** Lagar-Cavilla, A., Ahn, J., Souhlal, S., Agarwal, N., Burny, R., Butt, S., Chang, J., Chaugule, A., Deng, N., Shahid, J., Thelen, G., Yurtsever, K. A., Zhao, Y., & Ranganathan, P. (2019). Software-Defined Far Memory in Warehouse-Scale Computers. International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS, 317–330. [[paper](https://doi.org/10.1145/3297858.3304053)]
 98 | 
 99 | **[Network]** Ziegler, T., Vani, S. T., Binnig, C., Fonseca, R., & Kraska, T. (2019). Designing distributed tree-based index structures for fast RDMA-capable networks. Proceedings of the ACM SIGMOD International Conference on Management of Data, 741–758. [[paper](https://doi.org/10.1145/3299869.3300081)]
100 | 
101 | **[Network]** Tirmazi, M., Ben Basat, R., Gao, J., & Yu, M. (2020). Cheetah: Accelerating Database Queries with Switch Pruning. Proceedings of the ACM SIGMOD International Conference on Management of Data, 2407–2422. [[paper](https://doi.org/10.1145/3318464.3389698)]
102 | 
103 | **[Network]** Craddock, H., Konudula, L. P., Cheng, K., & Kul, G. (2019). The case for physical memory pools: A vision paper. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11513 LNCS(Vm), 208–221. [[paper](https://doi.org/10.1007/978-3-030-23502-4_15)]
104 | 
105 | ## 8. Application & Industry
106 | 
107 | **[Application]** Müller, I., Marroquín, R., & Alonso, G. (2020). Lambada: Interactive Data Analytics on Cold Data Using Serverless Cloud Infrastructure. Proceedings of the ACM SIGMOD International Conference on Management of Data, 115–130. [[paper](https://doi.org/10.1145/3318464.3389758)]
108 | 
109 | **[Application]** Yu, X., Youill, M., Woicik, M., Ghanem, A., Serafini, M., Aboulnaga, A., & Stonebraker, M. (2020). PushdownDB: Accelerating a DBMS Using S3 Computation. Proceedings - International Conference on Data Engineering, 2020-April, 1802–1805. [[paper](https://doi.org/10.1109/ICDE48307.2020.00174)]
110 | 
111 | **[Application]** Antonopoulos, P., Budovski, A., Diaconu, C., Saenz, A. H., Hu, J., Kodavalla, H., Kossmann, D., Lingam, S., Minhas, U. F., Prakash, N., Purohit, V., Qu, H., Ravella, C. S., Reisteter, K., Shrotri, S., Tang, D., & Wakade, V. (2019). Socrates: The new SQL server in the cloud. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1743–1756. [[paper](https://doi.org/10.1145/3299869.3314047)]
112 | 
113 | **[Industry]** Verbitski, A., Gupta, A., Saha, D., Corey, J., Gupta, K., Brahmadesam, M., Mittal, R., Krishnamurthy, S., Maurice, S., Kharatishvilli, T., & Bao, X. (2018). Amazon Aurora. 789–796. [[paper](https://doi.org/10.1145/3183713.3196937)]
114 | 
115 | **[Industry]** Li, F. (2018). Cloud native database systems at Alibaba: Opportunities and challenges. Proceedings of the VLDB Endowment, 12(12), 2263–2272. [[paper](https://doi.org/10.14778/3352063.3352141)]
116 | 
117 | **[Industry]** Dobrescu, M., Argyraki, K., & Argyraki EPFL, K. (2014). Millions of Tiny Databases. Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’14). [[paper](https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/dobrescu)]
118 | 
119 | **[Industry]** Cao, W., Liu, Y., Cheng, Z., Zheng, N., Li, W., Wu, W., Ouyang, L., Wang, P., Wang, Y., Kuan, R., Liu, Z., Zhu, F., & Zhang, T. (2020). POLARDB Meets Computational Storage : Efficiently Support Analytical Workloads in Cloud-Native Relational Database. 18th USENIX Conference on File and Storage Technologies (FAST 20). 2020. [[paper](https://www.usenix.org/conference/fast20/presentation/cao-wei)]
120 | 
121 | **[Industry]** Cao, W., Zhang, Y., Yang, X., Li, F., Wang, S., Hu, Q., Cheng, X., Chen, Z., Liu, Z., Fang, J., Wang, B., Wang, Y., Sun, H., Yang, Z., Cheng, Z., Chen, S., Wu, J., Hu, W., Zhao, J., … Tong, J. (2021). PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers. Proceedings of the ACM SIGMOD International Conference on Management of Data, 2477–2489. [[paper](https://doi.org/10.1145/3448016.3457560)]
122 | 
123 | **[Industry]** Cao, W., Liu, Z., Wang, P., Chen, S., Zhu, C., Zheng, S., Wang, Y., & Ma, G. (2018). PolarFS: An ultralow latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment, 11(12), 1849–1862. [[paper](https://doi.org/10.14778/3229863.3229872)]
124 | 
125 | **[Industry]** Dageville, B., Cruanes, T., Zukowski, M., Antonov, V., Avanes, A., Bock, J., Claybaugh, J., Engovatov, D., Hentschel, M., Huang, J., Lee, A. W., Motivala, A., Munir, A. Q., Pelley, S., Povinec, P., Rahn, G., Triantafyllis, S., & Unterbrunner, P. (2016). The snowflake elastic data warehouse. Proceedings of the ACM SIGMOD International Conference on Management of Data, 26-June-20, 215–226. [[paper](https://doi.org/10.1145/2882903.2903741)]
126 | 
127 | **[Industry]** Mattson, T., Rogers, J., & Elmore, A. J. (2018). The BigDAWG polystore system. Making Databases Work: The Pragmatic Wisdom of Michael Stonebraker, 44(2), 279–289. [[paper](https://doi.org/10.1145/3226595.3226620)]
128 | 
129 | **[Industry]** Huang, D., Liu, Q., Cui, Q., Fang, Z., Ma, X., Xu, F., Shen, L., Tang, L., Zhou, Y., Huang, M., Wei, W., Liu, C., Zhang, J., Li, J., Wu, X., Song, L., Sun, R., Yu, S., Zhao, L., … Tang, X. (2020). TiDB: a Raft-based HTAP database. Proceedings of the VLDB Endowment, 13(12), 3072–3084. [[paper](https://doi.org/10.14778/3415478.3415535)]
130 | 
131 | ## 9. Challenges
132 | 
133 | **[Challenges]** Zhang, Q., Cai, Y., Angel, S., Liu, V., Chen, A., & Loo, B. T. (2020). Rethinking Data Management Systems for Disaggregated Data Centers. CIDR 2019 - 9th Biennial Conference on Innovative Data Systems Research.[[paper](https://par.nsf.gov/biblio/10157860)]
134 | 
135 | **[Challenges]** Hellerstein, J. M., Faleiro, J., Gonzalez, J. E., Schleier-Smith, J., Sreekanti, V., Tumanov, A., & Wu, C. (2019). Serverless computing: One step forward, two steps back. CIDR 2019 - 9th Biennial Conference on Innovative Data Systems Research.[[paper](https://arxiv.org/abs/1812.03651)]
136 | 
137 | 


--------------------------------------------------------------------------------