├── README.md ├── chapter-01-refs.md ├── chapter-02-refs.md ├── chapter-03-refs.md ├── chapter-04-refs.md ├── chapter-05-refs.md ├── chapter-06-refs.md ├── chapter-07-refs.md ├── chapter-08-refs.md ├── chapter-09-refs.md ├── chapter-10-refs.md ├── chapter-11-refs.md ├── chapter-12-refs.md ├── ddia-poster.jpg └── ddia-poster.pdf /README.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Literature References 5 | --------------------- 6 | 7 | This repository accompanies the book [Designing Data-Intensive Applications](http://dataintensive.net/) 8 | by [Martin Kleppmann](https://www.google.co.uk/maps), published by 9 | [O'Reilly Media](http://shop.oreilly.com/product/0636920032175.do). 10 | 11 | The book contains a large number of references to further reading material for anyone who wants to 12 | go into more depth, ranging from books and research papers to blog posts, bug reports and tweets. 13 | Many of the references are freely available online. 14 | 15 | The purpose of this repository is to maintain up-to-date links to the full text of online resources, 16 | where available. If you are reading the print edition, you may find it quicker than using a search 17 | engine to find the material. If you are reading an ebook edition, we have included links directly in 18 | the ebook, but unfortunately links tend to break frequently due to the nature of the web. 19 | 20 | If you find a broken link or any error in the references, please submit a pull request to fix it. 21 | For academic papers, you can search for the title in [Google Scholar](https://scholar.google.co.uk/) 22 | to find open-access PDF files. 23 | 24 | Chapters 25 | -------- 26 | 27 | 1. [References for Chapter 1](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-01-refs.md) 28 | 2. [References for Chapter 2](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-02-refs.md) 29 | 3. [References for Chapter 3](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-03-refs.md) 30 | 4. [References for Chapter 4](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-04-refs.md) 31 | 5. [References for Chapter 5](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-05-refs.md) 32 | 6. [References for Chapter 6](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-06-refs.md) 33 | 7. [References for Chapter 7](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-07-refs.md) 34 | 8. [References for Chapter 8](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-08-refs.md) 35 | 9. [References for Chapter 9](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-09-refs.md) 36 | 10. [References for Chapter 10](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-10-refs.md) 37 | 11. [References for Chapter 11](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-11-refs.md) 38 | 12. [References for Chapter 12](https://github.com/wikibook/data-intensive-applications/blob/master/chapter-12-refs.md) 39 | 40 | Maps 41 | ---- 42 | 43 | As an additional special touch, the book has a graphical table of contents for each chapter, 44 | [drawn in the style of a geographic map](https://www.oreilly.com/ideas/drawing-a-map-of-distributed-data-systems). 45 | Each chapter is represented by an island in the sea of distributed data. We have also assembled 46 | the archipelago into a poster which you can download here: 47 | 48 | * [Poster of maps in PDF format](https://github.com/wikibook/data-intensive-applications/blob/master/ddia-poster.pdf) 49 | * [Poster of maps in JPEG format](https://github.com/wikibook/data-intensive-applications/blob/master/ddia-poster.jpg) 50 | 51 | License 52 | ------- 53 | 54 | Copyright (c) 2017 [Martin Kleppmann](http://martin.kleppmann.com/). 55 | 56 | Creative Commons License 57 | 58 | You may freely use the material in this repository under a 59 | Creative Commons 60 | Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). 61 | 62 | Thank you to 63 | [Shabbir Diwan](http://shabbirdiwan.com/), 64 | [Edie Freedman](http://www.ediefreedman.com/), 65 | [Ron Bilodeau](http://www.oreilly.com/pub/au/3771), and 66 | [Marie Beaugureau](https://twitter.com/cmariebeau) 67 | for designing the maps, and to [O'Reilly Media](https://www.oreilly.com/) for supporting the project. 68 | -------------------------------------------------------------------------------- /chapter-01-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 1 References 5 | -------------------- 6 | 7 | 1. Michael Stonebraker and Uğur Çetintemel: 8 | “['One Size Fits All': An Idea Whose Time Has Come and Gone](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.9136&rep=rep1&type=pdf),” at *21st International Conference 9 | on Data Engineering* (ICDE), April 2005. 10 | 11 | 1. Walter L. Heimerdinger and Charles B. Weinstock: 12 | “[A Conceptual Framework for System Fault Tolerance](https://resources.sei.cmu.edu/asset_files/TechnicalReport/1992_005_001_16112.pdf),” Technical Report CMU/SEI-92-TR-033, Software Engineering Institute, Carnegie 13 | Mellon University, October 1992. 14 | 15 | 1. Ding Yuan, Yu Luo, Xin Zhuang, et al.: 16 | “[Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems](https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf),” at *11th USENIX Symposium on Operating Systems Design 17 | and Implementation* (OSDI), October 2014. 18 | 19 | 1. Yury Izrailevsky and Ariel Tseitlin: 20 | “[The Netflix Simian Army](https://netflixtechblog.com/the-netflix-simian-army-16e57fbab116),” 21 | *netflixtechblog.com*, July 19, 2011. 22 | 23 | 1. Daniel Ford, François Labelle, Florentina I. Popovici, et al.: 24 | “[Availability in Globally Distributed Storage Systems](http://research.google.com/pubs/archive/36737.pdf),” 25 | at *9th USENIX Symposium on Operating Systems Design and Implementation* (OSDI), 26 | October 2010. 27 | 28 | 1. Brian Beach: 29 | “[Hard Drive Reliability Update – Sep 2014](https://www.backblaze.com/blog/hard-drive-reliability-update-september-2014/),” *backblaze.com*, September 23, 2014. 30 | 31 | 1. Laurie Voss: 32 | “[AWS: The Good, the Bad and the Ugly](https://web.archive.org/web/20160429075023/http://blog.awe.sm/2012/12/18/aws-the-good-the-bad-and-the-ugly/),” *blog.awe.sm*, December 18, 2012. 33 | 34 | 1. Haryadi S. Gunawi, Mingzhe Hao, Tanakorn 35 | Leesatapornwongsa, et al.: “[What Bugs Live in the Cloud?](http://ucare.cs.uchicago.edu/pdf/socc14-cbs.pdf),” at *5th ACM Symposium on Cloud Computing* (SoCC), November 2014. 36 | [doi:10.1145/2670979.2670986](http://dx.doi.org/10.1145/2670979.2670986) 37 | 38 | 1. Nelson Minar: 39 | “[Leap Second Crashes Half the Internet](http://www.somebits.com/weblog/tech/bad/leap-second-2012.html),” *somebits.com*, July 3, 2012. 40 | 41 | 1. Amazon Web Services: 42 | “[Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region](http://aws.amazon.com/message/65648/),” *aws.amazon.com*, April 29, 2011. 43 | 44 | 1. Richard I. Cook: 45 | “[How Complex Systems Fail](http://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf),” Cognitive Technologies Laboratory, April 2000. 46 | 47 | 1. Jay Kreps: 48 | “[Getting Real About Distributed System Reliability](http://blog.empathybox.com/post/19574936361/getting-real-about-distributed-system-reliability),” *blog.empathybox.com*, March 19, 2012. 49 | 50 | 1. David Oppenheimer, Archana Ganapathi, and David A. Patterson: 51 | “[Why Do Internet Services Fail, and What Can Be Done About It?](http://static.usenix.org/legacy/events/usits03/tech/full_papers/oppenheimer/oppenheimer.pdf),” at *4th USENIX Symposium on 52 | Internet Technologies and Systems* (USITS), March 2003. 53 | 54 | 1. Nathan Marz: 55 | “[Principles of Software Engineering, Part 1](http://nathanmarz.com/blog/principles-of-software-engineering-part-1.html),” *nathanmarz.com*, April 2, 2013. 56 | 57 | 1. Michael Jurewitz: 58 | “[The Human Impact of Bugs](http://jury.me/blog/2013/3/14/the-human-impact-of-bugs),” 59 | *jury.me*, March 15, 2013. 60 | 61 | 1. Raffi Krikorian: 62 | “[Timelines at Scale](http://www.infoq.com/presentations/Twitter-Timeline-Scalability),” 63 | at *QCon San Francisco*, November 2012. 64 | 65 | 1. Martin Fowler: 66 | *Patterns of Enterprise Application Architecture*. Addison Wesley, 2002. 67 | ISBN: 978-0-321-12742-6 68 | 69 | 1. Kelly Sommers: 70 | “[After all that run around, what caused 500ms disk latency even when we replaced physical server?](https://twitter.com/kellabyte/status/532930540777635840)” *twitter.com*, November 13, 2014. 71 | 72 | 1. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, et al.: 73 | “[Dynamo: Amazon's Highly Available Key-Value Store](http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf),” at *21st ACM Symposium on Operating 74 | Systems Principles* (SOSP), October 2007. 75 | 76 | 1. Greg Linden: 77 | “[Make Data Useful](http://glinden.blogspot.co.uk/2006/12/slides-from-my-talk-at-stanford.html),” slides from presentation at Stanford University Data Mining class (CS345), December 2006. 78 | 79 | 1. Tammy Everts: 80 | “[The Real Cost of Slow Time vs Downtime](https://www.slideshare.net/Radware/radware-cmg2014-tammyevertsslowtimevsdowntime),” *slideshare.net*, November 5, 2014. 81 | 82 | 1. Jake Brutlag: 83 | “[Speed Matters](https://ai.googleblog.com/2009/06/speed-matters.html),” *ai.googleblog.com*, June 23, 2009. 84 | 85 | 1. Tyler Treat: 86 | “[Everything You Know About Latency Is Wrong](http://bravenewgeek.com/everything-you-know-about-latency-is-wrong/),” *bravenewgeek.com*, December 12, 2015. 87 | 88 | 1. Jeffrey Dean and Luiz André Barroso: 89 | “[The Tail at Scale](http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext),” 90 | *Communications of the ACM*, volume 56, number 2, pages 74–80, February 2013. 91 | [doi:10.1145/2408776.2408794](http://dx.doi.org/10.1145/2408776.2408794) 92 | 93 | 1. Graham Cormode, Vladislav 94 | Shkapenyuk, Divesh Srivastava, and Bojian Xu: 95 | “[Forward Decay: A Practical Time Decay Model for Streaming Systems](http://dimacs.rutgers.edu/~graham/pubs/papers/fwddecay.pdf),” at *25th IEEE International Conference on Data 96 | Engineering* (ICDE), March 2009. 97 | 98 | 1. Ted Dunning and Otmar Ertl: 99 | “[Computing Extremely Accurate Quantiles Using t-Digests](https://github.com/tdunning/t-digest),” *github.com*, March 2014. 100 | 101 | 1. Gil Tene: 102 | “[HdrHistogram](http://www.hdrhistogram.org/),” *hdrhistogram.org*. 103 | 104 | 1. Baron Schwartz: 105 | “[Why Percentiles Don’t Work the Way You Think](https://orangematter.solarwinds.com/2016/11/18/why-percentiles-dont-work-the-way-you-think/),” *solarwinds.com*, November 18, 2016. 106 | 107 | 1. James Hamilton: 108 | “[On Designing and Deploying Internet-Scale Services](https://www.usenix.org/legacy/events/lisa07/tech/full_papers/hamilton/hamilton.pdf),” at *21st Large Installation 109 | System Administration Conference* (LISA), November 2007. 110 | 111 | 1. Brian Foote and Joseph Yoder: 112 | “[Big Ball of Mud](http://www.laputan.org/pub/foote/mud.pdf),” at 113 | *4th Conference on Pattern Languages of Programs* (PLoP), 114 | September 1997. 115 | 116 | 1. Frederick P Brooks: “No Silver Bullet – Essence and 117 | Accident in Software Engineering,” in *The Mythical Man-Month*, Anniversary 118 | edition, Addison-Wesley, 1995. ISBN: 978-0-201-83595-3 119 | 120 | 1. Ben Moseley and Peter Marks: 121 | “[Out of the Tar Pit](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.93.8928),” 122 | at *BCS Software Practice Advancement* (SPA), 2006. 123 | 124 | 1. Rich Hickey: 125 | “[Simple Made Easy](http://www.infoq.com/presentations/Simple-Made-Easy),” 126 | at *Strange Loop*, September 2011. 127 | 128 | 1. Hongyu Pei Breivold, Ivica Crnkovic, and Peter J. Eriksson: 129 | “[Analyzing Software Evolvability](http://www.es.mdh.se/pdf_publications/1251.pdf),” 130 | at *32nd Annual IEEE International Computer Software and Applications Conference* 131 | (COMPSAC), July 2008. 132 | [doi:10.1109/COMPSAC.2008.50](http://dx.doi.org/10.1109/COMPSAC.2008.50) 133 | 134 | -------------------------------------------------------------------------------- /chapter-02-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 2 References 5 | -------------------- 6 | 7 | 1. Edgar F. Codd: 8 | “[A Relational Model of Data for Large Shared Data Banks](https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf),” *Communications of the ACM*, volume 13, number 9 | 6, pages 377–387, June 1970. 10 | [doi:10.1145/362384.362685](http://dx.doi.org/10.1145/362384.362685) 11 | 12 | 1. Michael Stonebraker and Joseph M. Hellerstein: 13 | “[What Goes Around Comes Around](http://mitpress2.mit.edu/books/chapters/0262693143chapm1.pdf),” 14 | in *Readings in Database Systems*, 4th edition, MIT Press, pages 2–41, 2005. 15 | ISBN: 978-0-262-69314-1 16 | 17 | 1. Pramod J. Sadalage and 18 | Martin Fowler: *NoSQL Distilled*. Addison-Wesley, August 2012. ISBN: 19 | 978-0-321-82662-6 20 | 21 | 1. Eric Evans: 22 | “[NoSQL: What's in a Name?](https://web.archive.org/web/20190623045155/http://blog.sym-link.com/2009/10/30/nosql_whats_in_a_name.html),” *blog.sym-link.com*, October 30, 2009. 23 | 24 | 1. James Phillips: 25 | “[Surprises in Our NoSQL Adoption Survey](http://blog.couchbase.com/nosql-adoption-survey-surprises),” *blog.couchbase.com*, February 8, 2012. 26 | 27 | 1. Michael Wagner: 28 | *SQL/XML:2006 – Evaluierung der Standardkonformität ausgewählter Datenbanksysteme*. 29 | Diplomica Verlag, Hamburg, 2010. ISBN: 978-3-836-64609-3 30 | 31 | 1. “[XML Data (SQL Server)](https://docs.microsoft.com/en-us/sql/relational-databases/xml/xml-data-sql-server?view=sql-server-ver15),” SQL Server documentation, *docs.microsoft.com*, 2013. 32 | 33 | 1. “[PostgreSQL 9.3.1 Documentation](http://www.postgresql.org/docs/9.3/static/index.html),” The PostgreSQL Global Development Group, 2013. 34 | 35 | 1. “[The MongoDB 2.4 Manual](http://docs.mongodb.org/manual/),” MongoDB, Inc., 2013. 36 | 37 | 1. “[RethinkDB 1.11 Documentation](http://www.rethinkdb.com/docs/),” *rethinkdb.com*, 2013. 38 | 39 | 1. “[Apache CouchDB 1.6 Documentation](http://docs.couchdb.org/en/latest/),” *docs.couchdb.org*, 2014. 40 | 41 | 1. Lin Qiao, Kapil Surlaker, Shirshanka Das, et al.: 42 | “[On Brewing Fresh Espresso: LinkedIn’s Distributed Data Serving Platform](http://www.slideshare.net/amywtang/espresso-20952131),” at *ACM International Conference on Management 43 | of Data* (SIGMOD), June 2013. 44 | 45 | 1. Rick Long, Mark Harrington, Robert Hain, and Geoff Nicholls: 46 | [*IMS Primer*](http://www.redbooks.ibm.com/redbooks/pdfs/sg245352.pdf). 47 | IBM Redbook SG24-5352-00, IBM International Technical Support Organization, January 2000. 48 | 49 | 1. Stephen D. Bartlett: 50 | “[IBM’s IMS—Myths, Realities, and Opportunities](ftp://public.dhe.ibm.com/software/data/ims/pdf/TCG2013015LI.pdf),” The Clipper Group Navigator, TCG2013015LI, July 2013. 51 | 52 | 1. Sarah Mei: 53 | “[Why You Should Never Use MongoDB](http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/),” 54 | *sarahmei.com*, November 11, 2013. 55 | 56 | 1. J. S. Knowles and D. M. R. Bell: 57 | “The CODASYL Model,” in *Databases—Role and Structure: An Advanced Course*, edited by P. M. 58 | Stocker, P. M. D. Gray, and M. P. Atkinson, pages 19–56, Cambridge University Press, 1984. ISBN: 59 | 978-0-521-25430-4 60 | 61 | 1. Charles W. Bachman: 62 | “[The Programmer as Navigator](http://dl.acm.org/citation.cfm?id=362534),” 63 | *Communications of the ACM*, volume 16, number 11, pages 653–658, November 1973. 64 | [doi:10.1145/355611.362534](http://dx.doi.org/10.1145/355611.362534) 65 | 66 | 1. Joseph M. Hellerstein, Michael Stonebraker, and James Hamilton: 67 | “[Architecture of a Database System](http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf),” 68 | *Foundations and Trends in Databases*, volume 1, number 2, pages 141–259, November 2007. 69 | [doi:10.1561/1900000002](http://dx.doi.org/10.1561/1900000002) 70 | 71 | 1. Sandeep Parikh and Kelly Stirman: 72 | “[Schema Design for Time Series Data in MongoDB](http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb),” *blog.mongodb.org*, October 30, 2013. 73 | 74 | 1. Martin Fowler: 75 | “[Schemaless Data Structures](http://martinfowler.com/articles/schemaless/),” 76 | *martinfowler.com*, January 7, 2013. 77 | 78 | 1. Amr Awadallah: 79 | “[Schema-on-Read vs. Schema-on-Write](http://www.slideshare.net/awadallah/schemaonread-vs-schemaonwrite),” at *Berkeley EECS RAD Lab Retreat*, Santa Cruz, CA, May 2009. 80 | 81 | 1. Martin Odersky: 82 | “[The Trouble with Types](http://www.infoq.com/presentations/data-types-issues),” 83 | at *Strange Loop*, September 2013. 84 | 85 | 1. Conrad Irwin: 86 | “[MongoDB—Confessions of a PostgreSQL Lover](https://speakerdeck.com/conradirwin/mongodb-confessions-of-a-postgresql-lover),” at *HTML5DevConf*, October 2013. 87 | 88 | 1. “[Percona Toolkit Documentation: pt-online-schema-change](http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html),” Percona Ireland Ltd., 2013. 89 | 90 | 1. Rany Keddo, Tobias Bielohlawek, and Tobias Schmidt: 91 | “[Large Hadron Migrator](https://github.com/soundcloud/lhm),” SoundCloud, 2013. 92 | 93 | 1. Shlomi Noach: 94 | “[gh-ost: GitHub's Online Schema Migration Tool for MySQL](http://githubengineering.com/gh-ost-github-s-online-migration-tool-for-mysql/),” *githubengineering.com*, August 1, 2016. 95 | 96 | 1. James C. Corbett, Jeffrey Dean, Michael Epstein, et al.: 97 | “[Spanner: Google’s Globally-Distributed Database](https://research.google/pubs/pub39966/),” 98 | at *10th USENIX Symposium on Operating System Design and Implementation* (OSDI), 99 | October 2012. 100 | 101 | 1. Donald K. Burleson: 102 | “[Reduce I/O with Oracle Cluster Tables](http://www.dba-oracle.com/oracle_tip_hash_index_cluster_table.htm),” *dba-oracle.com*. 103 | 104 | 1. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, et al.: 105 | “[Bigtable: A Distributed Storage System for Structured Data](https://research.google/pubs/pub27898/),” at *7th USENIX Symposium on Operating System Design and 106 | Implementation* (OSDI), November 2006. 107 | 108 | 1. Bobbie J. Cochrane and Kathy A. McKnight: 109 | “[DB2 JSON Capabilities, Part 1: Introduction to DB2 JSON](http://www.ibm.com/developerworks/data/library/techarticle/dm-1306nosqlforjson1/),” IBM developerWorks, June 20, 2013. 110 | 111 | 1. Herb Sutter: 112 | “[The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software](http://www.gotw.ca/publications/concurrency-ddj.htm),” *Dr. Dobb's Journal*, 113 | volume 30, number 3, pages 202-210, March 2005. 114 | 115 | 1. Joseph M. Hellerstein: 116 | “[The Declarative Imperative: Experiences and Conjectures in Distributed Logic](http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-90.pdf),” Electrical Engineering and 117 | Computer Sciences, University of California at Berkeley, Tech report UCB/EECS-2010-90, June 118 | 2010. 119 | 120 | 1. Jeffrey Dean and Sanjay Ghemawat: 121 | “[MapReduce: Simplified Data Processing on Large Clusters](https://research.google/pubs/pub62/),” at *6th USENIX Symposium on Operating System Design and 122 | Implementation* (OSDI), December 2004. 123 | 124 | 1. Craig Kerstiens: 125 | “[JavaScript in Your Postgres](https://blog.heroku.com/javascript_in_your_postgres),” 126 | *blog.heroku.com*, June 5, 2013. 127 | 128 | 1. Nathan Bronson, Zach Amsden, George Cabrera, et al.: 129 | “[TAO: Facebook’s Distributed Data Store for the Social Graph](https://www.usenix.org/conference/atc13/technical-sessions/presentation/bronson),” at 130 | *USENIX Annual Technical Conference* (USENIX ATC), June 2013. 131 | 132 | 1. “[Apache TinkerPop3.2.3 Documentation](http://tinkerpop.apache.org/docs/3.2.3/reference/),” *tinkerpop.apache.org*, October 2016. 133 | 134 | 1. “[The Neo4j Manual v2.0.0](http://docs.neo4j.org/chunked/2.0.0/index.html),” 135 | Neo Technology, 2013. 136 | 137 | 1. Emil Eifrem: 138 | [Twitter correspondence](https://twitter.com/emileifrem/status/419107961512804352), January 3, 2014. 139 | 140 | 1. David Beckett and Tim Berners-Lee: 141 | “[Turtle – Terse RDF Triple Language](http://www.w3.org/TeamSubmission/turtle/),” 142 | W3C Team Submission, March 28, 2011. 143 | 144 | 1. “[Datomic Development Resources](http://docs.datomic.com/),” Metadata Partners, LLC, 2013. 145 | 146 | 1. W3C RDF Working Group: 147 | “[Resource Description Framework (RDF)](http://www.w3.org/RDF/),” 148 | *w3.org*, 10 February 2004. 149 | 150 | 1. “[Apache Jena](http://jena.apache.org/),” 151 | Apache Software Foundation. 152 | 153 | 1. Steve Harris, Andy Seaborne, and Eric 154 | Prud'hommeaux: “[SPARQL 1.1 Query Language](http://www.w3.org/TR/sparql11-query/),” 155 | W3C Recommendation, March 2013. 156 | 157 | 1. Todd J. Green, Shan Shan Huang, Boon Thau Loo, and Wenchao Zhou: 158 | “[Datalog and Recursive Query Processing](http://blogs.evergreen.edu/sosw/files/2014/04/Green-Vol5-DBS-017.pdf),” *Foundations and Trends in Databases*, 159 | volume 5, number 2, pages 105–195, November 2013. 160 | [doi:10.1561/1900000017](http://dx.doi.org/10.1561/1900000017) 161 | 162 | 1. Stefano Ceri, Georg Gottlob, and Letizia Tanca: 163 | “[What You Always Wanted to Know About Datalog (And Never Dared to Ask)](https://www.researchgate.net/profile/Letizia_Tanca/publication/3296132_What_you_always_wanted_to_know_about_Datalog_and_never_dared_to_ask/links/0fcfd50ca2d20473ca000000.pdf),” *IEEE 164 | Transactions on Knowledge and Data Engineering*, volume 1, number 1, pages 146–166, March 1989. 165 | [doi:10.1109/69.43410](http://dx.doi.org/10.1109/69.43410) 166 | 167 | 1. Serge Abiteboul, Richard Hull, and Victor Vianu: 168 | [*Foundations of Databases*](http://webdam.inria.fr/Alice/). Addison-Wesley, 1995. 169 | ISBN: 978-0-201-53771-0, available online at *webdam.inria.fr/Alice* 170 | 171 | 1. Nathan Marz: 172 | “[Cascalog](https://github.com/nathanmarz/cascalog)," *github.com*. 173 | 174 | 1. Dennis A. Benson, 175 | Ilene Karsch-Mizrachi, David J. Lipman, et al.: 176 | “[GenBank](https://academic.oup.com/nar/article/36/suppl_1/D25/2507746),” 177 | *Nucleic Acids Research*, volume 36, Database issue, pages D25–D30, December 2007. 178 | [doi:10.1093/nar/gkm929](http://dx.doi.org/10.1093/nar/gkm929) 179 | 180 | 1. Fons Rademakers: 181 | “[ROOT for Big Data Analysis](https://indico.cern.ch/event/246453/contributions/1566610/attachments/423154/587535/ROOT-BigData-Analysis-London-2013.pdf),” at *Workshop on the Future of Big Data Management*, 182 | London, UK, June 2013. 183 | 184 | -------------------------------------------------------------------------------- /chapter-03-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 3 References 5 | -------------------- 6 | 7 | 1. Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman: 8 | *Data Structures and Algorithms*. Addison-Wesley, 1983. ISBN: 978-0-201-00023-8 9 | 10 | 1. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and 11 | Clifford Stein: *Introduction to Algorithms*, 3rd edition. MIT Press, 2009. 12 | ISBN: 978-0-262-53305-8 13 | 14 | 1. Justin Sheehy and David Smith: 15 | “[Bitcask: A Log-Structured Hash Table for Fast Key/Value Data](https://riak.com/assets/bitcask-intro.pdf),” Basho Technologies, April 2010. 16 | 17 | 1. Yinan Li, Bingsheng He, Robin Jun Yang, et al.: 18 | “[Tree Indexing on Solid State Drives](http://www.vldb.org/pvldb/vldb2010/papers/R106.pdf),” 19 | *Proceedings of the VLDB Endowment*, volume 3, number 1, pages 1195–1206, 20 | September 2010. 21 | 22 | 1. Goetz Graefe: 23 | “[Modern B-Tree Techniques](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.219.7269&rep=rep1&type=pdf),” 24 | *Foundations and Trends in Databases*, volume 3, number 4, pages 203–402, August 2011. 25 | [doi:10.1561/1900000028](http://dx.doi.org/10.1561/1900000028) 26 | 27 | 1. Jeffrey Dean and Sanjay Ghemawat: 28 | “[LevelDB Implementation Notes](https://github.com/google/leveldb/blob/master/doc/impl.md),” 29 | *github.com*. 30 | 31 | 1. Dhruba Borthakur: 32 | “[The History of RocksDB](https://rocksdb.blogspot.com/2013/11/the-history-of-rocksdb.html),” 33 | *rocksdb.blogspot.com*, November 24, 2013. 34 | 35 | 1. Matteo Bertozzi: 36 | “[Apache HBase I/O – HFile](https://blog.cloudera.com/apache-hbase-i-o-hfile/),” *blog.cloudera.com*, June 29, 2012. 37 | 38 | 1. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, et al.: 39 | “[Bigtable: A Distributed Storage System for Structured Data](https://research.google/pubs/pub27898/),” at *7th USENIX Symposium on Operating System Design and 40 | Implementation* (OSDI), November 2006. 41 | 42 | 1. Patrick 43 | O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil: 44 | “[The Log-Structured Merge-Tree (LSM-Tree)](http://www.cs.umb.edu/~poneil/lsmtree.pdf),” 45 | *Acta Informatica*, volume 33, number 4, pages 351–385, June 1996. 46 | [doi:10.1007/s002360050048](http://dx.doi.org/10.1007/s002360050048) 47 | 48 | 1. Mendel Rosenblum and John K. Ousterhout: 49 | “[The Design and Implementation of a Log-Structured File System](http://research.cs.wisc.edu/areas/os/Qual/papers/lfs.pdf),” 50 | *ACM Transactions on Computer Systems*, volume 10, number 1, pages 26–52, February 1992. 51 | [doi:10.1145/146941.146943](http://dx.doi.org/10.1145/146941.146943) 52 | 53 | 1. Adrien Grand: 54 | “[What Is in a Lucene Index?](http://www.slideshare.net/lucenerevolution/what-is-inaluceneagrandfinal),” at *Lucene/Solr Revolution*, November 14, 2013. 55 | 56 | 1. Deepak Kandepet: 57 | “[Hacking Lucene—The Index Format](https://web.archive.org/web/20160316190830/http://hackerlabs.github.io/blog/2011/10/01/hacking-lucene-the-index-format/index.html),” *hackerlabs.github.io*, October 1, 2011. 58 | 59 | 1. Michael McCandless: 60 | “[Visualizing Lucene's Segment Merges](http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html),” *blog.mikemccandless.com*, February 11, 2011. 61 | 62 | 1. Burton H. Bloom: 63 | “[Space/Time Trade-offs in Hash Coding with Allowable Errors](https://people.cs.umass.edu/~emery/classes/cmpsci691st/readings/Misc/p422-bloom.pdf),” 64 | *Communications of the ACM*, volume 13, number 7, pages 422–426, July 1970. 65 | [doi:10.1145/362686.362692](http://dx.doi.org/10.1145/362686.362692) 66 | 67 | 1. “[Operating Cassandra: Compaction](https://cassandra.apache.org/doc/latest/operating/compaction.html),” Apache Cassandra Documentation v4.0, 2016. 68 | 69 | 1. Rudolf Bayer and Edward M. McCreight: 70 | “[Organization and Maintenance of Large Ordered Indices](http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD0712079),” Boeing Scientific Research Laboratories, Mathematical and Information Sciences 71 | Laboratory, report no. 20, July 1970. 72 | 73 | 1. Douglas Comer: 74 | “[The Ubiquitous B-Tree](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.96.6637&rep=rep1&type=pdf),” *ACM Computing Surveys*, volume 11, number 2, pages 121–137, June 1979. 75 | [doi:10.1145/356770.356776](http://dx.doi.org/10.1145/356770.356776) 76 | 77 | 1. Emmanuel Goossaert: 78 | “[Coding for SSDs](http://codecapsule.com/2014/02/12/coding-for-ssds-part-1-introduction-and-table-of-contents/),” *codecapsule.com*, February 12, 2014. 79 | 80 | 1. C. Mohan and Frank Levine: 81 | “[ARIES/IM: An Efficient and High Concurrency Index Management Method Using Write-Ahead Logging](http://www.ics.uci.edu/~cs223/papers/p371-mohan.pdf),” at *ACM 82 | International Conference on Management of Data* (SIGMOD), June 1992. 83 | [doi:10.1145/130283.130338](http://dx.doi.org/10.1145/130283.130338) 84 | 85 | 1. Howard Chu: 86 | “[LDAP at Lightning Speed]( https://buildstuff14.sched.com/event/08a1a368e272eb599a52e08b4c3c779d),” 87 | at *Build Stuff '14*, November 2014. 88 | 89 | 1. Bradley C. Kuszmaul: 90 | “[A Comparison of Fractal Trees to Log-Structured Merge (LSM) Trees](http://www.pandademo.com/wp-content/uploads/2017/12/A-Comparison-of-Fractal-Trees-to-Log-Structured-Merge-LSM-Trees.pdf),” *tokutek.com*, 91 | April 22, 2014. 92 | 93 | 1. Manos Athanassoulis, Michael S. Kester, 94 | Lukas M. Maas, et al.: “[Designing Access Methods: The RUM Conjecture](http://openproceedings.org/2016/conf/edbt/paper-12.pdf),” at *19th International Conference on Extending Database 95 | Technology* (EDBT), March 2016. 96 | [doi:10.5441/002/edbt.2016.42](http://dx.doi.org/10.5441/002/edbt.2016.42) 97 | 98 | 1. Peter Zaitsev: 99 | “[Innodb Double Write](https://www.percona.com/blog/2006/08/04/innodb-double-write/),” 100 | *percona.com*, August 4, 2006. 101 | 102 | 1. Tomas Vondra: 103 | “[On the Impact of Full-Page Writes](http://blog.2ndquadrant.com/on-the-impact-of-full-page-writes/),” *blog.2ndquadrant.com*, November 23, 2016. 104 | 105 | 1. Mark Callaghan: 106 | “[The Advantages of an LSM vs a B-Tree](http://smalldatum.blogspot.co.uk/2016/01/summary-of-advantages-of-lsm-vs-b-tree.html),” *smalldatum.blogspot.co.uk*, January 19, 2016. 107 | 108 | 1. Mark Callaghan: 109 | “[Choosing Between Efficiency and Performance with RocksDB](https://codemesh.io/codemesh2016/mark-callaghan),” at *Code Mesh*, November 4, 2016. 110 | 111 | 1. Michi Mutsuzaki: “MySQL vs. LevelDB” [URL inactive], August 2011. 112 | 113 | 1. Benjamin Coverston, 114 | Jonathan Ellis, et al.: “[CASSANDRA-1608: Redesigned Compaction](https://issues.apache.org/jira/browse/CASSANDRA-1608), *issues.apache.org*, July 2011. 115 | 116 | 1. Igor Canadi, Siying Dong, and Mark Callaghan: 117 | “[RocksDB Tuning Guide](https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide),” 118 | *github.com*, 2016. 119 | 120 | 1. [*MySQL 5.7 Reference Manual*](http://dev.mysql.com/doc/refman/5.7/en/index.html). 121 | Oracle, 2014. 122 | 123 | 1. [*Books Online for SQL Server 2012*](http://msdn.microsoft.com/en-us/library/ms130214.aspx). 124 | Microsoft, 2012. 125 | 126 | 1. Joe Webb: 127 | “[Using Covering Indexes to Improve Query Performance](https://www.simple-talk.com/sql/learn-sql-server/using-covering-indexes-to-improve-query-performance/),” *simple-talk.com*, 29 September 2008. 128 | 129 | 1. Frank Ramsak, Volker Markl, Robert Fenk, et al.: 130 | “[Integrating the UB-Tree into a Database System Kernel](http://www.vldb.org/conf/2000/P263.pdf),” 131 | at *26th International Conference on Very Large Data Bases* (VLDB), September 2000. 132 | 133 | 1. The PostGIS Development Group: 134 | “[PostGIS 2.1.2dev Manual](http://postgis.net/docs/manual-2.1/),” 135 | *postgis.net*, 2014. 136 | 137 | 1. Robert Escriva, Bernard Wong, and Emin Gün Sirer: 138 | “[HyperDex: A Distributed, Searchable Key-Value Store](http://www.cs.princeton.edu/courses/archive/fall13/cos518/papers/hyperdex.pdf),” at *ACM SIGCOMM Conference*, August 2012. 139 | [doi:10.1145/2377677.2377681](http://dx.doi.org/10.1145/2377677.2377681) 140 | 141 | 1. Michael McCandless: 142 | “[Lucene's FuzzyQuery Is 100 Times Faster in 4.0](http://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html),” *blog.mikemccandless.com*, March 24, 2011. 143 | 144 | 1. Steffen Heinz, Justin Zobel, and Hugh E. Williams: 145 | “[Burst Tries: A Fast, Efficient Data Structure for String Keys](http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499),” 146 | *ACM Transactions on Information Systems*, volume 20, number 2, pages 192–223, April 2002. 147 | [doi:10.1145/506309.506312](http://dx.doi.org/10.1145/506309.506312) 148 | 149 | 1. Klaus U. Schulz and Stoyan Mihov: 150 | “[Fast String Correction with Levenshtein Automata](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.652),” 151 | *International Journal on Document Analysis and Recognition*, 152 | volume 5, number 1, pages 67–85, November 2002. 153 | [doi:10.1007/s10032-002-0082-8](http://dx.doi.org/10.1007/s10032-002-0082-8) 154 | 155 | 1. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze: 156 | [*Introduction to Information Retrieval*](http://nlp.stanford.edu/IR-book/). 157 | Cambridge University Press, 2008. ISBN: 978-0-521-86571-5, available online at *nlp.stanford.edu/IR-book* 158 | 159 | 1. Michael Stonebraker, Samuel Madden, Daniel J. Abadi, et al.: 160 | “[The End of an Architectural Era (It’s Time for a Complete Rewrite)](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.137.3697&rep=rep1&type=pdf),” at 161 | *33rd International Conference on Very Large Data Bases* (VLDB), September 2007. 162 | 163 | 1. “[VoltDB Technical Overview White Paper](https://www.voltdb.com/files/voltdb-technical-overview/),” VoltDB, 2014. 164 | 165 | 1. Stephen M. Rumble, Ankita Kejriwal, and John K. Ousterhout: 166 | “[Log-Structured Memory for DRAM-Based Storage](https://www.usenix.org/system/files/conference/fast14/fast14-paper_rumble.pdf),” at *12th USENIX Conference on File and Storage 167 | Technologies* (FAST), February 2014. 168 | 169 | 1. Stavros Harizopoulos, Daniel J. Abadi, 170 | Samuel Madden, and Michael Stonebraker: 171 | “[OLTP Through the Looking Glass, and What We Found There](http://hstore.cs.brown.edu/papers/hstore-lookingglass.pdf),” at *ACM International Conference on Management of Data* 172 | (SIGMOD), June 2008. 173 | [doi:10.1145/1376616.1376713](http://dx.doi.org/10.1145/1376616.1376713) 174 | 175 | 1. Justin DeBrabant, Andrew Pavlo, Stephen Tu, et al.: 176 | “[Anti-Caching: A New Approach to Database Management System Architecture](http://www.vldb.org/pvldb/vol6/p1942-debrabant.pdf),” *Proceedings of the VLDB Endowment*, volume 6, 177 | number 14, pages 1942–1953, September 2013. 178 | 179 | 1. Joy Arulraj, Andrew Pavlo, and Subramanya R. Dulloor: 180 | “[Let's Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems](http://www.pdl.cmu.edu/PDL-FTP/NVM/storage.pdf),” at *ACM International Conference on 181 | Management of Data* (SIGMOD), June 2015. 182 | [doi:10.1145/2723372.2749441](http://dx.doi.org/10.1145/2723372.2749441) 183 | 184 | 1. Edgar F. Codd, S. B. Codd, and C. T. Salley: 185 | “[Providing OLAP to User-Analysts: An IT Mandate](https://pdfs.semanticscholar.org/a0bd/1491a54a4de428c5eef9b836ef6ee2915fe7.pdf),” 186 | E. F. Codd Associates, 1993. 187 | 188 | 1. Surajit Chaudhuri and Umeshwar Dayal: 189 | “[An Overview of Data Warehousing and OLAP Technology](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/sigrecord.pdf),” *ACM SIGMOD Record*, volume 26, number 1, pages 65–74, 190 | March 1997. [doi:10.1145/248603.248616](http://dx.doi.org/10.1145/248603.248616) 191 | 192 | 1. Per-Åke Larson, Cipri Clinciu, Campbell Fraser, et al.: 193 | “[Enhancements to SQL Server Column Stores](http://research.microsoft.com/pubs/193599/Apollo3%20-%20Sigmod%202013%20-%20final.pdf),” at *ACM International Conference on Management of Data* 194 | (SIGMOD), June 2013. 195 | 196 | 1. Franz Färber, Norman May, Wolfgang Lehner, et al.: 197 | “[The SAP HANA Database – An Architecture Overview](http://sites.computer.org/debull/A12mar/hana.pdf),” 198 | *IEEE Data Engineering Bulletin*, volume 35, number 1, pages 28–33, March 2012. 199 | 200 | 1. Michael Stonebraker: 201 | “[The Traditional RDBMS Wisdom Is (Almost Certainly) All Wrong](http://slideshot.epfl.ch/talks/166),” presentation at *EPFL*, May 2013. 202 | 203 | 1. Daniel J. Abadi: 204 | “[Classifying the SQL-on-Hadoop Solutions](https://web.archive.org/web/20150622074951/http://hadapt.com/blog/2013/10/02/classifying-the-sql-on-hadoop-solutions/),” *hadapt.com*, October 2, 2013. 205 | 206 | 1. Marcel Kornacker, Alexander Behm, Victor Bittorf, et al.: 207 | “[Impala: A Modern, Open-Source SQL Engine for Hadoop](http://pandis.net/resources/cidr15impala.pdf),” at *7th Biennial Conference on Innovative Data Systems 208 | Research* (CIDR), January 2015. 209 | 210 | 1. Sergey Melnik, Andrey Gubarev, Jing Jing Long, et al.: 211 | “[Dremel: Interactive Analysis of Web-Scale Datasets](https://research.google/pubs/pub36632/),” at *36th International Conference on Very Large Data Bases* (VLDB), pages 212 | 330–339, September 2010. 213 | 214 | 1. Ralph Kimball and Margy Ross: 215 | *The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling*, 216 | 3rd edition. John Wiley & Sons, July 2013. ISBN: 978-1-118-53080-1 217 | 218 | 1. Derrick Harris: 219 | “[Why Apple, eBay, and Walmart Have Some of the Biggest Data Warehouses You’ve Ever Seen](http://gigaom.com/2013/03/27/why-apple-ebay-and-walmart-have-some-of-the-biggest-data-warehouses-youve-ever-seen/),” 220 | *gigaom.com*, March 27, 2013. 221 | 222 | 1. Julien Le Dem: 223 | “[Dremel Made Simple with Parquet](https://blog.twitter.com/engineering/en_us/a/2013/dremel-made-simple-with-parquet.html),” 224 | *blog.twitter.com*, September 11, 2013. 225 | 226 | 1. Daniel J. Abadi, Peter Boncz, Stavros 227 | Harizopoulos, et al.: 228 | “[The Design and Implementation of Modern Column-Oriented Database Systems](http://cs-www.cs.yale.edu/homes/dna/papers/abadi-column-stores.pdf),” *Foundations and Trends in 229 | Databases*, volume 5, number 3, pages 197–280, December 2013. 230 | [doi:10.1561/1900000024](http://dx.doi.org/10.1561/1900000024) 231 | 232 | 1. Peter Boncz, Marcin Zukowski, and Niels Nes: 233 | “[MonetDB/X100: Hyper-Pipelining Query Execution](http://cidrdb.org/cidr2005/papers/P19.pdf),” 234 | at *2nd Biennial Conference on Innovative Data Systems Research* (CIDR), January 2005. 235 | 236 | 1. Jingren Zhou and Kenneth A. Ross: 237 | “[Implementing Database Operations Using SIMD Instructions](http://www1.cs.columbia.edu/~kar/pubsk/simd.pdf),” 238 | at *ACM International Conference on Management of Data* (SIGMOD), pages 145–156, June 2002. 239 | [doi:10.1145/564691.564709](http://dx.doi.org/10.1145/564691.564709) 240 | 241 | 1. Michael Stonebraker, Daniel J. Abadi, Adam Batkin, et al.: 242 | “[C-Store: A Column-oriented DBMS](http://www.cs.umd.edu/~abadi/vldb.pdf),” 243 | at *31st International Conference on Very Large Data Bases* (VLDB), pages 553–564, September 2005. 244 | 245 | 1. Andrew Lamb, Matt Fuller, Ramakrishna Varadarajan, et al.: 246 | “[The Vertica Analytic Database: C-Store 7 Years Later](http://vldb.org/pvldb/vol5/p1790_andrewlamb_vldb2012.pdf),” 247 | *Proceedings of the VLDB Endowment*, volume 5, number 12, pages 1790–1801, August 2012. 248 | 249 | 1. Julien Le Dem and Nong Li: 250 | “[Efficient Data Storage for Analytics with Apache Parquet 2.0](http://www.slideshare.net/julienledem/th-210pledem),” at *Hadoop Summit*, San Jose, 251 | June 2014. 252 | 253 | 1. Jim Gray, Surajit Chaudhuri, Adam Bosworth, et al.: 254 | “[Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals](http://arxiv.org/pdf/cs/0701155.pdf),” *Data Mining and Knowledge 255 | Discovery*, volume 1, number 1, pages 29–53, March 2007. 256 | [doi:10.1023/A:1009726021843](http://dx.doi.org/10.1023/A:1009726021843) 257 | 258 | -------------------------------------------------------------------------------- /chapter-04-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 4 References 5 | -------------------- 6 | 7 | 1. “[Java Object Serialization Specification](http://docs.oracle.com/javase/7/docs/platform/serialization/spec/serialTOC.html),” *docs.oracle.com*, 2010. 8 | 9 | 1. “[Ruby 2.2.0 API Documentation](http://ruby-doc.org/core-2.2.0/),” *ruby-doc.org*, Dec 2014. 10 | 11 | 1. “[The Python 3.4.3 Standard Library Reference Manual](https://docs.python.org/3/library/pickle.html),” *docs.python.org*, February 2015. 12 | 13 | 1. “[EsotericSoftware/kryo](https://github.com/EsotericSoftware/kryo),” 14 | *github.com*, October 2014. 15 | 16 | 1. “[CWE-502: Deserialization of Untrusted Data](http://cwe.mitre.org/data/definitions/502.html),” Common Weakness Enumeration, *cwe.mitre.org*, 17 | July 30, 2014. 18 | 19 | 1. Steve Breen: 20 | “[What Do WebLogic, WebSphere, JBoss, Jenkins, OpenNMS, and Your Application Have in Common? This Vulnerability](http://foxglovesecurity.com/2015/11/06/what-do-weblogic-websphere-jboss-jenkins-opennms-and-your-application-have-in-common-this-vulnerability/),” *foxglovesecurity.com*, November 6, 2015. 21 | 22 | 1. Patrick McKenzie: 23 | “[What the Rails Security Issue Means for Your Startup](http://www.kalzumeus.com/2013/01/31/what-the-rails-security-issue-means-for-your-startup/),” *kalzumeus.com*, January 31, 2013. 24 | 25 | 1. Eishay Smith: 26 | “[jvm-serializers wiki](https://github.com/eishay/jvm-serializers/wiki),” 27 | *github.com*, November 2014. 28 | 29 | 1. “[XML Is a Poor Copy of S-Expressions](http://c2.com/cgi/wiki?XmlIsaPoorCopyOfEssExpressions),” *c2.com* wiki. 30 | 31 | 1. Matt Harris: 32 | “[Snowflake: An Update and Some Very Important Information](https://groups.google.com/forum/#!topic/twitter-development-talk/ahbvo3VTIYI),” email to *Twitter Development 33 | Talk* mailing list, October 19, 2010. 34 | 35 | 1. Shudi (Sandy) Gao, C. M. Sperberg-McQueen, and 36 | Henry S. Thompson: “[XML Schema 1.1](http://www.w3.org/XML/Schema),” W3C Recommendation, 37 | May 2001. 38 | 39 | 1. Francis Galiegue, Kris Zyp, and Gary Court: 40 | “[JSON Schema](http://json-schema.org/),” IETF Internet-Draft, February 2013. 41 | 42 | 1. Yakov Shafranovich: 43 | “[RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files](https://tools.ietf.org/html/rfc4180),” October 2005. 44 | 45 | 1. “[MessagePack Specification](http://msgpack.org/),” *msgpack.org*. 46 | 47 | 1. Mark Slee, Aditya Agarwal, and Marc Kwiatkowski: 48 | “[Thrift: Scalable Cross-Language Services Implementation](http://thrift.apache.org/static/files/thrift-20070401.pdf),” Facebook technical report, April 2007. 49 | 50 | 1. “[Protocol Buffers Developer Guide](https://developers.google.com/protocol-buffers/docs/overview),” Google, Inc., *developers.google.com*. 51 | 52 | 1. Igor Anishchenko: 53 | “[Thrift vs Protocol Buffers vs Avro - Biased Comparison](http://www.slideshare.net/IgorAnishchenko/pb-vs-thrift-vs-avro),” *slideshare.net*, September 17, 2012. 54 | 55 | 1. “[A Matrix of the Features Each Individual Language Library Supports](http://wiki.apache.org/thrift/LibraryFeatures),” 56 | *wiki.apache.org*. 57 | 58 | 1. Martin Kleppmann: 59 | “[Schema Evolution in Avro, Protocol Buffers and Thrift](http://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html),” *martin.kleppmann.com*, December 5, 2012. 60 | 61 | 1. “[Apache Avro 1.7.7 Documentation](http://avro.apache.org/docs/1.7.7/),” *avro.apache.org*, July 2014. 62 | 63 | 1. Doug Cutting, Chad Walters, Jim Kellerman, et al.: 64 | “[[PROPOSAL] New Subproject: Avro](http://mail-archives.apache.org/mod_mbox/hadoop-general/200904.mbox/%3C49D53694.1050906@apache.org%3E),” email thread on *hadoop-general* mailing list, 65 | *mail-archives.apache.org*, April 2009. 66 | 67 | 1. Tony Hoare: 68 | “[Null References: The Billion Dollar Mistake](http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare),” at *QCon London*, 69 | March 2009. 70 | 71 | 1. Aditya Auradkar and Tom Quiggle: 72 | “[Introducing Espresso—LinkedIn's Hot New Distributed Document Store](https://engineering.linkedin.com/espresso/introducing-espresso-linkedins-hot-new-distributed-document-store),” *engineering.linkedin.com*, January 21, 2015. 73 | 74 | 1. Jay Kreps: 75 | “[Putting Apache Kafka to Use: A Practical Guide to Building a Stream Data Platform (Part 2)](http://blog.confluent.io/2015/02/25/stream-data-platform-2/),” *blog.confluent.io*, 76 | February 25, 2015. 77 | 78 | 1. Gwen Shapira: 79 | “[The Problem of Managing Schemas](http://radar.oreilly.com/2014/11/the-problem-of-managing-schemas.html),” *radar.oreilly.com*, November 4, 2014. 80 | 81 | 1. “[Apache Pig 0.14.0 Documentation](http://pig.apache.org/docs/r0.14.0/),” *pig.apache.org*, November 2014. 82 | 83 | 1. John Larmouth: 84 | [*ASN.1 Complete*](http://www.oss.com/asn1/resources/books-whitepapers-pubs/larmouth-asn1-book.pdf). 85 | Morgan Kaufmann, 1999. ISBN: 978-0-122-33435-1 86 | 87 | 1. Russell Housley, Warwick Ford, Tim Polk, and David Solo: 88 | “[RFC 2459: Internet X.509 Public Key Infrastructure: Certificate and CRL Profile](https://www.ietf.org/rfc/rfc2459.txt),” IETF Network Working Group, Standards Track, 89 | January 1999. 90 | 91 | 1. Lev Walkin: 92 | “[Question: Extensibility and Dropping Fields](http://lionet.info/asn1c/blog/2010/09/21/question-extensibility-removing-fields/),” *lionet.info*, September 21, 2010. 93 | 94 | 1. Jesse James Garrett: 95 | “[Ajax: A New Approach to Web Applications](https://web.archive.org/web/20181231094556/https://www.adaptivepath.com/ideas/ajax-new-approach-web-applications/),” *adaptivepath.com*, February 18, 2005. 96 | 97 | 1. Sam Newman: *Building Microservices*. 98 | O'Reilly Media, 2015. ISBN: 978-1-491-95035-7 99 | 100 | 1. Chris Richardson: 101 | “[Microservices: Decomposing Applications for Deployability and Scalability](http://www.infoq.com/articles/microservices-intro),” *infoq.com*, May 25, 2014. 102 | 103 | 1. Pat Helland: 104 | “[Data on the Outside Versus Data on the Inside](http://cidrdb.org/cidr2005/papers/P12.pdf),” at *2nd Biennial Conference on Innovative Data Systems Research* (CIDR), 105 | January 2005. 106 | 107 | 1. Roy Thomas Fielding: 108 | “[Architectural Styles and the Design of Network-Based Software Architectures](https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf),” PhD Thesis, University of 109 | California, Irvine, 2000. 110 | 111 | 1. Roy Thomas Fielding: 112 | “[REST APIs Must Be Hypertext-Driven](http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven),” *roy.gbiv.com*, October 20 2008. 113 | 114 | 1. “[REST in Peace, SOAP](https://royal.pingdom.com/rest-in-peace-soap/),” *royal.pingdom.com*, October 15, 2010. 115 | 116 | 1. “[Web Services Standards as of Q1 2007](https://www.innoq.com/resources/ws-standards-poster/),” *innoq.com*, February 2007. 117 | 118 | 1. Pete Lacey: 119 | “[The S Stands for Simple](http://harmful.cat-v.org/software/xml/soap/simple),” *harmful.cat-v.org*, November 15, 2006. 120 | 121 | 1. Stefan Tilkov: 122 | “[Interview: Pete Lacey Criticizes Web Services](http://www.infoq.com/articles/pete-lacey-ws-criticism),” *infoq.com*, December 12, 2006. 123 | 124 | 1. “[OpenAPI Specification (fka Swagger RESTful API Documentation Specification) Version 2.0](http://swagger.io/specification/),” 125 | *swagger.io*, September 8, 2014. 126 | 127 | 1. Michi Henning: 128 | “[The Rise and Fall of CORBA](https://cacm.acm.org/magazines/2008/8/5336-the-rise-and-fall-of-corba/fulltext),” 129 | *Communications of the ACM*, volume 51, number 8, pages 52–57, August 2008. 130 | [doi:10.1145/1378704.1378718](http://dx.doi.org/10.1145/1378704.1378718) 131 | 132 | 1. Andrew D. Birrell and Bruce Jay Nelson: 133 | “[Implementing Remote Procedure Calls](http://www.cs.princeton.edu/courses/archive/fall03/cs518/papers/rpc.pdf),” *ACM Transactions on Computer Systems* (TOCS), 134 | volume 2, number 1, pages 39–59, February 1984. 135 | [doi:10.1145/2080.357392](http://dx.doi.org/10.1145/2080.357392) 136 | 137 | 1. Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall: 138 | “[A Note on Distributed Computing](http://m.mirror.facebook.net/kde/devel/smli_tr-94-29.pdf),” 139 | Sun Microsystems Laboratories, Inc., Technical Report TR-94-29, November 1994. 140 | 141 | 1. Steve Vinoski: 142 | “[Convenience over Correctness](http://steve.vinoski.net/pdf/IEEE-Convenience_Over_Correctness.pdf),” *IEEE Internet Computing*, volume 12, number 4, pages 89–92, July 2008. 143 | [doi:10.1109/MIC.2008.75](http://dx.doi.org/10.1109/MIC.2008.75) 144 | 145 | 1. Marius Eriksen: 146 | “[Your Server as a Function](http://monkey.org/~marius/funsrv.pdf),” at 147 | *7th Workshop on Programming Languages and Operating Systems* (PLOS), November 2013. 148 | [doi:10.1145/2525528.2525538](http://dx.doi.org/10.1145/2525528.2525538) 149 | 150 | 1. “[gRPC concepts](https://grpc.io/docs/guides/concepts/),” The Linux Foundation, *grpc.io*. 151 | 152 | 1. Aditya Narayan and Irina Singh: 153 | “[Designing and Versioning Compatible Web Services](https://web.archive.org/web/20141016000136/http://www.ibm.com/developerworks/websphere/library/techarticles/0705_narayan/0705_narayan.html),” *ibm.com*, March 28, 2007. 154 | 155 | 1. Troy Hunt: 156 | “[Your API Versioning Is Wrong, Which Is Why I Decided to Do It 3 Different Wrong Ways](http://www.troyhunt.com/2014/02/your-api-versioning-is-wrong-which-is.html),” *troyhunt.com*, 157 | February 10, 2014. 158 | 159 | 1. “[API Upgrades](https://stripe.com/docs/upgrades),” Stripe, Inc., April 2015. 160 | 161 | 1. Jonas Bonér: 162 | “[Upgrade in an Akka Cluster](http://grokbase.com/t/gg/akka-user/138wd8j9e3/upgrade-in-an-akka-cluster),” email to *akka-user* mailing list, *grokbase.com*, August 28, 2013. 163 | 164 | 1. Philip A. Bernstein, Sergey Bykov, Alan Geller, et al.: 165 | “[Orleans: Distributed Virtual Actors for Programmability and Scalability](https://www.microsoft.com/en-us/research/publication/orleans-distributed-virtual-actors-for-programmability-and-scalability/),” Microsoft Research 166 | Technical Report MSR-TR-2014-41, March 2014. 167 | 168 | 1. “[Microsoft Project Orleans Documentation](http://dotnet.github.io/orleans/),” Microsoft Research, *dotnet.github.io*, 2015. 169 | 170 | 1. David Mercer, Sean Hinde, Yinso Chen, and Richard A O'Keefe: 171 | “[beginner: Updating Data Structures](http://erlang.org/pipermail/erlang-questions/2007-October/030318.html),” email thread on *erlang-questions* mailing list, *erlang.com*, 172 | October 29, 2007. 173 | 174 | 1. Fred Hebert: 175 | “[Postscript: Maps](http://learnyousomeerlang.com/maps),” *learnyousomeerlang.com*, 176 | April 9, 2014. 177 | 178 | -------------------------------------------------------------------------------- /chapter-05-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 5 References 5 | -------------------- 6 | 7 | 1. Bruce G. Lindsay, Patricia Griffiths Selinger, C. Galtieri, et al.: 8 | “[Notes on Distributed Databases](http://domino.research.ibm.com/library/cyberdig.nsf/papers/A776EC17FC2FCE73852579F100578964/$File/RJ2571.pdf),” IBM Research, Research Report RJ2571(33471), July 1979. 9 | 10 | 1. “[Oracle Active Data Guard Real-Time Data Protection and Availability](http://www.oracle.com/technetwork/database/availability/active-data-guard-wp-12c-1896127.pdf),” Oracle White Paper, June 2013. 11 | 12 | 1. “[AlwaysOn Availability Groups](http://msdn.microsoft.com/en-us/library/hh510230.aspx),” in *SQL Server Books Online*, Microsoft, 2012. 13 | 14 | 1. Lin Qiao, Kapil Surlaker, Shirshanka Das, et al.: 15 | “[On Brewing Fresh Espresso: LinkedIn’s Distributed Data Serving Platform](http://www.slideshare.net/amywtang/espresso-20952131),” at *ACM International Conference on 16 | Management of Data* (SIGMOD), June 2013. 17 | 18 | 1. Jun Rao: 19 | “[Intra-Cluster Replication for Apache Kafka](http://www.slideshare.net/junrao/kafka-replication-apachecon2013),” at *ApacheCon North America*, February 2013. 20 | 21 | 1. “[Highly Available Queues](https://www.rabbitmq.com/ha.html),” in *RabbitMQ Server Documentation*, Pivotal Software, Inc., 2014. 22 | 23 | 1. Yoshinori Matsunobu: 24 | “[Semi-Synchronous Replication at Facebook](http://yoshinorimatsunobu.blogspot.co.uk/2014/04/semi-synchronous-replication-at-facebook.html),” *yoshinorimatsunobu.blogspot.co.uk*, April 1, 2014. 25 | 26 | 1. Robbert van Renesse and Fred B. Schneider: 27 | “[Chain Replication for Supporting High Throughput and Availability](http://static.usenix.org/legacy/events/osdi04/tech/full_papers/renesse/renesse.pdf),” at *6th USENIX Symposium on 28 | Operating System Design and Implementation* (OSDI), December 2004. 29 | 30 | 1. Jeff Terrace and Michael J. Freedman: 31 | “[Object Storage on CRAQ: High-Throughput Chain Replication for Read-Mostly Workloads](https://www.usenix.org/legacy/event/usenix09/tech/full_papers/terrace/terrace.pdf),” at *USENIX 32 | Annual Technical Conference* (ATC), June 2009. 33 | 34 | 1. Brad Calder, Ju Wang, Aaron Ogus, et al.: 35 | “[Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency](http://sigops.org/sosp/sosp11/current/2011-Cascais/printable/11-calder.pdf),” at *23rd ACM 36 | Symposium on Operating Systems Principles* (SOSP), October 2011. 37 | 38 | 1. Andrew Wang: 39 | “[Windows Azure Storage](https://www.umbrant.com/2016/02/04/windows-azure-storage/),” 40 | *umbrant.com*, February 4, 2016. 41 | 42 | 1. “[Percona Xtrabackup - Documentation](https://www.percona.com/doc/percona-xtrabackup/2.1/index.html),” Percona LLC, 2014. 43 | 44 | 1. Jesse Newland: 45 | “[GitHub Availability This Week](https://github.com/blog/1261-github-availability-this-week),” *github.com*, September 14, 2012. 46 | 47 | 1. Mark Imbriaco: 48 | “[Downtime Last Saturday](https://github.com/blog/1364-downtime-last-saturday),” 49 | *github.com*, December 26, 2012. 50 | 51 | 1. John Hugg: 52 | “[‘All in’ with Determinism for Performance and Testing in Distributed Systems](https://www.youtube.com/watch?v=gJRj3vJL4wE),” at *Strange Loop*, September 2015. 53 | 54 | 1. Amit Kapila: 55 | “[WAL Internals of PostgreSQL](http://www.pgcon.org/2012/schedule/attachments/258_212_Internals%20Of%20PostgreSQL%20Wal.pdf),” at *PostgreSQL Conference* (PGCon), May 2012. 56 | 57 | 1. [*MySQL Internals Manual*](http://dev.mysql.com/doc/internals/en/index.html). 58 | Oracle, 2014. 59 | 60 | 1. Yogeshwer Sharma, Philippe Ajoux, Petchean Ang, et al.: 61 | “[Wormhole: Reliable Pub-Sub to Support Geo-Replicated Internet Services](https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-sharma.pdf),” at *12th USENIX 62 | Symposium on Networked Systems Design and Implementation* (NSDI), May 2015. 63 | 64 | 1. “[Oracle GoldenGate 12c: Real-Time Access to Real-Time Information](http://www.oracle.com/us/products/middleware/data-integration/oracle-goldengate-realtime-access-2031152.pdf),” Oracle White Paper, October 2013. 65 | 66 | 1. Shirshanka Das, Chavdar Botev, Kapil Surlaker, et al.: 67 | “[All Aboard the Databus!](http://www.socc2012.org/s18-das.pdf),” at 68 | *ACM Symposium on Cloud Computing* (SoCC), October 2012. 69 | 70 | 1. Greg Sabino Mullane: 71 | “[Version 5 of Bucardo Database Replication System](http://blog.endpoint.com/2014/06/bucardo-5-multimaster-postgres-released.html),” *blog.endpoint.com*, June 23, 2014. 72 | 73 | 1. Werner Vogels: 74 | “[Eventually Consistent](http://queue.acm.org/detail.cfm?id=1466448),” 75 | *ACM Queue*, volume 6, number 6, pages 14–19, October 2008. 76 | [doi:10.1145/1466443.1466448](http://dx.doi.org/10.1145/1466443.1466448) 77 | 78 | 1. Douglas B. Terry: 79 | “[Replicated Data Consistency Explained Through Baseball](https://www.microsoft.com/en-us/research/publication/replicated-data-consistency-explained-through-baseball/),” Microsoft Research, Technical Report 80 | MSR-TR-2011-137, October 2011. 81 | 82 | 1. Douglas B. Terry, Alan J. Demers, Karin Petersen, et al.: 83 | “[Session Guarantees for Weakly Consistent Replicated Data](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71.2269&rep=rep1&type=pdf),” at *3rd International Conference 84 | on Parallel and Distributed Information Systems* (PDIS), September 1994. 85 | [doi:10.1109/PDIS.1994.331722](http://dx.doi.org/10.1109/PDIS.1994.331722) 86 | 87 | 1. Terry Pratchett: *Reaper Man: A Discworld 88 | Novel*. Victor Gollancz, 1991. ISBN: 978-0-575-04979-6 89 | 90 | 1. “[Tungsten Replicator](https://github.com/holys/tungsten-replicator),” *github.com*. 91 | 92 | 1. “[BDR 0.10.0 Documentation](http://bdr-project.org/docs/next/index.html),” The PostgreSQL Global Development Group, *bdr-project.org*, 2015. 93 | 94 | 1. Robert Hodges: 95 | “[If You *Must* Deploy Multi-Master Replication, Read This First](http://scale-out-blog.blogspot.co.uk/2012/04/if-you-must-deploy-multi-master.html),” *scale-out-blog.blogspot.co.uk*, 96 | March 30, 2012. 97 | 98 | 1. J. Chris Anderson, Jan Lehnardt, and Noah 99 | Slater: *CouchDB: The Definitive Guide*. O'Reilly Media, 2010. 100 | ISBN: 978-0-596-15589-6 101 | 102 | 1. AppJet, Inc.: 103 | “[Etherpad and EasySync Technical Manual](https://github.com/ether/etherpad-lite/blob/e2ce9dc/doc/easysync/easysync-full-description.pdf),” *github.com*, March 26, 2011. 104 | 105 | 1. John Day-Richter: 106 | “[What’s Different About the New Google Docs: Making Collaboration Fast](https://drive.googleblog.com/2010/09/whats-different-about-new-google-docs.html),” 107 | *drive.googleblog.com*, September 23, 2010. 108 | 109 | 1. Martin Kleppmann and Alastair R. Beresford: 110 | “[A Conflict-Free Replicated JSON Datatype](http://arxiv.org/abs/1608.03960),” 111 | arXiv:1608.03960, August 13, 2016. 112 | 113 | 1. Frazer Clement: 114 | “[Eventual Consistency – Detecting Conflicts](http://messagepassing.blogspot.co.uk/2011/10/eventual-consistency-detecting.html),” *messagepassing.blogspot.co.uk*, October 20, 2011. 115 | 116 | 1. Robert Hodges: 117 | “[State of the Art for MySQL Multi-Master Replication](https://web.archive.org/web/20161010052017/https://www.percona.com/live/mysql-conference-2013/sites/default/files/slides/mysql-multi-master-state-of-art-2013-04-24_0.pdf),” at *Percona Live: MySQL Conference & 118 | Expo*, April 2013. 119 | 120 | 1. John Daily: 121 | “[Clocks Are Bad, or, Welcome to the Wonderful World of Distributed Systems](https://riak.com/clocks-are-bad-or-welcome-to-distributed-systems/),” *riak.com*, November 12, 2013. 122 | 123 | 1. Riley Berton: 124 | “[Is Bi-Directional Replication (BDR) in Postgres Transactional?](http://sdf.org/~riley/blog/2016/01/04/is-bi-directional-replication-bdr-in-postgres-transactional/),” *sdf.org*, January 4, 2016. 125 | 126 | 1. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, et al.: 127 | “[Dynamo: Amazon's Highly Available Key-Value Store](http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf),” at *21st ACM Symposium on Operating 128 | Systems Principles* (SOSP), October 2007. 129 | 130 | 1. Marc Shapiro, Nuno Preguiça, Carlos Baquero, 131 | and Marek Zawirski: “[A Comprehensive Study of Convergent and Commutative Replicated Data Types](http://hal.inria.fr/inria-00555588/),” INRIA Research Report no. 7506, 132 | January 2011. 133 | 134 | 1. Sam Elliott: 135 | “[CRDTs: An UPDATE (or Maybe Just a PUT)](https://speakerdeck.com/lenary/crdts-an-update-or-just-a-put),” at *RICON West*, October 2013. 136 | 137 | 1. Russell Brown: 138 | “[A Bluffers Guide to CRDTs in Riak](https://gist.github.com/russelldb/f92f44bdfb619e089a4d),” *gist.github.com*, October 28, 2013. 139 | 140 | 1. Benjamin Farinier, Thomas Gazagnaire, and 141 | Anil Madhavapeddy: “[Mergeable Persistent Data Structures](http://gazagnaire.org/pub/FGM15.pdf),” at *26es Journées Francophones des Langages Applicatifs* (JFLA), 142 | January 2015. 143 | 144 | 1. Chengzheng Sun and Clarence Ellis: 145 | “[Operational Transformation in Real-Time Group Editors: Issues, Algorithms, and Achievements](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.933&rep=rep1&type=pdf),” at 146 | *ACM Conference on Computer Supported Cooperative Work* (CSCW), November 1998. 147 | 148 | 1. Lars Hofhansl: 149 | “[HBASE-7709: Infinite Loop Possible in Master/Master Replication](https://issues.apache.org/jira/browse/HBASE-7709),” *issues.apache.org*, January 29, 2013. 150 | 151 | 1. David K. Gifford: 152 | “[Weighted Voting for Replicated Data](https://www.cs.cmu.edu/~15-749/READINGS/required/availability/gifford79.pdf),” 153 | at *7th ACM Symposium on Operating Systems Principles* (SOSP), December 1979. 154 | [doi:10.1145/800215.806583](http://dx.doi.org/10.1145/800215.806583) 155 | 156 | 1. Heidi Howard, Dahlia Malkhi, and Alexander Spiegelman: 157 | “[Flexible Paxos: Quorum Intersection Revisited](https://arxiv.org/abs/1608.06696),” 158 | *arXiv:1608.06696*, August 24, 2016. 159 | 160 | 1. Joseph Blomstedt: 161 | “[Re: Absolute Consistency](https://web.archive.org/web/20190919171316/http://lists.basho.com:80/pipermail/riak-users_lists.basho.com/2012-January/007157.html),” email to *riak-users* mailing list, *lists.basho.com*, 162 | January 11, 2012. 163 | 164 | 1. Joseph Blomstedt: 165 | “[Bringing Consistency to Riak](https://vimeo.com/51973001),” at *RICON West*, 166 | October 2012. 167 | 168 | 1. Peter Bailis, Shivaram Venkataraman, 169 | Michael J. Franklin, et al.: 170 | “[Quantifying Eventual Consistency with PBS](http://www.bailis.org/papers/pbs-cacm2014.pdf),” 171 | *Communications of the ACM*, volume 57, number 8, pages 93–102, August 2014. 172 | [doi:10.1145/2632792](http://dx.doi.org/10.1145/2632792) 173 | 174 | 1. Jonathan Ellis: 175 | “[Modern Hinted Handoff](http://www.datastax.com/dev/blog/modern-hinted-handoff),” 176 | *datastax.com*, December 11, 2012. 177 | 178 | 1. “[Project Voldemort Wiki](https://github.com/voldemort/voldemort/wiki),” *github.com*, 2013. 179 | 180 | 1. “[Apache Cassandra Documentation](https://cassandra.apache.org/doc/latest/),” Apache Software Foundation, *cassandra.apache.org*. 181 | 182 | 1. “[Riak Enterprise: Multi-Datacenter Replication](https://web.archive.org/web/20150513041837/http://basho.com/assets/MultiDatacenter_Replication.pdf).” 183 | Technical whitepaper, Basho Technologies, Inc., September 2014. 184 | 185 | 1. Jonathan Ellis: 186 | “[Why Cassandra Doesn't Need Vector Clocks](http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks),” *datastax.com*, September 2, 2013. 187 | 188 | 1. Leslie Lamport: 189 | “[Time, Clocks, and the Ordering of Events in a Distributed System](https://www.microsoft.com/en-us/research/publication/time-clocks-ordering-events-distributed-system/),” *Communications of the ACM*, 190 | volume 21, number 7, pages 558–565, July 1978. 191 | [doi:10.1145/359545.359563](http://dx.doi.org/10.1145/359545.359563) 192 | 193 | 1. Joel Jacobson: 194 | “[Riak 2.0: Data Types](http://blog.joeljacobson.com/riak-2-0-data-types/),” 195 | *blog.joeljacobson.com*, March 23, 2014. 196 | 197 | 1. D. Stott Parker Jr., Gerald J. Popek, Gerard Rudisin, et al.: 198 | “[Detection of Mutual Inconsistency in Distributed Systems](http://zoo.cs.yale.edu/classes/cs426/2013/bib/parker83detection.pdf),” *IEEE Transactions on Software Engineering*, 199 | volume 9, number 3, pages 240–247, May 1983. 200 | [doi:10.1109/TSE.1983.236733](http://dx.doi.org/10.1109/TSE.1983.236733) 201 | 202 | 1. Nuno Preguiça, Carlos Baquero, Paulo Sérgio 203 | Almeida, et al.: “[Dotted Version Vectors: Logical Clocks for Optimistic Replication](http://arxiv.org/pdf/1011.5808v1.pdf),” arXiv:1011.5808, November 26, 204 | 2010. 205 | 206 | 1. Sean Cribbs: 207 | “[A Brief History of Time in Riak](https://speakerdeck.com/seancribbs/a-brief-history-of-time-in-riak),” 208 | at *RICON*, October 2014. 209 | 210 | 1. Russell Brown: 211 | “[Vector Clocks Revisited Part 2: Dotted Version Vectors](https://riak.com/posts/technical/vector-clocks-revisited-part-2-dotted-version-vectors/),” *basho.com*, November 10, 2015. 212 | 213 | 1. Carlos Baquero: 214 | “[Version Vectors Are Not Vector Clocks](https://haslab.wordpress.com/2011/07/08/version-vectors-are-not-vector-clocks/),” *haslab.wordpress.com*, July 8, 2011. 215 | 216 | 1. Reinhard Schwarz and Friedemann Mattern: 217 | “[Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail](http://dcg.ethz.ch/lectures/hs08/seminar/papers/mattern4.pdf),” *Distributed 218 | Computing*, volume 7, number 3, pages 149–174, March 1994. 219 | [doi:10.1007/BF02277859](http://dx.doi.org/10.1007/BF02277859) 220 | 221 | -------------------------------------------------------------------------------- /chapter-06-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 6 References 5 | -------------------- 6 | 7 | 1. David J. DeWitt and Jim N. Gray: 8 | “[Parallel Database Systems: The Future of High Performance Database Systems](http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/dewittgray92.pdf),” 9 | *Communications of the ACM*, volume 35, number 6, pages 85–98, June 1992. 10 | [doi:10.1145/129888.129894](http://dx.doi.org/10.1145/129888.129894) 11 | 12 | 1. Lars George: 13 | “[HBase vs. BigTable Comparison](http://www.larsgeorge.com/2009/11/hbase-vs-bigtable-comparison.html),” 14 | *larsgeorge.com*, November 2009. 15 | 16 | 1. “[The Apache HBase Reference Guide](https://hbase.apache.org/book/book.html),” Apache Software Foundation, *hbase.apache.org*, 2014. 17 | 18 | 1. MongoDB, Inc.: 19 | “[New Hash-Based Sharding Feature in MongoDB 2.4](https://www.mongodb.com/blog/post/new-hash-based-sharding-feature-in-mongodb-24),” *blog.mongodb.org*, April 10, 2013. 20 | 21 | 1. Ikai Lan: 22 | “[App Engine Datastore Tip: Monotonically Increasing Values Are Bad](http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/),” *ikaisays.com*, 23 | January 25, 2011. 24 | 25 | 1. Martin Kleppmann: 26 | “[Java's hashCode Is Not Safe for Distributed Systems](http://martin.kleppmann.com/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html),” *martin.kleppmann.com*, June 18, 2012. 27 | 28 | 1. David Karger, Eric Lehman, Tom Leighton, et al.: 29 | “[Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web](http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf),” 30 | at *29th Annual ACM Symposium on Theory of Computing* (STOC), pages 654–663, 1997. 31 | [doi:10.1145/258533.258660](http://dx.doi.org/10.1145/258533.258660) 32 | 33 | 1. John Lamping and Eric Veach: 34 | “[A Fast, Minimal Memory, Consistent Hash Algorithm](http://arxiv.org/pdf/1406.2294v1.pdf),” *arxiv.org*, June 2014. 35 | 36 | 1. Eric Redmond: 37 | “[A Little Riak Book](https://web.archive.org/web/20160807123307/http://www.littleriakbook.com/),” Version 1.4.0, 38 | Basho Technologies, September 2013. 39 | 40 | 1. “[Couchbase 2.5 Administrator Guide](http://docs.couchbase.com/couchbase-manual-2.5/cb-admin/),” Couchbase, Inc., 2014. 41 | 42 | 1. Avinash Lakshman and Prashant Malik: 43 | “[Cassandra – A Decentralized Structured Storage System](http://www.cs.cornell.edu/Projects/ladis2009/papers/Lakshman-ladis2009.PDF),” at *3rd ACM SIGOPS International Workshop on 44 | Large Scale Distributed Systems and Middleware* (LADIS), October 2009. 45 | 46 | 1. Jonathan Ellis: 47 | “[Facebook’s Cassandra Paper, Annotated and Compared to Apache Cassandra 2.0](https://docs.datastax.com/en/articles/cassandra/cassandrathenandnow.html),” 48 | *docs.datastax.com*, September 12, 2013. 49 | 50 | 1. “[Introduction to Cassandra Query Language](https://docs.datastax.com/en/cql-oss/3.1/cql/cql_intro_c.html),” DataStax, Inc., 2014. 51 | 52 | 1. Samuel Axon: 53 | “[3% of Twitter's Servers Dedicated to Justin Bieber](https://mashable.com/archive/justin-bieber-twitter),” *mashable.com*, September 7, 2010. 54 | 55 | 1. “[Riak KV Docs](https://docs.riak.com/riak/kv/latest/index.html),” *docs.riak.com*. 56 | 57 | 1. Richard Low: 58 | “[The Sweet Spot for Cassandra Secondary Indexing](https://web.archive.org/web/20190831132955/http://www.wentnet.com/blog/?p=77),” *wentnet.com*, October 21, 2013. 59 | 60 | 1. Zachary Tong: 61 | “[Customizing Your Document Routing](https://www.elastic.co/blog/customizing-your-document-routing/),” *elastic.co*, June 3, 2013. 62 | 63 | 1. “[Apache Solr Reference Guide](https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide),” Apache Software Foundation, 2014. 64 | 65 | 1. Andrew Pavlo: 66 | “[H-Store Frequently Asked Questions](http://hstore.cs.brown.edu/documentation/faq/),” 67 | *hstore.cs.brown.edu*, October 2013. 68 | 69 | 1. “[Amazon DynamoDB Developer Guide](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/),” Amazon Web Services, Inc., 2014. 70 | 71 | 1. Rusty Klophaus: 72 | “[Difference Between 2I and Search](https://web.archive.org/web/20150926053350/http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-October/006220.html),” email to *riak-users* mailing list, *lists.basho.com*, October 25, 2011. 73 | 74 | 1. Donald K. Burleson: 75 | “[Object Partitioning in Oracle](http://www.dba-oracle.com/art_partit.htm),” 76 | *dba-oracle.com*, November 8, 2000. 77 | 78 | 1. Eric Evans: 79 | “[Rethinking Topology in Cassandra](http://www.slideshare.net/jericevans/virtual-nodes-rethinking-topology-in-cassandra),” at *ApacheCon Europe*, November 2012. 80 | 81 | 1. Rafał Kuć: 82 | “[Reroute API Explained](http://elasticsearchserverbook.com/reroute-api-explained/),” 83 | *elasticsearchserverbook.com*, September 30, 2013. 84 | 85 | 1. “[Project Voldemort Documentation](http://www.project-voldemort.com/voldemort/),” *project-voldemort.com*. 86 | 87 | 1. Enis Soztutar: 88 | “[Apache HBase Region Splitting and Merging](http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/),” *hortonworks.com*, February 1, 2013. 89 | 90 | 1. Brandon Williams: 91 | “[Virtual Nodes in Cassandra 1.2](http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2),” *datastax.com*, December 4, 2012. 92 | 93 | 1. Richard Jones: 94 | “[libketama: Consistent Hashing Library for Memcached Clients](https://www.metabrew.com/article/libketama-consistent-hashing-algo-memcached-clients),” *metabrew.com*, April 10, 2007. 95 | 96 | 1. Branimir Lambov: 97 | “[New Token Allocation Algorithm in Cassandra 3.0](http://www.datastax.com/dev/blog/token-allocation-algorithm),” *datastax.com*, January 28, 2016. 98 | 99 | 1. Jason Wilder: 100 | “[Open-Source Service Discovery](http://jasonwilder.com/blog/2014/02/04/service-discovery-in-the-cloud/),” *jasonwilder.com*, February 2014. 101 | 102 | 1. Kishore Gopalakrishna, Shi Lu, Zhen Zhang, et al.: 103 | “[Untangling Cluster Management with Helix](http://www.socc2012.org/helix_onecol.pdf?attredirects=0),” at *ACM Symposium on Cloud Computing* (SoCC), October 2012. 104 | [doi:10.1145/2391229.2391248](http://dx.doi.org/10.1145/2391229.2391248) 105 | 106 | 1. “[Moxi 1.8 Manual](http://docs.couchbase.com/moxi-manual-1.8/),” Couchbase, Inc., 2014. 107 | 108 | 1. Shivnath Babu and Herodotos Herodotou: 109 | “[Massively Parallel Databases and MapReduce Systems](https://www.microsoft.com/en-us/research/wp-content/uploads/2013/11/db-mr-survey-final.pdf),” 110 | *Foundations and Trends in Databases*, volume 5, number 1, pages 1–104, November 2013. 111 | [doi:10.1561/1900000036](http://dx.doi.org/10.1561/1900000036) 112 | 113 | -------------------------------------------------------------------------------- /chapter-07-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 7 References 5 | -------------------- 6 | 7 | 1. Donald D. Chamberlin, Morton M. Astrahan, Michael W. Blasgen, et al.: 8 | “[A History and Evaluation of System R](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.348&rep=rep1&type=pdf),” *Communications of the ACM*, 9 | volume 24, number 10, pages 632–646, October 1981. 10 | [doi:10.1145/358769.358784](http://dx.doi.org/10.1145/358769.358784) 11 | 12 | 1. Jim N. Gray, Raymond A. Lorie, Gianfranco R. Putzolu, and Irving L. Traiger: 13 | “[Granularity of Locks and Degrees of Consistency in a Shared Data Base](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.92.8248&rep=rep1&type=pdf),” in *Modelling in Data 14 | Base Management Systems: Proceedings of the IFIP Working Conference on Modelling in Data Base 15 | Management Systems*, edited by G. M. Nijssen, pages 16 | 364–394, Elsevier/North Holland Publishing, 1976. Also in *Readings in Database Systems*, 4th edition, edited by Joseph M. 17 | Hellerstein and Michael Stonebraker, MIT Press, 2005. ISBN: 978-0-262-69314-1 18 | 19 | 1. Kapali P. Eswaran, Jim N. Gray, Raymond A. Lorie, and Irving L. Traiger: 20 | “[The Notions of Consistency and Predicate Locks in a Database System](http://research.microsoft.com/en-us/um/people/gray/papers/On%20the%20Notions%20of%20Consistency%20and%20Predicate%20Locks%20in%20a%20Database%20System%20CACM.pdf),” *Communications of the 21 | ACM*, volume 19, number 11, pages 624–633, November 1976. 22 | 23 | 1. “[ACID Transactions Are Incredibly Helpful](http://web.archive.org/web/20150320053809/https://foundationdb.com/acid-claims),” FoundationDB, LLC, 2013. 24 | 25 | 1. John D. Cook: 26 | “[ACID Versus BASE for Database Transactions](http://www.johndcook.com/blog/2009/07/06/brewer-cap-theorem-base/),” *johndcook.com*, July 6, 2009. 27 | 28 | 1. Gavin Clarke: 29 | “[NoSQL's CAP Theorem Busters: We Don't Drop ACID](http://www.theregister.co.uk/2012/11/22/foundationdb_fear_of_cap_theorem/),” *theregister.co.uk*, November 22, 2012. 30 | 31 | 1. Theo Härder and Andreas Reuter: 32 | “[Principles of Transaction-Oriented Database Recovery](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.2812&rep=rep1&type=pdf),” *ACM Computing Surveys*, 33 | volume 15, number 4, pages 287–317, December 1983. 34 | [doi:10.1145/289.291](http://dx.doi.org/10.1145/289.291) 35 | 36 | 1. Peter Bailis, Alan Fekete, Ali Ghodsi, et al.: 37 | “[HAT, not CAP: Towards Highly Available Transactions](http://www.bailis.org/papers/hat-hotos2013.pdf),” 38 | at *14th USENIX Workshop on Hot Topics in Operating Systems* (HotOS), May 2013. 39 | 40 | 1. Armando Fox, Steven D. Gribble, Yatin Chawathe, et al.: 41 | “[Cluster-Based Scalable Network Services](http://www.cs.berkeley.edu/~brewer/cs262b/TACC.pdf),” at 42 | *16th ACM Symposium on Operating Systems Principles* (SOSP), October 1997. 43 | 44 | 1. Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman: 45 | [*Concurrency Control and Recovery in Database Systems*](http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx). 46 | Addison-Wesley, 1987. ISBN: 978-0-201-10715-9, available online at *research.microsoft.com*. 47 | 48 | 1. Alan Fekete, Dimitrios Liarokapis, Elizabeth O'Neil, et al.: 49 | “[Making Snapshot Isolation Serializable](https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/2009/Papers/p492-fekete.pdf),” *ACM Transactions on Database Systems*, 50 | volume 30, number 2, pages 492–528, June 2005. 51 | [doi:10.1145/1071610.1071615](http://dx.doi.org/10.1145/1071610.1071615) 52 | 53 | 1. Mai Zheng, Joseph Tucek, Feng Qin, and Mark Lillibridge: 54 | “[Understanding the Robustness of SSDs Under Power Fault](https://www.usenix.org/system/files/conference/fast13/fast13-final80.pdf),” at *11th USENIX Conference on File and 55 | Storage Technologies* (FAST), February 2013. 56 | 57 | 1. Laurie Denness: 58 | “[SSDs: A Gift and a Curse](https://laur.ie/blog/2015/06/ssds-a-gift-and-a-curse/),” 59 | *laur.ie*, June 2, 2015. 60 | 61 | 1. Adam Surak: 62 | “[When Solid State Drives Are Not That Solid](https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/),” *blog.algolia.com*, June 15, 2015. 63 | 64 | 1. Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, 65 | Ramnatthan Alagappan, et al.: “[All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications](http://research.cs.wisc.edu/wind/Publications/alice-osdi14.pdf),” 66 | at *11th USENIX Symposium on Operating Systems Design and Implementation* (OSDI), 67 | October 2014. 68 | 69 | 1. Chris Siebenmann: 70 | “[Unix's File Durability Problem](https://utcc.utoronto.ca/~cks/space/blog/unix/FileSyncProblem),” *utcc.utoronto.ca*, April 14, 2016. 71 | 72 | 1. Lakshmi N. Bairavasundaram, Garth R. 73 | Goodson, Bianca Schroeder, et al.: 74 | “[An Analysis of Data Corruption in the Storage Stack](http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.pdf),” at *6th USENIX Conference on File and Storage 75 | Technologies* (FAST), February 2008. 76 | 77 | 1. Bianca Schroeder, Raghav Lagisetty, and Arif Merchant: 78 | “[Flash Reliability in Production: The Expected and the Unexpected](https://www.usenix.org/conference/fast16/technical-sessions/presentation/schroeder),” at *14th USENIX Conference on 79 | File and Storage Technologies* (FAST), February 2016. 80 | 81 | 1. Don Allison: 82 | “[SSD Storage – Ignorance of Technology Is No Excuse](https://blog.korelogic.com/blog/2015/03/24),” *blog.korelogic.com*, March 24, 2015. 83 | 84 | 1. Dave Scherer: 85 | “[Those Are Not Transactions (Cassandra 2.0)](http://web.archive.org/web/20150526065247/http://blog.foundationdb.com/those-are-not-transactions-cassandra-2-0),” *blog.foundationdb.com*, September 6, 2013. 86 | 87 | 1. Kyle Kingsbury: 88 | “[Call Me Maybe: Cassandra](http://aphyr.com/posts/294-call-me-maybe-cassandra/),” 89 | *aphyr.com*, September 24, 2013. 90 | 91 | 1. “[ACID Support in Aerospike](https://web.archive.org/web/20170305002118/https://www.aerospike.com/docs/architecture/assets/AerospikeACIDSupport.pdf),” Aerospike, Inc., June 2014. 92 | 93 | 1. Martin Kleppmann: 94 | “[Hermitage: Testing the 'I' in ACID](http://martin.kleppmann.com/2014/11/25/hermitage-testing-the-i-in-acid.html),” *martin.kleppmann.com*, November 25, 2014. 95 | 96 | 1. Tristan D'Agosta: 97 | “[BTC Stolen from Poloniex](https://bitcointalk.org/index.php?topic=499580),” 98 | *bitcointalk.org*, March 4, 2014. 99 | 100 | 1. bitcointhief2: 101 | “[How I Stole Roughly 100 BTC from an Exchange and How I Could Have Stolen More!](http://www.reddit.com/r/Bitcoin/comments/1wtbiu/how_i_stole_roughly_100_btc_from_an_exchange_and/),” *reddit.com*, 102 | February 2, 2014. 103 | 104 | 1. Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, and S. Sudarshan: 105 | “[Automating the Detection of Snapshot Isolation Anomalies](http://www.vldb.org/conf/2007/papers/industrial/p1263-jorwekar.pdf),” at *33rd International Conference on 106 | Very Large Data Bases* (VLDB), September 2007. 107 | 108 | 1. Michael Melanson: 109 | “[Transactions: The Limits of Isolation](https://www.michaelmelanson.net/transactions-the-limits-of-isolation/),” 110 | *michaelmelanson.net*, November 30, 2014. 111 | 112 | 1. Hal Berenson, Philip A. Bernstein, Jim N. Gray, et al.: 113 | “[A Critique of ANSI SQL Isolation Levels](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf),” 114 | at *ACM International Conference on Management of Data* (SIGMOD), May 1995. 115 | 116 | 1. Atul Adya: “[Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions](http://pmg.csail.mit.edu/papers/adya-phd.pdf),” 117 | PhD Thesis, Massachusetts Institute of Technology, March 1999. 118 | 119 | 1. Peter Bailis, Aaron Davidson, Alan Fekete, et al.: 120 | “[Highly Available Transactions: Virtues and Limitations (Extended Version)](http://arxiv.org/pdf/1302.0309.pdf),” at *40th International Conference on Very Large Data Bases* 121 | (VLDB), September 2014. 122 | 123 | 1. Bruce Momjian: 124 | “[MVCC Unmasked](http://momjian.us/main/presentations/internals.html#mvcc),” *momjian.us*, 125 | July 2014. 126 | 127 | 1. Annamalai Gurusami: 128 | “[Repeatable Read Isolation Level in InnoDB – How Consistent Read View Works](https://web.archive.org/web/20161225080947/https://blogs.oracle.com/mysqlinnodb/entry/repeatable_read_isolation_level_in),” 129 | *blogs.oracle.com*, January 15, 2013. 130 | 131 | 1. Nikita Prokopov: 132 | “[Unofficial Guide to Datomic Internals](http://tonsky.me/blog/unofficial-guide-to-datomic-internals/),” *tonsky.me*, May 6, 2014. 133 | 134 | 1. Baron Schwartz: 135 | “[Immutability, MVCC, and Garbage Collection](http://www.xaprb.com/blog/2013/12/28/immutability-mvcc-and-garbage-collection/),” *xaprb.com*, December 28, 2013. 136 | 137 | 1. J. Chris Anderson, Jan Lehnardt, and Noah Slater: 138 | *CouchDB: The Definitive Guide*. O'Reilly Media, 2010. 139 | ISBN: 978-0-596-15589-6 140 | 141 | 1. Rikdeb Mukherjee: 142 | “[Isolation in DB2 (Repeatable Read, Read Stability, Cursor Stability, Uncommitted Read) with Examples](http://mframes.blogspot.co.uk/2013/07/isolation-in-cursor.html),” 143 | *mframes.blogspot.co.uk*, July 4, 2013. 144 | 145 | 1. Steve Hilker: 146 | “[Cursor Stability (CS) – IBM DB2 Community](https://web.archive.org/web/20150420001721/http://www.toadworld.com/platforms/ibmdb2/w/wiki/6661.cursor-stability-cs.aspx),” 147 | *toadworld.com*, March 14, 2013. 148 | 149 | 1. Nate Wiger: 150 | “[An Atomic Rant](https://nateware.com/2010/02/18/an-atomic-rant/),” *nateware.com*, 151 | February 18, 2010. 152 | 153 | 1. Joel Jacobson: 154 | “[Riak 2.0: Data Types](https://web.archive.org/web/20161023195905/http://blog.joeljacobson.com/riak-2-0-data-types/),” 155 | *blog.joeljacobson.com*, March 23, 2014. 156 | 157 | 1. Michael J. Cahill, Uwe Röhm, and Alan Fekete: 158 | “[Serializable Isolation for Snapshot Databases](http://www.cs.nyu.edu/courses/fall12/CSCI-GA.2434-001/p729-cahill.pdf),” at *ACM International Conference on 159 | Management of Data* (SIGMOD), June 2008. 160 | [doi:10.1145/1376616.1376690](http://dx.doi.org/10.1145/1376616.1376690) 161 | 162 | 1. Dan R. K. Ports and Kevin Grittner: 163 | “[Serializable Snapshot Isolation in PostgreSQL](http://drkp.net/papers/ssi-vldb12.pdf),” 164 | at *38th International Conference on Very Large Databases* (VLDB), August 2012. 165 | 166 | 1. Tony Andrews: 167 | “[Enforcing Complex Constraints in Oracle](http://tonyandrews.blogspot.co.uk/2004/10/enforcing-complex-constraints-in.html),” *tonyandrews.blogspot.co.uk*, October 15, 2004. 168 | 169 | 1. Douglas B. Terry, Marvin M. Theimer, Karin Petersen, et al.: 170 | “[Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.141.7889&rep=rep1&type=pdf),” at 171 | *15th ACM Symposium on Operating Systems Principles* (SOSP), December 1995. 172 | [doi:10.1145/224056.224070](http://dx.doi.org/10.1145/224056.224070) 173 | 174 | 1. Gary Fredericks: 175 | “[Postgres Serializability Bug](https://github.com/gfredericks/pg-serializability-bug),” *github.com*, September 2015. 176 | 177 | 1. Michael Stonebraker, Samuel Madden, Daniel J. Abadi, et al.: 178 | “[The End of an Architectural Era (It’s Time for a Complete Rewrite)](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.137.3697&rep=rep1&type=pdf),” at *33rd International 179 | Conference on Very Large Data Bases* (VLDB), September 2007. 180 | 181 | 1. John Hugg: 182 | “[H-Store/VoltDB Architecture vs. CEP Systems and Newer Streaming Architectures](https://www.youtube.com/watch?v=hD5M4a1UVz8),” 183 | at *Data @Scale Boston*, November 2014. 184 | 185 | 1. Robert Kallman, Hideaki Kimura, Jonathan Natkins, et al.: 186 | “[H-Store: A High-Performance, Distributed Main Memory Transaction Processing System](http://www.vldb.org/pvldb/vol1/1454211.pdf),” 187 | *Proceedings of the VLDB Endowment*, volume 1, number 2, pages 1496–1499, August 2008. 188 | 189 | 1. Rich Hickey: 190 | “[The Architecture of Datomic](http://www.infoq.com/articles/Architecture-Datomic),” *infoq.com*, November 2, 2012. 191 | 192 | 1. John Hugg: 193 | “[Debunking Myths About the VoltDB In-Memory Database](https://dzone.com/articles/debunking-myths-about-voltdb),” *dzone.com*, May 28, 2014. 194 | 195 | 1. Joseph M. Hellerstein, Michael Stonebraker, and James Hamilton: 196 | “[Architecture of a Database System](https://dsf.berkeley.edu/papers/fntdb07-architecture.pdf),” 197 | *Foundations and Trends in Databases*, volume 1, number 2, pages 141–259, November 2007. 198 | [doi:10.1561/1900000002](http://dx.doi.org/10.1561/1900000002) 199 | 200 | 1. Michael J. Cahill: 201 | “[Serializable Isolation for Snapshot Databases](http://cahill.net.au/wp-content/uploads/2010/02/cahill-thesis.pdf),” PhD Thesis, University of Sydney, July 2009. 202 | 203 | 1. D. Z. Badal: 204 | “[Correctness of Concurrency Control and Implications in Distributed Databases](http://ieeexplore.ieee.org/abstract/document/762563/),” at *3rd International IEEE Computer Software and 205 | Applications Conference* (COMPSAC), November 1979. 206 | 207 | 1. Rakesh Agrawal, Michael J. Carey, and Miron Livny: 208 | “[Concurrency Control Performance Modeling: Alternatives and Implications](http://www.eecs.berkeley.edu/~brewer/cs262/ConcControl.pdf),” *ACM Transactions on Database 209 | Systems* (TODS), volume 12, number 4, pages 609–654, December 1987. 210 | [doi:10.1145/32204.32220](http://dx.doi.org/10.1145/32204.32220) 211 | 212 | 1. Dave Rosenthal: 213 | “[Databases at 14.4MHz](http://web.archive.org/web/20150427041746/http://blog.foundationdb.com/databases-at-14.4mhz),” 214 | *blog.foundationdb.com*, December 10, 2014. 215 | 216 | -------------------------------------------------------------------------------- /chapter-08-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 8 References 5 | -------------------- 6 | 7 | 1. Mark Cavage: 8 | “[There’s Just No Getting Around It: You’re Building a Distributed System](http://queue.acm.org/detail.cfm?id=2482856),” *ACM Queue*, volume 11, number 4, pages 80-89, April 2013. 9 | [doi:10.1145/2466486.2482856](http://dx.doi.org/10.1145/2466486.2482856) 10 | 11 | 1. Jay Kreps: 12 | “[Getting Real About Distributed System Reliability](http://blog.empathybox.com/post/19574936361/getting-real-about-distributed-system-reliability),” *blog.empathybox.com*, March 19, 2012. 13 | 14 | 1. Sydney Padua: *The Thrilling Adventures of 15 | Lovelace and Babbage: The (Mostly) True Story of the First Computer*. Particular Books, April 16 | 2015. ISBN: 978-0-141-98151-2 17 | 18 | 1. Coda Hale: 19 | “[You Can’t Sacrifice Partition Tolerance](http://codahale.com/you-cant-sacrifice-partition-tolerance/),” *codahale.com*, October 7, 2010. 20 | 21 | 1. Jeff Hodges: 22 | “[Notes on Distributed Systems for Young Bloods](https://web.archive.org/web/20200218095605/https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/),” *somethingsimilar.com*, January 14, 2013. 23 | 24 | 1. Antonio Regalado: 25 | “[Who Coined 'Cloud Computing’?](http://www.technologyreview.com/news/425970/who-coined-cloud-computing/),” *technologyreview.com*, October 31, 2011. 26 | 27 | 1. Luiz André Barroso, Jimmy Clidaras, and Urs Hölzle: 28 | “[The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition](http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y201306CAC024),” 29 | *Synthesis Lectures on Computer Architecture*, volume 8, number 3, 30 | Morgan & Claypool Publishers, July 2013. 31 | [doi:10.2200/S00516ED2V01Y201306CAC024](http://dx.doi.org/10.2200/S00516ED2V01Y201306CAC024), 32 | ISBN: 978-1-627-05010-4 33 | 34 | 1. David Fiala, Frank Mueller, Christian Engelmann, et al.: 35 | “[Detection and Correction of Silent Data Corruption for Large-Scale High-Performance Computing](http://moss.csc.ncsu.edu/~mueller/ftp/pub/mueller/papers/sc12.pdf),” at 36 | *International Conference for High Performance Computing, Networking, Storage and 37 | Analysis* (SC12), November 2012. 38 | 39 | 1. Arjun Singh, Joon Ong, Amit Agarwal, et al.: 40 | “[Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network](http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf),” at 41 | *Annual Conference of the ACM Special Interest Group on Data Communication* (SIGCOMM), August 2015. 42 | [doi:10.1145/2785956.2787508](http://dx.doi.org/10.1145/2785956.2787508) 43 | 44 | 1. Glenn K. Lockwood: 45 | “[Hadoop's Uncomfortable Fit in HPC](http://glennklockwood.blogspot.co.uk/2014/05/hadoops-uncomfortable-fit-in-hpc.html),” *glennklockwood.blogspot.co.uk*, May 16, 2014. 46 | 47 | 1. John von Neumann: 48 | “[Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components](https://ece.uwaterloo.ca/~ssundara/courses/prob_logics.pdf),” in *Automata Studies (AM-34)*, 49 | edited by Claude E. Shannon and John McCarthy, Princeton University Press, 1956. 50 | ISBN: 978-0-691-07916-5 51 | 52 | 1. Richard W. Hamming: 53 | *The Art of Doing Science and Engineering*. Taylor & Francis, 1997. 54 | ISBN: 978-9-056-99500-3 55 | 56 | 1. Claude E. Shannon: 57 | “[A Mathematical Theory of Communication](http://cs.brynmawr.edu/Courses/cs380/fall2012/shannon1948.pdf),” *The Bell System Technical Journal*, volume 27, number 3, 58 | pages 379–423 and 623–656, July 1948. 59 | 60 | 1. Peter Bailis and Kyle Kingsbury: 61 | “[The Network Is Reliable](https://queue.acm.org/detail.cfm?id=2655736),” 62 | *ACM Queue*, volume 12, number 7, pages 48-55, July 2014. 63 | [doi:10.1145/2639988.2639988](http://dx.doi.org/10.1145/2639988.2639988) 64 | 65 | 1. Joshua B. Leners, Trinabh Gupta, Marcos K. Aguilera, and Michael Walfish: 66 | “[Taming Uncertainty in Distributed Systems with Help from the Network](http://www.cs.nyu.edu/~mwalfish/papers/albatross-eurosys15.pdf),” at *10th European Conference on 67 | Computer Systems* (EuroSys), April 2015. 68 | [doi:10.1145/2741948.2741976](http://dx.doi.org/10.1145/2741948.2741976) 69 | 70 | 1. Phillipa Gill, Navendu Jain, and Nachiappan Nagappan: 71 | “[Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications](http://conferences.sigcomm.org/sigcomm/2011/papers/sigcomm/p350.pdf),” at 72 | *ACM SIGCOMM Conference*, August 2011. 73 | [doi:10.1145/2018436.2018477](http://dx.doi.org/10.1145/2018436.2018477) 74 | 75 | 1. Mark Imbriaco: 76 | “[Downtime Last Saturday](https://github.com/blog/1364-downtime-last-saturday),” 77 | *github.com*, December 26, 2012. 78 | 79 | 1. Will Oremus: 80 | “[The Global Internet Is Being Attacked by Sharks, Google Confirms](http://www.slate.com/blogs/future_tense/2014/08/15/shark_attacks_threaten_google_s_undersea_internet_cables_video.html),” *slate.com*, August 15, 81 | 2014. 82 | 83 | 1. Marc A. Donges: 84 | “[Re: bnx2 cards Intermittantly Going Offline](http://www.spinics.net/lists/netdev/msg210485.html),” Message to Linux *netdev* mailing list, *spinics.net*, September 13, 2012. 85 | 86 | 1. Kyle Kingsbury: 87 | “[Call Me Maybe: Elasticsearch](https://aphyr.com/posts/317-call-me-maybe-elasticsearch),” *aphyr.com*, June 15, 2014. 88 | 89 | 1. Salvatore Sanfilippo: 90 | “[A Few Arguments About Redis Sentinel Properties and Fail Scenarios](http://antirez.com/news/80),” *antirez.com*, October 21, 2014. 91 | 92 | 1. Bert Hubert: 93 | “[The Ultimate SO_LINGER Page, or: Why Is My TCP Not Reliable](http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable),” *blog.netherlabs.nl*, January 18, 2009. 94 | 95 | 1. Nicolas Liochon: 96 | “[CAP: If All You Have Is a Timeout, Everything Looks Like a Partition](http://blog.thislongrun.com/2015/05/CAP-theorem-partition-timeout-zookeeper.html),” *blog.thislongrun.com*, 97 | May 25, 2015. 98 | 99 | 1. Jerome H. Saltzer, David P. Reed, and David D. Clark: 100 | “[End-To-End Arguments in System Design](https://groups.csail.mit.edu/ana/Publications/PubPDFs/End-to-End%20Arguments%20in%20System%20Design.pdf),” 101 | *ACM Transactions on Computer Systems*, volume 2, number 4, pages 277–288, November 1984. 102 | [doi:10.1145/357401.357402](http://dx.doi.org/10.1145/357401.357402) 103 | 104 | 1. Matthew P. Grosvenor, Malte Schwarzkopf, Ionel Gog, et al.: 105 | “[Queues Don’t Matter When You Can JUMP Them!](https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-grosvenor_update.pdf),” at *12th USENIX Symposium on Networked 106 | Systems Design and Implementation* (NSDI), May 2015. 107 | 108 | 1. Guohui Wang and T. S. Eugene Ng: 109 | “[The Impact of Virtualization on Network Performance of Amazon EC2 Data Center](http://www.cs.rice.edu/~eugeneng/papers/INFOCOM10-ec2.pdf),” at *29th IEEE 110 | International Conference on Computer Communications* (INFOCOM), March 2010. 111 | [doi:10.1109/INFCOM.2010.5461931](http://dx.doi.org/10.1109/INFCOM.2010.5461931) 112 | 113 | 1. Van Jacobson: 114 | “[Congestion Avoidance and Control](http://www.cs.usask.ca/ftp/pub/discus/seminars2002-2003/p314-jacobson.pdf),” at *ACM Symposium on Communications Architectures and 115 | Protocols* (SIGCOMM), August 1988. 116 | [doi:10.1145/52324.52356](http://dx.doi.org/10.1145/52324.52356) 117 | 118 | 1. Brandon Philips: 119 | “[etcd: Distributed Locking and Service Discovery](https://www.youtube.com/watch?v=HJIjTTHWYnE),” at *Strange Loop*, September 2014. 120 | 121 | 1. Steve Newman: 122 | “[A Systematic Look at EC2 I/O](https://web.archive.org/web/20141211094156/http://blog.scalyr.com/2012/10/a-systematic-look-at-ec2-io/),” 123 | *blog.scalyr.com*, October 16, 2012. 124 | 125 | 1. Naohiro Hayashibara, Xavier Défago, Rami Yared, and 126 | Takuya Katayama: “[The ϕ Accrual Failure Detector](http://hdl.handle.net/10119/4784),” Japan Advanced Institute of Science and Technology, School of Information 127 | Science, Technical Report IS-RR-2004-010, May 2004. 128 | 129 | 1. Jeffrey Wang: 130 | “[Phi Accrual Failure Detector](http://ternarysearch.blogspot.co.uk/2013/08/phi-accrual-failure-detector.html),” *ternarysearch.blogspot.co.uk*, August 11, 2013. 131 | 132 | 1. Srinivasan Keshav: *An Engineering Approach 133 | to Computer Networking: ATM Networks, the Internet, and the Telephone Network*. 134 | Addison-Wesley Professional, May 1997. ISBN: 978-0-201-63442-6 135 | 136 | 1. Cisco, “[Integrated Services Digital Network](https://web.archive.org/web/20181229220921/http://docwiki.cisco.com/wiki/Integrated_Services_Digital_Network),” *docwiki.cisco.com*. 137 | 138 | 1. Othmar Kyas: *ATM Networks*. 139 | International Thomson Publishing, 1995. ISBN: 978-1-850-32128-6 140 | 141 | 1. “[InfiniBand FAQ](http://www.mellanox.com/related-docs/whitepapers/InfiniBandFAQ_FQ_100.pdf),” Mellanox Technologies, December 22, 2014. 142 | 143 | 1. Jose Renato Santos, Yoshio Turner, and G. (John) Janakiraman: 144 | “[End-to-End Congestion Control for InfiniBand](http://www.hpl.hp.com/techreports/2002/HPL-2002-359.pdf),” at *22nd Annual Joint Conference of the IEEE Computer and 145 | Communications Societies* (INFOCOM), April 2003. Also published by HP Laboratories Palo 146 | Alto, Tech Report HPL-2002-359. 147 | [doi:10.1109/INFCOM.2003.1208949](http://dx.doi.org/10.1109/INFCOM.2003.1208949) 148 | 149 | 1. Ulrich Windl, David Dalton, Marc Martinec, and Dale R. Worley: 150 | “[The NTP FAQ and HOWTO](http://www.ntp.org/ntpfaq/NTP-a-faq.htm),” *ntp.org*, 151 | November 2006. 152 | 153 | 1. John Graham-Cumming: 154 | “[How and why the leap second affected Cloudflare DNS](https://blog.cloudflare.com/how-and-why-the-leap-second-affected-cloudflare-dns/),” *blog.cloudflare.com*, January 1, 2017. 155 | 156 | 1. David Holmes: 157 | “[Inside the Hotspot VM: Clocks, Timers and Scheduling Events – Part I – Windows](https://web.archive.org/web/20160308031939/https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks),” 158 | *blogs.oracle.com*, October 2, 2006. 159 | 160 | 1. Steve Loughran: 161 | “[Time on Multi-Core, Multi-Socket Servers](http://steveloughran.blogspot.co.uk/2015/09/time-on-multi-core-multi-socket-servers.html),” *steveloughran.blogspot.co.uk*, September 17, 2015. 162 | 163 | 1. James C. Corbett, Jeffrey Dean, Michael Epstein, et al.: 164 | “[Spanner: Google’s Globally-Distributed Database](https://research.google/pubs/pub39966/),” at *10th USENIX Symposium on Operating System Design and 165 | Implementation* (OSDI), October 2012. 166 | 167 | 1. M. Caporaloni and R. Ambrosini: 168 | “[How Closely Can a Personal Computer Clock Track the UTC Timescale Via the Internet?](https://iopscience.iop.org/0143-0807/23/4/103/),” *European Journal of 169 | Physics*, volume 23, number 4, pages L17–L21, June 2012. 170 | [doi:10.1088/0143-0807/23/4/103](http://dx.doi.org/10.1088/0143-0807/23/4/103) 171 | 172 | 1. Nelson Minar: 173 | “[A Survey of the NTP Network](http://alumni.media.mit.edu/~nelson/research/ntp-survey99/),” 174 | *alumni.media.mit.edu*, December 1999. 175 | 176 | 1. Viliam Holub: 177 | “[Synchronizing Clocks in a Cassandra Cluster Pt. 1 – The Problem](https://blog.rapid7.com/2014/03/14/synchronizing-clocks-in-a-cassandra-cluster-pt-1-the-problem/),” *blog.rapid7.com*, March 14, 2014. 178 | 179 | 1. Poul-Henning Kamp: 180 | “[The One-Second War (What Time Will You Die?)](http://queue.acm.org/detail.cfm?id=1967009),” *ACM Queue*, volume 9, number 4, pages 44–48, April 2011. 181 | [doi:10.1145/1966989.1967009](http://dx.doi.org/10.1145/1966989.1967009) 182 | 183 | 1. Nelson Minar: 184 | “[Leap Second Crashes Half the Internet](http://www.somebits.com/weblog/tech/bad/leap-second-2012.html),” *somebits.com*, July 3, 2012. 185 | 186 | 1. Christopher Pascoe: 187 | “[Time, Technology and Leaping Seconds](http://googleblog.blogspot.co.uk/2011/09/time-technology-and-leaping-seconds.html),” *googleblog.blogspot.co.uk*, September 15, 2011. 188 | 189 | 1. Mingxue Zhao and Jeff Barr: 190 | “[Look Before You Leap – The Coming Leap Second and AWS](https://aws.amazon.com/blogs/aws/look-before-you-leap-the-coming-leap-second-and-aws/),” *aws.amazon.com*, May 18, 2015. 191 | 192 | 1. Darryl Veitch and Kanthaiah Vijayalayan: 193 | “[Network Timing and the 2015 Leap Second](http://crin.eng.uts.edu.au/~darryl/Publications/LeapSecond_camera.pdf),” at *17th International Conference on Passive and Active 194 | Measurement* (PAM), April 2016. 195 | [doi:10.1007/978-3-319-30505-9_29](http://dx.doi.org/10.1007/978-3-319-30505-9_29) 196 | 197 | 1. “[Timekeeping in VMware Virtual Machines](https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/Timekeeping-In-VirtualMachines.pdf),” 198 | Information Guide, VMware, Inc., December 2011. 199 | 200 | 1. “[MiFID II / MiFIR: Regulatory Technical and Implementing Standards – Annex I (Draft)](https://www.esma.europa.eu/sites/default/files/library/2015/11/2015-esma-1464_annex_i_-_draft_rts_and_its_on_mifid_ii_and_mifir.pdf),” 201 | European Securities and Markets Authority, Report ESMA/2015/1464, September 2015. 202 | 203 | 1. Luke Bigum: 204 | “[Solving MiFID II Clock Synchronisation With Minimum Spend (Part 1)](https://web.archive.org/web/20170704030310/https://www.lmax.com/blog/staff-blogs/2015/11/27/solving-mifid-ii-clock-synchronisation-minimum-spend-part-1/),” 205 | *lmax.com*, November 27, 2015. 206 | 207 | 1. Kyle Kingsbury: 208 | “[Call Me Maybe: Cassandra](https://aphyr.com/posts/294-call-me-maybe-cassandra/),” *aphyr.com*, September 24, 2013. 209 | 210 | 1. John Daily: 211 | “[Clocks Are Bad, or, Welcome to the Wonderful World of Distributed Systems](https://riak.com/clocks-are-bad-or-welcome-to-distributed-systems/),” 212 | *riak.com*, November 12, 2013. 213 | 214 | 1. Kyle Kingsbury: 215 | “[The Trouble with Timestamps](https://aphyr.com/posts/299-the-trouble-with-timestamps),” *aphyr.com*, October 12, 2013. 216 | 217 | 1. Leslie Lamport: 218 | “[Time, Clocks, and the Ordering of Events in a Distributed System](https://www.microsoft.com/en-us/research/publication/time-clocks-ordering-events-distributed-system/),” 219 | *Communications of the ACM*, volume 21, number 7, pages 558–565, July 1978. 220 | [doi:10.1145/359545.359563](http://dx.doi.org/10.1145/359545.359563) 221 | 222 | 1. Sandeep Kulkarni, Murat Demirbas, Deepak Madeppa, et al.: 223 | “[Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases](http://www.cse.buffalo.edu/tech-reports/2014-04.pdf),” State University of New York at 224 | Buffalo, Computer Science and Engineering Technical Report 2014-04, May 2014. 225 | 226 | 1. Justin Sheehy: 227 | “[There Is No Now: Problems With Simultaneity in Distributed Systems](https://queue.acm.org/detail.cfm?id=2745385),” *ACM Queue*, volume 13, number 3, pages 36–41, March 2015. 228 | [doi:10.1145/2733108](http://dx.doi.org/10.1145/2733108) 229 | 230 | 1. Murat Demirbas: 231 | “[Spanner: Google's Globally-Distributed Database](http://muratbuffalo.blogspot.co.uk/2013/07/spanner-googles-globally-distributed_4.html),” *muratbuffalo.blogspot.co.uk*, July 4, 2013. 232 | 233 | 1. Dahlia Malkhi and Jean-Philippe Martin: 234 | “[Spanner's Concurrency Control](http://www.cs.cornell.edu/~ie53/publications/DC-col51-Sep13.pdf),” *ACM SIGACT News*, volume 44, number 3, pages 73–77, September 2013. 235 | [doi:10.1145/2527748.2527767](http://dx.doi.org/10.1145/2527748.2527767) 236 | 237 | 1. Manuel Bravo, Nuno Diegues, Jingna Zeng, et al.: 238 | “[On the Use of Clocks to Enforce Consistency in the Cloud](http://sites.computer.org/debull/A15mar/p18.pdf),” *IEEE Data Engineering Bulletin*, 239 | volume 38, number 1, pages 18–31, March 2015. 240 | 241 | 1. Spencer Kimball: 242 | “[Living Without Atomic Clocks](http://www.cockroachlabs.com/blog/living-without-atomic-clocks/),” *cockroachlabs.com*, February 17, 2016. 243 | 244 | 1. Cary G. Gray and David R. Cheriton: 245 | “[Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency](http://web.stanford.edu/class/cs240/readings/89-leases.pdf),” at 246 | *12th ACM Symposium on Operating Systems Principles* (SOSP), December 1989. 247 | [doi:10.1145/74850.74870](http://dx.doi.org/10.1145/74850.74870) 248 | 249 | 1. Todd Lipcon: 250 | “[Avoiding Full GCs in Apache HBase with MemStore-Local Allocation Buffers: Part 1](https://web.archive.org/web/20121101040711/http://blog.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/),” 251 | *blog.cloudera.com*, February 24, 2011. 252 | 253 | 1. Martin Thompson: 254 | “[Java Garbage Collection Distilled](http://mechanical-sympathy.blogspot.co.uk/2013/07/java-garbage-collection-distilled.html),” *mechanical-sympathy.blogspot.co.uk*, July 16, 2013. 255 | 256 | 1. Alexey Ragozin: 257 | “[How to Tame Java GC Pauses? Surviving 16GiB Heap and Greater](https://dzone.com/articles/how-tame-java-gc-pauses),” 258 | *dzone.com*, June 28, 2011. 259 | 260 | 1. Christopher Clark, Keir Fraser, Steven Hand, et al.: 261 | “[Live Migration of Virtual Machines](http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf),” at *2nd USENIX Symposium on Symposium on 262 | Networked Systems Design & Implementation* (NSDI), May 2005. 263 | 264 | 1. Mike Shaver: 265 | “[fsyncers and Curveballs](http://shaver.off.net/diary/2008/05/25/fsyncers-and-curveballs/),” *shaver.off.net*, May 25, 2008. 266 | 267 | 1. Zhenyun Zhuang and Cuong Tran: 268 | “[Eliminating Large JVM GC Pauses Caused by Background IO Traffic](https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic),” *engineering.linkedin.com*, February 10, 269 | 2016. 270 | 271 | 1. David Terei and Amit Levy: 272 | “[Blade: A Data Center Garbage Collector](http://arxiv.org/pdf/1504.02578.pdf),” 273 | arXiv:1504.02578, April 13, 2015. 274 | 275 | 1. Martin Maas, Tim Harris, Krste Asanović, and John Kubiatowicz: 276 | “[Trash Day: Coordinating Garbage Collection in Distributed Systems](https://timharris.uk/papers/2015-hotos.pdf),” at *15th USENIX Workshop on Hot Topics in Operating 277 | Systems* (HotOS), May 2015. 278 | 279 | 1. “[Predictable Low Latency](http://cdn2.hubspot.net/hubfs/1624455/Website_2016/content/White%20papers/Cinnober%20on%20GC%20pause%20free%20Java%20applications.pdf),” Cinnober Financial Technology AB, *cinnober.com*, November 24, 2013. 280 | 281 | 1. Martin Fowler: 282 | “[The LMAX Architecture](http://martinfowler.com/articles/lmax.html),” 283 | *martinfowler.com*, July 12, 2011. 284 | 285 | 1. Flavio P. Junqueira and Benjamin Reed: 286 | *ZooKeeper: Distributed Process Coordination*. O'Reilly Media, 2013. 287 | ISBN: 978-1-449-36130-3 288 | 289 | 1. Enis Söztutar: 290 | “[HBase and HDFS: Understanding Filesystem Usage in HBase](http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usage),” at *HBaseCon*, 291 | June 2013. 292 | 293 | 1. Caitie McCaffrey: 294 | “[Clients Are Jerks: AKA How Halo 4 DoSed the Services at Launch & How We Survived](http://caitiem.com/2015/06/23/clients-are-jerks-aka-how-halo-4-dosed-the-services-at-launch-how-we-survived/),” *caitiem.com*, 295 | June 23, 2015. 296 | 297 | 1. Leslie Lamport, Robert Shostak, and Marshall Pease: 298 | “[The Byzantine Generals Problem](https://www.microsoft.com/en-us/research/publication/byzantine-generals-problem/),” 299 | *ACM Transactions on Programming Languages and Systems* (TOPLAS), volume 4, number 3, pages 382–401, July 1982. 300 | [doi:10.1145/357172.357176](http://dx.doi.org/10.1145/357172.357176) 301 | 302 | 1. Jim N. Gray: 303 | “[Notes on Data Base Operating Systems](http://jimgray.azurewebsites.net/papers/dbos.pdf),” in *Operating Systems: An Advanced Course*, Lecture 304 | Notes in Computer Science, volume 60, edited by R. Bayer, R. M. Graham, and G. Seegmüller, 305 | pages 393–481, Springer-Verlag, 1978. ISBN: 978-3-540-08755-7 306 | 307 | 1. Brian Palmer: 308 | “[How Complicated Was the Byzantine Empire?](http://www.slate.com/articles/news_and_politics/explainer/2011/10/the_byzantine_tax_code_how_complicated_was_byzantium_anyway_.html),” *slate.com*, October 20, 2011. 309 | 310 | 1. Leslie Lamport: 311 | “[My Writings](http://lamport.azurewebsites.net/pubs/pubs.html),” *lamport.azurewebsites.net*, December 16, 2014. 312 | This page can be found by searching the web for the 23-character string obtained by removing the hyphens from the string 313 | `allla-mport-spubso-ntheweb`. 314 | 315 | 1. John Rushby: 316 | “[Bus Architectures for Safety-Critical Embedded Systems](http://www.csl.sri.com/papers/emsoft01/emsoft01.pdf),” at *1st International Workshop on Embedded Software* 317 | (EMSOFT), October 2001. 318 | 319 | 1. Jake Edge: 320 | “[ELC: SpaceX Lessons Learned](http://lwn.net/Articles/540368/),” *lwn.net*, 321 | March 6, 2013. 322 | 323 | 1. Andrew Miller and Joseph J. LaViola, Jr.: 324 | “[Anonymous Byzantine Consensus from Moderately-Hard Puzzles: A Model for Bitcoin](http://nakamotoinstitute.org/static/docs/anonymous-byzantine-consensus.pdf),” University of Central 325 | Florida, Technical Report CS-TR-14-01, April 2014. 326 | 327 | 1. James Mickens: 328 | “[The Saddest Moment](https://www.usenix.org/system/files/login-logout_1305_mickens.pdf),” *USENIX ;login: logout*, May 2013. 329 | 330 | 1. Evan Gilman: 331 | “[The Discovery of Apache ZooKeeper’s Poison Packet](http://www.pagerduty.com/blog/the-discovery-of-apache-zookeepers-poison-packet/),” *pagerduty.com*, May 7, 2015. 332 | 333 | 1. Jonathan Stone and Craig Partridge: 334 | “[When the CRC and TCP Checksum Disagree](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.27.7611&rep=rep1&type=pdf),” at *ACM Conference on Applications, 335 | Technologies, Architectures, and Protocols for Computer Communication* (SIGCOMM), August 2000. 336 | [doi:10.1145/347059.347561](http://dx.doi.org/10.1145/347059.347561) 337 | 338 | 1. Evan Jones: 339 | “[How Both TCP and Ethernet Checksums Fail](http://www.evanjones.ca/tcp-and-ethernet-checksums-fail.html),” *evanjones.ca*, October 5, 2015. 340 | 341 | 1. Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer: 342 | “[Consensus in the Presence of Partial Synchrony](http://www.net.t-labs.tu-berlin.de/~petr/ADC-07/papers/DLS88.pdf),” *Journal of the ACM*, volume 35, number 2, pages 288–323, 343 | April 1988. [doi:10.1145/42282.42283](http://dx.doi.org/10.1145/42282.42283) 344 | 345 | 1. Peter Bailis and Ali Ghodsi: 346 | “[Eventual Consistency Today: Limitations, Extensions, and Beyond](http://queue.acm.org/detail.cfm?id=2462076),” *ACM Queue*, volume 11, number 3, pages 55-63, March 2013. 347 | [doi:10.1145/2460276.2462076](http://dx.doi.org/10.1145/2460276.2462076) 348 | 349 | 1. Bowen Alpern and Fred B. Schneider: 350 | “[Defining Liveness](https://www.cs.cornell.edu/fbs/publications/DefLiveness.pdf),” 351 | *Information Processing Letters*, volume 21, number 4, pages 181–185, October 1985. 352 | [doi:10.1016/0020-0190(85)90056-0](http://dx.doi.org/10.1016/0020-0190(85)90056-0) 353 | 354 | 1. Flavio P. Junqueira: 355 | “[Dude, Where’s My Metadata?](http://fpj.me/2015/05/28/dude-wheres-my-metadata/),” 356 | *fpj.me*, May 28, 2015. 357 | 358 | 1. Scott Sanders: 359 | “[January 28th Incident Report](https://github.com/blog/2106-january-28th-incident-report),” *github.com*, February 3, 2016. 360 | 361 | 1. Jay Kreps: 362 | “[A Few Notes on Kafka and Jepsen](http://blog.empathybox.com/post/62279088548/a-few-notes-on-kafka-and-jepsen),” *blog.empathybox.com*, September 25, 2013. 363 | 364 | 1. Thanh Do, Mingzhe Hao, Tanakorn 365 | Leesatapornwongsa, et al.: 366 | “[Limplock: Understanding the Impact of Limpware on Scale-out Cloud Systems](http://ucare.cs.uchicago.edu/pdf/socc13-limplock.pdf),” at *4th ACM Symposium on Cloud Computing* 367 | (SoCC), October 2013. 368 | [doi:10.1145/2523616.2523627](http://dx.doi.org/10.1145/2523616.2523627) 369 | 370 | 1. Frank McSherry, Michael Isard, and Derek G. Murray: 371 | “[Scalability! But at What COST?](http://www.frankmcsherry.org/assets/COST.pdf),” 372 | at *15th USENIX Workshop on Hot Topics in Operating Systems* (HotOS), 373 | May 2015. 374 | 375 | -------------------------------------------------------------------------------- /chapter-09-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 9 References 5 | -------------------- 6 | 7 | 1. Peter Bailis and Ali Ghodsi: 8 | “[Eventual Consistency Today: Limitations, Extensions, and Beyond](http://queue.acm.org/detail.cfm?id=2462076),” *ACM Queue*, volume 11, number 3, pages 55-63, March 2013. 9 | [doi:10.1145/2460276.2462076](http://dx.doi.org/10.1145/2460276.2462076) 10 | 11 | 1. Prince Mahajan, Lorenzo Alvisi, and Mike Dahlin: 12 | “[Consistency, Availability, and Convergence](http://apps.cs.utexas.edu/tech_reports/reports/tr/TR-2036.pdf),” University of Texas at Austin, Department of Computer 13 | Science, Tech Report UTCS TR-11-22, May 2011. 14 | 15 | 1. Alex Scotti: 16 | “[Adventures in Building Your Own Database](http://www.slideshare.net/AlexScotti1/allyourbase-55212398),” at *All Your Base*, November 2015. 17 | 18 | 1. Peter Bailis, Aaron Davidson, Alan Fekete, et al.: 19 | “[Highly Available Transactions: Virtues and Limitations](http://arxiv.org/pdf/1302.0309.pdf),” at *40th International Conference on Very Large Data Bases* (VLDB), 20 | September 2014. Extended version published as pre-print arXiv:1302.0309 [cs.DB]. 21 | 22 | 1. Paolo Viotti and Marko Vukolić: 23 | “[Consistency in Non-Transactional Distributed Storage Systems](http://arxiv.org/abs/1512.00168),” arXiv:1512.00168, 12 April 2016. 24 | 25 | 1. Maurice P. Herlihy and Jeannette M. Wing: 26 | “[Linearizability: A Correctness Condition for Concurrent Objects](http://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf),” *ACM Transactions on Programming 27 | Languages and Systems* (TOPLAS), volume 12, number 3, pages 463–492, July 1990. 28 | [doi:10.1145/78969.78972](http://dx.doi.org/10.1145/78969.78972) 29 | 30 | 1. Leslie Lamport: 31 | “[On interprocess communication](https://www.microsoft.com/en-us/research/publication/interprocess-communication-part-basic-formalism-part-ii-algorithms/),” 32 | *Distributed Computing*, volume 1, number 2, pages 77–101, June 1986. 33 | [doi:10.1007/BF01786228](http://dx.doi.org/10.1007/BF01786228) 34 | 35 | 1. David K. Gifford: 36 | “[Information Storage in a Decentralized Computer System](http://www.mirrorservice.org/sites/www.bitsavers.org/pdf/xerox/parc/techReports/CSL-81-8_Information_Storage_in_a_Decentralized_Computer_System.pdf),” Xerox Palo Alto Research Centers, CSL-81-8, June 1981. 37 | 38 | 1. Martin Kleppmann: 39 | “[Please Stop Calling Databases CP or AP](http://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html),” *martin.kleppmann.com*, May 11, 2015. 40 | 41 | 1. Kyle Kingsbury: 42 | “[Call Me Maybe: MongoDB Stale Reads](https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-reads),” *aphyr.com*, April 20, 2015. 43 | 44 | 1. Kyle Kingsbury: 45 | “[Computational Techniques in Knossos](https://aphyr.com/posts/314-computational-techniques-in-knossos),” *aphyr.com*, May 17, 2014. 46 | 47 | 1. Peter Bailis: 48 | “[Linearizability Versus Serializability](http://www.bailis.org/blog/linearizability-versus-serializability/),” *bailis.org*, September 24, 2014. 49 | 50 | 1. Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman: 51 | [*Concurrency Control and Recovery in Database Systems*](http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx). 52 | Addison-Wesley, 1987. ISBN: 978-0-201-10715-9, available online at *research.microsoft.com*. 53 | 54 | 1. Mike Burrows: 55 | “[The Chubby Lock Service for Loosely-Coupled Distributed Systems](https://research.google/pubs/pub27897/),” 56 | at *7th USENIX Symposium on Operating System Design and Implementation* (OSDI), November 2006. 57 | 58 | 1. Flavio P. Junqueira and Benjamin Reed: 59 | *ZooKeeper: Distributed Process Coordination*. O'Reilly Media, 2013. 60 | ISBN: 978-1-449-36130-3 61 | 62 | 1. “[etcd Documentation](https://etcd.io/docs/),” The Linux Foundation, *etcd.io*. 63 | 64 | 1. “[Apache Curator](http://curator.apache.org/),” Apache Software Foundation, *curator.apache.org*, 2015. 65 | 66 | 1. Morali Vallath: 67 | *Oracle 10g RAC Grid, Services & Clustering*. Elsevier Digital Press, 2006. 68 | ISBN: 978-1-555-58321-7 69 | 70 | 1. Peter Bailis, Alan Fekete, Michael J Franklin, et al.: 71 | “[Coordination-Avoiding Database Systems](http://arxiv.org/pdf/1402.2237.pdf),” 72 | *Proceedings of the VLDB Endowment*, volume 8, number 3, pages 185–196, November 2014. 73 | 74 | 1. Kyle Kingsbury: 75 | “[Call Me Maybe: etcd and Consul](https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul),” *aphyr.com*, June 9, 2014. 76 | 77 | 1. Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini: 78 | “[Zab: High-Performance Broadcast for Primary-Backup Systems](https://marcoserafini.github.io/papers/zab.pdf),” 79 | at *41st IEEE International Conference on Dependable Systems and Networks* (DSN), June 2011. 80 | [doi:10.1109/DSN.2011.5958223](http://dx.doi.org/10.1109/DSN.2011.5958223) 81 | 82 | 1. Diego Ongaro and John K. Ousterhout: 83 | “[In Search of an Understandable Consensus Algorithm](https://www.usenix.org/system/files/conference/atc14/atc14-paper-ongaro.pdf),” 84 | at *USENIX Annual Technical Conference* (ATC), June 2014. 85 | 86 | 1. Hagit Attiya, Amotz Bar-Noy, and Danny Dolev: 87 | “[Sharing Memory Robustly in Message-Passing Systems](http://www.cse.huji.ac.il/course/2004/dist/p124-attiya.pdf),” 88 | *Journal of the ACM*, volume 42, number 1, pages 124–142, January 1995. 89 | [doi:10.1145/200836.200869](http://dx.doi.org/10.1145/200836.200869) 90 | 91 | 1. Nancy Lynch and Alex Shvartsman: 92 | “[Robust Emulation of Shared Memory Using Dynamic Quorum-Acknowledged Broadcasts](http://groups.csail.mit.edu/tds/papers/Lynch/FTCS97.pdf),” at *27th Annual International Symposium on 93 | Fault-Tolerant Computing* (FTCS), June 1997. 94 | [doi:10.1109/FTCS.1997.614100](http://dx.doi.org/10.1109/FTCS.1997.614100) 95 | 96 | 1. Christian Cachin, Rachid Guerraoui, and Luís Rodrigues: 97 | [*Introduction to Reliable and Secure Distributed Programming*](http://www.distributedprogramming.net/), 98 | 2nd edition. Springer, 2011. ISBN: 978-3-642-15259-7, 99 | [doi:10.1007/978-3-642-15260-3](http://dx.doi.org/10.1007/978-3-642-15260-3) 100 | 101 | 1. Sam Elliott, Mark Allen, and Martin Kleppmann: 102 | [personal communication](https://twitter.com/lenary/status/654761711933648896), 103 | thread on *twitter.com*, October 15, 2015. 104 | 105 | 1. Niklas Ekström, Mikhail Panchenko, and Jonathan Ellis: 106 | “[Possible Issue with Read Repair?](http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3CFA480D1DC3964E2C8B0A14E0880094C9%40Robotech%3E),” email thread on *cassandra-dev* mailing list, October 2012. 107 | 108 | 1. Maurice P. Herlihy: 109 | “[Wait-Free Synchronization](https://cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf),” 110 | *ACM Transactions on Programming Languages and Systems* (TOPLAS), volume 13, number 1, 111 | pages 124–149, January 1991. 112 | [doi:10.1145/114005.102808](http://dx.doi.org/10.1145/114005.102808) 113 | 114 | 1. Armando Fox and Eric A. Brewer: 115 | “[Harvest, Yield, and Scalable Tolerant Systems](http://radlab.cs.berkeley.edu/people/fox/static/pubs/pdf/c18.pdf),” at *7th Workshop on Hot Topics in Operating 116 | Systems* (HotOS), March 1999. 117 | [doi:10.1109/HOTOS.1999.798396](http://dx.doi.org/10.1109/HOTOS.1999.798396) 118 | 119 | 1. Seth Gilbert and Nancy Lynch: 120 | “[Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services](http://www.comp.nus.edu.sg/~gilbert/pubs/BrewersConjecture-SigAct.pdf),” 121 | *ACM SIGACT News*, volume 33, number 2, pages 51–59, June 2002. 122 | [doi:10.1145/564585.564601](http://dx.doi.org/10.1145/564585.564601) 123 | 124 | 1. Seth Gilbert and Nancy Lynch: 125 | “[Perspectives on the CAP Theorem](http://groups.csail.mit.edu/tds/papers/Gilbert/Brewer2.pdf),” *IEEE Computer Magazine*, volume 45, number 2, pages 30–36, February 2012. 126 | [doi:10.1109/MC.2011.389](http://dx.doi.org/10.1109/MC.2011.389) 127 | 128 | 1. Eric A. Brewer: 129 | “[CAP Twelve Years Later: How the 'Rules' Have Changed](http://cs609.cs.ua.edu/CAP12.pdf),” *IEEE Computer Magazine*, volume 45, number 2, pages 23–29, February 2012. 130 | [doi:10.1109/MC.2012.37](http://dx.doi.org/10.1109/MC.2012.37) 131 | 132 | 1. Susan B. Davidson, Hector Garcia-Molina, and Dale Skeen: 133 | “[Consistency in Partitioned Networks](http://delab.csd.auth.gr/~dimitris/courses/mpc_fall05/papers/invalidation/acm_csur85_partitioned_network_consistency.pdf),” *ACM Computing Surveys*, volume 17, number 3, pages 341–370, September 1985. 134 | [doi:10.1145/5505.5508](http://dx.doi.org/10.1145/5505.5508) 135 | 136 | 1. Paul R. Johnson and Robert H. Thomas: 137 | “[RFC 677: The Maintenance of Duplicate Databases](https://tools.ietf.org/html/rfc677),” Network Working Group, January 27, 1975. 138 | 139 | 1. Bruce G. Lindsay, Patricia Griffiths Selinger, C. Galtieri, et al.: 140 | “[Notes on Distributed Databases](http://domino.research.ibm.com/library/cyberdig.nsf/papers/A776EC17FC2FCE73852579F100578964/$File/RJ2571.pdf),” IBM Research, Research Report RJ2571(33471), July 1979. 141 | 142 | 1. Michael J. Fischer and Alan Michael: 143 | “[Sacrificing Serializability to Attain High Availability of Data in an Unreliable Network](http://www.cs.ucsb.edu/~agrawal/spring2011/ugrad/p70-fischer.pdf),” at 144 | *1st ACM Symposium on Principles of Database Systems* (PODS), March 1982. 145 | [doi:10.1145/588111.588124](http://dx.doi.org/10.1145/588111.588124) 146 | 147 | 1. Eric A. Brewer: 148 | “[NoSQL: Past, Present, Future](http://www.infoq.com/presentations/NoSQL-History),” 149 | at *QCon San Francisco*, November 2012. 150 | 151 | 1. Henry Robinson: 152 | “[CAP Confusion: Problems with 'Partition Tolerance,'](https://web.archive.org/web/20160304020135/http://blog.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/)” 153 | *blog.cloudera.com*, April 26, 2010. 154 | 155 | 1. Adrian Cockcroft: 156 | “[Migrating to Microservices](http://www.infoq.com/presentations/migration-cloud-native),” at *QCon London*, March 2014. 157 | 158 | 1. Martin Kleppmann: 159 | “[A Critique of the CAP Theorem](http://arxiv.org/abs/1509.05393),” arXiv:1509.05393, 160 | September 17, 2015. 161 | 162 | 1. Nancy A. Lynch: 163 | “[A Hundred Impossibility Proofs for Distributed Computing](http://groups.csail.mit.edu/tds/papers/Lynch/podc89.pdf),” at *8th ACM Symposium on Principles of Distributed 164 | Computing* (PODC), August 1989. 165 | [doi:10.1145/72981.72982](http://dx.doi.org/10.1145/72981.72982) 166 | 167 | 1. Hagit Attiya, Faith Ellen, and Adam Morrison: 168 | “[Limitations of Highly-Available Eventually-Consistent Data Stores](https://www.cs.tau.ac.il/~mad/publications/podc2015-replds.pdf),” 169 | at *ACM Symposium on Principles of Distributed Computing* (PODC), July 2015. 170 | [doi:10.1145/2767386.2767419](http://dx.doi.org/10.1145/2767386.2767419) 171 | 172 | 1. Peter Sewell, Susmit Sarkar, 173 | Scott Owens, et al.: 174 | “[x86-TSO: A Rigorous and Usable Programmer's Model for x86 Multiprocessors](http://www.cl.cam.ac.uk/~pes20/weakmemory/cacm.pdf),” *Communications of the ACM*, 175 | volume 53, number 7, pages 89–97, July 2010. 176 | [doi:10.1145/1785414.1785443](http://dx.doi.org/10.1145/1785414.1785443) 177 | 178 | 1. Martin Thompson: 179 | “[Memory Barriers/Fences](http://mechanical-sympathy.blogspot.co.uk/2011/07/memory-barriersfences.html),” *mechanical-sympathy.blogspot.co.uk*, July 24, 2011. 180 | 181 | 1. Ulrich Drepper: 182 | “[What Every Programmer Should Know About Memory](http://www.akkadia.org/drepper/cpumemory.pdf),” *akkadia.org*, November 21, 2007. 183 | 184 | 1. Daniel J. Abadi: 185 | “[Consistency Tradeoffs in Modern Distributed Database System Design](http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf),” *IEEE Computer Magazine*, 186 | volume 45, number 2, pages 37–42, February 2012. 187 | [doi:10.1109/MC.2012.33](http://dx.doi.org/10.1109/MC.2012.33) 188 | 189 | 1. Hagit Attiya and Jennifer L. Welch: 190 | “[Sequential Consistency Versus Linearizability](http://courses.csail.mit.edu/6.852/01/papers/p91-attiya.pdf),” *ACM Transactions on Computer Systems* (TOCS), 191 | volume 12, number 2, pages 91–122, May 1994. 192 | [doi:10.1145/176575.176576](http://dx.doi.org/10.1145/176575.176576) 193 | 194 | 1. Mustaque Ahamad, Gil Neiger, James E. Burns, et al.: 195 | “[Causal Memory: Definitions, Implementation, and Programming](http://www-i2.informatik.rwth-aachen.de/i2/fileadmin/user_upload/documents/Seminar_MCMM11/Causal_memory_1996.pdf),” *Distributed 196 | Computing*, volume 9, number 1, pages 37–49, March 1995. 197 | [doi:10.1007/BF01784241](http://dx.doi.org/10.1007/BF01784241) 198 | 199 | 1. Wyatt Lloyd, Michael J. Freedman, 200 | Michael Kaminsky, and David G. Andersen: 201 | “[Stronger Semantics for Low-Latency Geo-Replicated Storage](https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final149.pdf),” at *10th USENIX Symposium on Networked 202 | Systems Design and Implementation* (NSDI), April 2013. 203 | 204 | 1. Marek Zawirski, Annette Bieniusa, Valter Balegas, et al.: 205 | “[SwiftCloud: Fault-Tolerant Geo-Replication Integrated All the Way to the Client Machine](http://arxiv.org/abs/1310.3107),” INRIA Research Report 8347, August 2013. 206 | 207 | 1. Peter Bailis, Ali Ghodsi, Joseph M Hellerstein, and Ion Stoica: 208 | “[Bolt-on Causal Consistency](http://db.cs.berkeley.edu/papers/sigmod13-bolton.pdf),” at 209 | *ACM International Conference on Management of Data* (SIGMOD), June 2013. 210 | 211 | 1. Philippe Ajoux, Nathan Bronson, Sanjeev 212 | Kumar, et al.: 213 | “[Challenges to Adopting Stronger Consistency at Scale](https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-ajoux.pdf),” at *15th USENIX Workshop on Hot Topics in 214 | Operating Systems* (HotOS), May 2015. 215 | 216 | 1. Peter Bailis: 217 | “[Causality Is Expensive (and What to Do About It)](http://www.bailis.org/blog/causality-is-expensive-and-what-to-do-about-it/),” *bailis.org*, February 5, 2014. 218 | 219 | 1. Ricardo Gonçalves, Paulo Sérgio Almeida, 220 | Carlos Baquero, and Victor Fonte: 221 | “[Concise Server-Wide Causality Management for Eventually Consistent Data Stores](http://haslab.uminho.pt/tome/files/global_logical_clocks.pdf),” at *15th IFIP International 222 | Conference on Distributed Applications and Interoperable Systems* (DAIS), June 2015. 223 | [doi:10.1007/978-3-319-19129-4_6](http://dx.doi.org/10.1007/978-3-319-19129-4_6) 224 | 225 | 1. Rob Conery: 226 | “[A Better ID Generator for PostgreSQL](http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/),” *rob.conery.io*, May 29, 2014. 227 | 228 | 1. Leslie Lamport: 229 | “[Time, Clocks, and the Ordering of Events in a Distributed System](https://www.microsoft.com/en-us/research/publication/time-clocks-ordering-events-distributed-system/),” 230 | *Communications of the ACM*, volume 21, number 7, pages 558–565, July 1978. 231 | [doi:10.1145/359545.359563](http://dx.doi.org/10.1145/359545.359563) 232 | 233 | 1. Xavier Défago, André Schiper, and Péter Urbán: 234 | “[Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey](https://dspace.jaist.ac.jp/dspace/bitstream/10119/4883/1/defago_et_al.pdf),” *ACM Computing 235 | Surveys*, volume 36, number 4, pages 372–421, December 2004. 236 | [doi:10.1145/1041680.1041682](http://dx.doi.org/10.1145/1041680.1041682) 237 | 238 | 1. Hagit Attiya and Jennifer Welch: *Distributed 239 | Computing: Fundamentals, Simulations and Advanced Topics*, 2nd edition. 240 | John Wiley & Sons, 2004. ISBN: 978-0-471-45324-6, 241 | [doi:10.1002/0471478210](http://dx.doi.org/10.1002/0471478210) 242 | 243 | 1. Mahesh 244 | Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, et al.: 245 | “[CORFU: A Shared Log Design for Flash Clusters](https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final30.pdf),” at *9th USENIX Symposium on Networked 246 | Systems Design and Implementation* (NSDI), April 2012. 247 | 248 | 1. Fred B. Schneider: 249 | “[Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial](http://www.cs.cornell.edu/fbs/publications/smsurvey.pdf),” *ACM Computing Surveys*, volume 250 | 22, number 4, pages 299–319, December 1990. 251 | 252 | 1. Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, et al.: 253 | “[Calvin: Fast Distributed Transactions for Partitioned Database Systems](http://cs.yale.edu/homes/thomson/publications/calvin-sigmod12.pdf),” at *ACM International Conference 254 | on Management of Data* (SIGMOD), May 2012. 255 | 256 | 1. Mahesh Balakrishnan, Dahlia Malkhi, Ted Wobber, et al.: 257 | “[Tango: Distributed Data Structures over a Shared Log](https://www.microsoft.com/en-us/research/publication/tango-distributed-data-structures-over-a-shared-log/),” 258 | at *24th ACM Symposium on Operating Systems Principles* (SOSP), November 2013. 259 | [doi:10.1145/2517349.2522732](http://dx.doi.org/10.1145/2517349.2522732) 260 | 261 | 1. Robbert van Renesse and Fred B. Schneider: 262 | “[Chain Replication for Supporting High Throughput and Availability](http://static.usenix.org/legacy/events/osdi04/tech/full_papers/renesse/renesse.pdf),” at *6th USENIX 263 | Symposium on Operating System Design and Implementation* (OSDI), December 2004. 264 | 265 | 1. Leslie Lamport: 266 | “[How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs](https://lamport.azurewebsites.net/pubs/multi.pdf),” 267 | *IEEE Transactions on Computers*, volume 28, number 9, pages 690–691, September 1979. 268 | [doi:10.1109/TC.1979.1675439](http://dx.doi.org/10.1109/TC.1979.1675439) 269 | 270 | 1. Enis Söztutar, Devaraj Das, and Carter Shanklin: 271 | “[Apache HBase High Availability at the Next Level](https://web.archive.org/web/20160405122821/http://hortonworks.com/blog/apache-hbase-high-availability-next-level/),” 272 | *hortonworks.com*, January 22, 2015. 273 | 274 | 1. Brian F Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, et al.: 275 | “[PNUTS: Yahoo!’s Hosted Data Serving Platform](http://www.mpi-sws.org/~druschel/courses/ds/papers/cooper-pnuts.pdf),” at *34th International Conference on Very Large Data 276 | Bases* (VLDB), August 2008. 277 | [doi:10.14778/1454159.1454167](http://dx.doi.org/10.14778/1454159.1454167) 278 | 279 | 1. Tushar Deepak Chandra and Sam Toueg: 280 | “[Unreliable Failure Detectors for Reliable Distributed Systems](http://courses.csail.mit.edu/6.852/08/papers/CT96-JACM.pdf),” *Journal of the ACM*, 281 | volume 43, number 2, pages 225–267, March 1996. 282 | [doi:10.1145/226643.226647](http://dx.doi.org/10.1145/226643.226647) 283 | 284 | 1. Michael J. Fischer, Nancy Lynch, and Michael S. Paterson: 285 | “[Impossibility of Distributed Consensus with One Faulty Process](https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf),” *Journal of the ACM*, volume 32, number 2, pages 374–382, April 1985. 286 | [doi:10.1145/3149.214121](http://dx.doi.org/10.1145/3149.214121) 287 | 288 | 1. Michael Ben-Or: “Another Advantage of Free 289 | Choice: Completely Asynchronous Agreement Protocols,” at *2nd ACM Symposium on Principles of 290 | Distributed Computing* (PODC), August 1983. 291 | [doi:10.1145/800221.806707](http://dl.acm.org/citation.cfm?id=806707) 292 | 293 | 1. Jim N. Gray and Leslie Lamport: 294 | “[Consensus on Transaction Commit](http://db.cs.berkeley.edu/cs286/papers/paxoscommit-tods2006.pdf),” *ACM Transactions on Database Systems* (TODS), volume 31, 295 | number 1, pages 133–160, March 2006. 296 | [doi:10.1145/1132863.1132867](http://dx.doi.org/10.1145/1132863.1132867) 297 | 298 | 1. Rachid Guerraoui: 299 | “[Revisiting the Relationship Between Non-Blocking Atomic Commitment and Consensus](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.27.6456&rep=rep1&type=pdf),” 300 | at *9th International Workshop on Distributed Algorithms* (WDAG), September 1995. 301 | [doi:10.1007/BFb0022140](http://dx.doi.org/10.1007/BFb0022140) 302 | 303 | 1. Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, 304 | Ramnatthan Alagappan, et al.: “[All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications](http://research.cs.wisc.edu/wind/Publications/alice-osdi14.pdf),” 305 | at *11th USENIX Symposium on Operating Systems Design and Implementation* (OSDI), 306 | October 2014. 307 | 308 | 1. Jim Gray: 309 | “[The Transaction Concept: Virtues and Limitations](http://jimgray.azurewebsites.net/papers/thetransactionconcept.pdf),” 310 | at *7th International Conference on Very Large Data Bases* (VLDB), September 1981. 311 | 312 | 1. Hector Garcia-Molina and Kenneth Salem: 313 | “[Sagas](http://www.cs.cornell.edu/andru/cs711/2002fa/reading/sagas.pdf),” at 314 | *ACM International Conference on Management of Data* (SIGMOD), May 1987. 315 | [doi:10.1145/38713.38742](http://dx.doi.org/10.1145/38713.38742) 316 | 317 | 1. C. Mohan, Bruce G. Lindsay, and Ron Obermarck: 318 | “[Transaction Management in the R* Distributed Database Management System](https://cs.brown.edu/courses/csci2270/archives/2012/papers/dtxn/p378-mohan.pdf),” 319 | *ACM Transactions on Database Systems*, volume 11, number 4, pages 378–396, December 1986. 320 | [doi:10.1145/7239.7266](http://dx.doi.org/10.1145/7239.7266) 321 | 322 | 1. “[Distributed Transaction Processing: The XA Specification](http://pubs.opengroup.org/onlinepubs/009680699/toc.pdf),” X/Open Company Ltd., Technical Standard 323 | XO/CAE/91/300, December 1991. ISBN: 978-1-872-63024-3 324 | 325 | 1. Mike Spille: 326 | “[XA Exposed, Part II](http://www.jroller.com/pyrasun/entry/xa_exposed_part_ii_schwartz),” 327 | *jroller.com*, April 3, 2004. 328 | 329 | 1. Ivan Silva Neto and Francisco Reverbel: 330 | “[Lessons Learned from Implementing WS-Coordination and WS-AtomicTransaction](http://www.ime.usp.br/~reverbel/papers/icis2008.pdf),” at *7th IEEE/ACIS International Conference on 331 | Computer and Information Science* (ICIS), May 2008. 332 | [doi:10.1109/ICIS.2008.75](http://dx.doi.org/10.1109/ICIS.2008.75) 333 | 334 | 1. James E. Johnson, David E. Langworthy, Leslie Lamport, and Friedrich H. Vogt: 335 | “[Formal Specification of a Web Services Protocol](https://www.microsoft.com/en-us/research/publication/formal-specification-of-a-web-services-protocol/),” 336 | at *1st International Workshop on Web Services and Formal Methods* (WS-FM), February 2004. 337 | [doi:10.1016/j.entcs.2004.02.022](http://dx.doi.org/10.1016/j.entcs.2004.02.022) 338 | 339 | 1. Dale Skeen: 340 | “[Nonblocking Commit Protocols](http://www.cs.utexas.edu/~lorenzo/corsi/cs380d/papers/Ske81.pdf),” at *ACM International Conference on Management of Data* (SIGMOD), April 1981. 341 | [doi:10.1145/582318.582339](http://dx.doi.org/10.1145/582318.582339) 342 | 343 | 1. Gregor Hohpe: 344 | “[Your Coffee Shop Doesn’t Use Two-Phase Commit](http://www.martinfowler.com/ieeeSoftware/coffeeShop.pdf),” *IEEE Software*, volume 22, number 2, pages 64–66, March 2005. 345 | [doi:10.1109/MS.2005.52](http://dx.doi.org/10.1109/MS.2005.52) 346 | 347 | 1. Pat Helland: 348 | “[Life Beyond Distributed Transactions: An Apostate’s Opinion](http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf),” at *3rd Biennial Conference on Innovative Data Systems 349 | Research* (CIDR), January 2007. 350 | 351 | 1. Jonathan Oliver: 352 | “[My Beef with MSDTC and Two-Phase Commits](http://blog.jonathanoliver.com/my-beef-with-msdtc-and-two-phase-commits/),” *blog.jonathanoliver.com*, April 4, 2011. 353 | 354 | 1. Oren Eini (Ahende Rahien): 355 | “[The Fallacy of Distributed Transactions](http://ayende.com/blog/167362/the-fallacy-of-distributed-transactions),” *ayende.com*, July 17, 2014. 356 | 357 | 1. Clemens Vasters: 358 | “[Transactions in Windows Azure (with Service Bus) – An Email Discussion](https://blogs.msdn.microsoft.com/clemensv/2012/07/30/transactions-in-windows-azure-with-service-bus-an-email-discussion/),” *vasters.com*, July 30, 2012. 359 | 360 | 1. “[Understanding Transactionality in Azure](https://docs.particular.net/nservicebus/azure/understanding-transactionality-in-azure),” NServiceBus Documentation, Particular Software, 2015. 361 | 362 | 1. Randy Wigginton, Ryan Lowe, Marcos Albe, and Fernando Ipar: 363 | “[Distributed Transactions in MySQL](https://web.archive.org/web/20161010054152/https://www.percona.com/live/mysql-conference-2013/sites/default/files/slides/XA_final.pdf),” 364 | at *MySQL Conference and Expo*, April 2013. 365 | 366 | 1. Mike Spille: 367 | “[XA Exposed, Part I](https://web.archive.org/web/20130523064202/http://www.jroller.com/pyrasun/entry/xa_exposed),” 368 | *jroller.com*, April 3, 2004. 369 | 370 | 1. Ajmer Dhariwal: 371 | “[Orphaned MSDTC Transactions (-2 spids)](http://www.eraofdata.com/orphaned-msdtc-transactions-2-spids/),” *eraofdata.com*, December 12, 2008. 372 | 373 | 1. Paul Randal: 374 | “[Real World Story of DBCC PAGE Saving the Day](http://www.sqlskills.com/blogs/paul/real-world-story-of-dbcc-page-saving-the-day/),” *sqlskills.com*, June 19, 2013. 375 | 376 | 1. “[in-doubt xact resolution Server Configuration Option](https://msdn.microsoft.com/en-us/library/ms179586.aspx),” SQL Server 2016 documentation, Microsoft, Inc., 377 | 2016. 378 | 379 | 1. Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer: 380 | “[Consensus in the Presence of Partial Synchrony](http://www.net.t-labs.tu-berlin.de/~petr/ADC-07/papers/DLS88.pdf),” *Journal of the ACM*, volume 35, number 2, pages 288–323, 381 | April 1988. [doi:10.1145/42282.42283](http://dx.doi.org/10.1145/42282.42283) 382 | 383 | 1. Miguel Castro and Barbara H. Liskov: 384 | “[Practical Byzantine Fault Tolerance and Proactive Recovery](http://zoo.cs.yale.edu/classes/cs426/2012/bib/castro02practical.pdf),” *ACM Transactions on Computer Systems*, 385 | volume 20, number 4, pages 396–461, November 2002. 386 | [doi:10.1145/571637.571640](http://dx.doi.org/10.1145/571637.571640) 387 | 388 | 1. Brian M. Oki and Barbara H. Liskov: 389 | “[Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems](http://www.cs.princeton.edu/courses/archive/fall11/cos518/papers/viewstamped.pdf),” at 390 | *7th ACM Symposium on Principles of Distributed Computing* (PODC), August 1988. 391 | [doi:10.1145/62546.62549](http://dx.doi.org/10.1145/62546.62549) 392 | 393 | 1. Barbara H. Liskov and James Cowling: 394 | “[Viewstamped Replication Revisited](http://pmg.csail.mit.edu/papers/vr-revisited.pdf),” 395 | Massachusetts Institute of Technology, Tech Report MIT-CSAIL-TR-2012-021, July 2012. 396 | 397 | 1. Leslie Lamport: 398 | “[The Part-Time Parliament](https://www.microsoft.com/en-us/research/publication/part-time-parliament/),” 399 | *ACM Transactions on Computer Systems*, volume 16, number 2, pages 133–169, May 1998. 400 | [doi:10.1145/279227.279229](http://dx.doi.org/10.1145/279227.279229) 401 | 402 | 1. Leslie Lamport: 403 | “[Paxos Made Simple](https://www.microsoft.com/en-us/research/publication/paxos-made-simple/),” *ACM SIGACT News*, volume 32, number 4, pages 51–58, December 2001. 404 | 405 | 1. Tushar Deepak Chandra, Robert Griesemer, and Joshua 406 | Redstone: “[Paxos Made Live – An Engineering Perspective](http://www.read.seas.harvard.edu/~kohler/class/08w-dsi/chandra07paxos.pdf),” at *26th ACM Symposium on Principles of Distributed 407 | Computing* (PODC), June 2007. 408 | 409 | 1. Robbert 410 | van Renesse: “[Paxos Made Moderately Complex](http://www.cs.cornell.edu/home/rvr/Paxos/paxos.pdf),” *cs.cornell.edu*, March 2011. 411 | 412 | 1. Diego Ongaro: 413 | “[Consensus: Bridging Theory and Practice](https://github.com/ongardie/dissertation),” 414 | PhD Thesis, Stanford University, August 2014. 415 | 416 | 1. Heidi Howard, Malte Schwarzkopf, Anil Madhavapeddy, 417 | and Jon Crowcroft: “[Raft Refloated: Do We Have Consensus?](http://www.cl.cam.ac.uk/~ms705/pub/papers/2015-osr-raft.pdf),” *ACM SIGOPS Operating Systems Review*, volume 49, 418 | number 1, pages 12–21, January 2015. 419 | [doi:10.1145/2723872.2723876](http://dx.doi.org/10.1145/2723872.2723876) 420 | 421 | 1. André Medeiros: 422 | “[ZooKeeper’s Atomic Broadcast Protocol: Theory and Practice](http://www.tcs.hut.fi/Studies/T-79.5001/reports/2012-deSouzaMedeiros.pdf),” Aalto University School of Science, March 20, 2012. 423 | 424 | 1. Robbert van Renesse, Nicolas Schiper, and 425 | Fred B. Schneider: “[Vive La Différence: Paxos vs. Viewstamped Replication vs. Zab](http://arxiv.org/abs/1309.5671),” *IEEE Transactions on Dependable and Secure Computing*, 426 | volume 12, number 4, pages 472–484, September 2014. 427 | [doi:10.1109/TDSC.2014.2355848](http://dx.doi.org/10.1109/TDSC.2014.2355848) 428 | 429 | 1. Will 430 | Portnoy: “[Lessons Learned from Implementing Paxos](http://blog.willportnoy.com/2012/06/lessons-learned-from-paxos.html),” *blog.willportnoy.com*, June 14, 2012. 431 | 432 | 1. Heidi Howard, Dahlia Malkhi, and Alexander Spiegelman: 433 | “[Flexible Paxos: Quorum Intersection Revisited](https://drops.dagstuhl.de/opus/volltexte/2017/7094/pdf/LIPIcs-OPODIS-2016-25.pdf),” 434 | at *20th International Conference on Principles of Distributed Systems* (OPODIS), December 2016. 435 | [doi:10.4230/LIPIcs.OPODIS.2016.25](http://dx.doi.org/10.4230/LIPIcs.OPODIS.2016.25) 436 | 437 | 1. Heidi Howard and Jon Crowcroft: 438 | “[Coracle: Evaluating Consensus at the Internet Edge](http://www.sigcomm.org/sites/default/files/ccr/papers/2015/August/2829988-2790010.pdf),” 439 | at *Annual Conference of the ACM Special Interest Group on Data Communication* (SIGCOMM), August 2015. 440 | [doi:10.1145/2829988.2790010](http://dx.doi.org/10.1145/2829988.2790010) 441 | 442 | 1. Kyle Kingsbury: 443 | “[Call Me Maybe: Elasticsearch 1.5.0](https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0),” *aphyr.com*, April 27, 2015. 444 | 445 | 1. Ivan Kelly: 446 | “[BookKeeper Tutorial](https://github.com/ivankelly/bookkeeper-tutorial),” 447 | *github.com*, October 2014. 448 | 449 | 1. Camille Fournier: 450 | “[Consensus Systems for the Skeptical Architect](https://vimeo.com/102667163),” 451 | at *Philly ETE*, Philadelphia, PA, USA, April 2014. 452 | 453 | 1. Kenneth P. Birman: 454 | “[A History of the Virtual Synchrony Replication Model](https://ptolemy.berkeley.edu/projects/truststc/pubs/713/History%20of%20the%20Virtual%20Synchrony%20Replication%20Model%202010.pdf),” 455 | in *Replication: Theory and Practice*, Springer LNCS volume 5959, chapter 6, pages 91–120, 2010. 456 | ISBN: 978-3-642-11293-5, [doi:10.1007/978-3-642-11294-2_6](http://dx.doi.org/10.1007/978-3-642-11294-2_6) 457 | 458 | -------------------------------------------------------------------------------- /chapter-10-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 10 References 5 | -------------------- 6 | 7 | 1. Jeffrey Dean and Sanjay Ghemawat: 8 | “[MapReduce: Simplified Data Processing on Large Clusters](https://research.google/pubs/pub62/),” 9 | at *6th USENIX Symposium on Operating System Design and Implementation* (OSDI), December 2004. 10 | 11 | 1. Joel Spolsky: 12 | “[The Perils of JavaSchools](https://www.joelonsoftware.com/2005/12/29/the-perils-of-javaschools-2/),” *joelonsoftware.com*, December 29, 2005. 13 | 14 | 1. Shivnath Babu and Herodotos Herodotou: 15 | “[Massively Parallel Databases and MapReduce Systems](https://www.microsoft.com/en-us/research/wp-content/uploads/2013/11/db-mr-survey-final.pdf),” 16 | *Foundations and Trends in Databases*, volume 5, number 1, pages 1–104, November 2013. 17 | [doi:10.1561/1900000036](http://dx.doi.org/10.1561/1900000036) 18 | 19 | 1. David J. DeWitt and Michael Stonebraker: 20 | “[MapReduce: A Major Step Backwards](https://homes.cs.washington.edu/~billhowe/mapreduce_a_major_step_backwards.html),” originally published at *databasecolumn.vertica.com*, January 17, 2008. 21 | 22 | 1. Henry Robinson: 23 | “[The Elephant Was a Trojan Horse: On the Death of Map-Reduce at Google](https://www.the-paper-trail.org/post/2014-06-25-the-elephant-was-a-trojan-horse-on-the-death-of-map-reduce-at-google/),” 24 | *the-paper-trail.org*, June 25, 2014. 25 | 26 | 1. “[The Hollerith Machine](https://www.census.gov/history/www/innovations/technology/the_hollerith_tabulator.html),” United States Census Bureau, *census.gov*. 27 | 28 | 1. “[IBM 82, 83, and 84 Sorters Reference Manual](http://www.textfiles.com/bitsavers/pdf/ibm/punchedCard/Sorter/A24-1034-1_82-83-84_sorters.pdf),” Edition A24-1034-1, International Business 29 | Machines Corporation, July 1962. 30 | 31 | 1. Adam Drake: 32 | “[Command-Line Tools Can Be 235x Faster than Your Hadoop Cluster](http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html),” *aadrake.com*, January 25, 2014. 33 | 34 | 1. “[GNU Coreutils 8.23 Documentation](http://www.gnu.org/software/coreutils/manual/html_node/index.html),” Free Software Foundation, Inc., 2014. 35 | 36 | 1. Martin Kleppmann: 37 | “[Kafka, Samza, and the Unix Philosophy of Distributed Data](http://martin.kleppmann.com/2015/08/05/kafka-samza-unix-philosophy-distributed-data.html),” *martin.kleppmann.com*, August 5, 2015. 38 | 39 | 1. Doug McIlroy: 40 | [Internal Bell Labs memo](https://swtch.com/~rsc/thread/mdmpipe.pdf), 41 | October 1964. Cited in: Dennis M. Richie: 42 | “[Advice from Doug McIlroy](https://www.bell-labs.com/usr/dmr/www/mdmpipe.html),” 43 | *bell-labs.com*. 44 | 45 | 1. M. D. McIlroy, E. N. Pinson, and B. A. Tague: 46 | “[UNIX Time-Sharing System: Foreword](https://archive.org/details/bstj57-6-1899),” 47 | *The Bell System Technical Journal*, volume 57, number 6, pages 1899–1904, 48 | July 1978. 49 | 50 | 1. Eric S. Raymond: 51 | [*The Art of UNIX Programming*](http://www.catb.org/~esr/writings/taoup/html/). 52 | Addison-Wesley, 2003. ISBN: 978-0-13-142901-7 53 | 54 | 1. Ronald Duncan: 55 | “[Text File Formats – ASCII Delimited Text – Not CSV or TAB Delimited Text](https://ronaldduncan.wordpress.com/2009/10/31/text-file-formats-ascii-delimited-text-not-csv-or-tab-delimited-text/),” 56 | *ronaldduncan.wordpress.com*, October 31, 2009. 57 | 58 | 1. Alan Kay: 59 | “[Is 'Software Engineering' an Oxymoron?](http://tinlizzie.org/~takashi/IsSoftwareEngineeringAnOxymoron.pdf),” *tinlizzie.org*. 60 | 61 | 1. Martin Fowler: 62 | “[InversionOfControl](http://martinfowler.com/bliki/InversionOfControl.html),” 63 | *martinfowler.com*, June 26, 2005. 64 | 65 | 1. Daniel J. Bernstein: 66 | “[Two File Descriptors for Sockets](http://cr.yp.to/tcpip/twofd.html),” *cr.yp.to*. 67 | 68 | 1. Rob Pike and Dennis M. Ritchie: 69 | “[The Styx Architecture for Distributed Systems](http://doc.cat-v.org/inferno/4th_edition/styx),” *Bell Labs Technical Journal*, volume 4, number 2, pages 70 | 146–152, April 1999. 71 | 72 | 1. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak 73 | Leung: “[The Google File System](http://research.google.com/archive/gfs-sosp2003.pdf),” 74 | at *19th ACM Symposium on Operating Systems Principles* (SOSP), October 2003. 75 | [doi:10.1145/945445.945450](http://dx.doi.org/10.1145/945445.945450) 76 | 77 | 1. Michael Ovsiannikov, Silvius Rus, Damian Reeves, et al.: 78 | “[The Quantcast File System](http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p808-ovsiannikov.pdf),” *Proceedings of the VLDB Endowment*, volume 6, number 11, pages 1092–1101, August 2013. 79 | [doi:10.14778/2536222.2536234](http://dx.doi.org/10.14778/2536222.2536234) 80 | 81 | 1. “[OpenStack Swift 2.6.1 Developer Documentation](http://docs.openstack.org/developer/swift/),” OpenStack Foundation, *docs.openstack.org*, March 2016. 82 | 83 | 1. Zhe Zhang, Andrew Wang, Kai Zheng, et al.: 84 | “[Introduction to HDFS Erasure Coding in Apache Hadoop](https://blog.cloudera.com/introduction-to-hdfs-erasure-coding-in-apache-hadoop/),” 85 | *blog.cloudera.com*, September 23, 2015. 86 | 87 | 1. Peter Cnudde: 88 | “[Hadoop Turns 10](https://web.archive.org/web/20190119112713/https://yahoohadoop.tumblr.com/post/138739227316/hadoop-turns-10),” 89 | *yahoohadoop.tumblr.com*, February 5, 2016. 90 | 91 | 1. Eric Baldeschwieler: 92 | “[Thinking About the HDFS vs. Other Storage Technologies](https://web.archive.org/web/20190529215115/http://hortonworks.com/blog/thinking-about-the-hdfs-vs-other-storage-technologies/),” 93 | *hortonworks.com*, July 25, 2012. 94 | 95 | 1. Brendan Gregg: 96 | “[Manta: Unix Meets Map Reduce](http://dtrace.org/blogs/brendan/2013/06/25/manta-unix-meets-map-reduce/),” *dtrace.org*, June 25, 2013. 97 | 98 | 1. Tom White: *Hadoop: The Definitive Guide*, 99 | 4th edition. O'Reilly Media, 2015. ISBN: 978-1-491-90163-2 100 | 101 | 1. Jim N. Gray: 102 | “[Distributed Computing Economics](http://arxiv.org/pdf/cs/0403019.pdf),” Microsoft 103 | Research Tech Report MSR-TR-2003-24, March 2003. 104 | 105 | 1. Márton Trencséni: 106 | “[Luigi vs Airflow vs Pinball](http://bytepawn.com/luigi-airflow-pinball.html),” 107 | *bytepawn.com*, February 6, 2016. 108 | 109 | 1. Roshan Sumbaly, Jay Kreps, and Sam Shah: 110 | “[The 'Big Data' Ecosystem at LinkedIn](http://www.slideshare.net/s_shah/the-big-data-ecosystem-at-linkedin-23512853),” at *ACM International Conference on Management of Data* 111 | (SIGMOD), July 2013. 112 | [doi:10.1145/2463676.2463707](http://dx.doi.org/10.1145/2463676.2463707) 113 | 114 | 1. Alan F. Gates, Olga Natkovich, Shubham Chopra, et al.: 115 | “[Building a High-Level Dataflow System on Top of Map-Reduce: The Pig Experience](http://www.vldb.org/pvldb/vol2/vldb09-1074.pdf),” 116 | at *35th International Conference on Very Large Data Bases* (VLDB), August 2009. 117 | 118 | 1. Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, et al.: 119 | “[Hive – A Petabyte Scale Data Warehouse Using Hadoop](http://i.stanford.edu/~ragho/hive-icde2010.pdf),” at *26th IEEE International Conference on Data Engineering* (ICDE), March 2010. 120 | [doi:10.1109/ICDE.2010.5447738](http://dx.doi.org/10.1109/ICDE.2010.5447738) 121 | 122 | 1. “[Cascading 3.0 User Guide](http://docs.cascading.org/cascading/3.0/userguide/),” Concurrent, Inc., *docs.cascading.org*, January 2016. 123 | 124 | 1. “[Apache Crunch User Guide](https://crunch.apache.org/user-guide.html),” Apache Software Foundation, *crunch.apache.org*. 125 | 126 | 1. Craig Chambers, Ashish Raniwala, Frances 127 | Perry, et al.: “[FlumeJava: Easy, Efficient Data-Parallel Pipelines](https://research.google.com/pubs/archive/35650.pdf),” at *31st ACM SIGPLAN Conference on Programming Language 128 | Design and Implementation* (PLDI), June 2010. 129 | [doi:10.1145/1806596.1806638](http://dx.doi.org/10.1145/1806596.1806638) 130 | 131 | 1. Jay Kreps: 132 | “[Why Local State is a Fundamental Primitive in Stream Processing](https://www.oreilly.com/ideas/why-local-state-is-a-fundamental-primitive-in-stream-processing),” *oreilly.com*, July 31, 2014. 133 | 134 | 1. Martin Kleppmann: 135 | “[Rethinking Caching in Web Apps](http://martin.kleppmann.com/2012/10/01/rethinking-caching-in-web-apps.html),” *martin.kleppmann.com*, October 1, 2012. 136 | 137 | 1. Mark Grover, Ted Malaska, Jonathan 138 | Seidman, and Gwen Shapira: *[Hadoop Application Architectures](http://shop.oreilly.com/product/0636920033196.do)*. O'Reilly Media, 2015. ISBN: 978-1-491-90004-8 139 | 140 | 1. Philippe Ajoux, Nathan Bronson, 141 | Sanjeev Kumar, et al.: 142 | “[Challenges to Adopting Stronger Consistency at Scale](https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-ajoux.pdf),” at *15th USENIX Workshop on Hot Topics in 143 | Operating Systems* (HotOS), May 2015. 144 | 145 | 1. “[Performance and Efficiency](https://pig.apache.org/docs/latest/perf.html),” 146 | Apache Pig Documentation, *pig.apache.org*, 2017. 147 | 148 | 1. Sriranjan Manjunath: 149 | “[Skewed Join](https://wiki.apache.org/pig/PigSkewedJoinSpec),” *wiki.apache.org*, 150 | 2009. 151 | 152 | 1. David J. DeWitt, Jeffrey F. Naughton, Donovan A. 153 | Schneider, and S. Seshadri: “[Practical Skew Handling in Parallel Joins](http://www.vldb.org/conf/1992/P027.PDF),” at *18th International Conference on Very Large Data Bases* (VLDB), August 1992. 154 | 155 | 1. Marcel Kornacker, Alexander Behm, Victor 156 | Bittorf, et al.: “[Impala: A Modern, Open-Source SQL Engine for Hadoop](http://pandis.net/resources/cidr15impala.pdf),” at *7th Biennial Conference on Innovative Data Systems 157 | Research* (CIDR), January 2015. 158 | 159 | 1. Matthieu Monsch: 160 | “[Open-Sourcing PalDB, a Lightweight Companion for Storing Side Data](https://engineering.linkedin.com/blog/2015/10/open-sourcing-paldb--a-lightweight-companion-for-storing-side-da),” *engineering.linkedin.com*, October 26, 2015. 161 | 162 | 1. Daniel Peng and Frank Dabek: 163 | “[Large-Scale Incremental Processing Using Distributed Transactions and Notifications](https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Peng.pdf),” at *9th USENIX 164 | conference on Operating Systems Design and Implementation* (OSDI), October 2010. 165 | 166 | 1. “["Cloudera Search User Guide,"](http://www.cloudera.com/documentation/cdh/5-1-x/Search/Cloudera-Search-User-Guide/Cloudera-Search-User-Guide.html) Cloudera, Inc., September 2015. 167 | 168 | 1. Lili Wu, Sam Shah, Sean Choi, et al.: 169 | “[The Browsemaps: Collaborative Filtering at LinkedIn](http://ceur-ws.org/Vol-1271/Paper3.pdf),” 170 | at *6th Workshop on Recommender Systems and the Social Web* (RSWeb), October 2014. 171 | 172 | 1. Roshan Sumbaly, Jay Kreps, Lei Gao, et al.: 173 | “[Serving Large-Scale Batch Computed Data with Project Voldemort](http://static.usenix.org/events/fast12/tech/full_papers/Sumbaly.pdf),” at *10th USENIX Conference on File and Storage 174 | Technologies* (FAST), February 2012. 175 | 176 | 1. Varun Sharma: 177 | “[Open-Sourcing Terrapin: A Serving System for Batch Generated Data](https://web.archive.org/web/20170215032514/https://engineering.pinterest.com/blog/open-sourcing-terrapin-serving-system-batch-generated-data-0),” 178 | *engineering.pinterest.com*, September 14, 2015. 179 | 180 | 1. Nathan Marz: 181 | “[ElephantDB](http://www.slideshare.net/nathanmarz/elephantdb),” *slideshare.net*, May 30, 2011. 182 | 183 | 1. Jean-Daniel (JD) Cryans: 184 | “[How-to: Use HBase Bulk Loading, and Why](https://blog.cloudera.com/how-to-use-hbase-bulk-loading-and-why/),” 185 | *blog.cloudera.com*, September 27, 2013. 186 | 187 | 1. Nathan Marz: 188 | “[How to Beat the CAP Theorem](http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html),” *nathanmarz.com*, October 13, 2011. 189 | 190 | 1. Molly Bartlett Dishman and Martin Fowler: 191 | “[Agile Architecture](http://conferences.oreilly.com/software-architecture/sa2015/public/schedule/detail/40388),” at *O'Reilly Software Architecture Conference*, March 2015. 192 | 193 | 1. David J. DeWitt and Jim N. Gray: 194 | “[Parallel Database Systems: The Future of High Performance Database Systems](http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/dewittgray92.pdf),” 195 | *Communications of the ACM*, volume 35, number 6, pages 85–98, June 1992. 196 | [doi:10.1145/129888.129894](http://dx.doi.org/10.1145/129888.129894) 197 | 198 | 1. Jay Kreps: 199 | “[But the multi-tenancy thing is actually really really hard](https://twitter.com/jaykreps/status/528235702480142336),” tweetstorm, *twitter.com*, October 31, 2014. 200 | 201 | 1. Jeffrey Cohen, Brian Dolan, Mark Dunlap, et al.: 202 | “[MAD Skills: New Analysis Practices for Big Data](http://www.vldb.org/pvldb/vol2/vldb09-219.pdf),” 203 | *Proceedings of the VLDB Endowment*, volume 2, number 2, pages 1481–1492, August 2009. 204 | [doi:10.14778/1687553.1687576](http://dx.doi.org/10.14778/1687553.1687576) 205 | 206 | 1. Ignacio 207 | Terrizzano, Peter Schwarz, Mary Roth, and John E. Colino: 208 | “[Data Wrangling: The Challenging Journey from the Wild to the Lake](http://cidrdb.org/cidr2015/Papers/CIDR15_Paper2.pdf),” at *7th Biennial Conference on Innovative Data Systems 209 | Research* (CIDR), January 2015. 210 | 211 | 1. Paige Roberts: 212 | “[To Schema on Read or to Schema on Write, That Is the Hadoop Data Lake Question](http://adaptivesystemsinc.com/blog/to-schema-on-read-or-to-schema-on-write-that-is-the-hadoop-data-lake-question/),” *adaptivesystemsinc.com*, July 2, 2015. 213 | 214 | 1. Bobby Johnson and Joseph Adler: 215 | “[The Sushi Principle: Raw Data Is Better](https://conferences.oreilly.com/strata/big-data-conference-ca-2015/public/schedule/detail/38737),” 216 | at *Strata+Hadoop World*, February 2015. 217 | 218 | 1. Vinod Kumar Vavilapalli, Arun C. Murthy, Chris Douglas, et al.: 219 | “[Apache Hadoop YARN: Yet Another Resource Negotiator](http://www.socc2013.org/home/program/a5-vavilapalli.pdf),” at *4th ACM Symposium on Cloud Computing* (SoCC), October 2013. 220 | [doi:10.1145/2523616.2523633](http://dx.doi.org/10.1145/2523616.2523633) 221 | 222 | 1. Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, et al.: 223 | “[Large-Scale Cluster Management at Google with Borg](http://research.google.com/pubs/pub43438.html),” at *10th European Conference on Computer Systems* (EuroSys), April 2015. 224 | [doi:10.1145/2741948.2741964](http://dx.doi.org/10.1145/2741948.2741964) 225 | 226 | 1. Malte Schwarzkopf: 227 | “[The Evolution of Cluster Scheduler Architectures](http://www.firmament.io/blog/scheduler-architectures.html),” *firmament.io*, March 9, 2016. 228 | 229 | 1. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, et al.: 230 | “[Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing](https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf),” at *9th 231 | USENIX Symposium on Networked Systems Design and Implementation* (NSDI), April 2012. 232 | 233 | 1. Holden Karau, Andy Konwinski, Patrick Wendell, and Matei 234 | Zaharia: *Learning Spark*. O'Reilly Media, 2015. ISBN: 978-1-449-35904-1 235 | 236 | 1. Bikas Saha and Hitesh Shah: 237 | “[Apache Tez: Accelerating Hadoop Query Processing](http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha),” at *Hadoop Summit*, June 2014. 238 | 239 | 1. Bikas Saha, Hitesh Shah, Siddharth Seth, et al.: 240 | “[Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications](http://home.cse.ust.hk/~weiwa/teaching/Fall15-COMP6611B/reading_list/Tez.pdf),” at *ACM 241 | International Conference on Management of Data* (SIGMOD), June 2015. 242 | [doi:10.1145/2723372.2742790](http://dx.doi.org/10.1145/2723372.2742790) 243 | 244 | 1. Kostas Tzoumas: 245 | “[Apache Flink: API, Runtime, and Project Roadmap](http://www.slideshare.net/KostasTzoumas/apache-flink-api-runtime-and-project-roadmap),” *slideshare.net*, January 14, 2015. 246 | 247 | 1. Alexander Alexandrov, Rico Bergmann, Stephan Ewen, et al.: 248 | “[The Stratosphere Platform for Big Data Analytics](https://ssc.io/pdf/2014-VLDBJ_Stratosphere_Overview.pdf),” *The VLDB Journal*, volume 23, number 6, pages 939–964, May 2014. 249 | [doi:10.1007/s00778-014-0357-y](http://dx.doi.org/10.1007/s00778-014-0357-y) 250 | 251 | 1. Michael Isard, Mihai Budiu, Yuan Yu, et al.: 252 | “[Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks](https://www.microsoft.com/en-us/research/publication/dryad-distributed-data-parallel-programs-from-sequential-building-blocks/),” at *European Conference on Computer 253 | Systems* (EuroSys), March 2007. 254 | [doi:10.1145/1272996.1273005](http://dx.doi.org/10.1145/1272996.1273005) 255 | 256 | 1. Daniel Warneke and Odej Kao: 257 | “[Nephele: Efficient Parallel Data Processing in the Cloud](https://stratosphere2.dima.tu-berlin.de/assets/papers/Nephele_09.pdf),” at *2nd Workshop on Many-Task Computing on Grids and 258 | Supercomputers* (MTAGS), November 2009. 259 | [doi:10.1145/1646468.1646476](http://dx.doi.org/10.1145/1646468.1646476) 260 | 261 | 1. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd: 262 | “[The PageRank Citation Ranking: Bringing Order to the Web](http://ilpubs.stanford.edu:8090/422/),” 263 | Stanford InfoLab Technical Report 422, 1999. 264 | 265 | 1. Leslie G. Valiant: 266 | “[A Bridging Model for Parallel Computation](http://dl.acm.org/citation.cfm?id=79181),” 267 | *Communications of the ACM*, volume 33, number 8, pages 103–111, August 1990. 268 | [doi:10.1145/79173.79181](http://dx.doi.org/10.1145/79173.79181) 269 | 270 | 1. Stephan Ewen, Kostas Tzoumas, Moritz Kaufmann, and Volker Markl: 271 | “[Spinning Fast Iterative Data Flows](http://vldb.org/pvldb/vol5/p1268_stephanewen_vldb2012.pdf),” *Proceedings of the VLDB Endowment*, volume 5, number 11, pages 1268-1279, July 2012. 272 | [doi:10.14778/2350229.2350245](http://dx.doi.org/10.14778/2350229.2350245) 273 | 274 | 1. Grzegorz Malewicz, Matthew H. 275 | Austern, Aart J. C. Bik, et al.: “[Pregel: A System for Large-Scale Graph Processing](https://kowshik.github.io/JPregel/pregel_paper.pdf),” at *ACM International Conference on Management of 276 | Data* (SIGMOD), June 2010. 277 | [doi:10.1145/1807167.1807184](http://dx.doi.org/10.1145/1807167.1807184) 278 | 279 | 1. Frank McSherry, Michael Isard, and Derek G. Murray: 280 | “[Scalability! But at What COST?](http://www.frankmcsherry.org/assets/COST.pdf),” at 281 | *15th USENIX Workshop on Hot Topics in Operating Systems* (HotOS), May 2015. 282 | 283 | 1. Ionel Gog, Malte Schwarzkopf, Natacha Crooks, et al.: 284 | “[Musketeer: All for One, One for All in Data Processing Systems](http://www.cl.cam.ac.uk/research/srg/netos/camsas/pubs/eurosys15-musketeer.pdf),” at *10th European Conference on 285 | Computer Systems* (EuroSys), April 2015. 286 | [doi:10.1145/2741948.2741968](http://dx.doi.org/10.1145/2741948.2741968) 287 | 288 | 1. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin: 289 | “[GraphChi: Large-Scale Graph Computation on Just a PC](https://www.usenix.org/system/files/conference/osdi12/osdi12-final-126.pdf),” at *10th USENIX Symposium on Operating Systems 290 | Design and Implementation* (OSDI), October 2012. 291 | 292 | 1. Andrew Lenharth, Donald Nguyen, and Keshav Pingali: 293 | “[Parallel Graph Analytics](http://cacm.acm.org/magazines/2016/5/201591-parallel-graph-analytics/fulltext),” *Communications of the ACM*, volume 59, number 5, pages 78–87, May 294 | 2016. [doi:10.1145/2901919](http://dx.doi.org/10.1145/2901919) 295 | 296 | 1. Fabian Hüske: 297 | “[Peeking into Apache Flink's Engine Room](http://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html),” *flink.apache.org*, March 13, 2015. 298 | 299 | 1. Mostafa Mokhtar: 300 | “[Hive 0.14 Cost Based Optimizer (CBO) Technical Overview](https://web.archive.org/web/20170607112708/http://hortonworks.com/blog/hive-0-14-cost-based-optimizer-cbo-technical-overview/),” 301 | *hortonworks.com*, March 2, 2015. 302 | 303 | 1. Michael Armbrust, Reynold S Xin, Cheng Lian, et al.: 304 | “[Spark SQL: Relational Data Processing in Spark](http://people.csail.mit.edu/matei/papers/2015/sigmod_spark_sql.pdf),” at *ACM International Conference on Management of Data* (SIGMOD), June 2015. 305 | [doi:10.1145/2723372.2742797](http://dx.doi.org/10.1145/2723372.2742797) 306 | 307 | 1. Daniel Blazevski: 308 | “[Planting Quadtrees for Apache Flink](http://insightdataengineering.com/blog/flink-knn/),” *insightdataengineering.com*, March 25, 2016. 309 | 310 | 1. Tom White: 311 | “[Genome Analysis Toolkit: Now Using Apache Spark for Data Processing](https://web.archive.org/web/20190215132904/http://blog.cloudera.com/blog/2016/04/genome-analysis-toolkit-now-using-apache-spark-for-data-processing/),” 312 | *blog.cloudera.com*, April 6, 2016. 313 | 314 | -------------------------------------------------------------------------------- /chapter-11-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 11 References 5 | -------------------- 6 | 7 | 1. Tyler Akidau, Robert Bradshaw, Craig Chambers, et al.: 8 | “[The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf),” 9 | *Proceedings of the VLDB Endowment*, volume 8, number 12, pages 1792–1803, August 2015. 10 | [doi:10.14778/2824032.2824076](http://dx.doi.org/10.14778/2824032.2824076) 11 | 12 | 1. Harold Abelson, Gerald Jay Sussman, and Julie Sussman: 13 | [*Structure and Interpretation of Computer Programs*](https://mitpress.mit.edu/sicp/), 14 | 2nd edition. MIT Press, 1996. ISBN: 978-0-262-51087-5, available online at *mitpress.mit.edu* 15 | 16 | 1. Patrick Th. Eugster, Pascal A. Felber, 17 | Rachid Guerraoui, and Anne-Marie Kermarrec: 18 | “[The Many Faces of Publish/Subscribe](http://www.cs.ru.nl/~pieter/oss/manyfaces.pdf),” 19 | *ACM Computing Surveys*, volume 35, number 2, pages 114–131, June 2003. 20 | [doi:10.1145/857076.857078](http://dx.doi.org/10.1145/857076.857078) 21 | 22 | 1. Joseph M. Hellerstein and Michael Stonebraker: 23 | [*Readings in Database Systems*](http://redbook.cs.berkeley.edu/), 4th edition. 24 | MIT Press, 2005. ISBN: 978-0-262-69314-1, available online at *redbook.cs.berkeley.edu* 25 | 26 | 1. Don Carney, Uğur Çetintemel, Mitch Cherniack, et al.: 27 | “[Monitoring Streams – A New Class of Data Management Applications](http://www.vldb.org/conf/2002/S07P02.pdf),” at *28th International Conference on Very Large Data Bases* 28 | (VLDB), August 2002. 29 | 30 | 1. Matthew Sackman: 31 | “[Pushing Back](http://www.lshift.net/blog/2016/05/05/pushing-back/),” 32 | *lshift.net*, May 5, 2016. 33 | 34 | 1. Vicent Martí: 35 | “[Brubeck, a statsd-Compatible Metrics Aggregator](http://githubengineering.com/brubeck/),” *githubengineering.com*, June 15, 2015. 36 | 37 | 1. Seth Lowenberger: 38 | “[MoldUDP64 Protocol Specification V 1.00](http://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/moldudp64.pdf),” *nasdaqtrader.com*, July 2009. 39 | 40 | 1. Pieter Hintjens: 41 | [*ZeroMQ – The Guide*](http://zguide.zeromq.org/page:all). O'Reilly Media, 2013. 42 | ISBN: 978-1-449-33404-8 43 | 44 | 1. Ian Malpass: 45 | “[Measure Anything, Measure Everything](https://codeascraft.com/2011/02/15/measure-anything-measure-everything/),” *codeascraft.com*, February 15, 2011. 46 | 47 | 1. Dieter Plaetinck: 48 | “[25 Graphite, Grafana and statsd Gotchas](https://grafana.com/blog/2016/03/03/25-graphite-grafana-and-statsd-gotchas/),” 49 | *grafana.com*, March 3, 2016. 50 | 51 | 1. Jeff Lindsay: 52 | “[Web Hooks to Revolutionize the Web](http://progrium.com/blog/2007/05/03/web-hooks-to-revolutionize-the-web/),” *progrium.com*, May 3, 2007. 53 | 54 | 1. Jim N. Gray: 55 | “[Queues Are Databases](https://arxiv.org/pdf/cs/0701158.pdf),” 56 | Microsoft Research Technical Report MSR-TR-95-56, December 1995. 57 | 58 | 1. Mark Hapner, Rich Burridge, Rahul Sharma, et al.: 59 | “[JSR-343 Java Message Service (JMS) 2.0 Specification](https://jcp.org/en/jsr/detail?id=343),” *jms-spec.java.net*, March 2013. 60 | 61 | 1. Sanjay Aiyagari, Matthew Arrott, Mark Atwell, et al.: 62 | “[AMQP: Advanced Message Queuing Protocol Specification](http://www.rabbitmq.com/resources/specs/amqp0-9-1.pdf),” Version 0-9-1, November 2008. 63 | 64 | 1. “[Google Cloud Pub/Sub: A Google-Scale Messaging Service](https://cloud.google.com/pubsub/architecture),” *cloud.google.com*, 2016. 65 | 66 | 1. “[Apache Kafka 0.9 Documentation](http://kafka.apache.org/documentation.html),” *kafka.apache.org*, November 2015. 67 | 68 | 1. Jay Kreps, Neha Narkhede, and Jun Rao: 69 | “[Kafka: A Distributed Messaging System for Log Processing](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/09/Kafka.pdf),” at *6th International Workshop on 70 | Networking Meets Databases* (NetDB), June 2011. 71 | 72 | 1. “[Amazon Kinesis Streams Developer Guide](http://docs.aws.amazon.com/streams/latest/dev/introduction.html),” *docs.aws.amazon.com*, April 2016. 73 | 74 | 1. Leigh Stewart and Sijie Guo: 75 | “[Building DistributedLog: Twitter’s High-Performance Replicated Log Service](https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service),” *blog.twitter.com*, 76 | September 16, 2015. 77 | 78 | 1. “[DistributedLog Documentation](https://bookkeeper.apache.org/distributedlog/docs/latest/),” 79 | Apache Software Foundation, *distributedlog.io*. 80 | 81 | 1. Jay Kreps: 82 | “[Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines)](https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines),” *engineering.linkedin.com*, 83 | April 27, 2014. 84 | 85 | 1. Kartik Paramasivam: 86 | “[How We’re Improving and Advancing Kafka at LinkedIn](https://engineering.linkedin.com/apache-kafka/how-we_re-improving-and-advancing-kafka-linkedin),” *engineering.linkedin.com*, September 2, 2015. 87 | 88 | 1. Jay Kreps: 89 | “[The Log: What Every Software Engineer Should Know About Real-Time Data's Unifying Abstraction](http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying),” 90 | *engineering.linkedin.com*, December 16, 2013. 91 | 92 | 1. Shirshanka Das, Chavdar Botev, Kapil Surlaker, 93 | et al.: “[All Aboard the Databus!](http://www.socc2012.org/s18-das.pdf),” at *3rd ACM 94 | Symposium on Cloud Computing* (SoCC), October 2012. 95 | 96 | 1. Yogeshwer Sharma, Philippe Ajoux, Petchean Ang, et al.: 97 | “[Wormhole: Reliable Pub-Sub to Support Geo-Replicated Internet Services](https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-sharma.pdf),” at *12th USENIX Symposium on 98 | Networked Systems Design and Implementation* (NSDI), May 2015. 99 | 100 | 1. P. P. S. Narayan: 101 | “[Sherpa Update](http://web.archive.org/web/20160801221400/https://developer.yahoo.com/blogs/ydn/sherpa-7992.html),” 102 | *developer.yahoo.com*, June 8, . 103 | 104 | 1. Martin Kleppmann: 105 | “[Bottled Water: Real-Time Integration of PostgreSQL and Kafka](http://martin.kleppmann.com/2015/04/23/bottled-water-real-time-postgresql-kafka.html),” *martin.kleppmann.com*, April 23, 2015. 106 | 107 | 1. Ben Osheroff: 108 | “[Introducing Maxwell, a mysql-to-kafka Binlog Processor](https://web.archive.org/web/20170208100334/https://developer.zendesk.com/blog/introducing-maxwell-a-mysql-to-kafka-binlog-processor),” 109 | *developer.zendesk.com*, August 20, 2015. 110 | 111 | 1. Randall Hauch: 112 | “[Debezium 0.2.1 Released](http://debezium.io/blog/2016/06/10/Debezium-0/),” *debezium.io*, 113 | June 10, 2016. 114 | 115 | 1. Prem Santosh Udaya Shankar: 116 | “[Streaming MySQL Tables in Real-Time to Kafka](https://engineeringblog.yelp.com/2016/08/streaming-mysql-tables-in-real-time-to-kafka.html),” *engineeringblog.yelp.com*, August 1, 2016. 117 | 118 | 1. “[Mongoriver](https://github.com/stripe/mongoriver),” 119 | Stripe, Inc., *github.com*, September 2014. 120 | 121 | 1. Dan Harvey: 122 | “[Change Data Capture with Mongo + Kafka](http://www.slideshare.net/danharvey/change-data-capture-with-mongodb-and-kafka),” at *Hadoop Users Group UK*, August 2015. 123 | 124 | 1. “[Oracle GoldenGate 12c: Real-Time Access to Real-Time Information](http://www.oracle.com/us/products/middleware/data-integration/oracle-goldengate-realtime-access-2031152.pdf),” Oracle White Paper, March 2015. 125 | 126 | 1. “[Oracle GoldenGate Fundamentals: How Oracle GoldenGate Works](https://www.youtube.com/watch?v=6H9NibIiPQE),” Oracle Corporation, *youtube.com*, November 2012. 127 | 128 | 1. Slava Akhmechet: 129 | “[Advancing the Realtime Web](http://rethinkdb.com/blog/realtime-web/),” *rethinkdb.com*, 130 | January 27, 2015. 131 | 132 | 1. “[Firebase Realtime Database Documentation](https://firebase.google.com/docs/database/),” Google, Inc., *firebase.google.com*, May 2016. 133 | 134 | 1. “[Apache CouchDB 1.6 Documentation](http://docs.couchdb.org/en/latest/),” *docs.couchdb.org*, 2014. 135 | 136 | 1. Matt DeBergalis: 137 | “[Meteor 0.7.0: Scalable Database Queries Using MongoDB Oplog Instead of Poll-and-Diff](http://info.meteor.com/blog/meteor-070-scalable-database-queries-using-mongodb-oplog-instead-of-poll-and-diff),” *info.meteor.com*, 138 | December 17, 2013. 139 | 140 | 1. “[Chapter 15. Importing and Exporting Live Data](https://docs.voltdb.com/UsingVoltDB/ChapExport.php),” VoltDB 6.4 User Manual, *docs.voltdb.com*, June 2016. 141 | 142 | 1. Neha Narkhede: 143 | “[Announcing Kafka Connect: Building Large-Scale Low-Latency Data Pipelines](http://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latency-data-pipelines),” *confluent.io*, 144 | February 18, 2016. 145 | 146 | 1. Greg Young: 147 | “[CQRS and Event Sourcing](https://www.youtube.com/watch?v=JHGkaShoyNs),” at *Code on 148 | the Beach*, August 2014. 149 | 150 | 1. Martin Fowler: 151 | “[Event Sourcing](http://martinfowler.com/eaaDev/EventSourcing.html),” *martinfowler.com*, 152 | December 12, 2005. 153 | 154 | 1. Vaughn Vernon: 155 | [*Implementing Domain-Driven Design*](https://www.informit.com/store/implementing-domain-driven-design-9780321834577). 156 | Addison-Wesley Professional, 2013. ISBN: 978-0-321-83457-7 157 | 158 | 1. H. V. Jagadish, Inderpal Singh Mumick, and Abraham Silberschatz: 159 | “[View Maintenance Issues for the Chronicle Data Model](https://dl.acm.org/doi/10.1145/212433.220201),” 160 | at *14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems* (PODS), May 1995. 161 | [doi:10.1145/212433.220201](http://dx.doi.org/10.1145/212433.220201) 162 | 163 | 1. “[Event Store 3.5.0 Documentation](http://docs.geteventstore.com/),” Event Store LLP, *docs.geteventstore.com*, February 2016. 164 | 165 | 1. Martin Kleppmann: 166 | [*Making Sense of Stream Processing*](http://www.oreilly.com/data/free/stream-processing.csp). Report, O'Reilly Media, May 2016. 167 | 168 | 1. Sander Mak: 169 | “[Event-Sourced Architectures with Akka](http://www.slideshare.net/SanderMak/eventsourced-architectures-with-akka),” at *JavaOne*, September 2014. 170 | 171 | 1. Julian Hyde: 172 | [personal communication](https://twitter.com/julianhyde/status/743374145006641153), 173 | June 2016. 174 | 175 | 1. Ashish Gupta and Inderpal Singh Mumick: 176 | *Materialized Views: Techniques, Implementations, and Applications*. MIT Press, 1999. 177 | ISBN: 978-0-262-57122-7 178 | 179 | 1. Timothy Griffin and Leonid Libkin: 180 | “[Incremental Maintenance of Views with Duplicates](http://homepages.inf.ed.ac.uk/libkin/papers/sigmod95.pdf),” at *ACM International Conference on Management of 181 | Data* (SIGMOD), May 1995. 182 | [doi:10.1145/223784.223849](http://dx.doi.org/10.1145/223784.223849) 183 | 184 | 1. Pat Helland: 185 | “[Immutability Changes Everything](http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf),” 186 | at *7th Biennial Conference on Innovative Data Systems Research* (CIDR), January 2015. 187 | 188 | 1. Martin Kleppmann: 189 | “[Accounting for Computer Scientists](http://martin.kleppmann.com/2011/03/07/accounting-for-computer-scientists.html),” *martin.kleppmann.com*, March 7, 2011. 190 | 191 | 1. Pat Helland: 192 | “[Accountants Don't Use Erasers](https://blogs.msdn.microsoft.com/pathelland/2007/06/14/accountants-dont-use-erasers/),” *blogs.msdn.com*, June 14, 2007. 193 | 194 | 1. Fangjin Yang: 195 | “[Dogfooding with Druid, Samza, and Kafka: Metametrics at Metamarkets](https://metamarkets.com/2015/dogfooding-with-druid-samza-and-kafka-metametrics-at-metamarkets/),” *metamarkets.com*, June 3, 2015. 196 | 197 | 1. Gavin Li, Jianqiu Lv, and Hang Qi: 198 | “[Pistachio: Co-Locate the Data and Compute for Fastest Cloud Compute](https://web.archive.org/web/20181214032620/https://yahoohadoop.tumblr.com/post/116365275781/pistachio-co-locate-the-data-and-compute-for),” 199 | *yahoohadoop.tumblr.com*, April 13, 2015. 200 | 201 | 1. Kartik Paramasivam: 202 | “[Stream Processing Hard Problems – Part 1: Killing Lambda](https://engineering.linkedin.com/blog/2016/06/stream-processing-hard-problems-part-1-killing-lambda),” *engineering.linkedin.com*, June 27, 2016. 203 | 204 | 1. Martin Fowler: 205 | “[CQRS](http://martinfowler.com/bliki/CQRS.html),” *martinfowler.com*, July 14, 2011. 206 | 207 | 1. Greg Young: 208 | “[CQRS Documents](https://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf),” 209 | *cqrs.files.wordpress.com*, November 2010. 210 | 211 | 1. Baron Schwartz: 212 | “[Immutability, MVCC, and Garbage Collection](http://www.xaprb.com/blog/2013/12/28/immutability-mvcc-and-garbage-collection/),” *xaprb.com*, December 28, 2013. 213 | 214 | 1. Daniel Eloff, Slava Akhmechet, Jay Kreps, et al.: 215 | ["Re: Turning the Database Inside-out with Apache Samza](https://news.ycombinator.com/item?id=9145197)," Hacker News discussion, *news.ycombinator.com*, March 4, 2015. 216 | 217 | 1. “[Datomic Development Resources: Excision](http://docs.datomic.com/excision.html),” Cognitect, Inc., *docs.datomic.com*. 218 | 219 | 1. “[Fossil Documentation: Deleting Content from Fossil](http://fossil-scm.org/index.html/doc/trunk/www/shunning.wiki),” *fossil-scm.org*, 2016. 220 | 221 | 1. Jay Kreps: 222 | “[The irony of distributed systems is that data loss is really easy but deleting data is surprisingly hard,](https://twitter.com/jaykreps/status/582580836425330688)” *twitter.com*, March 30, 223 | 2015. 224 | 225 | 1. David C. Luckham: 226 | “[What’s the Difference Between ESP and CEP?](http://www.complexevents.com/2006/08/01/what%E2%80%99s-the-difference-between-esp-and-cep/),” *complexevents.com*, August 1, 2006. 227 | 228 | 1. Srinath Perera: 229 | “[How Is Stream Processing and Complex Event Processing (CEP) Different?](https://www.quora.com/How-is-stream-processing-and-complex-event-processing-CEP-different),” *quora.com*, December 3, 2015. 230 | 231 | 1. Arvind Arasu, Shivnath Babu, and Jennifer Widom: 232 | “[The CQL Continuous Query Language: Semantic Foundations and Query Execution](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/cql.pdf),” 233 | *The VLDB Journal*, volume 15, number 2, pages 121–142, June 2006. 234 | [doi:10.1007/s00778-004-0147-z](http://dx.doi.org/10.1007/s00778-004-0147-z) 235 | 236 | 1. Julian Hyde: 237 | “[Data in Flight: How Streaming SQL Technology Can Help Solve the Web 2.0 Data Crunch](http://queue.acm.org/detail.cfm?id=1667562),” *ACM Queue*, volume 7, number 11, December 2009. 238 | [doi:10.1145/1661785.1667562](http://dx.doi.org/10.1145/1661785.1667562) 239 | 240 | 1. “[Esper Reference, Version 5.4.0](http://esper.espertech.com/release-5.4.0/esper-reference/html_single/index.html),” 241 | EsperTech, Inc., *espertech.com*, April 2016. 242 | 243 | 1. Zubair Nabi, Eric Bouillet, Andrew Bainbridge, and Chris Thomas: 244 | “[Of Streams and Storms](https://developer.ibm.com/streamsdev/wp-content/uploads/sites/15/2014/04/Streams-and-Storm-April-2014-Final.pdf),” IBM technical report, *developer.ibm.com*, April 2014. 245 | 246 | 1. Milinda Pathirage, Julian Hyde, Yi Pan, and Beth Plale: 247 | “[SamzaSQL: Scalable Fast Data Management with Streaming SQL](https://github.com/milinda/samzasql-hpbdc2016/blob/master/samzasql-hpbdc2016.pdf),” at *IEEE International Workshop on 248 | High-Performance Big Data Computing* (HPBDC), May 2016. 249 | [doi:10.1109/IPDPSW.2016.141](http://dx.doi.org/10.1109/IPDPSW.2016.141) 250 | 251 | 1. Philippe Flajolet, Éric Fusy, Olivier 252 | Gandouet, and Frédéric Meunier: 253 | “[HyperLogLog: The Analysis of a Near-Optimal Cardinality Estimation Algorithm](http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf),” 254 | at *Conference on Analysis of Algorithms* (AofA), June 2007. 255 | 256 | 1. Jay Kreps: 257 | “[Questioning the Lambda Architecture](https://www.oreilly.com/ideas/questioning-the-lambda-architecture),” *oreilly.com*, July 2, 2014. 258 | 259 | 1. Ian Hellström: 260 | “[An Overview of Apache Streaming Technologies](https://databaseline.bitbucket.io/an-overview-of-apache-streaming-technologies/),” *databaseline.bitbucket.io*, March 12, 2016. 261 | 262 | 1. Jay Kreps: 263 | “[Why Local State Is a Fundamental Primitive in Stream Processing](https://www.oreilly.com/ideas/why-local-state-is-a-fundamental-primitive-in-stream-processing),” *oreilly.com*, July 31, 2014. 264 | 265 | 1. Shay Banon: 266 | “[Percolator](https://www.elastic.co/blog/percolator),” *elastic.co*, February 8, 267 | 2011. 268 | 269 | 1. Alan Woodward and Martin Kleppmann: 270 | “[Real-Time Full-Text Search with Luwak and Samza](http://martin.kleppmann.com/2015/04/13/real-time-full-text-search-luwak-samza.html),” *martin.kleppmann.com*, April 13, 2015. 271 | 272 | 1. “[Apache Storm 2.1.0 Documentation](https://storm.apache.org/releases/2.1.0/index.html),” *storm.apache.org*, October 2019. 273 | 274 | 1. Tyler Akidau: 275 | “[The World Beyond Batch: Streaming 102](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102),” *oreilly.com*, January 20, 2016. 276 | 277 | 1. Stephan Ewen: 278 | “[Streaming Analytics with Apache Flink](https://www.confluent.io/resources/kafka-summit-2016/advanced-streaming-analytics-apache-flink-apache-kafka/),” 279 | at *Kafka Summit*, April 2016. 280 | 281 | 1. Tyler Akidau, Alex Balikov, Kaya Bekiroğlu, et al.: 282 | “[MillWheel: Fault-Tolerant Stream Processing at Internet Scale](http://research.google.com/pubs/pub41378.html),” at *39th International Conference on Very Large Data Bases* (VLDB), 283 | August 2013. 284 | 285 | 1. Alex Dean: 286 | “[Improving Snowplow's Understanding of Time](http://snowplowanalytics.com/blog/2015/09/15/improving-snowplows-understanding-of-time/),” *snowplowanalytics.com*, September 15, 2015. 287 | 288 | 1. “[Windowing (Azure Stream Analytics)](https://msdn.microsoft.com/en-us/library/azure/dn835019.aspx),” Microsoft Azure Reference, *msdn.microsoft.com*, April 2016. 289 | 290 | 1. “[State Management](http://samza.apache.org/learn/documentation/0.10/container/state-management.html),” Apache Samza 0.10 Documentation, *samza.apache.org*, December 2015. 291 | 292 | 1. Rajagopal Ananthanarayanan, 293 | Venkatesh Basker, Sumit Das, et al.: 294 | “[Photon: Fault-Tolerant and Scalable Joining of Continuous Data Streams](http://research.google.com/pubs/pub41318.html),” at *ACM International Conference on Management of 295 | Data* (SIGMOD), June 2013. 296 | [doi:10.1145/2463676.2465272](http://dx.doi.org/10.1145/2463676.2465272) 297 | 298 | 1. Martin Kleppmann: 299 | “[Samza Newsfeed Demo](https://github.com/ept/newsfeed),” *github.com*, 300 | September 2014. 301 | 302 | 1. Ben Kirwin: 303 | “[Doing the Impossible: Exactly-Once Messaging Patterns in Kafka](http://ben.kirw.in/2014/11/28/kafka-patterns/),” *ben.kirw.in*, November 28, 2014. 304 | 305 | 1. Pat Helland: 306 | “[Data on the Outside Versus Data on the Inside](http://cidrdb.org/cidr2005/papers/P12.pdf),” at *2nd Biennial Conference on Innovative Data Systems Research* (CIDR), January 307 | 2005. 308 | 309 | 1. Ralph Kimball and Margy Ross: 310 | *The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling*, 311 | 3rd edition. John Wiley & Sons, 2013. ISBN: 978-1-118-53080-1 312 | 313 | 1. Viktor Klang: 314 | “[I'm coining the phrase 'effectively-once' for message processing with at-least-once + idempotent operations](https://twitter.com/viktorklang/status/789036133434978304),” 315 | *twitter.com*, October 20, 2016. 316 | 317 | 1. Matei Zaharia, Tathagata Das, Haoyuan Li, et al.: 318 | “[Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters](https://www.usenix.org/system/files/conference/hotcloud12/hotcloud12-final28.pdf),” at 319 | *4th USENIX Conference in Hot Topics in Cloud Computing* (HotCloud), June 2012. 320 | 321 | 1. Kostas Tzoumas, Stephan Ewen, and Robert Metzger: 322 | “[High-Throughput, Low-Latency, and Exactly-Once Stream Processing with Apache Flink](https://www.ververica.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink),” 323 | *ververica.com*, August 5, 2015. 324 | 325 | 1. Paris Carbone, Gyula Fóra, Stephan Ewen, et al.: 326 | “[Lightweight Asynchronous Snapshots for Distributed Dataflows](http://arxiv.org/abs/1506.08603),” arXiv:1506.08603 [cs.DC], June 29, 2015. 327 | 328 | 1. Ryan Betts and John Hugg: 329 | [*Fast Data: Smart and at Scale*](http://www.oreilly.com/data/free/fast-data-smart-and-at-scale.csp). 330 | Report, O'Reilly Media, October 2015. 331 | 332 | 1. Flavio Junqueira: 333 | “[Making Sense of Exactly-Once Semantics](http://conferences.oreilly.com/strata/hadoop-big-data-eu/public/schedule/detail/49690),” at *Strata+Hadoop World London*, June 2016. 334 | 335 | 1. Jason Gustafson, Flavio Junqueira, Apurva Mehta, Sriram Subramanian, and Guozhang Wang: “[KIP-98 – Exactly Once Delivery and Transactional Messaging](https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging),” *cwiki.apache.org*, November 2016. 336 | 337 | 1. Pat Helland: 338 | “[Idempotence Is Not a Medical Condition](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.401.1539&rep=rep1&type=pdf),” *Communications of the ACM*, volume 55, number 5, page 56, May 2012. 339 | [doi:10.1145/2160718.2160734](http://dx.doi.org/10.1145/2160718.2160734) 340 | 341 | 1. Jay Kreps: 342 | “[Re: Trying to Achieve Deterministic Behavior on Recovery/Rewind](http://mail-archives.apache.org/mod_mbox/samza-dev/201409.mbox/%3CCAOeJiJg%2Bc7Ei%3DgzCuOz30DD3G5Hm9yFY%3DUJ6SafdNUFbvRgorg%40mail.gmail.com%3E),” email to *samza-dev* mailing list, 343 | September 9, 2014. 344 | 345 | 1. E. N. (Mootaz) Elnozahy, 346 | Lorenzo Alvisi, Yi-Min Wang, and David B. Johnson: 347 | “[A Survey of Rollback-Recovery Protocols in Message-Passing Systems](http://www.cs.utexas.edu/~lorenzo/papers/SurveyFinal.pdf),” *ACM Computing Surveys*, volume 34, number 3, 348 | pages 375–408, September 2002. 349 | [doi:10.1145/568522.568525](http://dx.doi.org/10.1145/568522.568525) 350 | 351 | 1. Adam Warski: 352 | “[Kafka Streams – How Does It Fit the Stream Processing Landscape?](https://softwaremill.com/kafka-streams-how-does-it-fit-stream-landscape/),” *softwaremill.com*, June 1, 2016. 353 | 354 | -------------------------------------------------------------------------------- /chapter-12-refs.md: -------------------------------------------------------------------------------- 1 | Designing Data-Intensive Applications 2 | ===================================== 3 | 4 | Chapter 12 References 5 | -------------------- 6 | 7 | 1. Rachid Belaid: 8 | “[Postgres Full-Text Search is Good Enough!](http://rachbelaid.com/postgres-full-text-search-is-good-enough/),” *rachbelaid.com*, July 13, 2015. 9 | 10 | 1. Philippe Ajoux, Nathan Bronson, Sanjeev Kumar, et al.: 11 | “[Challenges to Adopting Stronger Consistency at Scale](https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-ajoux.pdf),” at *15th USENIX Workshop on Hot Topics 12 | in Operating Systems* (HotOS), May 2015. 13 | 14 | 1. Pat Helland and Dave Campbell: 15 | “[Building on Quicksand](https://database.cs.wisc.edu/cidr/cidr2009/Paper_133.pdf),” at 16 | *4th Biennial Conference on Innovative Data Systems Research* (CIDR), January 2009. 17 | 18 | 1. Jessica Kerr: 19 | “[Provenance and Causality in Distributed Systems](https://web.archive.org/web/20190425150540/http://blog.jessitron.com/2016/09/provenance-and-causality-in-distributed.html),” 20 | *blog.jessitron.com*, September 25, 2016. 21 | 22 | 1. Kostas Tzoumas: 23 | “[Batch Is a Special Case of Streaming](http://data-artisans.com/batch-is-a-special-case-of-streaming/),” *data-artisans.com*, September 15, 2015. 24 | 25 | 1. Shinji Kim and Robert Blafford: 26 | “[Stream Windowing Performance Analysis: Concord and Spark Streaming](https://web.archive.org/web/20180125074821/http://concord.io/posts/windowing_performance_analysis_w_spark_streaming),” 27 | *concord.io*, July 6, 2016. 28 | 29 | 1. Jay Kreps: 30 | “[The Log: What Every Software Engineer Should Know About Real-Time Data's Unifying Abstraction](http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying),” 31 | *engineering.linkedin.com*, December 16, 2013. 32 | 33 | 1. Pat Helland: 34 | “[Life Beyond Distributed Transactions: An Apostate’s Opinion](http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf),” at *3rd Biennial Conference on Innovative Data 35 | Systems Research* (CIDR), January 2007. 36 | 37 | 1. “[Great Western Railway (1835–1948)](https://web.archive.org/web/20160122155425/https://www.networkrail.co.uk/VirtualArchive/great-western/),” 38 | Network Rail Virtual Archive, *networkrail.co.uk*. 39 | 40 | 1. Jacqueline Xu: 41 | “[Online Migrations at Scale](https://stripe.com/blog/online-migrations),” 42 | *stripe.com*, February 2, 2017. 43 | 44 | 1. Molly Bartlett Dishman and Martin Fowler: 45 | “[Agile Architecture](http://conferences.oreilly.com/software-architecture/sa2015/public/schedule/detail/40388),” at *O'Reilly Software Architecture Conference*, March 2015. 46 | 47 | 1. Nathan Marz and James Warren: 48 | [*Big Data: Principles and Best Practices of Scalable Real-Time Data Systems*](https://www.manning.com/books/big-data). 49 | Manning, 2015. ISBN: 978-1-617-29034-3 50 | 51 | 1. Oscar Boykin, Sam Ritchie, Ian O'Connell, and 52 | Jimmy Lin: “[Summingbird: A Framework for Integrating Batch and Online MapReduce Computations](http://www.vldb.org/pvldb/vol7/p1441-boykin.pdf),” at *40th International Conference on 53 | Very Large Data Bases* (VLDB), September 2014. 54 | 55 | 1. Jay Kreps: 56 | “[Questioning the Lambda Architecture](https://www.oreilly.com/ideas/questioning-the-lambda-architecture),” *oreilly.com*, July 2, 2014. 57 | 58 | 1. Raul Castro Fernandez, Peter Pietzuch, Jay Kreps, et al.: 59 | “[Liquid: Unifying Nearline and Offline Big Data Integration](http://cidrdb.org/cidr2015/Papers/CIDR15_Paper25u.pdf),” 60 | at *7th Biennial Conference on Innovative Data Systems Research* (CIDR), January 2015. 61 | 62 | 1. Dennis M. Ritchie and Ken Thompson: 63 | “[The UNIX Time-Sharing System](http://web.eecs.utk.edu/~qcao1/cs560/papers/paper-unix.pdf),” 64 | *Communications of the ACM*, volume 17, number 7, pages 365–375, July 1974. 65 | [doi:10.1145/361011.361061](http://dx.doi.org/10.1145/361011.361061) 66 | 67 | 1. Eric A. Brewer and Joseph M. Hellerstein: 68 | “[CS262a: Advanced Topics in Computer Systems](http://people.eecs.berkeley.edu/~brewer/cs262/systemr.html),” lecture notes, University of California, Berkeley, *cs.berkeley.edu*, 69 | August 2011. 70 | 71 | 1. Michael Stonebraker: 72 | “[The Case for Polystores](http://wp.sigmod.org/?p=1629),” *wp.sigmod.org*, 73 | July 13, 2015. 74 | 75 | 1. Jennie Duggan, 76 | Aaron J. Elmore, Michael Stonebraker, et al.: 77 | “[The BigDAWG Polystore System](http://dspace.mit.edu/openaccess-disseminate/1721.1/100936),” *ACM SIGMOD Record*, volume 44, number 2, pages 11–16, June 2015. 78 | [doi:10.1145/2814710.2814713](http://dx.doi.org/10.1145/2814710.2814713) 79 | 80 | 1. Patrycja Dybka: 81 | “[Foreign Data Wrappers for PostgreSQL](http://www.vertabelo.com/blog/technical-articles/foreign-data-wrappers-for-postgresql),” *vertabelo.com*, March 24, 2015. 82 | 83 | 1. David B. Lomet, Alan Fekete, Gerhard Weikum, and Mike Zwilling: 84 | “[Unbundling Transaction Services in the Cloud](https://www.microsoft.com/en-us/research/publication/unbundling-transaction-services-in-the-cloud/),” at *4th Biennial Conference on Innovative Data Systems 85 | Research* (CIDR), January 2009. 86 | 87 | 1. Martin Kleppmann and Jay Kreps: 88 | “[Kafka, Samza and the Unix Philosophy of Distributed Data](http://martin.kleppmann.com/papers/kafka-debull15.pdf),” *IEEE Data Engineering Bulletin*, volume 38, number 4, pages 4–14, 89 | December 2015. 90 | 91 | 1. John Hugg: 92 | “[Winning Now and in the Future: Where VoltDB Shines](https://voltdb.com/blog/winning-now-and-future-where-voltdb-shines),” *voltdb.com*, March 23, 2016. 93 | 94 | 1. Frank McSherry, Derek G. Murray, Rebecca Isaacs, and Michael Isard: 95 | “[Differential Dataflow](http://cidrdb.org/cidr2013/Papers/CIDR13_Paper111.pdf),” 96 | at *6th Biennial Conference on Innovative Data Systems Research* (CIDR), January 2013. 97 | 98 | 1. Derek G Murray, Frank McSherry, Rebecca Isaacs, et al.: 99 | “[Naiad: A Timely Dataflow System](http://sigops.org/s/conferences/sosp/2013/papers/p439-murray.pdf),” 100 | at *24th ACM Symposium on Operating Systems Principles* (SOSP), pages 439–455, November 2013. 101 | [doi:10.1145/2517349.2522738](http://dx.doi.org/10.1145/2517349.2522738) 102 | 103 | 1. Gwen Shapira: 104 | “[We have a bunch of customers who are implementing ‘database inside-out’ concept and they all ask ‘is anyone else doing it? are we crazy?’](https://twitter.com/gwenshap/status/758800071110430720)” *twitter.com*, July 28, 2016. 105 | 106 | 1. Martin Kleppmann: 107 | “[Turning the Database Inside-out with Apache Samza,](http://martin.kleppmann.com/2015/03/04/turning-the-database-inside-out.html)” at *Strange Loop*, September 2014. 108 | 109 | 1. Peter Van Roy and Seif Haridi: 110 | [*Concepts, Techniques, and Models of Computer Programming*](https://www.info.ucl.ac.be/~pvr/book.html). 111 | MIT Press, 2004. ISBN: 978-0-262-22069-9 112 | 113 | 1. “[Juttle Documentation](http://juttle.github.io/juttle/),” *juttle.github.io*, 2016. 114 | 115 | 1. Evan Czaplicki and Stephen Chong: 116 | “[Asynchronous Functional Reactive Programming for GUIs](http://people.seas.harvard.edu/~chong/pubs/pldi13-elm.pdf),” at *34th ACM SIGPLAN Conference on Programming Language 117 | Design and Implementation* (PLDI), June 2013. 118 | [doi:10.1145/2491956.2462161](http://dx.doi.org/10.1145/2491956.2462161) 119 | 120 | 1. Engineer Bainomugisha, Andoni Lombide Carreton, 121 | Tom van Cutsem, Stijn Mostinckx, and Wolfgang de Meuter: 122 | “[A Survey on Reactive Programming](http://soft.vub.ac.be/Publications/2012/vub-soft-tr-12-13.pdf),” *ACM Computing Surveys*, volume 45, number 4, pages 1–34, August 2013. 123 | [doi:10.1145/2501654.2501666](http://dx.doi.org/10.1145/2501654.2501666) 124 | 125 | 1. Peter Alvaro, Neil Conway, Joseph M. Hellerstein, and William R. Marczak: 126 | “[Consistency Analysis in Bloom: A CALM and Collected Approach](https://dsf.berkeley.edu/cs286/papers/calm-cidr2011.pdf),” 127 | at *5th Biennial Conference on Innovative Data Systems Research* (CIDR), January 2011. 128 | 129 | 1. Felienne Hermans: 130 | “[Spreadsheets Are Code](https://vimeo.com/145492419),” at *Code Mesh*, November 2015. 131 | 132 | 1. Dan Bricklin and Bob 133 | Frankston: “[VisiCalc: Information from Its Creators](http://danbricklin.com/visicalc.htm),” *danbricklin.com*. 134 | 135 | 1. D. Sculley, Gary Holt, Daniel Golovin, et al.: 136 | “[Machine Learning: The High-Interest Credit Card of Technical Debt](http://research.google.com/pubs/pub43146.html),” at *NIPS Workshop on Software Engineering for Machine Learning* 137 | (SE4ML), December 2014. 138 | 139 | 1. Peter Bailis, Alan Fekete, Michael J Franklin, 140 | et al.: “[Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity](http://www.bailis.org/papers/feral-sigmod2015.pdf),” at *ACM International Conference on 141 | Management of Data* (SIGMOD), June 2015. 142 | [doi:10.1145/2723372.2737784](http://dx.doi.org/10.1145/2723372.2737784) 143 | 144 | 1. Guy Steele: 145 | “[Re: Need for Macros (Was Re: Icon)](https://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg01134.html),” email to *ll1-discuss* mailing list, *people.csail.mit.edu*, December 24, 146 | 2001. 147 | 148 | 1. David Gelernter: 149 | “[Generative Communication in Linda](http://cseweb.ucsd.edu/groups/csag/html/teaching/cse291s03/Readings/p80-gelernter.pdf),” *ACM Transactions on Programming Languages and Systems* 150 | (TOPLAS), volume 7, number 1, pages 80–112, January 1985. 151 | [doi:10.1145/2363.2433](http://dx.doi.org/10.1145/2363.2433) 152 | 153 | 1. Patrick Th. Eugster, Pascal A. Felber, 154 | Rachid Guerraoui, and Anne-Marie Kermarrec: 155 | “[The Many Faces of Publish/Subscribe](http://www.cs.ru.nl/~pieter/oss/manyfaces.pdf),” 156 | *ACM Computing Surveys*, volume 35, number 2, pages 114–131, June 2003. 157 | [doi:10.1145/857076.857078](http://dx.doi.org/10.1145/857076.857078) 158 | 159 | 1. Ben Stopford: 160 | “[Microservices in a Streaming World](https://www.infoq.com/presentations/microservices-streaming),” at *QCon London*, March 2016. 161 | 162 | 1. Christian Posta: 163 | “[Why Microservices Should Be Event Driven: Autonomy vs Authority](http://blog.christianposta.com/microservices/why-microservices-should-be-event-driven-autonomy-vs-authority/),” *blog.christianposta.com*, May 27, 2016. 164 | 165 | 1. Alex Feyerke: 166 | “[Say Hello to Offline First](http://hood.ie/blog/say-hello-to-offline-first.html),” 167 | *hood.ie*, November 5, 2013. 168 | 169 | 1. Sebastian Burckhardt, Daan Leijen, Jonathan 170 | Protzenko, and Manuel Fähndrich: 171 | “[Global Sequence Protocol: A Robust Abstraction for Replicated Shared State](http://drops.dagstuhl.de/opus/volltexte/2015/5238/),” at *29th European Conference on Object-Oriented 172 | Programming* (ECOOP), July 2015. 173 | [doi:10.4230/LIPIcs.ECOOP.2015.568](http://dx.doi.org/10.4230/LIPIcs.ECOOP.2015.568) 174 | 175 | 1. Mark Soper: 176 | “[Clearing Up React Data Management Confusion with Flux, Redux, and Relay](https://medium.com/@marksoper/clearing-up-react-data-management-confusion-with-flux-redux-and-relay-aad504e63cae),” *medium.com*, December 3, 2015. 177 | 178 | 1. Eno Thereska, Damian Guy, Michael Noll, and Neha Narkhede: 179 | “[Unifying Stream Processing and Interactive Queries in Apache Kafka](http://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/),” *confluent.io*, October 26, 2016. 180 | 181 | 1. Frank McSherry: 182 | “[Dataflow as Database](https://github.com/frankmcsherry/blog/blob/master/posts/2016-07-17.md),” *github.com*, July 17, 2016. 183 | 184 | 1. Peter Alvaro: 185 | “[I See What You Mean](https://www.youtube.com/watch?v=R2Aa4PivG0g),” at *Strange 186 | Loop*, September 2015. 187 | 188 | 1. Nathan Marz: 189 | “[Trident: A High-Level Abstraction for Realtime Computation](https://blog.twitter.com/2012/trident-a-high-level-abstraction-for-realtime-computation),” *blog.twitter.com*, August 2, 2012. 190 | 191 | 1. Edi Bice: 192 | “[Low Latency Web Scale Fraud Prevention with Apache Samza, Kafka and Friends](http://www.slideshare.net/edibice/extremely-low-latency-web-scale-fraud-prevention-with-apache-samza-kafka-and-friends),” at *Merchant Risk 193 | Council MRC Vegas Conference*, March 2016. 194 | 195 | 1. Charity Majors: 196 | “[The Accidental DBA](https://charity.wtf/2016/10/02/the-accidental-dba/),” *charity.wtf*, 197 | October 2, 2016. 198 | 199 | 1. Arthur J. Bernstein, Philip M. Lewis, and Shiyong Lu: 200 | “[Semantic Conditions for Correctness at Different Isolation Levels](http://db.cs.berkeley.edu/cs286/papers/isolation-icde2000.pdf),” at *16th International Conference on Data 201 | Engineering* (ICDE), February 2000. 202 | [doi:10.1109/ICDE.2000.839387](http://dx.doi.org/10.1109/ICDE.2000.839387) 203 | 204 | 1. Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, and 205 | S. Sudarshan: “[Automating the Detection of Snapshot Isolation Anomalies](http://www.vldb.org/conf/2007/papers/industrial/p1263-jorwekar.pdf),” at *33rd International Conference on Very 206 | Large Data Bases* (VLDB), September 2007. 207 | 208 | 1. Kyle Kingsbury: 209 | [Jepsen blog post series](https://aphyr.com/tags/jepsen), *aphyr.com*, 2013–2016. 210 | 211 | 1. Michael Jouravlev: 212 | “[Redirect After Post](http://www.theserverside.com/news/1365146/Redirect-After-Post),” 213 | *theserverside.com*, August 1, 2004. 214 | 215 | 1. Jerome H. Saltzer, David P. Reed, and David D. Clark: 216 | “[End-to-End Arguments in System Design](https://groups.csail.mit.edu/ana/Publications/PubPDFs/End-to-End%20Arguments%20in%20System%20Design.pdf),” 217 | *ACM Transactions on Computer Systems*, volume 2, number 4, pages 277–288, November 1984. 218 | [doi:10.1145/357401.357402](http://dx.doi.org/10.1145/357401.357402) 219 | 220 | 1. Peter Bailis, Alan Fekete, Michael J. Franklin, et al.: 221 | “[Coordination-Avoiding Database Systems](http://arxiv.org/pdf/1402.2237.pdf),” 222 | *Proceedings of the VLDB Endowment*, volume 8, number 3, pages 185–196, November 2014. 223 | 224 | 1. Alex Yarmula: 225 | “[Strong Consistency in Manhattan](https://blog.twitter.com/2016/strong-consistency-in-manhattan),” *blog.twitter.com*, March 17, 2016. 226 | 227 | 1. Douglas B Terry, Marvin M Theimer, Karin Petersen, et al.: 228 | “[Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System](http://css.csail.mit.edu/6.824/2014/papers/bayou-conflicts.pdf),” at *15th ACM Symposium on Operating 229 | Systems Principles* (SOSP), pages 172–182, December 1995. 230 | [doi:10.1145/224056.224070](http://dx.doi.org/10.1145/224056.224070) 231 | 232 | 1. Jim Gray: 233 | “[The Transaction Concept: Virtues and Limitations](http://jimgray.azurewebsites.net/papers/thetransactionconcept.pdf),” 234 | at *7th International Conference on Very Large Data Bases* (VLDB), September 1981. 235 | 236 | 1. Hector Garcia-Molina and Kenneth Salem: 237 | “[Sagas](http://www.cs.cornell.edu/andru/cs711/2002fa/reading/sagas.pdf),” at 238 | *ACM International Conference on Management of Data* (SIGMOD), May 1987. 239 | [doi:10.1145/38713.38742](http://dx.doi.org/10.1145/38713.38742) 240 | 241 | 1. Pat Helland: 242 | “[Memories, Guesses, and Apologies](https://web.archive.org/web/20160304020907/http://blogs.msdn.com/b/pathelland/archive/2007/05/15/memories-guesses-and-apologies.aspx),” 243 | *blogs.msdn.com*, May 15, 2007. 244 | 245 | 1. Yoongu Kim, Ross Daly, Jeremie Kim, et al.: 246 | “[Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors](https://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf),” at *41st Annual 247 | International Symposium on Computer Architecture* (ISCA), June 2014. 248 | [doi:10.1145/2678373.2665726](http://dx.doi.org/10.1145/2678373.2665726) 249 | 250 | 1. Mark Seaborn and Thomas Dullien: 251 | “[Exploiting the DRAM Rowhammer Bug to Gain Kernel Privileges](https://googleprojectzero.blogspot.co.uk/2015/03/exploiting-dram-rowhammer-bug-to-gain.html),” *googleprojectzero.blogspot.co.uk*, March 9, 252 | 2015. 253 | 254 | 1. Jim N. Gray and Catharine van Ingen: 255 | “[Empirical Measurements of Disk Failure Rates and Error Rates](https://www.microsoft.com/en-us/research/publication/empirical-measurements-of-disk-failure-rates-and-error-rates/),” Microsoft Research, MSR-TR-2005-166, 256 | December 2005. 257 | 258 | 1. Annamalai Gurusami and Daniel Price: 259 | “[Bug #73170: Duplicates in Unique Secondary Index Because of Fix of Bug#68021](http://bugs.mysql.com/bug.php?id=73170),” *bugs.mysql.com*, July 2014. 260 | 261 | 1. Gary Fredericks: 262 | “[Postgres Serializability Bug](https://github.com/gfredericks/pg-serializability-bug),” 263 | *github.com*, September 2015. 264 | 265 | 1. Xiao Chen: 266 | “[HDFS DataNode Scanners and Disk Checker Explained](http://blog.cloudera.com/blog/2016/12/hdfs-datanode-scanners-and-disk-checker-explained/),” *blog.cloudera.com*, December 20, 267 | 2016. 268 | 269 | 1. Jay Kreps: 270 | “[Getting Real About Distributed System Reliability](http://blog.empathybox.com/post/19574936361/getting-real-about-distributed-system-reliability),” *blog.empathybox.com*, March 19, 2012. 271 | 272 | 1. Martin Fowler: 273 | “[The LMAX Architecture](http://martinfowler.com/articles/lmax.html),” 274 | *martinfowler.com*, July 12, 2011. 275 | 276 | 1. Sam Stokes: 277 | “[Move Fast with Confidence](http://blog.samstokes.co.uk/blog/2016/07/11/move-fast-with-confidence/),” *blog.samstokes.co.uk*, July 11, 2016. 278 | 279 | 1. “[Hyperledger Sawtooth documentation](https://sawtooth.hyperledger.org/docs/core/releases/latest/introduction.html),” 280 | Intel Corporation, *sawtooth.hyperledger.org*, 2017. 281 | 282 | 1. Richard Gendal Brown: 283 | “[Introducing R3 Corda™: A Distributed Ledger Designed for Financial Services](https://gendal.me/2016/04/05/introducing-r3-corda-a-distributed-ledger-designed-for-financial-services/),” *gendal.me*, April 5, 2016. 284 | 285 | 1. Trent McConaghy, Rodolphe Marques, Andreas Müller, et al.: 286 | “[BigchainDB: A Scalable Blockchain Database](https://www.bigchaindb.com/whitepaper/bigchaindb-whitepaper.pdf),” *bigchaindb.com*, June 8, 2016. 287 | 288 | 1. Ralph C. Merkle: 289 | “[A Digital Signature Based on a Conventional Encryption Function](https://people.eecs.berkeley.edu/~raluca/cs261-f15/readings/merkle.pdf),” at *CRYPTO '87*, August 1987. 290 | [doi:10.1007/3-540-48184-2_32](http://dx.doi.org/10.1007/3-540-48184-2_32) 291 | 292 | 1. Ben Laurie: 293 | “[Certificate Transparency](http://queue.acm.org/detail.cfm?id=2668154),” *ACM 294 | Queue*, volume 12, number 8, pages 10-19, August 2014. 295 | [doi:10.1145/2668152.2668154](http://dx.doi.org/10.1145/2668152.2668154) 296 | 297 | 1. Mark D. Ryan: 298 | “[Enhanced Certificate Transparency and End-to-End Encrypted Mail](https://www.ndss-symposium.org/wp-content/uploads/2017/09/12_2_1.pdf),” 299 | at *Network and Distributed System Security Symposium* (NDSS), February 2014. 300 | [doi:10.14722/ndss.2014.23379](http://dx.doi.org/10.14722/ndss.2014.23379) 301 | 302 | 1. “[Software Engineering Code of Ethics and Professional Practice](http://www.acm.org/about/se-code),” 303 | Association for Computing Machinery, *acm.org*, 1999. 304 | 305 | 1. “[ACM Code of Ethics and Professional Conduct](https://www.acm.org/code-of-ethics),” 306 | Association for Computing Machinery, *acm.org*, 2018. 307 | 308 | 1. François Chollet: 309 | “[Software development is starting to involve important ethical choices](https://twitter.com/fchollet/status/792958695722201088),” *twitter.com*, October 30, 2016. 310 | 311 | 1. Igor Perisic: 312 | “[Making Hard Choices: The Quest for Ethics in Machine Learning](https://engineering.linkedin.com/blog/2016/11/making-hard-choices--the-quest-for-ethics-in-machine-learning),” *engineering.linkedin.com*, November 313 | 2016. 314 | 315 | 1. John Naughton: 316 | “[Algorithm Writers Need a Code of Conduct](https://www.theguardian.com/commentisfree/2015/dec/06/algorithm-writers-should-have-code-of-conduct),” *theguardian.com*, December 6, 2015. 317 | 318 | 1. Logan Kugler: 319 | “[What Happens When Big Data Blunders?](http://cacm.acm.org/magazines/2016/6/202655-what-happens-when-big-data-blunders/fulltext),” *Communications of the ACM*, volume 59, number 6, pages 320 | 15–16, June 2016. [doi:10.1145/2911975](http://dx.doi.org/10.1145/2911975) 321 | 322 | 1. Bill Davidow: 323 | “[Welcome to Algorithmic Prison](http://www.theatlantic.com/technology/archive/2014/02/welcome-to-algorithmic-prison/283985/),” *theatlantic.com*, February 20, 2014. 324 | 325 | 1. Don Peck: 326 | “[They're Watching You at Work](http://www.theatlantic.com/magazine/archive/2013/12/theyre-watching-you-at-work/354681/),” *theatlantic.com*, December 2013. 327 | 328 | 1. Leigh Alexander: 329 | “[Is an Algorithm Any Less Racist Than a Human?](https://www.theguardian.com/technology/2016/aug/03/algorithm-racist-human-employers-work)” *theguardian.com*, August 3, 2016. 330 | 331 | 1. Jesse Emspak: 332 | “[How a Machine Learns Prejudice](https://www.scientificamerican.com/article/how-a-machine-learns-prejudice/),” *scientificamerican.com*, December 29, 2016. 333 | 334 | 1. Maciej Cegłowski: 335 | “[The Moral Economy of Tech](http://idlewords.com/talks/sase_panel.htm),” 336 | *idlewords.com*, June 2016. 337 | 338 | 1. Cathy O'Neil: 339 | [*Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy*](https://weaponsofmathdestructionbook.com/). 340 | Crown Publishing, 2016. ISBN: 978-0-553-41881-1 341 | 342 | 1. Julia Angwin: 343 | “[Make Algorithms Accountable](http://www.nytimes.com/2016/08/01/opinion/make-algorithms-accountable.html),” *nytimes.com*, August 1, 2016. 344 | 345 | 1. Bryce Goodman and Seth Flaxman: 346 | “[European Union Regulations on Algorithmic Decision-Making and a ‘Right to Explanation’](https://arxiv.org/abs/1606.08813),” *arXiv:1606.08813*, August 31, 347 | 2016. 348 | 349 | 1. “[A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes](http://educationnewyork.com/files/rockefeller_databroker.pdf),” 350 | Staff Report, *United States Senate Committee on Commerce, Science, and Transportation*, *commerce.senate.gov*, December 2013. 351 | 352 | 1. Olivia Solon: 353 | “[Facebook’s Failure: Did Fake News and Polarized Politics Get Trump Elected?](https://www.theguardian.com/technology/2016/nov/10/facebook-fake-news-election-conspiracy-theories)” *theguardian.com*, November 10, 354 | 2016. 355 | 356 | 1. Donella H. Meadows and Diana Wright: 357 | *Thinking in Systems: A Primer*. Chelsea Green Publishing, 2008. ISBN: 978-1-603-58055-7 358 | 359 | 1. Daniel J. Bernstein: 360 | “[Listening to a ‘big data’/‘data science’ talk](https://twitter.com/hashbreaker/status/598076230437568512),” *twitter.com*, May 12, 2015. 361 | 362 | 1. Marc Andreessen: 363 | “[Why Software Is Eating the World](http://genius.com/Marc-andreessen-why-software-is-eating-the-world-annotated),” *The Wall Street Journal*, 20 August 2011. 364 | 365 | 1. J. M. Porup: 366 | “[‘Internet of Things’ Security Is Hilariously Broken and Getting Worse](http://arstechnica.com/security/2016/01/how-to-search-the-internet-of-things-for-photos-of-sleeping-babies/),” *arstechnica.com*, January 23, 2016. 367 | 368 | 1. Bruce Schneier: 369 | [*Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World*](https://www.schneier.com/books/data_and_goliath/). 370 | W. W. Norton, 2015. ISBN: 978-0-393-35217-7 371 | 372 | 1. The Grugq: 373 | “[Nothing to Hide](https://grugq.tumblr.com/post/142799983558/nothing-to-hide),” 374 | *grugq.tumblr.com*, April 15, 2016. 375 | 376 | 1. Tony Beltramelli: 377 | “[Deep-Spying: Spying Using Smartwatch and Deep Learning](https://arxiv.org/abs/1512.05616),” Masters Thesis, IT University of Copenhagen, December 2015. Available at 378 | *arxiv.org/abs/1512.05616* 379 | 380 | 1. Shoshana Zuboff: 381 | “[Big Other: Surveillance Capitalism and the Prospects of an Information Civilization](http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2594754),” *Journal of Information 382 | Technology*, volume 30, number 1, pages 75–89, April 2015. 383 | [doi:10.1057/jit.2015.5](http://dx.doi.org/10.1057/jit.2015.5) 384 | 385 | 1. Carina C. Zona: 386 | “[Consequences of an Insightful Algorithm](https://www.youtube.com/watch?v=YRI40A4tyWU),” 387 | at *GOTO Berlin*, November 2016. 388 | 389 | 1. Bruce Schneier: 390 | “[Data Is a Toxic Asset, So Why Not Throw It Out?](https://www.schneier.com/essays/archives/2016/03/data_is_a_toxic_asse.html),” *schneier.com*, March 1, 2016. 391 | 392 | 1. John E. Dunn: 393 | “[The UK’s 15 Most Infamous Data Breaches](http://www.techworld.com/security/uks-most-infamous-data-breaches-2016-3604586/),” *techworld.com*, November 18, 2016. 394 | 395 | 1. Cory Scott: 396 | “[Data is not toxic - which implies no benefit - but rather hazardous material, where we must balance need vs. want](https://twitter.com/cory_scott/status/706586399483437056),” 397 | *twitter.com*, March 6, 2016. 398 | 399 | 1. Bruce Schneier: 400 | “[Mission Creep: When Everything Is Terrorism](https://www.schneier.com/essays/archives/2013/07/mission_creep_when_e.html),” *schneier.com*, July 16, 2013. 401 | 402 | 1. Lena Ulbricht and Maximilian von Grafenstein: 403 | “[Big Data: Big Power Shifts?](http://policyreview.info/articles/analysis/big-data-big-power-shifts),” *Internet Policy Review*, volume 5, number 1, March 2016. 404 | [doi:10.14763/2016.1.406](http://dx.doi.org/10.14763/2016.1.406) 405 | 406 | 1. Ellen P. Goodman and Julia Powles: 407 | “[Facebook and Google: Most Powerful and Secretive Empires We've Ever Known](https://www.theguardian.com/technology/2016/sep/28/google-facebook-powerful-secretive-empire-transparency),” *theguardian.com*, September 28, 408 | 2016. 409 | 410 | 1. [Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data](http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:31995L0046), Official Journal of the European Communities No. L 281/31, 411 | *eur-lex.europa.eu*, November 1995. 412 | 413 | 1. Brendan Van Alsenoy: 414 | “[Regulating Data Protection: The Allocation of Responsibility and Risk Among Actors Involved in Personal Data Processing](https://lirias.kuleuven.be/handle/123456789/545027),” 415 | Thesis, KU Leuven Centre for IT and IP Law, August 2016. 416 | 417 | 1. Michiel Rhoen: 418 | “[Beyond Consent: Improving Data Protection Through Consumer Protection Law](http://policyreview.info/articles/analysis/beyond-consent-improving-data-protection-through-consumer-protection-law),” *Internet Policy 419 | Review*, volume 5, number 1, March 2016. 420 | [doi:10.14763/2016.1.404](http://dx.doi.org/10.14763/2016.1.404) 421 | 422 | 1. Jessica Leber: 423 | “[Your Data Footprint Is Affecting Your Life in Ways You Can’t Even Imagine](https://www.fastcoexist.com/3057514/your-data-footprint-is-affecting-your-life-in-ways-you-cant-even-imagine),” *fastcoexist.com*, March 15, 424 | 2016. 425 | 426 | 1. Maciej Cegłowski: 427 | “[Haunted by Data](http://idlewords.com/talks/haunted_by_data.htm),” *idlewords.com*, 428 | October 2015. 429 | 430 | 1. Sam Thielman: 431 | “[You Are Not What You Read: Librarians Purge User Data to Protect Privacy](https://www.theguardian.com/us-news/2016/jan/13/us-library-records-purged-data-privacy),” *theguardian.com*, 432 | January 13, 2016. 433 | 434 | 1. Conor Friedersdorf: 435 | “[Edward Snowden’s Other Motive for Leaking](http://www.theatlantic.com/politics/archive/2014/05/edward-snowdens-other-motive-for-leaking/370068/),” *theatlantic.com*, May 13, 2014. 436 | 437 | 1. Phillip Rogaway: 438 | “[The Moral Character of Cryptographic Work](http://web.cs.ucdavis.edu/~rogaway/papers/moral-fn.pdf),” Cryptology ePrint 2015/1162, December 2015. 439 | 440 | -------------------------------------------------------------------------------- /ddia-poster.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wikibook/data-intensive-applications/0e9b3790f436c1628a515825c9f1149d7a765d14/ddia-poster.jpg -------------------------------------------------------------------------------- /ddia-poster.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wikibook/data-intensive-applications/0e9b3790f436c1628a515825c9f1149d7a765d14/ddia-poster.pdf --------------------------------------------------------------------------------