├── README.md
├── clojure
├── 20130821-testing-principles.md
├── 20130926-data-representation.md
└── 20130927-ns-organization.md
├── git
└── 20140403-git.md
├── ios
└── 20131009-profiling-using-instruments.md
└── swe
├── Migrations_and_future_proofing.md
└── migrations
├── 1.png
├── 1.svg
├── 2.png
├── 2.svg
├── 3.png
├── 3.svg
├── 4.png
├── 4.svg
├── 5.png
├── 5.svg
├── 6.png
├── 6.svg
├── 7.png
└── 7.svg
/README.md:
--------------------------------------------------------------------------------
1 | Prismatic's Engineering Practices Sessions
2 | =============
3 | Every week, all the Prismatic engineers get together for an Eng Practices session.
4 | We use it as a venue for us to each share our individual experience
5 | and talk about how we can become better software engineers.
6 | We've polished up some of our better discussions and released them here.
7 | Hopefully you too will find them to be a useful tool for improving your craft.
8 |
9 | ## Software Engineering
10 | - [Prismatic Git practices](git/20140403-git.md): our branching model, PR and commit rules, etc.
11 | - [Migrations and Future Proofing](swe/Migrations_and_future_proofing.md): recipes for migrations, and suggestions for forwards compatibility
12 |
13 | ## Clojure
14 | - [Data Representation in Clojure](clojure/20130926-data-representation.md): design choices for when to use maps, defrecords, deftypes, and reify.
15 | - [Readable Clojure ns Layout](clojure/20130927-ns-organization.md): conventions about the layout of a namespace file.
16 | - [Testing Principles](clojure/20130821-testing-principles.md): how and why to write good tests.
17 |
18 | ## iOS
19 | - [Profiling and Debugging](ios/20131009-profiling-using-instruments.md): an overview on how to effectively use instruments and the debugger
20 |
--------------------------------------------------------------------------------
/clojure/20130821-testing-principles.md:
--------------------------------------------------------------------------------
1 | #A Few Testing Principles
2 |
3 | (by [ihat](http://github.com/ihat))
4 |
5 | ##Agenda
6 |
7 | 1. [Why test?](#why)
8 | 2. [Different types of tests](#types)
9 | 3. [Real world testing](#real)
10 |
11 | ###1. Why test?
12 |
13 | #### A short story
14 |
15 | This past monday, @cpetzold and I paired on creating activity notifications for comment submissions.
16 |
17 | Turns out, the submit comment API endpoint contained a lot of (unrefactored) logic for not only the creation of the comment, but also for email notifications to the original poster. We wanted to extract functionality without breaking the endpoint.
18 |
19 | Fortunately, there were tests! @cpetzold and I refactored with confidence, and eventually slimmed down the submit comment test and created finer-grained unit tests around email notifications and activity notifications.
20 |
21 | #### Moral of the story
22 |
23 | The three goals of testing (from ["Fast Test, Slow Test"](http://www.youtube.com/watch?v=RAxiiRPHS9k), Gary Bernhardt)
24 |
25 | 1. Prevent regression
26 | 2. Prevent fear
27 | 3. Prevent bad design
28 |
29 | The short story shows:
30 |
31 | - (Prevent regression) We could introduce change that would not break current functionality.
32 | - (Prevent regression) "If you don't test your code, your customers will" ([Pragmatic Programmer](http://www.amazon.com/Pragmatic-Programmer-Journeyman-Master-ebook/dp/B000SEGEKI)).
33 | - (Prevent fear) The tests encouraged us to refactor aggressively.
34 | - (Prevent fear) The tests served as documentation and contract; got us up to speed faster.
35 |
36 | I'm skipping how testing drives the code design process (TDD) for now. But generally speaking, if you are able to write testable code, then chances are the code is loosely coupled and highly cohesive.
37 |
38 | #### Two more reasons!
39 |
40 | I learned these the hard way from past and current jobs.
41 |
42 | 1. The person writing the code, where the mistake happened, and the person testing the code, are different -- pain is not felt where it should be felt.
43 | 2. Consider the life-time value of code. Don't optimize for write-only.
44 |
45 |
46 | ### 2. Different types of tests: unit, integration and acceptance tests
47 | Following material is a summary of these two sources:
48 |
49 | - [Clean Code Talks](http://misko.hevery.com/2008/11/04/clean-code-talks-unit-testing/) by Misko Hevery (author of AngularJS)
50 | - [Growing Object-Oriented Software, Guided by Tests](http://www.amazon.com/Growing-Object-Oriented-Software-Guided-Tests/dp/0321503627) by Freeman, Pryce
51 |
52 | #### Analogy: Testing a car
53 |
54 | - Framework for testing a car: build a machine that pretends you're a driver, step on brakes and accelerator, etc.
55 | - Known as a scenario test
56 | - Prove the car works
57 | - The problem is with the execution
58 | - It's horribly slow
59 | - When something breaks, it doesn't pinpoint what's wrong
60 | - You'd want to test each main component separately
61 | - e.g., the powertrain, which consists of the engine, transmission, drive shafts, differentials, etc.
62 | - And you can want to isolate a component of the powertrain for testing
63 | - e.g., the engine
64 |
65 |
66 | #### Three types of tests defined
67 |
68 | Unit tests
69 |
70 | - What: Test individual methods in isolation
71 | - Answers: Do our functions do the right thing? Are they convenient to work with?
72 |
73 | Integration tests
74 |
75 | - What: Test collection of functions across namespaces as subsystems
76 | - Answers: Does our code work against code we can't change?
77 | - Examples:
78 | - In Rails / Django, these are typically the controller tests that require the database
79 | - Any test that involves actual MySQL, Mongo, third party software that we haven't provided a test substitute
80 |
81 | Acceptance tests
82 |
83 | - What: Test the whole system pretending to be the user
84 | - Answers: Does the whole system work?
85 | - Examples:
86 | - Spinning up selenium
87 | - api/system-test: spinning up server
88 |
89 |
90 |
91 | #### Trade-offs between these tests
92 |
93 | 
94 |
95 |
96 | Acceptance tests:
97 | - High confidence happy paths OK
98 | - Hard / slow to reproduce
99 | - Many things come into play
100 | - Highest customer / external feedback
101 | - Slow!
102 |
103 | Integration tests:
104 | - High confidence integrated sub-systems OK
105 | - Lower app coverage, need more of them
106 | - Still need to debug what went wrong
107 |
108 | Unit tests:
109 | - High confidence the function under test OK
110 | - Need lots of these (a good thing)
111 | - Quickest developer feedback
112 | - Shows developers the internal quality
113 |
114 |
115 | #### "Integration tests are a scam"
116 |
117 | (see [The Code Whisperer](http://blog.thecodewhisperer.com/2010/10/16/integrated-tests-are-a-scam/))
118 |
119 | - path count: 2^n number of tests for integration tests
120 | - any `if` statement, `try` ... `catch`
121 | - testing the whole thing
122 | - 500 conditions: 2^500
123 | - suite runtime is super linear. Integration test is super linear.
124 | - The worst feeling ever? Make a bunch of changes, then run system test or selenium test and
125 | a whole bunch of stuff broke. e.g., a random 500 response. worst. Start up the logger and
126 | off we go...
127 |
128 |
129 | ### 3. Real world testing
130 |
131 | > "There is no secret to writing tests... there are only secrets to writing testable code"
132 |
133 | > -- Misko Hevery
134 |
135 |
136 | #### In an ideal world
137 |
138 | - All functions are pure
139 | - All computation is local
140 | - Data in and data out (no side effects)
141 | - Testing is simple: you'd just assert the expected output given input
142 |
143 | The idealized view is:
144 | ```
145 | x -> (f) -> y
146 | ```
147 | ... where `x`, `y` are data and `(f)` is a function.
148 |
149 |
150 |
151 |
152 | #### But real life is messy
153 |
154 | - Real life is full of mutation and state, so we write functions that:
155 | - Have side effects, e.g.,
156 | - Creating a user record in a database
157 | - Issues an API request to mark docs as viewed
158 | - Depend on other functions that have side effects
159 |
160 | Our simple diagram above now looks like this:
161 | ```
162 | x -> (f) -> y
163 | |
164 | ---> (g)
165 | ```
166 |
167 |
168 |
169 | #### Rules of the game change when using functions with side effects
170 |
171 | - The order of evaluation matters
172 | - We need to know the context in which a function is run
173 | - Testing is no longer simple
174 |
175 |
176 |
177 | Scenario: How would you add functionality to the following untested code?
178 | - We would like to record public linkage events to Mixpanel
179 | - The API endpoints end up calling `record-public-linkage-event` below:
180 |
181 | ```clojure
182 | (ns user)
183 | (require '[clojure.test :as test])
184 | (require '[user-data.event-log :as event-log])
185 |
186 | ;; compile the other file
187 |
188 | (defn record-public-linkage-event
189 | [mongo-datastore env request user-id event-body]
190 | (let [event-type (:type event-body)
191 | public-id (:public-id event-body)
192 | event-payload (assoc event-body
193 | :munged_public_id (.replace public-id "-" "_")
194 | :user_id user-id)]
195 | (event-log/write-public-linkage-event mongo-datastore event-payload)))
196 | ```
197 |
198 |
199 | What is this function doing?
200 | - it munges `request`, `user-id` and `event-body` into a form suitable for logging (functional)
201 | - it calls `event-log/write-public-linkage-event`, a collaborating function
202 |
203 |
204 | Suppose we wanted to test this, how do we do it?
205 | 1. First write the expectation
206 | - `event-log/write-public-linkage-event` should receive `mongo-datastore` and the munged event data
207 |
208 | ```clojure
209 | (deftest record-public-linkage-event-test
210 | (record-public-linkage-event mongo-datastore env request user-id event-body)
211 | (testing "event-log/write-public-linkage-event should be called with transformed event"
212 | (should-receive event-log/write-public-linkage-event
213 | mongo-datastore env request user-id
214 | munged-event-body)))
215 | ```
216 |
217 |
218 | 2. Fill in the details...
219 |
220 | ```clojure
221 | (deftest record-public-linkage-event-test
222 | (let [mongo-datastore (gensym)
223 | env :test
224 | request {:cookies {"stage_p_public" {:value "foobar-cookie"}}}
225 | user-id (rand-int 10)
226 | event-body {:type "public-linkage" :key1 "val1"}]
227 |
228 | (testing "event-log/write-public-linkage-event should be called with transformed event"
229 | (record-public-linkage-event mongo-datastore env request user-id event-body))))
230 | ```
231 |
232 | But how do I assert? Use a test double! (For details see [Martin Fowler's blog](http://martinfowler.com/articles/mocksArentStubs.html)
233 |
234 | ```clojure
235 | (defmacro test-double
236 | "Creates a test double of a function given a fully qualified function name."
237 | ([] `(gensym))
238 |
239 | ([fqfn]
240 | (let [args (first (:arglists (meta (resolve fqfn))))
241 | args-count (count args)]
242 | `(let [received-args# (repeatedly ~args-count #(atom nil))]
243 | (reify
244 | clojure.lang.IFn
245 | (invoke [this# ~@args]
246 | (doseq [[arg# received#] (map list ~args received-args#)]
247 | (reset! received# arg#)))
248 |
249 | clojure.lang.ILookup
250 | (valAt [this# k# not-found#]
251 | (-> (map (fn [arg# received#] [(keyword arg#) @received#]) '~args received-args#)
252 | (#(into {} %))
253 | (get k# not-found#)))
254 |
255 | (valAt [this# k#] (.valAt this# k# nil)))))))
256 |
257 | (test/deftest record-public-linkage-event-test
258 | (let [mongo-datastore (test-double)
259 | env :test
260 | request {:cookies {"stage_p_public" {:value "foobar-cookie"}}}
261 | user-id (rand-int 10)
262 | event-body {:type "public-linkage" :key1 "val1" :public-id "foobar-cookie"}
263 | event-log-double (test-double event-log/write-public-linkage-event)
264 |
265 | expected-event {:type "public-linkage"
266 | :key1 "val1"
267 | :public-id "foobar-cookie"
268 | :munged_public_id "foobar_cookie"
269 | :user_id user-id}]
270 |
271 | (with-redefs [event-log/write-public-linkage-event event-log-double]
272 | (test/testing "event-log/write-public-linkage-event should be called with transformed event"
273 | (record-public-linkage-event mongo-datastore env request user-id event-body)
274 | (test/is (= mongo-datastore (:mongo-datastore event-log-double)))
275 | (test/is (= expected-event (:event event-log-double)))))))
276 | ```
277 |
278 |
279 | The `with-redefs` is *isolating* the function under test.
280 |
281 | Taking a step back, the only thing worth testing in this function is the data manipulation, and that's borderline trivial. The data manipulation part could in principle be decoupled and tested. Whether the components are hooked up may be tested via an integration test, but we don't get much benefit out of that test.
282 |
283 | Wrinkle in the above example. Turns out `event-log/write-public-linkage-event` had hidden dependencies defined elsewhere `event-log`. This is a good example of how implicit dependencies can really hurt testability and actual reasoning of the code.
284 |
285 | #### Holy grail
286 |
287 | Have a functional core (data manipulation)
288 | - Lots of decision paths, no dependencies, isolated
289 | - Conducive to unit testing
290 |
291 | Surrounded by an imperative shell (side effects)
292 | - Lots of dependencies, few decision paths
293 | - Conducive to integration testing
294 |
295 |
296 | One could actually make the argument that this code is not worth testing because it's so obviously correct.
297 |
298 | But now that we have a test in place, we know that changes to this will:
299 | - Prevent regression
300 | - Prevent fear of refactoring
301 |
302 | ```clojure
303 | (deftest record-public-linkage-event-test
304 | (let [mongo-datastore (test-double)
305 | env :test
306 | request {:cookies {"stage_p_public" {:value "foobar-cookie"}}}
307 | user-id 101
308 | event-body {:type "public-linkage" :key1 "val1"}
309 | event-log-double (test-double user-data.event-log/write-public-linkage-event)
310 | mixpanel-double (test-double mixpanel/enqueue-event!)
311 |
312 | expected-event {:type "public-linkage"
313 | :key1 "val1"
314 | :public_id "foobar-cookie"
315 | :user_id user-id}]
316 |
317 | (with-redefs [event-log/write-public-linkage-event event-log-double
318 | mixpanel/enqueue-event! mixpanel-double]
319 | (record-public-linkage-event mongo-datastore env request user-id event-body)
320 |
321 | (testing "event-log/write-public-linkage-event should be called with transformed event"
322 | (is-= mongo-datastore (:mongo-datastore event-log-double))
323 | (is-= expected-event-payload (:event event-log-double)))
324 |
325 | (testing "mixpanel/enqueue-event! should be called with transformed event"
326 | # I can now add a test to check whether mixpanel is called properly! Perhaps with more data transformation as well.
327 | ))))
328 | ```
329 |
330 |
331 | #### Other (inchoate) critiques of this function
332 |
333 | - I shouldn't need to know that `event-log/write-public-linkage-event` requires `mongo-datastore`
334 | - How can we best separate instantiation of object graph from business logic under test?
335 |
336 |
337 | #### Parting thoughts
338 |
339 | - Testing is all about trade-offs
340 | - [Test Driven Development (TDD) is a pragmatic choice](http://blog.8thlight.com/uncle-bob/2013/03/06/ThePragmaticsOfTDD.html)
341 | - When you are optimizing for very short term, TDD may be too costly, as in the case of:
342 | - GUIs
343 | - functions that are trivial and obviously correct
344 | - There are certain (few limited) cases where the benefits of testing are outweighed by the cost of maintenance.
345 | - Some things are also very hard to test, e.g., UI, especially when they are still being designed.
346 | - Many front-end projects don't have large testing components.
347 | - But remember, programmers are constantly in maintenance mode. Optimize your code for readability and maintenance, not the first 10 minutes of its life.
348 |
--------------------------------------------------------------------------------
/clojure/20130926-data-representation.md:
--------------------------------------------------------------------------------
1 | "Objects" in Clojure: design choices for data and (polymorphic) operations
2 | ----
3 | (by [w01fe](http://github.com/w01fe))
4 |
5 | In Clojure, there are a potentially daunting number of ways to represent a slice of data that would have been an Object in an OO-land. For example, we could represent a `Person` with a `firstName` and `lastName` as any of:
6 |
7 | - tuple: `["joe" "schmoe"]`
8 | - plain old map: `{:first-name "joe" :last-name "schmoe"}`
9 | - struct-map: (Clojure 1.0, basically subsumed by records, forget I mentioned them.)
10 | - defrecord: `(defrecord Person [first-name last-name])`
11 | - deftype: `(deftype Person [first-name last-name])`
12 | - reify: `(defn person [first last] (reify Human (first-name [this] first-name) (last-name [this] last-name)])`
13 |
14 | At first, I thought this session would just cover these data representations. But, the whole reason we care about data representation is because we want to make it easy to do the **operations** we want on our data -- thus, it makes no sense to think about data in the absence of functions. A complicating factor is that we sometimes want these functions to be **polymorphic** -- that is, work (differently) across a variety of different data types. Again, we are provided with a family of options:
15 |
16 | - plain old functions (with `instance?` and explicit conditional logic for polymorphism)
17 | - multimethods
18 | - protocols
19 | - raw Java interfaces
20 |
21 | We'll start by surveying these ingredients and their pros and cons independently, and then discuss some "recipes" for combining them in ways I've found fruitful in various circumstances.
22 |
23 |
24 | ## Operations
25 |
26 | While data abstractions are arguably more fundamental, we'll start by reviewing the options for operations on data. This way, we'll have full context when we get to the data types. Here's a table describing major features of the four options mentioned above.
27 |
28 |
111 |
112 |
113 |
114 | ## Data types
115 |
116 | Moving on to data types, we'll cover the features of the four main contenders above. We intentionally omit
117 |
118 | - tuples, which in our experience are rarely the right choice, and certainly aren't well-suited for polymorphism
119 | - `struct-map`s, which are deprecated
120 | - `proxy` and `gen-class`, which exist primarily for Java interop.
121 |
122 |
123 |
245 |
246 |
247 | ## Recipes
248 |
249 | Now I understand how all the pieces work. But how should I represent my Widget?
250 |
251 | This section will aim to provide some rules and heuristics for selecting appropriate data and operation types. Of course, please keep in mind that the answer is not not always clear-cut, since it involves the interactions of the above features, which can be even more complex and nuanced than the individual differences in data types and operations.
252 |
253 |
254 | ### No polymorphism
255 |
256 | You have a single data type, and one or more operations you want to perform on it.
257 |
258 | This is a relatively easy case, since you can mostly refer to the 'data types' table to figure out your best course of action. If you need primitive support (etc.), you should probably use a defrecord. If you (really) need mutable fields, you're stuck with deftypes. If efficiency is not a concern, maps are the simplest and easiest option.
259 |
260 | Regardless, if your data is shared across many namespaces, or serialized and stored or shared across process boundaries, you should probably have a concrete schema to refer back to. deftypes and defrecords give you some of this, but don't capture field constraints beyond primitive values, so it's prudent to use a schema library [[1]](#footnotes) to precisely describe your data for documentation and safety, which can work just as well with plain maps as more structured types.
261 |
262 | For operating on your data, plain old functions are the simplest and typically best option. We recommend that you use `safe-get` [[3]](#footnotes) to access fields, use docstrings and schemas to document your code, and organize your namespace into clear public and private sections.
263 |
264 | With this discipline under your belt, the only real benefits of using records and interfaces or protocols are primitive support, and the appearance of lexical scope for your data members. The price you pay for these features is the extra ceremony around declaring interfaces and data classes, plus (in my opinion) slightly decreased ease of use.
265 |
266 | If you do need polymorphism, however, ordinary functions are usually not a great choice. A single `instance?` check or `case` on `:type` isn't the end of the world, and sometimes is the simplest and cleanest solution -- but once these conditionals start appearing in multiple places, there's a good chance you're doing it wrong.
267 |
268 |
269 |
270 | ### Extreme polymorphism, or polymorphism without data types
271 |
272 | If you need polymorphism of an exotic form, where you're not just conditioning on the class of the first argument, then you need multimethods. In our experience, this is a pretty rare occurrence. Multimethods are so general that you can pick the data format best suited to your application on its own merits. If you want to use maps for your data representation, a simple `:type` field mapping to a keyword can be used for dispatch.
273 |
274 | Similarly, if you want an extensible method without a corresponding concrete data type, multimethods give you a way to declare open dispatch without tying you down to a concrete data representation. For example, you can make a function that dispatches on its first argument value (not class), which anyone can extend.
275 |
276 | Beyond these cases, you should probably think hard before using a multimethod. This is especially true if you have multiple polymorphic methods, since you'll need to repeat your dispatch logic in each multimethod if you choose this option.
277 |
278 |
279 | ### Maximum mungeability
280 |
281 | We've almost reached the end of the road for plain old maps as well. But before we get there, it's probably worth mentioning one more way to achieve polymorphism: storing functions as fields, ala JavaScript (or many languages that came before it).
282 |
283 | ```clojure
284 | (def my-obj
285 | {:foo (fn [this y] (bar (:baz this) y))
286 | :baz 12})
287 |
288 | ;; caller
289 | ((:foo my-obj) my-obj 12)
290 | ```
291 |
292 | This gives you flexibility to do crazy things that are difficult to achieve with more rigid interfaces and records; you can `merge` "objects", `assoc` new "methods", and bring all the other tools you usually use to manipulate data to bear on constructing your polymorphic objects.
293 |
294 | That said, this method is rather clunky and hard to understand. (Where the hell is the :foo function of my-other-obj defined?). We've only had one or two cases where we felt that we needed this power, and they've all been replaced with Graph [[3]](#footnotes), which has a similar model but abstracts away some of the complexity of this approach, when your object is really trying to represent a flexible computation process with many steps.
295 |
296 | And now, we've hit the end of the road for plain old maps.
297 |
298 |
299 | ### Maximum efficiency
300 |
301 | If you need maximal memory efficiency and/or unboxed primitives, you must use defrecord, deftype, or reify. If you need mutable members, you must use deftype. If you need custom equality semantics or map semantics, you must use deftype or reify. If you want efficiency and (less efficient) extensibility, you probably want defrecord.
302 |
303 | If you need complex logic that returns primitives, you need to use Java interfaces to work with these objects.
304 |
305 | If you need none of these things, you don't want `deftype` or `definterface`.
306 |
307 |
308 | ### Simple polymorphism
309 |
310 | We're left with a common case, where you do need polymorphism but don't require extreme performance or complex dispatch.
311 |
312 | We've already covered the cases where you want to use plain old functions, multimethods, and interfaces; and plain old maps and `deftype`s on the data type side. This leaves us with a single operation type, protocols (or `definterface+` [[2]](#footnotes), if you're concerned about [memory/perf implications of protocol dispatch](https://groups.google.com/forum/?fromgroups#!topic/clojure-dev/jKMliKIf9Fg)), and two data types, `defrecords` and `reify`.
313 |
314 | At this point, the decision is pretty simple, based on your use case.
315 |
316 | `reify` is simpler -- you don't have to explicitly name your fields since Clojure automatically captures things referenced in lexical scope, and you don't need to define a separate constructor function if you want constructor-like logic. The price you pay for this simplicity is that the objects you create are *opaque*: if you want access to any 'fields' you will have to create protocol methods for this purpose.
317 |
318 | On the other hand, `defrecord` is *transparent* -- people can examine your object, pull out fields, reason about the `Class` of your data, and so on.
319 |
320 | Thus, the choice of `reify` or `defrecord` primarily comes down to what you are trying to represent; if you're primarily concerned with *data*, you probably want `defrecord`, whereas if you only care about *behavior* then `reify` may be a simpler choice.
321 |
322 |
323 | ### Abstract data members.
324 |
325 | We're basically done with our tour, but there's one issue we haven't touched on yet: abstract data members. What if you have multiple data types (posts and URL documents, employees and customers, etc.) and want a data-centric interface (all documents have titles, all people have names, etc.). None of Clojure's data types allow for implementation inheritance, so if your employers and customers are separate records, you're out of luck for getting the static checking of Java-style field access (.first-name r).
326 |
327 | In this case, there are three options at your disposal, none of which is really ideal:
328 |
329 | - Use an informal interface (a.k.a docstring): "All people have :first-name and :last-name keys". This should probably be backed by schemas [[1]](#footnotes) and liberal use of `safe-get` [[3]](#footnotes) to ensure your data measures up.
330 | - Be oh-so-formal: declare a protocol full of 'getter' functions, and fill each of your records with methods like `(first-name [this] first-name)`. This pain can sometimes be alleviated by defining a *single* record or reify that goes through this ceremony, and letting each of your 'objects' share this single constructor -- they can even pass in functions that the single implementation delegates to, if you need limited polymorphism.
331 | - If you're not concerned with polymorphism but just a hierarchy of data types, you may be able to flip things around so that all data types are represented with a single `defrecord`, that has a `:type` and `:type-info` fields to allow extensibility.
332 |
333 |
334 | -----------------
335 |
336 |
337 |
338 | 1. [Schema](https://github.com/plumatic/schema) is a library for declaring data shapes, and annotating functions with input and output schemas. Besides their other benefits, Records have the documentation advantage of having a concrete description that is type-hintable; among other things, schema brings these same benefits to ordinary Clojure maps.
339 | 2. [Potemkin](https://github.com/ztellman/potemkin) provides some great tools for dealing with interfaces, protocols, records, and so on. In particular, it provides variants of `defprotocol` and `defrecord` that are more repl-friendly, and an implementation of `definterface` that's a drop-in replacement for `defprotocol`, allowing full primitive support with automatic wrapper functions (but without the open-ness of protocols, of course).
340 | 3. [Plumbing](https://github.com/plumatic/plumbing) is a library of Clojure utility functions, including Graph, a tool for declarative description of functional processes.
341 |
342 |
343 |
344 |
--------------------------------------------------------------------------------
/clojure/20130927-ns-organization.md:
--------------------------------------------------------------------------------
1 | Readable Clojure ns Layout
2 | ----
3 | (by [aria42](http://github.com/aria42))
4 |
5 | It's crucial to create namespaces that not only do their job, but minimize the amount of time it takes a fresh pair of eyes to understand the goal of the namespace, what its abstractions are, and how to use it.
6 |
7 | Clojure provides virtually no constraints about how the layout of a file should be. You could for instance have multiple namespaces in a file. (Please don't ever do this or you will be ostracized.)
8 |
9 | For these reasons, it's crucial to have conventions about the layout of a namespace file. Below is an example of what a good `ns` structure might look like:
10 |
11 | ```clojure
12 | (ns flop.linear-regression
13 | "Standard least squares linear regression algorithm.
14 |
15 | Usage (only public function):
16 | (learn-linear-regression
17 | [[50.0, {:slobber 1 :furry 1}],
18 | [20.0, {:meowy 1 :furry 1}]])
19 | > {:meowy -3.3333315890376487,
20 | :furry 23.33332778138144,
21 | :slobber 26.666659370419094}
22 |
23 | Uses numerical optimization on least squares objective and internally
24 | indexes features, so should work on very large problems."
25 | (:use plumbing.core)
26 | (:require
27 | [schema.core :as s]
28 | [flop.optimize :as optimize]
29 | [flop.array :as array]))
30 |
31 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
32 | ;;; Schemas
33 |
34 | (def SparseFeatureVector
35 | {Object double})
36 |
37 | (def LabeledRegressionExample
38 | [(s/one double "target value")
39 | (s/one SparseFeatureVector "sparse feature vector")])
40 |
41 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
42 | ;;; Feature Indexing
43 |
44 | (defn build-feat-index [training-data]
45 | (let [all-feats (vec (set (mapcat (comp keys second) training-data)))
46 | feat-idx (into {} (map-indexed (fn [i x] [x i]) all-feats))]
47 | [all-feats feat-idx]))
48 |
49 | (defn indexed-feat-vec [feat-idx fv]
50 | (for [[f v] fv
51 | :let [i (feat-idx f)]
52 | :when i]
53 | [i v]))
54 |
55 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
56 | ;;; Objective Function
57 |
58 | (defn least-square-predict [weights indexed-fv]
59 | (sum (fn [[feat-idx v]]
60 | (* (array/aget weights feat-idx) v))
61 | indexed-fv))
62 |
63 | (defn least-square-objective [num-feats indexed-training-data weights]
64 | (let [grad (double-array num-feats)
65 | val (sum
66 | (fn [[target idx-fv]]
67 | (let [guess (least-square-predict weights idx-fv)
68 | diff (- guess target)]
69 | (doseq [[feat-idx v] idx-fv]
70 | (array/ainc grad feat-idx (* diff v)))
71 | (* 0.5 diff diff)))
72 | indexed-training-data)]
73 | [val grad ]))
74 |
75 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
76 | ;;; Public
77 |
78 | (s/defn learn-linear-regression :- SparseFeatureVector
79 | "Takes a sequence of LabeledRegressionExamples and returns a SparseFeatureVector from feature keys to their learned least-squares double value"
80 | [training-data :- [LabeledRegressionExample]]
81 | (let [[all-feats feat-idx] (build-feat-index training-data)
82 | indexed-training-data (map (fn [[target fv]]
83 | [target (indexed-feat-vec feat-idx fv)])
84 | training-data)
85 | weights (optimize/lbfgs-optimize
86 | (partial least-square-objective (count all-feats) indexed-training-data)
87 | (double-array (count all-feats))
88 | {:print-progress true :max-iters 1000 :thresh 1e-6})]
89 | (for-map [[f i] feat-idx]
90 | f (array/aget weights i))))
91 | ```
92 | The conventions here aren't without exception. In particular, for a namespace like `plumbing.core` which is exceptional for a number of reasons, this wouldn't make sense, but it's the exception and not the rule. If you generate a new namespace, you should have a good reason to deviate.
93 |
94 | Let's go over this file section by section.
95 |
96 | ## `ns` form
97 |
98 | A good layout starts with the `ns` form at the top of each file.
99 |
100 | ### Always format dependencies in namespace form in `use`/`require`/`import` order with each section sorted by topological dependency:
101 |
102 | Make sure when you `:require`, that you use the last element of the fully-qualified `ns` unless (1) the last element is `core`, in which case use the 2nd to last or (2) conflicts with another required `ns`.
103 |
104 | #### JVM Clojure
105 | The only thing you should `use` is `plumbing.core` and in tests, you can use `clojure.test` and the tested ns. Exceptions to the `:require` above exists for `schema.core`, which can be required as just `s`.
106 |
107 | #### ClojureScript
108 | There is no `use`. Ensure your `require`s are dependency-sorted.
109 |
110 | ### Use an `ns` doc string to describe the purpose of the namespace
111 |
112 | A good ns doc-string can tell a reader about the basic purpose of the namespace as well as highlight important public functions and namespace conventions. Things you might want to include in the `ns` doc-string
113 |
114 | * a 1-2 sentence description of the purpose of the namespace
115 | * example usage
116 | * what is meant to be used by clients? Protocols, functions, schemas, etc.
117 | * more detailed information about implementation
118 | * namespace internal conventions about variables
119 |
120 | ## Separate code into pragma blocks
121 |
122 | Actual `ns` code should be separated into 'pragma' blocks as much as possible to allow for high-level navigation (we can even have a keyboard command to show an outline mode of the file and jump to a pragma block). For instance,
123 |
124 | ```clojure
125 | ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
126 | ;;; Data serialization
127 |
128 | (defn load-articles [^java.io.File in-file]
129 | ...)
130 |
131 | (defn write-articles [^java.io.File out-file articles]
132 | ...)
133 | ````
134 | to delimit the serialization functions.
135 |
136 | There should be an overall structure of pragma blocks and there should be a convention about the names of pragma blocks we expect to re-use between files, especially for code which is meant to be read/used by clients of the `ns`.
137 |
138 | * Schemas: All the schemas that are part of the interface of the `ns` (meaning that others could use the schema for their own functions or these schemas are part of the arguments or return values to public functions in your namespace)
139 | * Protocols: All the protocols which are part of the interface of the `ns` (either because client `ns` will provide record/type implementations of the protocols or use them via objects you provide).
140 | * Records/Types: Any records or types you provide (in the case of records, you are also saying the generated factory functions are part of your public interface).
141 | * Public: There should be a `Public` pragma block at the bottom of your file which is meant to be the public functions of your namespace. The only reason this goes at the bottom is because all the private functions you are using to support these functions must come earlier (Using forward `declare` for all your private functions is not tenable). There are special more stringent rules for Public functions (see below).
142 |
143 | You can and should organize all your private/internal code into descriptive pragma blocks even though they aren't part of the interface.
144 |
145 | ## The Public Pragma Block
146 |
147 | Virtually all `ns` should have a Public pragma block which is what you should read after the `ns` form itself and the public protocols/schemas/records/types (see pragmas above). All functions in Public should obey the following:
148 |
149 | * All should have a doc-string describing what they take and return
150 | * For complex parameter or return types, type-hint using the schemas/protocols/records/types from the pragmas at the top.
151 |
152 | The golden rule for public functions really is **they should be understandable after reading the `ns` form, the top protocol/schema/record pragmas, and the function doc-string**. You shouldn't have to read any of the rest of the files or other files (unless they are explicitly used in the exposed schemas/protocols/records) to understand what the function does.
153 |
154 |
155 |
--------------------------------------------------------------------------------
/git/20140403-git.md:
--------------------------------------------------------------------------------
1 | # Prismatic and Git
2 |
3 | ## What do we like?
4 |
5 | Here, we start zoomed out at history, and zoom into the level of individual commits, looking at what we think are good and bad examples of each.
6 |
7 | ### History
8 |
9 | We care about history because it helps us understand where code came from and how it evolved. Keeping a clean history makes these tasks (sometimes much) easier. Git comes with tools like git-bisect that can only usefully be used with **linearizable histories where the tests pass on every commit**. In real-life development is nonlinear; that's why we have branches; but at the end of the day we'd like to create the *appearence* that development was done linearly, without losing the benefits of a distributed workflow (or forgetting how the work was actually accomplished).
10 |
11 | We can probably all agree that we don't want our history to look like this:
12 |
13 |
14 |
15 | And (arguably) ideally, we'd like our history to look like this:
16 |
17 |
18 | ```
19 | feat1 o-o-o-o-o-o PR
20 | / |
21 | master - - o - - - - - o - - - - - o - -
22 | \ |
23 | feat2 o-o-o-o-o-o PR
24 | ```
25 |
26 | Note that despite the apparent branching and preservation of the context of each commit, master is actually a linear history of commits that can be easily bisected and understood. Most of the rest of this document is about how and why we do this.
27 |
28 |
29 | ### Pull Requests
30 |
31 | Pull requests are one of the most important ways we communicate about, share context on, and get feedback on code.
32 |
33 | A good pull request:
34 |
35 | * is made as soon as a meaningful, mergeable chunk of work can be completed, to minimize the possibility of conflicts with other team members
36 | * accomplishes a single product or engineering goal (or fraction thereof)
37 | * is composed of a sequence of good commits (see next section), with the tests passing after each commit, and no merge commits if possible
38 | * once you introduce merge commits, you can no longer rebase, and you will introduce a permanent nonlinearity into master.
39 | * every commit should ideally pass the tests. later we'll explain how to reconcile this with 'commit early, commit often'.
40 | * comes with a descriptive message explaining:
41 | * what product or engineering goal it accomplishes (with links to necessary context)
42 | * anything that might be unclear or unusual about the code (although this should probably be documented in the code as well)
43 | * a description of how the code was tested (hopefully locally and/or on dogfood), with exposition of possible things that could go wrong that you tested.
44 | * extra description if a change is irreversable (i.e. produces backwards-incompatible data) or introduces deploy dependencies
45 | * is of manageable size, so it can be meaningfully reviewed quickly
46 | * (except when necessary, e.g. for large refactors -- which should still be broken down and described in the commit message as much as possible)
47 |
48 | Reviewing PRs should be a high priority for your team members, so strive to make this easy for them.
49 |
50 |
51 | ### Commits
52 |
53 | Not surprisingly, what makes a good commit (on master) is basically the same as what makes a good PR. This doesn't mean that you shouldn't make commits that don't follow those rules locally; you should just attempt to clean them up before submitting for PR.
54 |
55 | A good commit:
56 |
57 | * accomplishes a single, coherent objective (e.g., add a fn and test; add a ns and test; refactor to rename a fn; change behavior to do x)
58 | * begins and ends in a state where the tests all pass
59 | * comes with a descriptive message explaining what the commit accomplishes.
60 | * These should be sufficiently detailed to make sense in a `git log` on master.
61 | * "Fix bug" and "WIP" do not count
62 | * Should start with a one-line summary, and can (and often should) contain additional lines or bullets to explain more context about the change. For more details on what makes a good commit message refer to this [blog post] (http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
63 |
64 |
65 | ## How do we make nice things?
66 |
67 | ### Clean History
68 |
69 | We're based almost exactly on the [GitHub Flow](http://scottchacon.com/2011/08/31/github-flow.html) model. Ignore the talk about git-flow in that post, and start at "GitHub Flow". You should probably read this whole document. Reproducing the summary here:
70 |
71 | * Anything in the master branch is deployable
72 | * To work on something new, create a descriptively named branch off of master (ie: new-oauth2-scopes)
73 | * Commit to that branch locally and regularly push your work to the same named branch on the server
74 | * When you need feedback or help, or you think the branch is ready for merging, open a pull request
75 | * After someone else has reviewed and signed off on the feature, you can merge it into master (after rebasing it on the latest)
76 | * Once it is merged and pushed to ‘master’, you can and should deploy (at least to staging) immediately, so that bugs and other unanticipated issues are caught as quickly as possible.
77 |
78 | The only difference from GitHub is that at Prismatic we currently only deploy to staging after every merge, with periodic prod deploys (at least once a week), but the principal idea is still that production should be deployable at any time from master.
79 |
80 | Note that in this model, no code is ever committed to master, only merged from branches. If every branch is rebased on master and merged using `--no-ff` (which github does), we get the clean history model shown above.
81 |
82 |
83 | ### Branches/PRs
84 |
85 | Our workflow for making branches that follow the rules laid out above are:
86 |
87 | * first `git checkout master`, `ppull` to get the latest (a shortcut for `git pull && git submodule update --init --recursive` run at the repo root), and finally `git checkout -b my-awesome-branch` to get your branch started
88 |
89 | * Do your work. Commit as early and often as you like, make "WIP" and "bug fixes", do whatever makes you happy
90 | * While working, `git fetch` and `git rebase origin/master` often to keep your branch up-to-date so you don't end up with huge merge conflicts later.
91 | * **Do not merge master into your branch**, since merge commits clutter up history, make it nonlinear, are difficult to remove later, and make it hard to apply post-production to your branch before sending it up for PR.
92 |
93 | * When the tests pass and you're ready to ship your code or share it with the rest of the team, make it look nice and follow the rules above. For an overly pedantic treatment of the subject, [this is a great resource](http://sethrobertson.github.io/GitPostProduction/gpp.html). Typically, a single `get rebase -i HEAD^N` where `N` is the number of commits on your branch is all you need to reorder and squash commits so that everyting left represents a good state with a reasonable commit message
94 |
95 | * Finally, rebase on master one last time and put it up for PR.
96 |
97 | * Address PR comments. If you disagree with a comment, now is the perfect time to have a discussion about the points made, and add an entry in or update the style guide if necessary.
98 |
99 | * If significant time passes before making a PR and its approval, rebase on master one more time; then merge into master.
100 |
101 | * Deploy to staging! And production?!
102 |
103 | One point is that if you're collaborating on a branch, you must be careful with rewriting history. You should never rebase work that is shared with someone else, without careful coordination first.
104 |
105 |
106 | ### Commits
107 |
108 | For commits, it should be pretty straightforward to just follow the above rules. Some common gotchas and tools to make this nicer:
109 |
110 | * Always use `git status` before committing. It will tell you if you have unsaved files in emacs, or new files you're forgetting to add.
111 | * If you forget to put something in a commit or want to change the message, `git commit --amend` is your friend. You can use this in a "commit early, commit often" workflow where you continually amend your commit until it's in tip-top shape.
112 | * If you make a bunch of unrelated changes, you can still break them into separate commits. You can use `git add` on individual files to stage your ideal commit, or even `git add -p` to stage changes to *parts* of files. Use `git diff --staged` to see exactly what you've staged before you commit.
113 |
114 |
115 |
116 |
117 |
118 | ## Elsewhere in the doc?
119 |
120 | Choose your own adventure on how to fix mistakes:
121 |
122 | http://sethrobertson.github.io/GitFixUm/fixup.html
123 |
124 |
125 |
126 | ```
127 | [alias]
128 | lg = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit
129 | lga = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --all
130 | lgd = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit -p
131 | df = diff --color --color-words --abbrev
132 | [color]
133 | branch = auto
134 | diff = auto
135 | interactive = auto
136 | status = auto
137 |
138 | [core]
139 | editor = /Applications/Emacs.app/Contents/MacOS/bin/emacsclient -t -a=\\\"\\\"
140 | [push]
141 | default = simple
142 | ```
143 |
--------------------------------------------------------------------------------
/ios/20131009-profiling-using-instruments.md:
--------------------------------------------------------------------------------
1 | Profiling and Debugging
2 | ===============
3 | (by [AshFurrow](http://github.com/AshFurrow))
4 |
5 | # Instruments
6 |
7 | Instruments is a tool that ships with Xcode that profiles your app for different characteristics. There are different profilers you can use from within Instruments – we'll cover the common ones later. For now, we need to make an important point.
8 |
9 | Deferred Mode
10 | ----------------
11 |
12 | Instruments profiles your app on the *device on which it is being profiled*. That means that if you profile an app running on a MacBook Pro, you're not going to get the same results as when you profile on an iPhone 4.
13 |
14 | Additionally, Instruments collecting data on-device and transferring it to the computer carries an not-insignificant cost. It's possible to put Instruments into *deferred mode*, so that it will collect appropriate usage information on-device and wait until profiling has stopped to transfer it to the computer for analysis. A downside to using deferred mode is that you don't see real-time results (you have to wait until the profile has completed).
15 |
16 | To enable deferred mode, open Instruments' preferences and check the top box "Always use deferred mode."
17 |
18 | What's more, some profilers are available on one platform, but not the other. Play around – profiling is fun!
19 |
20 | Traces
21 | ----------------
22 |
23 | Data collected and interpreted by Instruments is stored in a file called a *trace*. These can be saved for later viewing. A trace can also contain more than run of a profiling session. To see all the runs in a trace, expand the left-arrow to the left of a profiler.
24 |
25 | 
26 |
27 | This is useful for verifying that a change to the code resulted in a desirable change to the run.
28 |
29 | Time Profiler
30 | ----------------
31 |
32 | The *Time Profiler* is one of the most useful profilers in Instruments. It is used to examine, on a per-thread basis, the activity occurring on the CPU.
33 |
34 | 
35 |
36 | This is graph of the CPU time spent executing code from your app. The bottom part of the screen displays a tree of every method invocation on a per-thread basis. The easiest way to find the heaviest (most expensive, time-wise) method stack trace on a thread is to open the *Extended Detail* pane.
37 |
38 | 
39 |
40 | It's a wonder this isn't open by default. The Extended Detail pane shows the heaviest stack trace belonging to the currently selected leaf of the tree in the main view. System libraries are greyed out while your application code is in black.
41 |
42 | By default, the Time Profiler displays information about the *entire duration* of the trace. To focus on a specific part of the trace, hold the ⎇ key and click-and-drag.
43 |
44 | 
45 |
46 | As we can see here, the most expensive operation is the `LineHeightLabel`'s `sizeToFit` method invocation(s).
47 |
48 | Core Animation Profiler
49 | ----------------
50 |
51 | The Core Animation template contains two profilers: a *Core Animation Profiler* for measuring screen refresh rates and the Time Profiler from the last section. This tool is very useful when you notice dropped frame rates.
52 |
53 | In general, it's desirable to maintain a screen refresh rate of 60 frames per second. That means that, in between screen refreshes, your app needs to spend fewer than 16 milliseconds executing.
54 |
55 | There are also options on the left side of the window to color views that are drawn offscreen, etc.
56 |
57 | *Note*: The Core Animation Profiler is only available on actual devices, *not* on the simulator.
58 |
59 | Allocations Profiler
60 | ----------------
61 |
62 | The *Allocations Profiler* is useful for measuring the total amount of memory used by an application. It contains two measurement tools: an allocations tool for measuring the total amount of memory in use by objects within an application, and a VM Tracker for measuring the total virtual memory in use by the system.
63 |
64 | The VM Tracker is off by default and needs to be enabled (VM scans are expensive).
65 |
66 | 
67 |
68 | A common and useful approach to using the Allocations Profiler is to perform some action, undo that action, and repeat. For example, tap a user name to push a new view controller onto the navigation stack, then tap the back button, and repeat. The memory use before the action is performed should be equivalent to the memory use after the action has been undone. This will help you detect memory leaks (often caused by reference cycles).
69 |
70 | Another useful technique is to simulate a memory warning from the simulator to verify that your application responds accordingly by freeing up memory. Unfortunately, there is no (documented) way to simulate a memory warning on an actual device ([wink wink](http://stackoverflow.com/questions/12425720/a-way-to-send-low-memory-warning-to-app-on-iphone)).
71 |
72 | Leaks Profiler
73 | ----------------
74 |
75 | The *Leaks Profiler* helps you find memory leaks in your application code. It includes the Allocation Profiler (sans VM tracker).
76 |
77 | 
78 |
79 | Any leaks found are shown in the Leaks Profiler as red bars. Focus (⎇-click-and-drag) and select the Leaks profiler for more information.
80 |
81 | Open the Extended Details pane and select the leaked object to see the stack trace of the where the leak was performed. In this example, it's a problem with `stringWithURIEncoded`.
82 |
83 | `
84 |
85 | # Debugger
86 |
87 | Xcode 5 ships with the LLDB debugger, which is automatically attached to a running application. You can add breakpoints to your project by clicking on the gutter to the right of your file.
88 |
89 | 
90 |
91 | You can see all of the breakpoints set in your current workspace with the *Breakpoints Navigator*.
92 |
93 | 
94 |
95 | Active breakpoints are in dark blue and inactive breakpoints are ghosted out.
96 |
97 | When your application hits a breakpoint, the app will pause and the debugger becomes active.
98 |
99 | 
100 |
101 | You can use the pane on the left to navigate objects, print their descriptions, etc. The console on the right side of the debugger pane, which can be shown and hidden with ⌘⇧Y, is the debugger. Here you can continue (`c`) execution until the next breakpoint, step to the next line (`n`), and print pointer descriptions (`p`).
102 |
103 | If it can determine it, LLDB will cast structs to their type. So `p size` becomes `p (CGSize)size`. Sometimes, LLDB can't suss out what type to cast to, so you might have to do it yourself (i.e.: `p (CGSize)[someObj someMethodReturningSize]`).
104 |
105 | You can also print out object descriptions with the `po` command: `po object` will call `object`'s `description` and print that out to the debugging console. This is useful for views because their description contains their frame.
106 |
107 | The debugger can be used to invoke methods, like `p [obj method]`, or `p obj.property`.
108 |
109 | The debugger doesn't have access to `enum`s or `#defines`. This makes this ... tricky sometimes. For example, if you've received data from an API and you want to turn it into an `NSString`, you'd typically use `[[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding]`, but that encoding is an `enum` member, so you have to use `4`, instead.
110 |
--------------------------------------------------------------------------------
/swe/Migrations_and_future_proofing.md:
--------------------------------------------------------------------------------
1 | # Migrations and Future Proofing
2 |
3 | You've implemented a new feature, which involved refactoring some existing data formats or server handlers. Everything looks good, the tests all pass, and (if applicable) the latest clients (frontend and/or backend) all seem to work with the latest servers. Ship it?!
4 |
5 | Not so fast.
6 |
7 | If your changes involve formatting changes to **data at rest**, and/or interface changes to servers with **clients running in other processes**, you also have to worry about (and test) how your changes will interact with **old data or code**.
8 |
9 | This document first talks about the issues (**inertia**), how to deal them when they come up (**migrations**), and then leads into a discussion of best practices (**forward compatibility**) to reduce the need for or difficulty of future migrations.
10 |
11 | ## Examples
12 |
13 | Not all code has inertia. For example, consider a one-off script that you run on your laptop. The code is self-contained, has no other processes relying on it, and can effectively be changed and rerun at any time.
14 |
15 | At Prismatic, we run dozens of highly availabile services in the cloud, process and store millions of documents every day, and regularly ship client apps to customers' iOS and Android devices. Here are some hypothetical situations illustrating how inertia of code and data can manifest in practice in this more complex environment:
16 |
17 | - We developed a new feature, which required a small change to the data returned by an handler in our API server. Compatibility was tested against the latest version of our iOS app, after which the change was deployed to production. Immediately, 5% of our users' iOS apps crashed, because they turned off auto-update and are still on version 1.0, which turned out to be incompatible with the modified API handler.
18 | - We have many terabytes of documents in a data store, which we both write to and read from in production. A migration job is started to add a new field to each document, and the existing production code starts crashing when it sees documents with the new field. It seems like any step in the right direction (migrate the data, deploy code that expects the new field) will fail; at the same time, we also don't want to resort to scheduled downtime to do the migration.
19 |
20 | Later, we describe concrete recipes for dealing with these and related problems without downtime.
21 |
22 | # Inertia (why do we have to deal with old stuff?)
23 |
24 | ## Old code
25 |
26 | Since code in a single process is deployed atomically, breaking changes localized to code running in a single process are typically safe. But when we change APIs that cross process boundaries (such as updating the interface for an API handler), we have to think about the transitional period where both old and new code is running at the same time.
27 |
28 | In the worst case, some of the old code is not under our control. We have external clients for our APIs, which could range from a web app (possible to force-refresh) to an iOS app (possible to ship updates) to third-party API clients (not our code). In the latter two cases, we have no explicit control over when old code is replaced with new, and might have to support old API clients for months or years after we might wish they were dead and gone.
29 |
30 | Moreover, even in the best case where all affected code is running under our control (e.g., on EC2), it's often impossible (or at least unwise) to update all affected processes **simultaneously** (as we'll discuss shortly). Thus, we typically still have to think about the interaction of old and new code even in this "best" case scenario.
31 |
32 |
33 | ## Old data
34 |
35 | Old data is in some ways better, and many ways worse, than old code. On the upside, at least the data is typically under our control; we can choose to rewrite it at any time. Downsides are:
36 |
37 | - Sometimes the data is not actually under our control. For example, consider urls linked from emails we have sent, or (to a lesser extent) data stored on iOS clients we control.
38 | - Data can be **big**; it might take hours or days to migrate a data store from an old format to a new format, and we probably want our application to continue working throughout the migration process.
39 | - Data often must be **consistent** -- we often want all code (old and new) interacting with a given piece of data to reflect the latest value at a given point (including updates made while the migration is ongoing). For example, if a user adds a comment to a document after the document has been migrated to a new format with new keys, it's not acceptable for processes during or after the migration to fail to report the new comment.
40 | - Data is **dumb** -- we can try to make our new code smart about interacting with old code, but there's often no parallel to this for our new data formats.
41 |
42 |
43 | ## Simultaneity and backwards compatibility
44 |
45 | If we could simultaneously replace all our old code with new code, and (as applicable) upgrade all our old data to a new format, we'd be happy campers (and this document would be much shorter). Unfortunately, this is typically impossible, and even when it's possible it's often unwise.
46 |
47 | - If we introduce incompatibilities with code or data that's not under our control (e.g., old iOS app versions), it's obviously impossible to ensure simultaneity.
48 | - If we introduce incompatibilities of code with large datasets, it's probably impossible to update the code and migrate all the data simultaneously without downtime.
49 | - Even if there is no data involved and all the code is under our control, we probably can't redeploy *everything* simultaneously without downtime (unless we're willing to bring up a new copy of the entire system and then move people over).
50 | - Even if in principle we can update all the code (and data) simultaneously, if something goes wrong it may be very difficult to **roll back** a big bang release, especially when data format changes are involved.
51 |
52 | When we cannot update all our code (and data) simultaneously, we have to ensure a *deployment strategy* that provides a **path of backwards compatibility**. This is a sequence of steps, where at each step all running code is able to interoperate with the other code and data visible at this step.
53 |
54 |
55 |
56 | This example shows a simple system, where we have two processes that read and write from a data store. There is a server that reads and writes to the same datastore and provides an API for clients (e.g, our iOS app). In this case the system is "happy", since all all communication links are compatible (indicated by green links). Conceptually, our deployment strategy must ensure that at every step the entire system stays "happy" in this way.
57 |
58 | The next section describes recipes for such deployment strategies.
59 |
60 |
61 | # Migrations (scenarios and strategies)
62 |
63 | This section covers various scenarios and strategies for providing a path of backwards compatibility, in roughly increasing order of difficulty.
64 |
65 | ## Full forwards and backwards compatibility
66 |
67 | The best-case scenario is full compatibility. All old code can interact with all new code and data, and all new code can interact with all old code and data out there.
68 |
69 |
70 |
71 | In this happy scenario, new code can be deployed at any time, in any order. Achieving this result often requires careful forethought (and sometimes incurs an accumulation of cruft, which can often only be removed by breaking changes). A common significant change that can sometimes be carried out with full compatibility is the addition or removal of new optional keys/columns; this works when old code is sufficiently flexible to handle the new data as-is. More on this in the next section on forward compatibility.
72 |
73 | ## Breaking code changes (no data changes)
74 |
75 | As mentioned above, breaking changes to code are typically easier to deal with than breaking changes to data formats.
76 |
77 | ### Deploy dependencies (additions only)
78 |
79 | If your change only involves adding new methods to an API, or a new data format, old clients can often continue to function as-is. For example, perhaps we add a new handler to our API to support of a new feature in our iOS app. In this case, the typical complication is the introduction of a **deploy dependency** -- the server with the new APIs (or producer of the new data) must be deployed before all consumers of the new APIs/data. The only exception is if the consumers are made robustly backwards compatible (able to gracefully handle missing new APIs/data), in which case we're in the previous happy situation.
80 |
81 |
82 |
83 | The basic steps in this deployment strategy are are:
84 |
85 | 1. The initial situation, with old server and client
86 | 2. The new server(s) are deployed, which are backwards-compatible with the old client
87 | 3. The new client(s) are deployed, which can rely on the functionality in the new server
88 | 4. (optional) At some later point, the old client is fully replaced by the new
89 |
90 | Deploy dependencies are relatively harmless, but should be handled carefully to ensure that the new code is not deployed in the wrong order. Whenever possible, server and client changes should be made in separate changesets, loudly *declared* in pull requests, and the client change should only be merged to a production branch *after* the server change has been merged and fully deployed.
91 |
92 | As we'll discuss in the final section, extensions to existing methods (such as the addition of new fields to responses) can often be handled in this setup if clients are carefully designed to be forward-compatible with such changes.
93 |
94 |
95 | ### API versioning (breaking code changes)
96 |
97 | Suppose we have an existing API method `/interests` that returns a list of `String`s to describe a user's interests. A new feature of the iOS app requires more information about each interest, so we want to change this to a list of maps like `{'name':foo, ...}`. Breaking changes like this introduce fatal cycles into the "deploy dependency graph", since new clients need new server APIs and old clients need old APIs, but all the old code can't usually be replaced simultaneously.
98 |
99 |
100 |
101 | Assuming there are no changes to data at rest, these changes can be converted into the happier "additions only" scenario above by using **API versioning**, (e.g., adding a new method `v2/interests` rather than making breaking changes to `v1/interests`).
102 |
103 |
104 |
105 | The deployment strategy here is the same as in the previous section:
106 |
107 | 1. The initial situation, with old server and client
108 | 2. The new server is deployed, with the old endpoint preserved for the old client and a new endpoint for the new client
109 | 3. The new client is deployed, which uses the new endpoint
110 | 4. (optional) At some later point, the old client is fully replaced and the old endpoint can be removed from the server.
111 |
112 | The cost of API versioning is that until (4) you have two API methods that do similar things, and they must both be maintained and tested until all code that access the old one is gone and it can be deleted. Maintenance is further complicated if both methods interact with common data that must be kept in sync, especially when the new method needs to store data not representable in the new format.
113 |
114 |
115 | ## Breaking data format changes
116 |
117 | As described in the previous section, backwards-incompatible data changes are more complex since data cannot typically be updated atomically the way code can; because data is both **big** and **dumb**, it may take a long time to migrate, and it typically can't smartly present itself to old code in a way that papers over the incompatibilities.
118 |
119 | *Note that this document primarily describes general approaches applicable to any data storage technology; for sufficiently "smart" systems (such as SQL) other approaches may be applicable, which we only touch on briefly here.*
120 |
121 | ### Aside: data versioning
122 |
123 | As with APIs, overcoming breaking data format changes typically require **versioning** (or downtime). There are at least two ways to version data:
124 |
125 | 1. **In-place:** store a version number inside each datum (or otherwise infer the version from the data), and as you upgrade overwrite the old datum (backing it up if desired). Adapting the above example to data-at-rest, we might have a single storage location `interests` that stores a mapping from `user-id` to **either** `{:version "1" :data ["cats" "dogs"]}` or `{:version "2" :data [{:name "cats" ...} {:name "dogs" ...}]}`.
126 | 2. **Multi-place:** store data in a versioned location, so that new versions can live alongside old versions. Using the same example, we might have one location `interests_1` that maps `user-id` to `["cats" "dogs"]` and a new location `interests_2` that maps `user-id` to `[{:name "cats" ...} {:name "dogs" ...}]`.
127 |
128 | The advantage of in-place versioning is that there's a single location to find the latest version of a datum. This can make it much easier to ensure data consistency between processes (as we'll see in a second), especially when there are multiple concurrent writers that need transactional semantics. However, the disadvantage is that during deployment you must ensure that all running code can read all data versions currently in play.
129 |
130 | Multi-place versioning can be simpler because each process can read data in a single format, by just choosing the appropriate location to read from. However, now the onus is on the programmer to ensure consistency requirements are met across all versions of a datum in play. When consistency requirements are lax (e.g., write-once data), this approach can be much simpler, and also has a much easier story for supporting old code and rolling back as needed.
131 |
132 | As we will see, zero-downtime approaches to data migration typically involve either **pausing writes**, or deploying code that can **simultaneously read *or* write both versions** of data for the duration of the migration.
133 |
134 | ### Migrating static data (no writers)
135 |
136 | Things are simple if the data being migrated is static (or writes can be avoided or deferred during the migration). In this case, the steps to migrate are:
137 |
138 | 1. The data is migrated in the background to the new format (multi-place versioning must be used to avoid breaking existing readers)
139 | 2. Readers are deployed to read the new format from location
140 | 3. (optional) The old data can be deleted, and writes can resume on the new data
141 |
142 | This is effectively a simplified version of the "Write Both" migration in the next subsection. To migrate in-place, there is also a simplified version of the "Read Both" migration below that works when writes are paused.
143 |
144 | ### "Write Both" migrations (often best for single writer)
145 |
146 | Things become significantly more complicated when you need to accommodate writes to your data during the migration process, because you have to ensure that writes during the migration are all captured in the new format by the end of the migration.
147 |
148 | If there is at most a single writer (per datum), the simplest option is often the "Write Both" migration with multi-place versioning.
149 |
150 | For example, when a new user signs up for Prismatic with a Twitter account, within seconds a worker process fetches the user's tweets, analyzes all shared URLs, and stores suggested topics in S3 for presentation later in the onboarding process. While we have multiple workers computing topic suggestions, the data is effectively write-once with no concurrency concerns, so a "Write Both" migration was appropriate when we wanted to make breaking changes to the suggestions data format.
151 |
152 |
153 |
154 | In this deployment strategy, the writer propagates changes to both the new and old formats throughout the duration of the migration:
155 |
156 | 1. The initial situation, with one read/write process and any number of additional readers interacting with the old data
157 | 2. The writer is deployed with code that continues to read from the old location, but writes to **both** the old location as well as a new location in the new format
158 | 3. The data is batch migrated from the old to new format. Note that the batch migration process is technically an additional writer that must be properly synchronized with other writer(s) to avoid losing their concurrent updates. This can sometimes be simplified by running the migration **inside** the writer (which could be combined with step (2) in a single deploy).
159 | 4. Readers are deployed to read from the new location
160 | 5. The writer is deployed to read and write only from the new location, and old data can be archived
161 |
162 | The primary advantages of this approach are:
163 |
164 | - Readers only need to be deployed once, and the writer only needs to be deployed twice
165 | - Readers do not need to be concerned with multiple versions of data, and it's always clear where to find the latest version of a datum
166 | - Readers can be updated incrementally -- if you have many different clients for your data, you can take as long as you want to move them all over to the new format in step (4) -- and each reader only needs to be updated and deployed once
167 |
168 | However, there are several disadvantages that are alleviated by the more complex "Read Latest" migration described next:
169 |
170 | - The new format cannot be used to store new information not representable in the old format until all old data is migrated and the migration is completed in step (5).
171 | - Multi-place versioning must be used, which requires twice as much storage and write bandwidth during the migration
172 | - It can be difficult to adapt this approach to multiple concurrent writers, because unless all writers are deployed concurrently in step (5) data loss can ensue (new writer 1 writes a new datum to the new format only, while old writer 2 is still reading from the old format)
173 | - Even with a single writer, it can be difficult to properly synchronize the writer and batch migration job across the two data locations to avoid concurrency issues
174 |
175 | #### Variations: SQL databases and transaction logs
176 |
177 | That said, there are some common variations of the "Write Both" approach for multiple writers that can work well for "sufficiently smart" data, such as a SQL database. For instance, several systems ([1](https://github.com/soundcloud/lhm),[2](https://www.facebook.com/note.php?note_id=430801045932),[3](http://openarkkit.googlecode.com/svn/trunk/openarkkit/doc/html/oak-online-alter-table.html)) use *triggers* to propagate updates to the old format to the new format during a migration. [Others](https://github.com/freels/table_migrator) rely instead on an indexed `updated_at` timestamp column to find rows that still need to be migrated.
178 |
179 | Similar schemes can be concocted to work with any system that maintains an explicit write log, since the latest state of the new data can always be reconstructed from the write log.
180 |
181 |
182 | ### "Read Latest" migrations
183 |
184 | The "Read Latest" migration is more complex than "Write Both". In exchange for this complexity, it easily accommodates multiple writers, in-place migrations, and the storage of new information in the new format before (or without) a batch migration of old data.
185 |
186 | For example, we have many terabytes of old documents stored at Prismatic, and at any time any document could be retrieved (or modified) by any user visiting their profile page (and potentially doing a new action). When we've made breaking changes to the format, we've done a "Read Latest" migration to ensure safety in the face of concurrent writes.
187 |
188 |
189 |
190 | The basic idea here is that rather than trying to keep both new and old formats up-to-date at all times, we ensure that readers are able to find and access the latest version of a datum at any point, whether it be old or new:
191 |
192 | 1. The initial situation, with any number of readers and writers interacting with the old data.
193 | 2. In any order:
194 | - All readers are deployed with the ability to read both the new and old formats. If the new data will be stored in a new location, readers should always check the new location first, and only fall back to the old if the new version is not yet present (note potential concurrency issue, which can be avoided by using in-place versioning).
195 | - All writers are deployed with the same reading behavior, where they write data back *in the format / location it was read in*.
196 | 3. Writers can be deployed to *upgrade on write* to the new version.
197 | 4. (optional) The data is batch migrated from the old to new format
198 | 5. (optional) Cleanup of old data and code for interacting with the old data format
199 |
200 | The primary advantages of this approach are:
201 |
202 | - The migration can be done with in-place versioning, which works fine with multiple writers (without adding any additional concurrency concerns)
203 | - New information can be stored in the new format starting at (3), before old data is batch migrated
204 | - The batch migration can be put off indefinitely, so long as a code path for lazy migration of all old datums is maintained
205 |
206 | The disadvantages are:
207 |
208 | - More total deployments are needed to fully move through the process (three deployments for writers and two for readers). Note that if the last two steps are skipped, the deploy count is actually the same as "Write Both".
209 | - Readers need to be concerned with handling both formats. In principle, this could involve a lot of forked code paths for handling old and new data. In practice, this can typically be done simply by always using a "new" in-memory representation, upgrading datums on read, and downgrading on write as necessary.
210 |
211 | #### Variations: single reader/writer
212 |
213 | If only a single process interacts with the data whose format is changing, a much simpler version of the "Read Latest" migration can be used where steps 2 and 3 are combined into a single deploy that lazily migrates the data to the new format (data is always upgraded on read, and written in the new format).
214 |
215 |
216 | # Pay it forward
217 |
218 | Ensuring **forward compatibility** means writing code that will accommodate future changes to data and clients with a minimum of backwards incompatibilities. This is very difficult because doing it perfectly involves predicting the future. However, there are a variety of things we can do to make it less likely that we'll fall into the more difficult situations described above.
219 |
220 | ## Don't be rash
221 |
222 | When you introduce a new API endpoint or format for data at rest, think hard. These decisions should generally be taken much more seriously than API or data decisions that won't leave a single process. You may end up with terabytes of data stored in this format, or stubborn users who refuse to update their two-year-old iOS client that accesses your API, and be stuck with this decision for a long time. Think it through, bounce it off a co-worker or two, and imagine yourself two years in the future working through the worst options in the previous section before committing.
223 |
224 | If you can test a new feature on the web (where you have to support old clients for a day), that's probably preferable to supporting your API for a year on iOS. If testing on iOS is the best option, try to design the client code such that it degrades gracefully if you remove the server-side API so you're not stuck with it. If you're designing an experimental server-side feature, see if you can store the data off to the side (e.g., in a different location, rather than together with currently critical data) so you can just delete it if the experiment fails rather than being saddled with this data forever without a huge migration project.
225 |
226 | ## Version up-front
227 |
228 | Most of the migration options above involve versioning of data and APIs. You'll probably make things simpler if you version up-front. Provide space to introduce new versions of API methods, and store your data with a version inside or inside a versioned bucket.
229 |
230 | ## Constrain access when appropriate
231 |
232 | As we've seen above, migrations become increasingly difficult as more processes need to read (and especially write) a given piece of data. Multiple writers always come with potentially concurrency issues, and these are only made worse when the data format needs to change in the future. Hiding data behind an API rather than accessing directly from a variety of systems can simplify concurrency issues and future migrations, although it can also increase the overall system complexity and add latency, so should be considered carefully on a case-by-case basis.
233 |
234 | ## Be safe, but not overly strict
235 |
236 | Know that things will change in the future, and try to ensure that your code won't fail silently when dealing with incompatible code or data. This means [schematizing](https://github.com/plumatic/schema) API endpoints and reads and writes of data when it crosses process boundaries.
237 |
238 | That said, the best case above happens when your data and endpoints are both forward and backwards compatible. Overly strict schemas can make forward compatibility very difficult. Think about **schema evolution** in advance, and ways that you can make your code flexible without hampering safety. For example, if you allow your code to accept arbitrary new keys on data, the system will be much easier to extend without explicit API or data versioning. (There is a lot of existing literature about this with regards to protocol buffers, where one common but controversial suggestion is to make all fields optional to maximize potential for forward compatibility). One potential "gotcha" here is what to do when writing back a modified datum with new fields, or sending datums with new fields back to an API in subsequent requests -- neither dropping nor including the field can always be correct, so there are no easy answers without thinking hard about your application.
239 |
240 |
241 |
--------------------------------------------------------------------------------
/swe/migrations/1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/1.png
--------------------------------------------------------------------------------
/swe/migrations/1.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/swe/migrations/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/2.png
--------------------------------------------------------------------------------
/swe/migrations/2.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/swe/migrations/3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/3.png
--------------------------------------------------------------------------------
/swe/migrations/3.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/swe/migrations/4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/4.png
--------------------------------------------------------------------------------
/swe/migrations/4.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/swe/migrations/5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/5.png
--------------------------------------------------------------------------------
/swe/migrations/5.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/swe/migrations/6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/6.png
--------------------------------------------------------------------------------
/swe/migrations/6.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/swe/migrations/7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/plumatic/eng-practices/8ef32d48dcb4a10f58f4dcd62895c1f07e0db1a5/swe/migrations/7.png
--------------------------------------------------------------------------------
/swe/migrations/7.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------