├── .gitignore
├── README.md
├── project.clj
├── src
    └── interval_metrics
    │   ├── ThreadLocalRandom.java
    │   ├── core.clj
    │   └── measure.clj
└── test
    └── interval_metrics
        ├── t_core.clj
        └── t_measure.clj


/.gitignore:
--------------------------------------------------------------------------------
 1 | .cake
 2 | pom.xml
 3 | *.jar
 4 | *.tar
 5 | *.tar.bz2
 6 | *.war
 7 | *.deb
 8 | *~
 9 | .*.swp
10 | *.log
11 | .lein-repl-history
12 | lib
13 | classes
14 | build
15 | .lein-deps-sum
16 | .lein-failures
17 | protosrc/
18 | reimann-*.zip
19 | /site
20 | site/**
21 | bench/**
22 | target/**
23 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # interval-metrics
  2 | 
  3 | Data structures for measuring performance. Provides lockfree, high-performance
  4 | mutable state, wrapped in idiomatic Clojure identities, without any external
  5 | dependencies.
  6 | 
  7 | Codahale's metrics library is designed for slowly-evolving metrics with stable
  8 | dynamics, where multiple readers may request the current value of a metric at
  9 | any time. Sometimes, you have a *single* reader which collects metrics over a
 10 | specific time window--and you want the value of the metric at the end of the
 11 | window to reflect observations from that window only, rather than including
 12 | observations from prior windows.
 13 | 
 14 | ## Clojars
 15 | 
 16 | https://clojars.org/interval-metrics
 17 | 
 18 | ## Rates
 19 | 
 20 | The `Metric` protocol defines an identity which wraps some mutable
 21 | measurements. Think of it like an atom or ref, only instead of (swap!) or
 22 | (alter), it accepts new measurements to merge into its state. All values in
 23 | this implementation are longs. Let's keep track of a *rate* of events per
 24 | second:
 25 | 
 26 | ``` clj
 27 | user=> (use 'interval-metrics.core)
 28 | nil
 29 | user=> (def r (rate))
 30 | #'user/r
 31 | ```
 32 | 
 33 | All operations are thread-safe, naturally. Let's tell the rate that we handled
 34 | 2 events, then 5 more events, than... I dunno, negative 200 events.
 35 | 
 36 | ``` clj
 37 | user=> (update! r 2)
 38 | #<Rate@df69935: 0.5458097134611932>
 39 | user=> (update! r 5)
 40 | #<Rate@df69935: 1.0991987655505422>
 41 | user=> (update! r -200)
 42 | #<Rate@df69935: -22.796646374437813>
 43 | ```
 44 | 
 45 | Metrics implement IDeref, so you can always ask for their current value without changing anything. Rate's value is the sum of all updates, divided by the time since the last snapshot:
 46 | 
 47 | ``` clj
 48 | user=> (deref r)
 49 | -21.14793138384688
 50 | ```
 51 | 
 52 | The `Snapshot` protocol defines an operation which gets the current value of an
 53 | identity *and atomically resets the value*. For instance, you might call
 54 | (snapshot! some-rate) every second, and log the resulting value or send it to
 55 | Graphite:
 56 | 
 57 | ``` clj
 58 | user=> (snapshot! r)
 59 | -14.292050358924806
 60 | user=> r
 61 | #<Rate@df69935: 0.0>
 62 | ```
 63 | 
 64 | Note that the rate became zero when we took a snapshot. It'll start
 65 | accumulating new state afresh.
 66 | 
 67 | ## Reservoirs
 68 | 
 69 | Sometimes you want a probabilistic sample of numbers. `uniform-reservoir` creates a pool of longs which represents a uniformly distributed sample of the updates. Let's create a reservoir which holds three numbers:
 70 | 
 71 | ``` clj
 72 | user=> (def r (uniform-reservoir 3))
 73 | #'user/r
 74 | ```
 75 | 
 76 | And update it with some values:
 77 | 
 78 | ``` clj
 79 | user=> (update! r 2)
 80 | #<UniformReservoir@49f17507: (2)>
 81 | user=> (update! r 1)
 82 | #<UniformReservoir@49f17507: (1 2)>
 83 | user=> (update! r 3)
 84 | #<UniformReservoir@49f17507: (1 2 3)>
 85 | ```
 86 | 
 87 | The *value* of a reservoir is a sorted list of the numbers in its current
 88 | sample. What happens when we add more than three elements?
 89 | 
 90 | ``` clj
 91 | #<UniformReservoir@399d2d53: (1 2 3)>
 92 | user=> (update! r 4)
 93 | #<UniformReservoir@399d2d53: (1 3 4)>
 94 | user=> (update! r 5)
 95 | #<UniformReservoir@399d2d53: (1 3 4)>
 96 | ```
 97 | 
 98 | Sampling is *probabilistic*: 4 made it in, but 5 didn't. Updates are always
 99 | constant time.
100 | 
101 | ``` clj
102 | user=> (update! r -2)
103 | #<UniformReservoir@399d2d53: (-2 1 3)>
104 | user=> (update! r -3)
105 | #<UniformReservoir@399d2d53: (-2 1 3)>
106 | user=> (update! r -4)
107 | #<UniformReservoir@399d2d53: (-4 -2 3)>
108 | ```
109 | 
110 | Snapshots (like deref) return the sorted list in O(n log n) time. Note that
111 | when we take a snapshot, the reservoir becomes empty again (nil). 
112 | 
113 | ``` clj
114 | user=> (snapshot! r)
115 | (-4 -2 3)
116 | user=> r
117 | #<AtomicMetric@3018fc1a: nil>
118 | user=> (update! r 42)
119 | #<UniformReservoir@593c7b26: (42)>
120 | ```
121 | 
122 | There are also functions to extract specific values from sorted sequences. To
123 | get the 95th percentile value:
124 | 
125 | ```clj
126 | #'user/r
127 | user=> (dotimes [_ 10000] (update! r (rand 1000)))
128 | nil
129 | user=> (quantile @r 0.95)
130 | 965
131 | ```
132 | 
133 | Typically, you'd run have a thread periodically take snapshots of your metrics and report some derived values. Here, we extract the median, 95th, 99th, and maximum percentiles seen since the last snapshot:
134 | 
135 | ```clj
136 | user=> (map (partial quantile (snapshot! r)) [0.5 0.95 0.99 1])
137 | (496 945 983 999)
138 | ```
139 | 
140 | ## Measuring your code's performance
141 | 
142 | The `interval-metrics.measure` namespace has some helpers for measuring common
143 | things about your code.
144 | 
145 | ```clj
146 | (use ['interval-metrics.measure :only '[periodically measure-latency]]
147 |      ['interval-metrics.core    :only '[snapshot! rate+latency]])
148 | ```
149 | 
150 | Define a hybrid metric which tracks both rates and latency distributions.
151 | 
152 | ```clj
153 | (def latencies (rate+latency))
154 | ```
155 | 
156 | Start a thread to snapshot the latencies every 5 seconds.
157 | 
158 | ```clj
159 | (def poller
160 |   (periodically 5 
161 |     (clojure.pprint/pprint (snapshot! latencies))))
162 | ```
163 | 
164 | The measure-latency macro times how long its body takes to execute, and updates
165 | the latencies metric each time.
166 | 
167 | ```clj
168 | (while true
169 |   (measure-latency latencies
170 |     (into [] (range 10))))
171 | ```
172 | 
173 | You'll see a map like this printed every 5 seconds, showing the rate of calls
174 | per second, and the latency distribution, in milliseconds.
175 | 
176 | ``` clj
177 | {:time 1369438387321/1000,
178 |  :rate 316831.2493798337,
179 |  :latencies
180 |  {0.0 2397/1000000,
181 |   0.5 2463/1000000,
182 |   0.95 2641/1000000,
183 |   0.99 2371/500000,
184 |   0.999 9597/1000000}}
185 | ```
186 | 
187 | Don't like rationals? Who doesn't! It's easy to map those latencies to a less
188 | precise type:
189 | 
190 | ```clj
191 | (measure/periodically 5
192 |   (-> latencies
193 |       metrics/snapshot!
194 |       (update-in [:latencies]
195 |                  (partial map (juxt key (comp float val))))
196 |       pprint))
197 | 
198 | ...
199 | 
200 | {:time 699491725283/500,
201 |  :rate 554.994854087713,
202 |  :latencies
203 |  ([0.0 9.388433]
204 |   [0.5 39.118896]
205 |   [0.95 50.673603]
206 |   [0.99 53.583065]
207 |   [0.999 57.83346])}
208 | ```
209 | 
210 | Kill the loop with ^C, then shut down the poller thread by calling `(poller)`.
211 | 
212 | You can configure the quantiles, reservoir size, and units for both the rate
213 | and the latencies by passing an options map to `rate+latency`:
214 | 
215 | ```clj
216 | (def latencies (rate+latency {:latency-unit :microseconds
217 |                               :rate-unit    :weeks}))
218 | ```
219 | 
220 | ## Performance
221 | 
222 | All algorithms are lockless. I sacrifice some correctness for performance, but
223 | never drop writes. Synchronization drift should be negligible compared to
224 | typical (~1s) sampling intervals. Rates on a highly contended JVM are
225 | accurate, in experiments, to at least three sigfigs.
226 | 
227 | With four threads updating, and one thread taking a snapshot every n seconds,
228 | my laptop can push between 10 to 18 million updates per second to a single rate+latency, reservoir, or rate object, saturating 99% of four cores.
229 | 
230 | ## License
231 | 
232 | Licensed under the Eclipse Public License.
233 | 
234 | ## How to run the tests
235 | 
236 | `lein midje` will run all tests.
237 | 
238 | `lein midje namespace.*` will run only tests beginning with "namespace.".
239 | 
240 | `lein midje :autotest` will run all the tests indefinitely. It sets up a
241 | watcher on the code files. If they change, only the relevant tests will be
242 | run again.
243 | 


--------------------------------------------------------------------------------
/project.clj:
--------------------------------------------------------------------------------
 1 | (defproject interval-metrics "1.0.2-SNAPSHOT"
 2 |   :description "Time-windowed metric collection objects."
 3 |   :dependencies []
 4 |   :java-source-paths ["src/interval_metrics"]
 5 |   :javac-options ["-target" "11" "-source" "11"]
 6 | ;  :warn-on-reflection true
 7 |   :profiles {:dev {:dependencies [[org.clojure/clojure "1.11.1"]
 8 |                                   [criterium "0.4.6"]
 9 |                                   [midje "1.10.6"]]
10 |                    :plugins [[lein-midje "3.2.1"]]}})
11 | 


--------------------------------------------------------------------------------
/src/interval_metrics/ThreadLocalRandom.java:
--------------------------------------------------------------------------------
  1 | /*
  2 |  * Written by Doug Lea with assistance from members of JCP JSR-166
  3 |  * Expert Group and released to the public domain, as explained at
  4 |  * http://creativecommons.org/publicdomain/zero/1.0/
  5 |  *
  6 |  * http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ThreadLocalRandom.java?view=markup
  7 |  */
  8 | 
  9 | package com.aphyr.interval_metrics;
 10 | 
 11 | import java.util.Random;
 12 | 
 13 | // CHECKSTYLE:OFF
 14 | /**
 15 |  * Copied directly from the JSR-166 project.
 16 |  */
 17 | @SuppressWarnings("all")
 18 | public class ThreadLocalRandom extends Random {
 19 |     // same constants as Random, but must be redeclared because private
 20 |     private static final long multiplier = 0x5DEECE66DL;
 21 |     private static final long addend = 0xBL;
 22 |     private static final long mask = (1L << 48) - 1;
 23 | 
 24 |     /**
 25 |      * The random seed. We can't use super.seed.
 26 |      */
 27 |     private long rnd;
 28 | 
 29 |     /**
 30 |      * Initialization flag to permit calls to setSeed to succeed only while executing the Random
 31 |      * constructor.  We can't allow others since it would cause setting seed in one part of a
 32 |      * program to unintentionally impact other usages by the thread.
 33 |      */
 34 |     boolean initialized;
 35 | 
 36 |     // Padding to help avoid memory contention among seed updates in
 37 |     // different TLRs in the common case that they are located near
 38 |     // each other.
 39 |     private long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7;
 40 | 
 41 |     /**
 42 |      * The actual ThreadLocal
 43 |      */
 44 |     private static final ThreadLocal<ThreadLocalRandom> localRandom =
 45 |             new ThreadLocal<ThreadLocalRandom>() {
 46 |                 protected ThreadLocalRandom initialValue() {
 47 |                     return new ThreadLocalRandom();
 48 |                 }
 49 |             };
 50 | 
 51 | 
 52 |     /**
 53 |      * Constructor called only by localRandom.initialValue.
 54 |      */
 55 |     ThreadLocalRandom() {
 56 |         super();
 57 |         initialized = true;
 58 |     }
 59 | 
 60 |     /**
 61 |      * Returns the current thread's {@code ThreadLocalRandom}.
 62 |      *
 63 |      * @return the current thread's {@code ThreadLocalRandom}
 64 |      */
 65 |     public static ThreadLocalRandom current() {
 66 |         return localRandom.get();
 67 |     }
 68 | 
 69 |     /**
 70 |      * Throws {@code UnsupportedOperationException}.  Setting seeds in this generator is not
 71 |      * supported.
 72 |      *
 73 |      * @throws UnsupportedOperationException always
 74 |      */
 75 |     public void setSeed(long seed) {
 76 |         if (initialized)
 77 |             throw new UnsupportedOperationException();
 78 |         rnd = (seed ^ multiplier) & mask;
 79 |     }
 80 | 
 81 |     protected int next(int bits) {
 82 |         rnd = (rnd * multiplier + addend) & mask;
 83 |         return (int) (rnd >>> (48 - bits));
 84 |     }
 85 | 
 86 |     /**
 87 |      * Returns a pseudorandom, uniformly distributed value between the given least value (inclusive)
 88 |      * and bound (exclusive).
 89 |      *
 90 |      * @param least the least value returned
 91 |      * @param bound the upper bound (exclusive)
 92 |      * @return the next value
 93 |      * @throws IllegalArgumentException if least greater than or equal to bound
 94 |      */
 95 |     public int nextInt(int least, int bound) {
 96 |         if (least >= bound)
 97 |             throw new IllegalArgumentException();
 98 |         return nextInt(bound - least) + least;
 99 |     }
100 | 
101 |     /**
102 |      * Returns a pseudorandom, uniformly distributed value between 0 (inclusive) and the specified
103 |      * value (exclusive).
104 |      *
105 |      * @param n the bound on the random number to be returned.  Must be positive.
106 |      * @return the next value
107 |      * @throws IllegalArgumentException if n is not positive
108 |      */
109 |     public long nextLong(long n) {
110 |         if (n <= 0)
111 |             throw new IllegalArgumentException("n must be positive");
112 |         // Divide n by two until small enough for nextInt. On each
113 |         // iteration (at most 31 of them but usually much less),
114 |         // randomly choose both whether to include high bit in result
115 |         // (offset) and whether to continue with the lower vs upper
116 |         // half (which makes a difference only if odd).
117 |         long offset = 0;
118 |         while (n >= Integer.MAX_VALUE) {
119 |             final int bits = next(2);
120 |             final long half = n >>> 1;
121 |             final long nextn = ((bits & 2) == 0) ? half : n - half;
122 |             if ((bits & 1) == 0)
123 |                 offset += n - nextn;
124 |             n = nextn;
125 |         }
126 |         return offset + nextInt((int) n);
127 |     }
128 | 
129 |     // Stolen from java.util.Random#nextInt().
130 |     public static final int BITS_PER_LONG = 63;
131 |     public static long nextLong2(final long n) {
132 |       long bits, val;
133 |       do {
134 |         bits = ThreadLocalRandom.current().nextLong() & (~(1L << BITS_PER_LONG));
135 |         val = bits % n;
136 |       } while (bits - val + (n - 1) < 0L);
137 |       return val;
138 |     }
139 | 
140 |     /**
141 |      * Returns a pseudorandom, uniformly distributed value between the given least value (inclusive)
142 |      * and bound (exclusive).
143 |      *
144 |      * @param least the least value returned
145 |      * @param bound the upper bound (exclusive)
146 |      * @return the next value
147 |      * @throws IllegalArgumentException if least greater than or equal to bound
148 |      */
149 |     public long nextLong(long least, long bound) {
150 |         if (least >= bound)
151 |             throw new IllegalArgumentException();
152 |         return nextLong(bound - least) + least;
153 |     }
154 | 
155 |     /**
156 |      * Returns a pseudorandom, uniformly distributed {@code double} value between 0 (inclusive) and
157 |      * the specified value (exclusive).
158 |      *
159 |      * @param n the bound on the random number to be returned.  Must be positive.
160 |      * @return the next value
161 |      * @throws IllegalArgumentException if n is not positive
162 |      */
163 |     public double nextDouble(double n) {
164 |         if (n <= 0)
165 |             throw new IllegalArgumentException("n must be positive");
166 |         return nextDouble() * n;
167 |     }
168 | 
169 |     /**
170 |      * Returns a pseudorandom, uniformly distributed value between the given least value (inclusive)
171 |      * and bound (exclusive).
172 |      *
173 |      * @param least the least value returned
174 |      * @param bound the upper bound (exclusive)
175 |      * @return the next value
176 |      * @throws IllegalArgumentException if least greater than or equal to bound
177 |      */
178 |     public double nextDouble(double least, double bound) {
179 |         if (least >= bound)
180 |             throw new IllegalArgumentException();
181 |         return nextDouble() * (bound - least) + least;
182 |     }
183 | 
184 |     private static final long serialVersionUID = -5851777807851030925L;
185 | }
186 | // CHECKSTYLE:ON
187 | 


--------------------------------------------------------------------------------
/src/interval_metrics/core.clj:
--------------------------------------------------------------------------------
  1 | (ns interval-metrics.core
  2 |   (:import (clojure.lang Counted
  3 |                          IDeref
  4 |                          Seqable
  5 |                          Indexed)
  6 |            (java.util.concurrent.atomic AtomicReference
  7 |                                         AtomicLong
  8 |                                         AtomicLongArray)
  9 |            (com.aphyr.interval_metrics ThreadLocalRandom)))
 10 | 
 11 | ;; Protocols ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 12 | 
 13 | (defprotocol Metric
 14 |   (update! [this value]
 15 |            "Destructively updates the metric with a new value."))
 16 | 
 17 | (defprotocol Snapshot
 18 |   (snapshot! [this]
 19 |              "Returns a copy of the metric, resetting the original to a blank
 20 |              state."))
 21 | 
 22 | ;; Utilities ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 23 | 
 24 | (defn next-long
 25 |   "Returns a pseudo-random long uniformly between 0 and n-1."
 26 |   [^long n]
 27 |   (ThreadLocalRandom/nextLong2 n))
 28 | 
 29 | (def default-uniform-reservoir-size 1028)
 30 | (def bits-per-long 63)
 31 | 
 32 | (def time-units
 33 |   "A map from units to their size in nanoseconds."
 34 |   {:nanoseconds   1
 35 |    :nanos         1
 36 |    :microseconds  1000
 37 |    :micros        1000
 38 |    :milliseconds  1000000
 39 |    :millis        1000000
 40 |    :seconds       1000000000
 41 |    :minutes       60000000000
 42 |    :hours         3600000000000
 43 |    :days          86400000000000
 44 |    :weeks         604800000000000})
 45 | 
 46 | (defn scale
 47 |   "Returns a conversion factor s from unit a to unit b, such that (* s
 48 |   measurement-in-a) is in units of b."
 49 |   [a b]
 50 |   (try
 51 |     (/ (get time-units a)
 52 |        (get time-units b))
 53 |     (catch NullPointerException e
 54 |       (when-not (contains? time-units a)
 55 |         (throw (IllegalArgumentException. (str "Don't know unit " a))))
 56 |       (when-not (contains? time-units b)
 57 |         (throw (IllegalArgumentException. (str "Don't know unit " b))))
 58 |       (throw e))))
 59 | 
 60 | (defn quantile
 61 |   "Given a sorted Indexed collection, returns the value nearest to the given
 62 |   quantile in [0,1]."
 63 |   [sorted quantile]
 64 |   (assert (<= 0.0 quantile 1.0))
 65 |   (let [n (count sorted)]
 66 |     (if (zero? n)
 67 |       nil
 68 |       (nth sorted
 69 |            (min (dec n)
 70 |                 (int (Math/floor (* n quantile))))))))
 71 | 
 72 | (defn mean
 73 |   [coll]
 74 |   (/ (reduce + coll)
 75 |      (count coll)))
 76 | 
 77 | ;; Atomic metrics ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 78 | 
 79 | ; Wraps another metric to support atomic snapshots. Uses a generator function
 80 | ; to generate new (empty) metric objects when snapshot is called.
 81 | (deftype AtomicMetric [^AtomicReference state generator]
 82 |   IDeref
 83 |   (deref [this]
 84 |          (deref (.get state)))
 85 | 
 86 |   Metric
 87 |   (update! [this value]
 88 |            (update! (.get state) value))
 89 | 
 90 |   Snapshot
 91 |   (snapshot! [this]
 92 |              (-> state
 93 |                (.getAndSet (generator))
 94 |                deref)))
 95 | 
 96 | ; A little gross since this is a global preference, but hopefully nobody
 97 | ; else will mind
 98 | (prefer-method clojure.core/print-method clojure.lang.IRecord IDeref)
 99 | 
100 | (defn atomic
101 |   "Given a generator function which creates blank metric objects, returns an
102 |   AtomicMetric which provides consistent snapshots of that metric."
103 |   [generator]
104 |   (AtomicMetric. (AtomicReference. (generator))
105 |                  generator))
106 | 
107 | 
108 | ;; Rates ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
109 | 
110 | (def rate-scale
111 |   "The size of a second in nanoseconds."
112 |   1.0e-9)
113 |   
114 | (deftype Rate [^AtomicLong count ^AtomicLong time]
115 |   IDeref
116 |   (deref [this]
117 |          (/ (.get count)
118 |             rate-scale
119 |             (- (System/nanoTime) (.get time))))
120 | 
121 |   Metric
122 |   (update! [this value]
123 |            (.addAndGet count value)
124 |            this)
125 | 
126 |   Snapshot
127 |   (snapshot! [this]
128 |              (locking this
129 |                (let [now  (System/nanoTime)
130 |                      t (.getAndSet time now)
131 |                      c  (.getAndSet count 0)]
132 |                  (/ c rate-scale (- now t))))))
133 | 
134 | (defn ^Rate rate
135 |   "Tracks the rate of values per second. (update rate 3) adds 3 to the rate's
136 |   counter. (deref rate) returns the number of values accumulated divided by the
137 |   time since the last snapshot was taken. (snapshot! rate) returns the number
138 |   of values per second since the last snapshot, and resets the count to zero."
139 |   []
140 |   (Rate. (AtomicLong. 0) (AtomicLong. (System/nanoTime))))
141 | 
142 | ;; Reservoirs ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
143 | 
144 | ; A mutable uniform Vitter R reservoir.
145 | (deftype UniformReservoir [^AtomicLong size ^AtomicLongArray values]
146 |   Counted
147 |   (count [this]
148 |            (min (.get size)
149 |                 (.length values)))
150 |   
151 |   Metric
152 |   (update! [this value]
153 |            (let [s (.incrementAndGet size)]
154 |              (if (<= s (.length values))
155 |                ; Not full
156 |                (.set values (dec s) value)
157 |                ; Full
158 |                (let [r (next-long s)]
159 |                  (when (< r (.length values))
160 |                    (.set values r value)))))
161 |            this)
162 |   
163 |   IDeref
164 |   ; Create a sorted array of longs.
165 |   (deref [this]
166 |          (let [n (count this)
167 |                ary (long-array n)]
168 |            (dotimes [i n]
169 |              (aset-long ary i (.get values i)))
170 |            (java.util.Arrays/sort ary)
171 |            (seq ary))))
172 | 
173 | (defn uniform-reservoir*
174 |   "Creates a new uniform reservoir, optionally of a given size. Does not
175 |   support snapshots; simply accrues new values for all time."
176 |   ([]
177 |    (uniform-reservoir* default-uniform-reservoir-size))
178 |   ([^long size]
179 |    (UniformReservoir. (AtomicLong. 0)
180 |                       (let [values (AtomicLongArray. size)]
181 |                         (dotimes [i size]
182 |                           (.set values i 0))
183 |                         values))))
184 | 
185 | (defn uniform-reservoir
186 |   "Creates a new uniform reservoir, optionally of a given size. Supports atomic
187 |   snapshots."
188 |   ([]
189 |    (atomic uniform-reservoir*))
190 |   ([^long size]
191 |    (atomic #(uniform-reservoir* size))))
192 | 
193 | ;; Combined rates and latencies ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
194 | (defrecord RateLatency [quantiles rate-scale latency-scale rate latencies]
195 |   Metric
196 |   (update! [this time]
197 |            (update! latencies time)
198 |            (update! rate 1))
199 | 
200 |   Snapshot
201 |   (snapshot! [this]
202 |              (let [rate      (snapshot! rate)
203 |                    latencies (snapshot! latencies)
204 |                    t         (/ (System/currentTimeMillis) 1000)]
205 |                {:time t
206 |                 :rate (* rate-scale rate)
207 |                 :latencies (->> quantiles
208 |                              (map (fn [q]
209 |                                     [q (when-let [t (quantile latencies q)]
210 |                                          (* t latency-scale))]))
211 |                              (into {}))})))
212 | 
213 | (defn rate+latency
214 |   "Returns a snapshottable metric which tracks a request rate and the latency
215 |   of requests. Calls to update! on this metric are assumed to be in nanoseconds.
216 |   When a snapshot is taken, returns a map like
217 | 
218 |   {
219 |    ; The current posix time in seconds
220 |    :time 273885884803/200
221 |    ; The number of updates per second
222 |    :rate 250/13
223 |    ; A map of latency quantiles to latencies, in milliseconds
224 |   } 
225 | 
226 |   Options:
227 |   
228 |   :quantiles       A list of the quantiles to sample.
229 |   :rate-unit       The unit to report rates in: e.g. :seconds
230 |   :latency-unit    The unit to report latencies in: e.g. :milliseconds
231 |   :reservoir-size  The size of the uniform reservoir used to collect latencies"
232 |   ([] (rate+latency {}))
233 |   ([opts]
234 |    (let [quantiles     (get opts :quantiles [0.0 0.5 0.95 0.99 0.999])
235 |          rate-scale    (scale (get opts :rate-unit :seconds) :seconds)
236 |          latency-scale (scale :nanos (get opts :latency-unit :millis))
237 |          rate          (rate)
238 |          reservoir     (if-let [s (:reservoir-size opts)]
239 |                          (uniform-reservoir s)
240 |                          (uniform-reservoir))]
241 |      (RateLatency. quantiles rate-scale latency-scale rate reservoir))))
242 | 


--------------------------------------------------------------------------------
/src/interval_metrics/measure.clj:
--------------------------------------------------------------------------------
 1 | (ns interval-metrics.measure
 2 |   "Functions and macros for instrumenting your code."
 3 |   (:use interval-metrics.core))
 4 | 
 5 | (defn unix-time
 6 |   "Returns the current unix time, in seconds."
 7 |   []
 8 |   (/ (System/currentTimeMillis) 1000))
 9 | 
10 | (defn linear-time
11 |   "Returns a linear time source, in seconds."
12 |   []
13 |   (/ (System/nanoTime) 1e9))
14 | 
15 | (defn periodically-
16 |   "Spawns a thread which calls f every dt seconds. Returns a function which
17 |   stops that thread."
18 |   [dt f]
19 |   (let [anchor   (linear-time)
20 |         running? (promise)
21 |         looper   (bound-fn looper []
22 |                    ; Sleep until the next tick, or when the shutdown is
23 |                    ; delivered as false.
24 |                    (while (deref running?
25 |                                  (* 1000 (- dt (mod (- (linear-time) anchor)
26 |                                                     dt)))
27 |                                  true)
28 |                      (try
29 |                        (f)
30 |                        (catch Throwable t))))]
31 |     (.start (Thread. ^Runnable looper "interval-metrics periodic"))
32 |     #(deliver running? false)))
33 | 
34 | (defmacro periodically
35 |   "Spawns a thread which executes body every dt seconds, after an initial dt
36 |   delay. Returns a function which stops that thread."
37 |   [dt & body]
38 |   `(periodically- ~dt (bound-fn [] ~@body)))
39 | 
40 | (defmacro measure-latency
41 |   "Wraps body in a macro which reports its running time in nanoseconds to a
42 |   Metric."
43 |   [metric & body]
44 |   `(let [t0#    (System/nanoTime)
45 |          value# (do ~@body)
46 |          t1#    (System/nanoTime)]
47 |      (update! ~metric (- t1# t0#))
48 |      value#))
49 | 


--------------------------------------------------------------------------------
/test/interval_metrics/t_core.clj:
--------------------------------------------------------------------------------
  1 | (ns interval-metrics.t-core
  2 |   (:use midje.sweet
  3 |         interval-metrics.core
  4 |         criterium.core)
  5 |   (:import (interval_metrics.core Rate)
  6 |            (java.util.concurrent CountDownLatch
  7 |                                  TimeUnit)))
  8 | 
  9 | (defn fill! [reservoir coll]
 10 |   (->> coll
 11 |     (partition-all 10000)
 12 |     (map shuffle)
 13 |     (map (fn [chunk]
 14 |            (dorun (pmap #(update! reservoir %)
 15 |                        chunk))))
 16 |     dorun)
 17 |   reservoir)
 18 | 
 19 | (facts "reservoirs"
 20 |        (facts "a few numbers"
 21 |               (let [r (uniform-reservoir)]
 22 |                 (fact (snapshot! r) => nil)
 23 | 
 24 |                 (update! r 1)
 25 |                 (fact (snapshot! r) => (just [1]))
 26 | 
 27 |                 (update! r 1)
 28 |                 (update! r -10)
 29 |                 (update! r 23)
 30 |                 (fact (snapshot! r) => (just [-10 1 23]))))
 31 | 
 32 |        (facts "linear naturals"
 33 |               (let [capacity 1000
 34 |                     n (* capacity 10)
 35 |                     r (uniform-reservoir capacity)
 36 |                     tolerance capacity]
 37 | 
 38 |                 ; Uniformly distributed numbers
 39 |                 (fill! r (range n))
 40 |                 (let [snap (snapshot! r)]
 41 |                   ; Flood with zeroes
 42 |                   (dotimes [i capacity]
 43 |                     (update! r 0))
 44 | 
 45 |                   (fact (count snap) => capacity)
 46 |                   (fact (quantile snap 0)   => (roughly 0 tolerance))
 47 |                   (fact (quantile snap 0.5) => (roughly (/ n 2) tolerance)
 48 |                         (fact (quantile snap 1)   => (roughly n tolerance))))
 49 | 
 50 |                 ; Check for those zeroes (to verify the snapshot isolated us)
 51 |                 (let [snap (snapshot! r)]
 52 |                   (fact (quantile snap 0) => 0)
 53 |                   (fact (quantile snap 0.5) => 0)
 54 |                   (fact (quantile snap 1) => 0))
 55 | 
 56 |                 ; Check to see that the most recent snapshot emptied things out
 57 |                 (fact (snapshot! r) => nil))))
 58 | 
 59 | (defn nanos->seconds
 60 |   [nanos]
 61 |   (* 1e-9 nanos))
 62 | 
 63 | (defn expected-rate
 64 |   [total ^Rate r]
 65 |   (let [x (/ total (nanos->seconds
 66 |                     (- (System/nanoTime)
 67 |                        (.time r))))]
 68 |     (roughly x (/ x 20))))
 69 | 
 70 | (facts "rates"
 71 |        (let [r (rate)
 72 |              t0 (System/nanoTime)
 73 |              n 100000
 74 |              total (* 0.5 n (inc n))
 75 |              _ (fill! r (range n))
 76 |              ; Verify that dereferencing the rate in progress is correct
 77 |              _ (fact @r => (expected-rate total r))
 78 |              _ (Thread/sleep 10)
 79 |              _ (fact @r => (expected-rate total r))
 80 |              ; And take a snapshot
 81 |              snap (snapshot! r)
 82 |              t1 (System/nanoTime)]
 83 |          ; Since the snapshot completed, the rate should be zero.
 84 |          (fact @r => 0.0)
 85 | 
 86 |          ; But our snapshot should be the mean.
 87 |          (fact snap => (roughly (/ total (nanos->seconds (- t1 t0)))))
 88 | 
 89 |          ; Additional snapshots should be zero.
 90 |          (fact (snapshot! r) => 0.0)))
 91 | 
 92 | (facts "rate+latency"
 93 |        (let [n  100000
 94 |              r  (rate+latency {:rate-unit   :nanoseconds
 95 |                                :latency-unit :micros
 96 |                                :quantiles    [0 1/2 1]})
 97 |              t0 (System/nanoTime)
 98 |              _  (dotimes [i n] (update! r i))
 99 |              snap (snapshot! r)
100 |              t1 (System/nanoTime)]
101 |          (fact (:time snap) => pos?)
102 |          (fact (:rate snap) => (roughly (/ n (- t1 t0))
103 |                                         (/ (:rate snap) 5)))
104 |          (let [ls (:latencies snap)]
105 |            (fact (get ls 0)   => (roughly (/ 0 n 1000) 1))
106 |            (fact (get ls 1/2) => (roughly (/ n 2 1000) (/ n 10000)))
107 |            (fact (get ls 1)   => (roughly (/ n 1000)   1)))))
108 | 
109 | (defn stress [metric n]
110 |   (let [threads 4
111 |         per-worker (/ n threads)
112 |         latch (CountDownLatch. threads)
113 |         t0 (System/nanoTime)
114 |         workers (map (fn [_] (future
115 |                                (dotimes [i per-worker]
116 |                                  (update! metric i))
117 |                                (.countDown latch)))
118 |                      (range threads))
119 |         reader (future
120 |                  (while (not (.await latch 1 TimeUnit/SECONDS))
121 | ;                   (prn (snapshot! metric))
122 |                         ))]
123 |     (dorun (map deref workers))
124 |     @reader
125 |     (let [t1 (System/nanoTime)
126 |           dt (nanos->seconds (- t1 t0))
127 |           rate (/ n dt)]
128 |       (println "Completed" n "updates in" (format "%.3f" dt) "s: "
129 |                (format "%.3f" rate) "updates/sec"))))
130 | 
131 | ;(facts "performance"
132 | ;       (println "Benchmarking rate")
133 | ;       (stress (rate) 1e9)
134 | ;
135 | ;       (println "Benchmarking reservoir")
136 | ;       (stress (uniform-reservoir) 1e9)
137 | ;       
138 | ;       (println "Benchmarking rate+latency")
139 | ;       (stress (rate+latency) 1e9))
140 | 


--------------------------------------------------------------------------------
/test/interval_metrics/t_measure.clj:
--------------------------------------------------------------------------------
1 | (ns interval-metrics.t-measure
2 |   (:use midje.sweet
3 |         interval-metrics.core
4 |         criterium.core)
5 |   (:import (interval_metrics.core Rate)
6 |            (java.util.concurrent CountDownLatch
7 |                                  TimeUnit)))
8 | 


--------------------------------------------------------------------------------