├── .gitignore ├── README.md ├── project.clj ├── src └── interval_metrics │ ├── ThreadLocalRandom.java │ ├── core.clj │ └── measure.clj └── test └── interval_metrics ├── t_core.clj └── t_measure.clj /.gitignore: -------------------------------------------------------------------------------- 1 | .cake 2 | pom.xml 3 | *.jar 4 | *.tar 5 | *.tar.bz2 6 | *.war 7 | *.deb 8 | *~ 9 | .*.swp 10 | *.log 11 | .lein-repl-history 12 | lib 13 | classes 14 | build 15 | .lein-deps-sum 16 | .lein-failures 17 | protosrc/ 18 | reimann-*.zip 19 | /site 20 | site/** 21 | bench/** 22 | target/** 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # interval-metrics 2 | 3 | Data structures for measuring performance. Provides lockfree, high-performance 4 | mutable state, wrapped in idiomatic Clojure identities, without any external 5 | dependencies. 6 | 7 | Codahale's metrics library is designed for slowly-evolving metrics with stable 8 | dynamics, where multiple readers may request the current value of a metric at 9 | any time. Sometimes, you have a *single* reader which collects metrics over a 10 | specific time window--and you want the value of the metric at the end of the 11 | window to reflect observations from that window only, rather than including 12 | observations from prior windows. 13 | 14 | ## Clojars 15 | 16 | https://clojars.org/interval-metrics 17 | 18 | ## Rates 19 | 20 | The `Metric` protocol defines an identity which wraps some mutable 21 | measurements. Think of it like an atom or ref, only instead of (swap!) or 22 | (alter), it accepts new measurements to merge into its state. All values in 23 | this implementation are longs. Let's keep track of a *rate* of events per 24 | second: 25 | 26 | ``` clj 27 | user=> (use 'interval-metrics.core) 28 | nil 29 | user=> (def r (rate)) 30 | #'user/r 31 | ``` 32 | 33 | All operations are thread-safe, naturally. Let's tell the rate that we handled 34 | 2 events, then 5 more events, than... I dunno, negative 200 events. 35 | 36 | ``` clj 37 | user=> (update! r 2) 38 | # 39 | user=> (update! r 5) 40 | # 41 | user=> (update! r -200) 42 | # 43 | ``` 44 | 45 | Metrics implement IDeref, so you can always ask for their current value without changing anything. Rate's value is the sum of all updates, divided by the time since the last snapshot: 46 | 47 | ``` clj 48 | user=> (deref r) 49 | -21.14793138384688 50 | ``` 51 | 52 | The `Snapshot` protocol defines an operation which gets the current value of an 53 | identity *and atomically resets the value*. For instance, you might call 54 | (snapshot! some-rate) every second, and log the resulting value or send it to 55 | Graphite: 56 | 57 | ``` clj 58 | user=> (snapshot! r) 59 | -14.292050358924806 60 | user=> r 61 | # 62 | ``` 63 | 64 | Note that the rate became zero when we took a snapshot. It'll start 65 | accumulating new state afresh. 66 | 67 | ## Reservoirs 68 | 69 | Sometimes you want a probabilistic sample of numbers. `uniform-reservoir` creates a pool of longs which represents a uniformly distributed sample of the updates. Let's create a reservoir which holds three numbers: 70 | 71 | ``` clj 72 | user=> (def r (uniform-reservoir 3)) 73 | #'user/r 74 | ``` 75 | 76 | And update it with some values: 77 | 78 | ``` clj 79 | user=> (update! r 2) 80 | # 81 | user=> (update! r 1) 82 | # 83 | user=> (update! r 3) 84 | # 85 | ``` 86 | 87 | The *value* of a reservoir is a sorted list of the numbers in its current 88 | sample. What happens when we add more than three elements? 89 | 90 | ``` clj 91 | # 92 | user=> (update! r 4) 93 | # 94 | user=> (update! r 5) 95 | # 96 | ``` 97 | 98 | Sampling is *probabilistic*: 4 made it in, but 5 didn't. Updates are always 99 | constant time. 100 | 101 | ``` clj 102 | user=> (update! r -2) 103 | # 104 | user=> (update! r -3) 105 | # 106 | user=> (update! r -4) 107 | # 108 | ``` 109 | 110 | Snapshots (like deref) return the sorted list in O(n log n) time. Note that 111 | when we take a snapshot, the reservoir becomes empty again (nil). 112 | 113 | ``` clj 114 | user=> (snapshot! r) 115 | (-4 -2 3) 116 | user=> r 117 | # 118 | user=> (update! r 42) 119 | # 120 | ``` 121 | 122 | There are also functions to extract specific values from sorted sequences. To 123 | get the 95th percentile value: 124 | 125 | ```clj 126 | #'user/r 127 | user=> (dotimes [_ 10000] (update! r (rand 1000))) 128 | nil 129 | user=> (quantile @r 0.95) 130 | 965 131 | ``` 132 | 133 | Typically, you'd run have a thread periodically take snapshots of your metrics and report some derived values. Here, we extract the median, 95th, 99th, and maximum percentiles seen since the last snapshot: 134 | 135 | ```clj 136 | user=> (map (partial quantile (snapshot! r)) [0.5 0.95 0.99 1]) 137 | (496 945 983 999) 138 | ``` 139 | 140 | ## Measuring your code's performance 141 | 142 | The `interval-metrics.measure` namespace has some helpers for measuring common 143 | things about your code. 144 | 145 | ```clj 146 | (use ['interval-metrics.measure :only '[periodically measure-latency]] 147 | ['interval-metrics.core :only '[snapshot! rate+latency]]) 148 | ``` 149 | 150 | Define a hybrid metric which tracks both rates and latency distributions. 151 | 152 | ```clj 153 | (def latencies (rate+latency)) 154 | ``` 155 | 156 | Start a thread to snapshot the latencies every 5 seconds. 157 | 158 | ```clj 159 | (def poller 160 | (periodically 5 161 | (clojure.pprint/pprint (snapshot! latencies)))) 162 | ``` 163 | 164 | The measure-latency macro times how long its body takes to execute, and updates 165 | the latencies metric each time. 166 | 167 | ```clj 168 | (while true 169 | (measure-latency latencies 170 | (into [] (range 10)))) 171 | ``` 172 | 173 | You'll see a map like this printed every 5 seconds, showing the rate of calls 174 | per second, and the latency distribution, in milliseconds. 175 | 176 | ``` clj 177 | {:time 1369438387321/1000, 178 | :rate 316831.2493798337, 179 | :latencies 180 | {0.0 2397/1000000, 181 | 0.5 2463/1000000, 182 | 0.95 2641/1000000, 183 | 0.99 2371/500000, 184 | 0.999 9597/1000000}} 185 | ``` 186 | 187 | Don't like rationals? Who doesn't! It's easy to map those latencies to a less 188 | precise type: 189 | 190 | ```clj 191 | (measure/periodically 5 192 | (-> latencies 193 | metrics/snapshot! 194 | (update-in [:latencies] 195 | (partial map (juxt key (comp float val)))) 196 | pprint)) 197 | 198 | ... 199 | 200 | {:time 699491725283/500, 201 | :rate 554.994854087713, 202 | :latencies 203 | ([0.0 9.388433] 204 | [0.5 39.118896] 205 | [0.95 50.673603] 206 | [0.99 53.583065] 207 | [0.999 57.83346])} 208 | ``` 209 | 210 | Kill the loop with ^C, then shut down the poller thread by calling `(poller)`. 211 | 212 | You can configure the quantiles, reservoir size, and units for both the rate 213 | and the latencies by passing an options map to `rate+latency`: 214 | 215 | ```clj 216 | (def latencies (rate+latency {:latency-unit :microseconds 217 | :rate-unit :weeks})) 218 | ``` 219 | 220 | ## Performance 221 | 222 | All algorithms are lockless. I sacrifice some correctness for performance, but 223 | never drop writes. Synchronization drift should be negligible compared to 224 | typical (~1s) sampling intervals. Rates on a highly contended JVM are 225 | accurate, in experiments, to at least three sigfigs. 226 | 227 | With four threads updating, and one thread taking a snapshot every n seconds, 228 | my laptop can push between 10 to 18 million updates per second to a single rate+latency, reservoir, or rate object, saturating 99% of four cores. 229 | 230 | ## License 231 | 232 | Licensed under the Eclipse Public License. 233 | 234 | ## How to run the tests 235 | 236 | `lein midje` will run all tests. 237 | 238 | `lein midje namespace.*` will run only tests beginning with "namespace.". 239 | 240 | `lein midje :autotest` will run all the tests indefinitely. It sets up a 241 | watcher on the code files. If they change, only the relevant tests will be 242 | run again. 243 | -------------------------------------------------------------------------------- /project.clj: -------------------------------------------------------------------------------- 1 | (defproject interval-metrics "1.0.2-SNAPSHOT" 2 | :description "Time-windowed metric collection objects." 3 | :dependencies [] 4 | :java-source-paths ["src/interval_metrics"] 5 | :javac-options ["-target" "11" "-source" "11"] 6 | ; :warn-on-reflection true 7 | :profiles {:dev {:dependencies [[org.clojure/clojure "1.11.1"] 8 | [criterium "0.4.6"] 9 | [midje "1.10.6"]] 10 | :plugins [[lein-midje "3.2.1"]]}}) 11 | -------------------------------------------------------------------------------- /src/interval_metrics/ThreadLocalRandom.java: -------------------------------------------------------------------------------- 1 | /* 2 | * Written by Doug Lea with assistance from members of JCP JSR-166 3 | * Expert Group and released to the public domain, as explained at 4 | * http://creativecommons.org/publicdomain/zero/1.0/ 5 | * 6 | * http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ThreadLocalRandom.java?view=markup 7 | */ 8 | 9 | package com.aphyr.interval_metrics; 10 | 11 | import java.util.Random; 12 | 13 | // CHECKSTYLE:OFF 14 | /** 15 | * Copied directly from the JSR-166 project. 16 | */ 17 | @SuppressWarnings("all") 18 | public class ThreadLocalRandom extends Random { 19 | // same constants as Random, but must be redeclared because private 20 | private static final long multiplier = 0x5DEECE66DL; 21 | private static final long addend = 0xBL; 22 | private static final long mask = (1L << 48) - 1; 23 | 24 | /** 25 | * The random seed. We can't use super.seed. 26 | */ 27 | private long rnd; 28 | 29 | /** 30 | * Initialization flag to permit calls to setSeed to succeed only while executing the Random 31 | * constructor. We can't allow others since it would cause setting seed in one part of a 32 | * program to unintentionally impact other usages by the thread. 33 | */ 34 | boolean initialized; 35 | 36 | // Padding to help avoid memory contention among seed updates in 37 | // different TLRs in the common case that they are located near 38 | // each other. 39 | private long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7; 40 | 41 | /** 42 | * The actual ThreadLocal 43 | */ 44 | private static final ThreadLocal localRandom = 45 | new ThreadLocal() { 46 | protected ThreadLocalRandom initialValue() { 47 | return new ThreadLocalRandom(); 48 | } 49 | }; 50 | 51 | 52 | /** 53 | * Constructor called only by localRandom.initialValue. 54 | */ 55 | ThreadLocalRandom() { 56 | super(); 57 | initialized = true; 58 | } 59 | 60 | /** 61 | * Returns the current thread's {@code ThreadLocalRandom}. 62 | * 63 | * @return the current thread's {@code ThreadLocalRandom} 64 | */ 65 | public static ThreadLocalRandom current() { 66 | return localRandom.get(); 67 | } 68 | 69 | /** 70 | * Throws {@code UnsupportedOperationException}. Setting seeds in this generator is not 71 | * supported. 72 | * 73 | * @throws UnsupportedOperationException always 74 | */ 75 | public void setSeed(long seed) { 76 | if (initialized) 77 | throw new UnsupportedOperationException(); 78 | rnd = (seed ^ multiplier) & mask; 79 | } 80 | 81 | protected int next(int bits) { 82 | rnd = (rnd * multiplier + addend) & mask; 83 | return (int) (rnd >>> (48 - bits)); 84 | } 85 | 86 | /** 87 | * Returns a pseudorandom, uniformly distributed value between the given least value (inclusive) 88 | * and bound (exclusive). 89 | * 90 | * @param least the least value returned 91 | * @param bound the upper bound (exclusive) 92 | * @return the next value 93 | * @throws IllegalArgumentException if least greater than or equal to bound 94 | */ 95 | public int nextInt(int least, int bound) { 96 | if (least >= bound) 97 | throw new IllegalArgumentException(); 98 | return nextInt(bound - least) + least; 99 | } 100 | 101 | /** 102 | * Returns a pseudorandom, uniformly distributed value between 0 (inclusive) and the specified 103 | * value (exclusive). 104 | * 105 | * @param n the bound on the random number to be returned. Must be positive. 106 | * @return the next value 107 | * @throws IllegalArgumentException if n is not positive 108 | */ 109 | public long nextLong(long n) { 110 | if (n <= 0) 111 | throw new IllegalArgumentException("n must be positive"); 112 | // Divide n by two until small enough for nextInt. On each 113 | // iteration (at most 31 of them but usually much less), 114 | // randomly choose both whether to include high bit in result 115 | // (offset) and whether to continue with the lower vs upper 116 | // half (which makes a difference only if odd). 117 | long offset = 0; 118 | while (n >= Integer.MAX_VALUE) { 119 | final int bits = next(2); 120 | final long half = n >>> 1; 121 | final long nextn = ((bits & 2) == 0) ? half : n - half; 122 | if ((bits & 1) == 0) 123 | offset += n - nextn; 124 | n = nextn; 125 | } 126 | return offset + nextInt((int) n); 127 | } 128 | 129 | // Stolen from java.util.Random#nextInt(). 130 | public static final int BITS_PER_LONG = 63; 131 | public static long nextLong2(final long n) { 132 | long bits, val; 133 | do { 134 | bits = ThreadLocalRandom.current().nextLong() & (~(1L << BITS_PER_LONG)); 135 | val = bits % n; 136 | } while (bits - val + (n - 1) < 0L); 137 | return val; 138 | } 139 | 140 | /** 141 | * Returns a pseudorandom, uniformly distributed value between the given least value (inclusive) 142 | * and bound (exclusive). 143 | * 144 | * @param least the least value returned 145 | * @param bound the upper bound (exclusive) 146 | * @return the next value 147 | * @throws IllegalArgumentException if least greater than or equal to bound 148 | */ 149 | public long nextLong(long least, long bound) { 150 | if (least >= bound) 151 | throw new IllegalArgumentException(); 152 | return nextLong(bound - least) + least; 153 | } 154 | 155 | /** 156 | * Returns a pseudorandom, uniformly distributed {@code double} value between 0 (inclusive) and 157 | * the specified value (exclusive). 158 | * 159 | * @param n the bound on the random number to be returned. Must be positive. 160 | * @return the next value 161 | * @throws IllegalArgumentException if n is not positive 162 | */ 163 | public double nextDouble(double n) { 164 | if (n <= 0) 165 | throw new IllegalArgumentException("n must be positive"); 166 | return nextDouble() * n; 167 | } 168 | 169 | /** 170 | * Returns a pseudorandom, uniformly distributed value between the given least value (inclusive) 171 | * and bound (exclusive). 172 | * 173 | * @param least the least value returned 174 | * @param bound the upper bound (exclusive) 175 | * @return the next value 176 | * @throws IllegalArgumentException if least greater than or equal to bound 177 | */ 178 | public double nextDouble(double least, double bound) { 179 | if (least >= bound) 180 | throw new IllegalArgumentException(); 181 | return nextDouble() * (bound - least) + least; 182 | } 183 | 184 | private static final long serialVersionUID = -5851777807851030925L; 185 | } 186 | // CHECKSTYLE:ON 187 | -------------------------------------------------------------------------------- /src/interval_metrics/core.clj: -------------------------------------------------------------------------------- 1 | (ns interval-metrics.core 2 | (:import (clojure.lang Counted 3 | IDeref 4 | Seqable 5 | Indexed) 6 | (java.util.concurrent.atomic AtomicReference 7 | AtomicLong 8 | AtomicLongArray) 9 | (com.aphyr.interval_metrics ThreadLocalRandom))) 10 | 11 | ;; Protocols ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 12 | 13 | (defprotocol Metric 14 | (update! [this value] 15 | "Destructively updates the metric with a new value.")) 16 | 17 | (defprotocol Snapshot 18 | (snapshot! [this] 19 | "Returns a copy of the metric, resetting the original to a blank 20 | state.")) 21 | 22 | ;; Utilities ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 23 | 24 | (defn next-long 25 | "Returns a pseudo-random long uniformly between 0 and n-1." 26 | [^long n] 27 | (ThreadLocalRandom/nextLong2 n)) 28 | 29 | (def default-uniform-reservoir-size 1028) 30 | (def bits-per-long 63) 31 | 32 | (def time-units 33 | "A map from units to their size in nanoseconds." 34 | {:nanoseconds 1 35 | :nanos 1 36 | :microseconds 1000 37 | :micros 1000 38 | :milliseconds 1000000 39 | :millis 1000000 40 | :seconds 1000000000 41 | :minutes 60000000000 42 | :hours 3600000000000 43 | :days 86400000000000 44 | :weeks 604800000000000}) 45 | 46 | (defn scale 47 | "Returns a conversion factor s from unit a to unit b, such that (* s 48 | measurement-in-a) is in units of b." 49 | [a b] 50 | (try 51 | (/ (get time-units a) 52 | (get time-units b)) 53 | (catch NullPointerException e 54 | (when-not (contains? time-units a) 55 | (throw (IllegalArgumentException. (str "Don't know unit " a)))) 56 | (when-not (contains? time-units b) 57 | (throw (IllegalArgumentException. (str "Don't know unit " b)))) 58 | (throw e)))) 59 | 60 | (defn quantile 61 | "Given a sorted Indexed collection, returns the value nearest to the given 62 | quantile in [0,1]." 63 | [sorted quantile] 64 | (assert (<= 0.0 quantile 1.0)) 65 | (let [n (count sorted)] 66 | (if (zero? n) 67 | nil 68 | (nth sorted 69 | (min (dec n) 70 | (int (Math/floor (* n quantile)))))))) 71 | 72 | (defn mean 73 | [coll] 74 | (/ (reduce + coll) 75 | (count coll))) 76 | 77 | ;; Atomic metrics ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 78 | 79 | ; Wraps another metric to support atomic snapshots. Uses a generator function 80 | ; to generate new (empty) metric objects when snapshot is called. 81 | (deftype AtomicMetric [^AtomicReference state generator] 82 | IDeref 83 | (deref [this] 84 | (deref (.get state))) 85 | 86 | Metric 87 | (update! [this value] 88 | (update! (.get state) value)) 89 | 90 | Snapshot 91 | (snapshot! [this] 92 | (-> state 93 | (.getAndSet (generator)) 94 | deref))) 95 | 96 | ; A little gross since this is a global preference, but hopefully nobody 97 | ; else will mind 98 | (prefer-method clojure.core/print-method clojure.lang.IRecord IDeref) 99 | 100 | (defn atomic 101 | "Given a generator function which creates blank metric objects, returns an 102 | AtomicMetric which provides consistent snapshots of that metric." 103 | [generator] 104 | (AtomicMetric. (AtomicReference. (generator)) 105 | generator)) 106 | 107 | 108 | ;; Rates ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 109 | 110 | (def rate-scale 111 | "The size of a second in nanoseconds." 112 | 1.0e-9) 113 | 114 | (deftype Rate [^AtomicLong count ^AtomicLong time] 115 | IDeref 116 | (deref [this] 117 | (/ (.get count) 118 | rate-scale 119 | (- (System/nanoTime) (.get time)))) 120 | 121 | Metric 122 | (update! [this value] 123 | (.addAndGet count value) 124 | this) 125 | 126 | Snapshot 127 | (snapshot! [this] 128 | (locking this 129 | (let [now (System/nanoTime) 130 | t (.getAndSet time now) 131 | c (.getAndSet count 0)] 132 | (/ c rate-scale (- now t)))))) 133 | 134 | (defn ^Rate rate 135 | "Tracks the rate of values per second. (update rate 3) adds 3 to the rate's 136 | counter. (deref rate) returns the number of values accumulated divided by the 137 | time since the last snapshot was taken. (snapshot! rate) returns the number 138 | of values per second since the last snapshot, and resets the count to zero." 139 | [] 140 | (Rate. (AtomicLong. 0) (AtomicLong. (System/nanoTime)))) 141 | 142 | ;; Reservoirs ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 143 | 144 | ; A mutable uniform Vitter R reservoir. 145 | (deftype UniformReservoir [^AtomicLong size ^AtomicLongArray values] 146 | Counted 147 | (count [this] 148 | (min (.get size) 149 | (.length values))) 150 | 151 | Metric 152 | (update! [this value] 153 | (let [s (.incrementAndGet size)] 154 | (if (<= s (.length values)) 155 | ; Not full 156 | (.set values (dec s) value) 157 | ; Full 158 | (let [r (next-long s)] 159 | (when (< r (.length values)) 160 | (.set values r value))))) 161 | this) 162 | 163 | IDeref 164 | ; Create a sorted array of longs. 165 | (deref [this] 166 | (let [n (count this) 167 | ary (long-array n)] 168 | (dotimes [i n] 169 | (aset-long ary i (.get values i))) 170 | (java.util.Arrays/sort ary) 171 | (seq ary)))) 172 | 173 | (defn uniform-reservoir* 174 | "Creates a new uniform reservoir, optionally of a given size. Does not 175 | support snapshots; simply accrues new values for all time." 176 | ([] 177 | (uniform-reservoir* default-uniform-reservoir-size)) 178 | ([^long size] 179 | (UniformReservoir. (AtomicLong. 0) 180 | (let [values (AtomicLongArray. size)] 181 | (dotimes [i size] 182 | (.set values i 0)) 183 | values)))) 184 | 185 | (defn uniform-reservoir 186 | "Creates a new uniform reservoir, optionally of a given size. Supports atomic 187 | snapshots." 188 | ([] 189 | (atomic uniform-reservoir*)) 190 | ([^long size] 191 | (atomic #(uniform-reservoir* size)))) 192 | 193 | ;; Combined rates and latencies ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 194 | (defrecord RateLatency [quantiles rate-scale latency-scale rate latencies] 195 | Metric 196 | (update! [this time] 197 | (update! latencies time) 198 | (update! rate 1)) 199 | 200 | Snapshot 201 | (snapshot! [this] 202 | (let [rate (snapshot! rate) 203 | latencies (snapshot! latencies) 204 | t (/ (System/currentTimeMillis) 1000)] 205 | {:time t 206 | :rate (* rate-scale rate) 207 | :latencies (->> quantiles 208 | (map (fn [q] 209 | [q (when-let [t (quantile latencies q)] 210 | (* t latency-scale))])) 211 | (into {}))}))) 212 | 213 | (defn rate+latency 214 | "Returns a snapshottable metric which tracks a request rate and the latency 215 | of requests. Calls to update! on this metric are assumed to be in nanoseconds. 216 | When a snapshot is taken, returns a map like 217 | 218 | { 219 | ; The current posix time in seconds 220 | :time 273885884803/200 221 | ; The number of updates per second 222 | :rate 250/13 223 | ; A map of latency quantiles to latencies, in milliseconds 224 | } 225 | 226 | Options: 227 | 228 | :quantiles A list of the quantiles to sample. 229 | :rate-unit The unit to report rates in: e.g. :seconds 230 | :latency-unit The unit to report latencies in: e.g. :milliseconds 231 | :reservoir-size The size of the uniform reservoir used to collect latencies" 232 | ([] (rate+latency {})) 233 | ([opts] 234 | (let [quantiles (get opts :quantiles [0.0 0.5 0.95 0.99 0.999]) 235 | rate-scale (scale (get opts :rate-unit :seconds) :seconds) 236 | latency-scale (scale :nanos (get opts :latency-unit :millis)) 237 | rate (rate) 238 | reservoir (if-let [s (:reservoir-size opts)] 239 | (uniform-reservoir s) 240 | (uniform-reservoir))] 241 | (RateLatency. quantiles rate-scale latency-scale rate reservoir)))) 242 | -------------------------------------------------------------------------------- /src/interval_metrics/measure.clj: -------------------------------------------------------------------------------- 1 | (ns interval-metrics.measure 2 | "Functions and macros for instrumenting your code." 3 | (:use interval-metrics.core)) 4 | 5 | (defn unix-time 6 | "Returns the current unix time, in seconds." 7 | [] 8 | (/ (System/currentTimeMillis) 1000)) 9 | 10 | (defn linear-time 11 | "Returns a linear time source, in seconds." 12 | [] 13 | (/ (System/nanoTime) 1e9)) 14 | 15 | (defn periodically- 16 | "Spawns a thread which calls f every dt seconds. Returns a function which 17 | stops that thread." 18 | [dt f] 19 | (let [anchor (linear-time) 20 | running? (promise) 21 | looper (bound-fn looper [] 22 | ; Sleep until the next tick, or when the shutdown is 23 | ; delivered as false. 24 | (while (deref running? 25 | (* 1000 (- dt (mod (- (linear-time) anchor) 26 | dt))) 27 | true) 28 | (try 29 | (f) 30 | (catch Throwable t))))] 31 | (.start (Thread. ^Runnable looper "interval-metrics periodic")) 32 | #(deliver running? false))) 33 | 34 | (defmacro periodically 35 | "Spawns a thread which executes body every dt seconds, after an initial dt 36 | delay. Returns a function which stops that thread." 37 | [dt & body] 38 | `(periodically- ~dt (bound-fn [] ~@body))) 39 | 40 | (defmacro measure-latency 41 | "Wraps body in a macro which reports its running time in nanoseconds to a 42 | Metric." 43 | [metric & body] 44 | `(let [t0# (System/nanoTime) 45 | value# (do ~@body) 46 | t1# (System/nanoTime)] 47 | (update! ~metric (- t1# t0#)) 48 | value#)) 49 | -------------------------------------------------------------------------------- /test/interval_metrics/t_core.clj: -------------------------------------------------------------------------------- 1 | (ns interval-metrics.t-core 2 | (:use midje.sweet 3 | interval-metrics.core 4 | criterium.core) 5 | (:import (interval_metrics.core Rate) 6 | (java.util.concurrent CountDownLatch 7 | TimeUnit))) 8 | 9 | (defn fill! [reservoir coll] 10 | (->> coll 11 | (partition-all 10000) 12 | (map shuffle) 13 | (map (fn [chunk] 14 | (dorun (pmap #(update! reservoir %) 15 | chunk)))) 16 | dorun) 17 | reservoir) 18 | 19 | (facts "reservoirs" 20 | (facts "a few numbers" 21 | (let [r (uniform-reservoir)] 22 | (fact (snapshot! r) => nil) 23 | 24 | (update! r 1) 25 | (fact (snapshot! r) => (just [1])) 26 | 27 | (update! r 1) 28 | (update! r -10) 29 | (update! r 23) 30 | (fact (snapshot! r) => (just [-10 1 23])))) 31 | 32 | (facts "linear naturals" 33 | (let [capacity 1000 34 | n (* capacity 10) 35 | r (uniform-reservoir capacity) 36 | tolerance capacity] 37 | 38 | ; Uniformly distributed numbers 39 | (fill! r (range n)) 40 | (let [snap (snapshot! r)] 41 | ; Flood with zeroes 42 | (dotimes [i capacity] 43 | (update! r 0)) 44 | 45 | (fact (count snap) => capacity) 46 | (fact (quantile snap 0) => (roughly 0 tolerance)) 47 | (fact (quantile snap 0.5) => (roughly (/ n 2) tolerance) 48 | (fact (quantile snap 1) => (roughly n tolerance)))) 49 | 50 | ; Check for those zeroes (to verify the snapshot isolated us) 51 | (let [snap (snapshot! r)] 52 | (fact (quantile snap 0) => 0) 53 | (fact (quantile snap 0.5) => 0) 54 | (fact (quantile snap 1) => 0)) 55 | 56 | ; Check to see that the most recent snapshot emptied things out 57 | (fact (snapshot! r) => nil)))) 58 | 59 | (defn nanos->seconds 60 | [nanos] 61 | (* 1e-9 nanos)) 62 | 63 | (defn expected-rate 64 | [total ^Rate r] 65 | (let [x (/ total (nanos->seconds 66 | (- (System/nanoTime) 67 | (.time r))))] 68 | (roughly x (/ x 20)))) 69 | 70 | (facts "rates" 71 | (let [r (rate) 72 | t0 (System/nanoTime) 73 | n 100000 74 | total (* 0.5 n (inc n)) 75 | _ (fill! r (range n)) 76 | ; Verify that dereferencing the rate in progress is correct 77 | _ (fact @r => (expected-rate total r)) 78 | _ (Thread/sleep 10) 79 | _ (fact @r => (expected-rate total r)) 80 | ; And take a snapshot 81 | snap (snapshot! r) 82 | t1 (System/nanoTime)] 83 | ; Since the snapshot completed, the rate should be zero. 84 | (fact @r => 0.0) 85 | 86 | ; But our snapshot should be the mean. 87 | (fact snap => (roughly (/ total (nanos->seconds (- t1 t0))))) 88 | 89 | ; Additional snapshots should be zero. 90 | (fact (snapshot! r) => 0.0))) 91 | 92 | (facts "rate+latency" 93 | (let [n 100000 94 | r (rate+latency {:rate-unit :nanoseconds 95 | :latency-unit :micros 96 | :quantiles [0 1/2 1]}) 97 | t0 (System/nanoTime) 98 | _ (dotimes [i n] (update! r i)) 99 | snap (snapshot! r) 100 | t1 (System/nanoTime)] 101 | (fact (:time snap) => pos?) 102 | (fact (:rate snap) => (roughly (/ n (- t1 t0)) 103 | (/ (:rate snap) 5))) 104 | (let [ls (:latencies snap)] 105 | (fact (get ls 0) => (roughly (/ 0 n 1000) 1)) 106 | (fact (get ls 1/2) => (roughly (/ n 2 1000) (/ n 10000))) 107 | (fact (get ls 1) => (roughly (/ n 1000) 1))))) 108 | 109 | (defn stress [metric n] 110 | (let [threads 4 111 | per-worker (/ n threads) 112 | latch (CountDownLatch. threads) 113 | t0 (System/nanoTime) 114 | workers (map (fn [_] (future 115 | (dotimes [i per-worker] 116 | (update! metric i)) 117 | (.countDown latch))) 118 | (range threads)) 119 | reader (future 120 | (while (not (.await latch 1 TimeUnit/SECONDS)) 121 | ; (prn (snapshot! metric)) 122 | ))] 123 | (dorun (map deref workers)) 124 | @reader 125 | (let [t1 (System/nanoTime) 126 | dt (nanos->seconds (- t1 t0)) 127 | rate (/ n dt)] 128 | (println "Completed" n "updates in" (format "%.3f" dt) "s: " 129 | (format "%.3f" rate) "updates/sec")))) 130 | 131 | ;(facts "performance" 132 | ; (println "Benchmarking rate") 133 | ; (stress (rate) 1e9) 134 | ; 135 | ; (println "Benchmarking reservoir") 136 | ; (stress (uniform-reservoir) 1e9) 137 | ; 138 | ; (println "Benchmarking rate+latency") 139 | ; (stress (rate+latency) 1e9)) 140 | -------------------------------------------------------------------------------- /test/interval_metrics/t_measure.clj: -------------------------------------------------------------------------------- 1 | (ns interval-metrics.t-measure 2 | (:use midje.sweet 3 | interval-metrics.core 4 | criterium.core) 5 | (:import (interval_metrics.core Rate) 6 | (java.util.concurrent CountDownLatch 7 | TimeUnit))) 8 | --------------------------------------------------------------------------------