├── .github └── workflows │ └── ci.yml ├── .gitignore ├── CHANGELOG.md ├── LICENSE ├── Makefile ├── README.md ├── benchmarks └── overhead.py ├── pylintrc ├── pytest.ini ├── setup.py ├── src └── perf_timer │ ├── __init__.py │ ├── _histogram.py │ ├── _impl.py │ ├── _trio.py │ └── _version.py ├── test-requirements.in ├── test-requirements.txt ├── test-requirements_trio-0.11.txt └── tests ├── __init__.py ├── test_format_duration.py ├── test_histogram.py ├── test_observer.py ├── test_perf_timer.py └── test_trio.py /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | name: CI 2 | 3 | on: [push] 4 | 5 | jobs: 6 | build_and_test: 7 | runs-on: ubuntu-latest 8 | strategy: 9 | matrix: 10 | python-version: ['3.7', '3.8', '3.9', '3.10'] 11 | requirements: [test-requirements.txt] 12 | include: 13 | - python-version: '3.7' 14 | requirements: test-requirements_trio-0.11.txt 15 | steps: 16 | - uses: actions/checkout@v3 17 | - name: Setup Python 18 | uses: actions/setup-python@v4 19 | with: 20 | python-version: ${{ matrix.python-version }} 21 | cache: pip 22 | cache-dependency-path: ${{ matrix.requirements }} 23 | # TODO: unpinned "latest" build 24 | - run: pip install . -r ${{ matrix.requirements }} 25 | - run: make test lint 26 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | __pycache__/ 3 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Release history 2 | 3 | ## perf-timer 0.3.0 (pending) 4 | ### Fixed 5 | - fix `__del__` exception on badly-constructed instances 6 | 7 | ## perf-timer 0.2.2 (2021-03-02) 8 | ### Fixed 9 | - handle absence of `time.thread_timer()` gracefully. This timer, which is the 10 | default used by `ThreadPerfTimer`, may not be available in some OS X 11 | environments. 12 | 13 | ## perf-timer 0.2.1 (2020-11-09) 14 | ### Fixed 15 | - employ `atexit()` to robustly log results even when `__del__` finalizers are 16 | not called 17 | 18 | ## perf-timer 0.2.0 (2020-07-01) 19 | ### Added 20 | - perf-timer classes now support tracking various statistics 21 | including standard deviation and percentiles. The options are 22 | `AverageObserver`, `StdDevObserver` (default), and `HistogramObserver`. 23 | E.g. `PerfTimer(..., observer=HistogramObserver)`. 24 | - Benchmark overhead of the various observer and timer types 25 | 26 | ## perf-timer 0.1.1 (2020-06-05) 27 | ### Fixed 28 | - Support rename of trio.hazmat to trio.lowlevel 29 | - Expose docs to help() 30 | 31 | ## perf-timer 0.1.0 (2019-07-31) 32 | Initial version 33 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 John Belmonte 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | all: test lint 2 | 3 | .PHONY: test 4 | test: 5 | PYTHONPATH=src python -m pytest --cov=src/ --no-cov-on-fail tests/ 6 | 7 | .PHONY: lint 8 | lint: 9 | PYTHONPATH=src python -m pylint src/ tests/ benchmarks/ 10 | 11 | # upgrade all deps: 12 | # make -W test-requirements.{in,txt} PIP_COMPILE_ARGS="-U" 13 | # upgrade specific deps: 14 | # make -W test-requirements.{in,txt} PIP_COMPILE_ARGS="-P foo" 15 | test-requirements.txt: setup.py test-requirements.in 16 | pip-compile -q $(PIP_COMPILE_ARGS) --output-file $@ $^ 17 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![Build status](https://img.shields.io/github/workflow/status/belm0/perf-timer/CI)](https://github.com/belm0/perf-timer/actions/workflows/ci.yml?query=branch%3Amaster+) 2 | [![Package version](https://img.shields.io/pypi/v/perf-timer.svg)](https://pypi.org/project/perf-timer) 3 | [![Supported Python versions](https://img.shields.io/pypi/pyversions/perf-timer.svg)](https://pypi.org/project/perf-timer) 4 | 5 | # PerfTimer 6 | 7 | An indispensable performance timer for Python 8 | 9 | ## Background 10 | ### Taxonomy 11 | Three general tools should be employed to 12 | understand the CPU performance of your Python code: 13 | 1. **sampling profiler** - measures the relative 14 | distribution of time spent among function or 15 | lines of code during a program session. Limited by 16 | sampling resolution. Does not provide call counts, 17 | and results cannot be easily compared between sessions. 18 | 2. **microbenchmark timer** (timeit) - accurately 19 | times a contrived code snippet by running it repeatedly 20 | 3. **instrumenting timer** - accurately times a specific 21 | function or section of your code during a program 22 | session 23 | 24 | _PerfTimer_ is a humble instance of #3. It's the easiest 25 | way (least amount of fuss and effort) to get insight into 26 | call count and execution time of a function or piece 27 | of code during a real session of your program. 28 | 29 | Use cases include: 30 | * check the effects of algorithm tweaks, new implementations, etc. 31 | * confirm the performance of a library you are considering under 32 | actual use by your app (as opposed to upstream's artificial 33 | benchmarks) 34 | * measure CPU overhead of networking or other asynchronous I/O 35 | (currently supported: OS threads, Trio async/await) 36 | 37 | ### Yet another code timer? 38 | 39 | It seems everyone has tried their hand at writing one of these timer 40 | utilities. Implementations can be found in public repos, snippets, and PyPi— 41 | there's even a Python feature request. That's not counting all the 42 | proprietary and one-off instances. 43 | 44 | Features of this library: 45 | 46 | * **flexible** - use as a context manager or function decorator; 47 | pluggable logging, timer, and observer functions 48 | * **low overhead** (typically a few microseconds) - can be 49 | employed in hot code paths or even enabled on production deployments 50 | * **async/await support** (Trio only) - first of its kind! Periods when a task is 51 | is sleeping, blocked by I/O, etc. will not be counted. 52 | * **percentile durations** - e.g. report the median and 90th percentile 53 | execution time of the instrumented code. Implemented with a bounded-memory, 54 | streaming histogram. 55 | 56 | ## Usage 57 | 58 | Typical usage is to create a `PerfTimer` instance at the global 59 | scope, so that aggregate execution time is reported at program termination: 60 | 61 | ```python 62 | from perf_timer import PerfTimer 63 | 64 | _timer = PerfTimer('process thumbnail') 65 | 66 | def get_thumbnail_image(path): 67 | img = cache.get_thumbnail(path) 68 | if not thumbnail: 69 | img = read_image(path) 70 | with _timer: 71 | img.decode() 72 | img.resize(THUMBNAIL_SIZE) 73 | cache.set_thumbnail(img) 74 | return img 75 | ``` 76 | 77 | When the program exits, assuming `get_thumbnail_image` was called 78 | several times, execution stats will be reported to stdout as 79 | follows: 80 | 81 | ``` 82 | timer "process thumbnail": avg 73.1 µs ± 18.0 µs, max 320.5 µs in 292 runs 83 | ``` 84 | 85 | ### decorator style 86 | 87 | To instrument an entire function or class method, use `PerfTimer` 88 | as a decorator: 89 | 90 | ```python 91 | @PerfTimer('get thumbnail') 92 | def get_thumbnail_image(path): 93 | ... 94 | ``` 95 | 96 | ### histogram statistics 97 | 98 | By default `PerfTimer` will track the average, standard deviation, and maximum 99 | of observed values. Other available observers include `HistogramObserver`, 100 | which reports (customizable) percentiles: 101 | 102 | ```python 103 | import random 104 | import time 105 | from perf_timer import PerfTimer, HistogramObserver 106 | 107 | _timer = PerfTimer('test', observer=HistogramObserver, quantiles=(.5, .9)) 108 | for _ in range(50): 109 | with _timer: 110 | time.sleep(random.expovariate(1/.1)) 111 | 112 | del _timer 113 | ``` 114 | output: 115 | ``` 116 | timer "test": avg 117ms ± 128ms, 50% ≤ 81.9ms, 90% ≤ 243ms in 50 runs 117 | ``` 118 | 119 | ### custom logging 120 | 121 | A custom logging function may be passed to the `PerfTimer` 122 | constructor: 123 | 124 | ```python 125 | import logging 126 | 127 | _logger = logging.getLogger() 128 | _timer = PerfTimer('process thumbnail', log_fn=_logger.debug) 129 | ``` 130 | 131 | ### OS thread support 132 | 133 | To minimize overhead, `PerfTimer` assumes single-thread access. Use 134 | `ThreadPerfTimer` in multi-thread scenarios: 135 | 136 | ```python 137 | from perf_timer import ThreadPerfTimer 138 | 139 | _timer = ThreadPerfTimer('process thumbnail') 140 | ``` 141 | 142 | ### async support 143 | 144 | In the previous example, timing the entire function will include file 145 | I/O time since `PerfTimer` measures wall time by default. For programs 146 | which happen to do I/O via the Trio async/await library, you 147 | can use `TrioPerfTimer` which measures time only when the current task 148 | is executing: 149 | 150 | ```python 151 | from perf_timer import TrioPerfTimer 152 | 153 | @TrioPerfTimer('get thumbnail') 154 | async def get_thumbnail_image(path): 155 | img = cache.get_thumbnail(path) 156 | if not thumbnail: 157 | img = await read_image(path) 158 | img.decode() 159 | img.resize(THUMBNAIL_SIZE) 160 | cache.set_thumbnail(img) 161 | return img 162 | ``` 163 | 164 | (Open challenge: support other async/await libraries) 165 | 166 | ### trio_perf_counter() 167 | 168 | This module also provides the `trio_perf_counter()` primitive. 169 | Following the semantics of the various performance counters in Python's `time` 170 | module, `trio_perf_counter()` provides high resolution measurement of a Trio 171 | task's execution time, excluding periods where it's sleeping or blocked on I/O. 172 | (`TrioPerfTimer` uses this internally.) 173 | 174 | ```python 175 | from perf_timer import trio_perf_counter 176 | 177 | async def get_remote_object(): 178 | t0 = trio_perf_counter() 179 | msg = await read_network_bytes() 180 | obj = parse(msg) 181 | print('task CPU usage (seconds):', trio_perf_counter() - t0) 182 | return obj 183 | ``` 184 | 185 | ## Installation 186 | 187 | ```shell 188 | pip install perf-timer 189 | ``` 190 | 191 | ## Measurement overhead 192 | 193 | Measurement overhead is important. The smaller the timer's overhead, the 194 | less it interferes with the normal timing of your program, and the tighter 195 | the code loop it can be applied to. 196 | 197 | The values below represent the typical overhead of one observation, as measured 198 | on ye old laptop (2014 MacBook Air 11 1.7GHz i7). 199 | 200 | ``` 201 | $ pip install -r test-requirements.txt 202 | $ python benchmarks/overhead.py 203 | compare observers: 204 | PerfTimer(observer=AverageObserver): 1.5 µs 205 | PerfTimer(observer=StdDevObserver): 1.8 µs (default) 206 | PerfTimer(observer=HistogramObserver): 6.0 µs 207 | 208 | compare types: 209 | PerfTimer(observer=StdDevObserver): 1.8 µs 210 | ThreadPerfTimer(observer=StdDevObserver): 9.8 µs 211 | TrioPerfTimer(observer=StdDevObserver): 4.8 µs 212 | ``` 213 | 214 | ## TODO 215 | * features 216 | * faster HistogramObserver 217 | * more async/await support: asyncio, curio, etc. 218 | * [asyncio hint which no longer works](https://stackoverflow.com/revisions/34827291/3) 219 | * project infrastructure 220 | * code coverage integration 221 | * publish docs 222 | * type annotations and check 223 | -------------------------------------------------------------------------------- /benchmarks/overhead.py: -------------------------------------------------------------------------------- 1 | """Measure and report overhead of the perf-timer variants. 2 | 3 | The typical observation duration is reported for each case. 4 | 5 | Synopsis: 6 | $ python overhead.py 7 | compare observers: 8 | PerfTimer(observer=AverageObserver): 1.5 µs 9 | PerfTimer(observer=StdDevObserver): 1.8 µs (default) 10 | PerfTimer(observer=HistogramObserver): 6.0 µs 11 | 12 | compare types: 13 | PerfTimer(observer=StdDevObserver): 1.8 µs 14 | ThreadPerfTimer(observer=StdDevObserver): 9.8 µs 15 | TrioPerfTimer(observer=StdDevObserver): 4.8 µs 16 | """ 17 | 18 | from functools import partial 19 | 20 | import trio 21 | 22 | from perf_timer import (PerfTimer, ThreadPerfTimer, TrioPerfTimer, 23 | AverageObserver, StdDevObserver, HistogramObserver, 24 | measure_overhead) 25 | from perf_timer._impl import _format_duration 26 | 27 | 28 | async def main(): 29 | _format = partial(_format_duration, precision=2) 30 | default_observer = StdDevObserver 31 | print('compare observers:') 32 | timer_type = PerfTimer 33 | for observer in (AverageObserver, StdDevObserver, HistogramObserver): 34 | duration = measure_overhead(partial(timer_type, observer=observer)) 35 | item = f'{timer_type.__name__}(observer={observer.__name__}):' 36 | print(f' {item:45s}{_format(duration)}' 37 | f'{" (default)" if observer is default_observer else ""}') 38 | 39 | print() 40 | print('compare types:') 41 | observer = default_observer 42 | for timer_type in (PerfTimer, ThreadPerfTimer, TrioPerfTimer): 43 | duration = measure_overhead(partial(timer_type, observer=observer)) 44 | item = f'{timer_type.__name__}(observer={observer.__name__}):' 45 | print(f' {item:45s}{_format(duration)}') 46 | 47 | 48 | if __name__ == '__main__': 49 | trio.run(main) 50 | -------------------------------------------------------------------------------- /pylintrc: -------------------------------------------------------------------------------- 1 | [MASTER] 2 | disable=bad-whitespace, 3 | blacklisted-name, 4 | duplicate-code, 5 | invalid-name, 6 | fixme, 7 | missing-docstring, 8 | protected-access, 9 | too-few-public-methods, 10 | too-many-ancestors, 11 | too-many-arguments, 12 | too-many-boolean-expressions, 13 | too-many-branches, 14 | too-many-instance-attributes, 15 | too-many-lines, 16 | too-many-locals, 17 | too-many-nested-blocks, 18 | too-many-public-methods, 19 | too-many-return-statements, 20 | too-many-statements, 21 | unused-argument, 22 | wrong-spelling-in-comment, 23 | wrong-spelling-in-docstring 24 | 25 | [REPORTS] 26 | score=no 27 | -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- 1 | [pytest] 2 | trio_mode = true 3 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import pathlib 2 | 3 | from setuptools import setup 4 | 5 | pkg_name = 'perf_timer' 6 | base_dir = pathlib.Path(__file__).parent 7 | with open(base_dir / 'src' / pkg_name / '_version.py') as f: 8 | version_globals = {} 9 | exec(f.read(), version_globals) 10 | version = version_globals['__version__'] 11 | 12 | setup( 13 | name=pkg_name, 14 | description='An indispensable performance timer for Python', 15 | long_description=''' 16 | PerfTimer is an instrumenting timer which provides an easy way to 17 | get insight into call count and average execution time of a function 18 | or piece of code during a real session of your program. 19 | 20 | Use cases include: 21 | * check the effects of algorithm tweaks, new implementations, etc. 22 | * confirm the performance of a library you are considering under 23 | actual use by your app (as opposed to upstream's artificial 24 | benchmarks) 25 | * measure CPU overhead of networking or other asynchronous I/O 26 | (currently supported: OS threads, Trio async/await) 27 | ''', 28 | long_description_content_type='text/markdown', 29 | version=version, 30 | author='John Belmonte', 31 | author_email='john@neggie.net', 32 | url='https://github.com/belm0/perf-timer', 33 | license='MIT', 34 | packages=[pkg_name], 35 | package_dir={'': 'src'}, 36 | install_requires=[], 37 | python_requires='>=3.7', 38 | classifiers=[ 39 | 'Development Status :: 3 - Alpha', 40 | 'Intended Audience :: Developers', 41 | 'License :: OSI Approved :: MIT License', 42 | 'Programming Language :: Python :: 3 :: Only', 43 | 'Programming Language :: Python :: 3.7', 44 | 'Programming Language :: Python :: 3.8', 45 | 'Programming Language :: Python :: 3.9', 46 | 'Programming Language :: Python :: 3.10', 47 | 'Framework :: Trio', 48 | ], 49 | ) 50 | -------------------------------------------------------------------------------- /src/perf_timer/__init__.py: -------------------------------------------------------------------------------- 1 | from ._impl import (PerfTimer, ThreadPerfTimer, AverageObserver, 2 | StdDevObserver, HistogramObserver, measure_overhead) 3 | try: 4 | from ._trio import trio_perf_counter, TrioPerfTimer 5 | except ImportError: 6 | pass 7 | from ._version import __version__ 8 | 9 | def _metadata_fix(): 10 | # don't do this for Sphinx case because it breaks "bysource" member ordering 11 | import sys # pylint: disable=import-outside-toplevel 12 | if 'sphinx' in sys.modules: 13 | return 14 | 15 | for name, value in globals().items(): 16 | if not name.startswith('_'): 17 | value.__module__ = __name__ 18 | 19 | _metadata_fix() 20 | -------------------------------------------------------------------------------- /src/perf_timer/_histogram.py: -------------------------------------------------------------------------------- 1 | from bisect import bisect_right 2 | from itertools import accumulate 3 | from math import inf, sqrt 4 | from numbers import Number 5 | 6 | 7 | class ApproximateHistogram: 8 | """ 9 | Streaming, approximate histogram 10 | 11 | Based on http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf 12 | 13 | Performance of adding a point is about 5x faster than 14 | https://github.com/carsonfarmer/streamhist (unmaintained). 15 | 16 | The output of quantile() will match numpy.quantile() exactly until 17 | the number of points reaches max_bins, and then gracefully transition 18 | to an approximation. 19 | """ 20 | 21 | def __init__(self, max_bins): 22 | self._max_bins = max_bins 23 | self._bins = [] # (point, count) 24 | self._costs = [] # item i is _bins[i+1].point - _bins[i].point 25 | self._count = 0 26 | # TODO: maintain min/max as bin entries with infinite merge cost 27 | self._min = inf 28 | self._max = -inf 29 | 30 | @staticmethod 31 | def _update_costs(costs, l, i, val): 32 | """update costs array to reflect l.insert(i, val)""" 33 | if i > 0: 34 | new_cost = val[0] - l[i - 1][0] 35 | costs.insert(i - 1, new_cost) 36 | if i < len(costs): 37 | costs[i] = l[i + 1][0] - val[0] 38 | elif len(l) > 1: 39 | costs.insert(0, l[1][0] - val[0]) 40 | # assert costs == approx([b - a for (a, _), (b, _) in zip(l, l[1:])], rel=1e-4) 41 | 42 | @staticmethod 43 | def _update_costs_for_merge(costs, l, i, val): 44 | """update costs array to reflect l[i:i+2] = (val, )""" 45 | # TODO: combine with update_costs() 46 | if 0 < i < len(costs) - 1: 47 | costs[i - 1:i + 2] = val[0] - l[i - 1][0], l[i + 1][0] - val[0] 48 | elif i > 0: 49 | costs[i - 1:i + 1] = (val[0] - l[i - 1][0], ) 50 | else: 51 | costs[i:i + 2] = (l[i + 1][0] - val[0], ) 52 | # assert costs == approx([b - a for (a, _), (b, _) in zip(l, l[1:])], rel=1e-4) 53 | 54 | @classmethod 55 | def _insert_with_cost(cls, costs, l, val): 56 | i = bisect_right(l, val) 57 | l.insert(i, val) 58 | cls._update_costs(costs, l, i, val) 59 | 60 | def add(self, point): 61 | """Add point to histogram""" 62 | # optimization: maintain cost array 63 | self._count += 1 64 | self._min = min(self._min, point) 65 | self._max = max(self._max, point) 66 | bins = self._bins 67 | costs = self._costs 68 | self._insert_with_cost(costs, bins, (point, 1)) 69 | if len(bins) > self._max_bins: 70 | i = costs.index(min(costs)) 71 | (q0, k0), (q1, k1) = bins[i:i+2] 72 | _count = k0 + k1 73 | median = (q0 * k0 + q1 * k1) / _count 74 | bins[i:i+2] = ((median, _count), ) 75 | self._update_costs_for_merge(costs, bins, i, (median, _count)) 76 | 77 | @property 78 | def count(self): 79 | """Return number of points represented by this histogram.""" 80 | return self._count 81 | 82 | @property 83 | def min(self): 84 | """Return minimum point represented by this histogram""" 85 | return self._min 86 | 87 | @property 88 | def max(self): 89 | """Return maximum point represented by this histogram""" 90 | return self._max 91 | 92 | def mean(self): 93 | """Return mean; O(max_bins) complexity.""" 94 | return sum(p * count for p, count in self._bins) / self._count 95 | 96 | def std(self): 97 | """Return standard deviation; O(max_bins) complexity.""" 98 | mean = self.mean() 99 | sum_squares = sum((p - mean) ** 2 * count for p, count in self._bins) 100 | return sqrt(sum_squares / self._count) 101 | 102 | def _quantile(self, sums, q): 103 | if q <= 0: 104 | return self._min 105 | if q >= 1: 106 | return self._max 107 | bins = self._bins 108 | target_sum = q * (self._count - 1) + 1 109 | i = bisect_right(sums, target_sum) - 1 110 | left = bins[i] if i >= 0 else (self._min, 0) 111 | right = bins[i+1] if i+1 < len(bins) else (self._max, 0) 112 | l0, r0 = left[0], right[0] 113 | l1, r1 = left[1], right[1] 114 | s = target_sum - (sums[i] if i >= 0 else 1) 115 | if l1 <= 1 and r1 <= 1: 116 | # We have exact info at this quantile. Match linear interpolation 117 | # strategy of numpy.quantile(). 118 | b = l0 + (r0 - l0) * s / r1 if r1 > 0 else l0 119 | else: 120 | if r1 == 1: 121 | # For exact bin on RHS, compensate for trapezoid interpolation using 122 | # only half of count. 123 | r1 = 2 124 | if l1 == r1: 125 | bp_ratio = s / l1 126 | else: 127 | bp_ratio = (l1 - (l1 ** 2 - 2 * s * (l1 - r1)) ** .5) / (l1 - r1) 128 | assert bp_ratio.imag == 0 129 | b = bp_ratio * (r0 - l0) + l0 130 | return b 131 | 132 | def sum(self): 133 | """Return sum of points; O(max_bins) complexity.""" 134 | return sum(x * count for x, count in self._bins) 135 | 136 | def quantile(self, q): 137 | """Return list of values at given quantile fraction(s); O(max_bins) complexity.""" 138 | # Deviation from Ben-Haim sum strategy: 139 | # * treat count 1 bins as "exact" rather than dividing the count at the point 140 | # * for neighboring exact bins, use simple linear interpolation matching 141 | # numpy.quantile() 142 | if isinstance(q, Number): 143 | q = (q, ) 144 | bins = self._bins 145 | sums = [x - (y/2 if y > 1 else 0) for x, (_, y) in \ 146 | zip(accumulate(bin[1] for bin in bins), bins)] 147 | return list(self._quantile(sums, q_item) for q_item in q) 148 | -------------------------------------------------------------------------------- /src/perf_timer/_impl.py: -------------------------------------------------------------------------------- 1 | import atexit 2 | import functools 3 | import math 4 | import timeit 5 | from weakref import WeakSet 6 | 7 | from contextvars import ContextVar 8 | from inspect import iscoroutinefunction 9 | from multiprocessing import Lock 10 | from time import perf_counter 11 | try: 12 | from time import thread_time 13 | except ImportError: 14 | # thread_time is not available in some OS X environments 15 | thread_time = None 16 | 17 | from perf_timer._histogram import ApproximateHistogram 18 | 19 | _start_time_by_instance = ContextVar('start_time', default={}) 20 | _timers = WeakSet() 21 | 22 | 23 | def _format_duration(duration, precision=3, delimiter=' '): 24 | """Returns human readable duration. 25 | 26 | >>> _format_duration(.0507) 27 | '50.7 ms' 28 | """ 29 | units = (('s', 1), ('ms', 1e3), ('µs', 1e6), ('ns', 1e9)) 30 | i = len(units) - 1 31 | if duration > 0: 32 | i = min(-int(math.floor(math.log10(duration)) // 3), i) 33 | symbol, scale = units[i] 34 | # constant precision, keeping trailing zeros but don't end in decimal point 35 | value = f'{duration * scale:#.{precision}g}'.rstrip('.') 36 | return f'{value}{delimiter}{symbol}' 37 | 38 | 39 | class _BetterContextDecorator: 40 | """ 41 | Equivalent to contextlib.ContextDecorator but supports decorating async 42 | functions. The context manager itself it still non-async. 43 | """ 44 | 45 | def _recreate_cm(self): 46 | return self 47 | 48 | def __call__(self, func): 49 | if iscoroutinefunction(func): 50 | @functools.wraps(func) 51 | async def inner(*args, **kwargs): 52 | with self._recreate_cm(): # pylint: disable=not-context-manager 53 | return await func(*args, **kwargs) 54 | else: 55 | @functools.wraps(func) 56 | def inner(*args, **kwargs): 57 | with self._recreate_cm(): # pylint: disable=not-context-manager 58 | return func(*args, **kwargs) 59 | return inner 60 | 61 | 62 | class _PerfTimerBase(_BetterContextDecorator): 63 | 64 | # NOTE: `observer` is handled by the metaclass, and `quantiles` is handled 65 | # by HistogramObserver. They're included here only for documentation. 66 | def __init__(self, name, *, time_fn=perf_counter, log_fn=print, 67 | observer=None, quantiles=None): 68 | """ 69 | :param name: string used to annotate the timer output 70 | :param time_fn: optional function which returns the current time. 71 | (A None value will raise NotImplementedError.) 72 | :param log_fn: optional function which records the output string 73 | :param observer: mixin class to observe and summarize samples 74 | (AverageObserver|StdDevObserver|HistogramObserver, default StdDevObserver) 75 | :param quantiles: for HistogramObserver, a sequence of quantiles to report. 76 | Values must be in range [0..1] and monotonically increasing. 77 | (default: (0.5, 0.9, 0.98)) 78 | """ 79 | self._init_ok = False 80 | if not time_fn: 81 | raise NotImplementedError 82 | self.name = name 83 | self._time_fn = time_fn 84 | self._log_fn = log_fn 85 | self._startTimeByInstance = _start_time_by_instance.get() 86 | self._reported = False 87 | _timers.add(self) 88 | self._init_ok = True 89 | 90 | def _observe(self, duration): 91 | """called for each observed duration""" 92 | 93 | def _report(self): 94 | """called to report observation results""" 95 | 96 | def _report_once(self): 97 | if not self._reported: 98 | self._report() 99 | self._reported = True 100 | 101 | def __del__(self): 102 | if self._init_ok: 103 | self._report_once() 104 | 105 | def __enter__(self): 106 | if self in self._startTimeByInstance: 107 | raise RuntimeError('PerfTimer is not re-entrant') 108 | self._startTimeByInstance[self] = self._time_fn() 109 | 110 | def __exit__(self, exc_type, exc_value, traceback): 111 | current_time = self._time_fn() 112 | start_time = self._startTimeByInstance.pop(self) 113 | if exc_type is None: 114 | duration = current_time - start_time 115 | self._observe(duration) 116 | 117 | 118 | class AverageObserver(_PerfTimerBase): 119 | """Mixin which outputs mean and max 120 | 121 | output synopsis: 122 | timer "foo": avg 11.9 ms, max 12.8 ms in 10 runs 123 | """ 124 | 125 | def __init__(self, *args, **kwargs): 126 | super().__init__(*args, **kwargs) 127 | self._count = 0 128 | self._sum = 0 129 | self._max = -math.inf 130 | 131 | def _observe(self, duration): 132 | self._count += 1 133 | self._sum += duration 134 | self._max = max(self._max, duration) 135 | 136 | def _report(self): 137 | if self._count > 1: 138 | mean = self._sum / self._count 139 | self._log_fn(f'timer "{self.name}": ' 140 | f'avg {_format_duration(mean)}, ' 141 | f'max {_format_duration(self._max)} ' 142 | f'in {self._count} runs') 143 | elif self._count > 0: 144 | self._log_fn(f'timer "{self.name}": ' 145 | f'{_format_duration(self._sum)}') 146 | 147 | 148 | class StdDevObserver(_PerfTimerBase): 149 | """Mixin which outputs mean, stddev, and max 150 | 151 | 15 - 20% slower than _AverageObserver. 152 | https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm 153 | 154 | output synopsis: 155 | timer "foo": avg 11.9 ms ± 961 µs, max 12.8 ms in 10 runs 156 | """ 157 | 158 | def __init__(self, *args, **kwargs): 159 | super().__init__(*args, **kwargs) 160 | self._count = 0 161 | self._mean = 0 162 | self._m2 = 0 163 | self._max = -math.inf 164 | 165 | def _observe(self, duration): 166 | self._count += 1 167 | delta = duration - self._mean 168 | self._mean += delta / self._count 169 | self._m2 += delta * (duration - self._mean) 170 | self._max = max(self._max, duration) 171 | 172 | def _report(self): 173 | if self._count > 1: 174 | std = math.sqrt(self._m2 / self._count) 175 | self._log_fn(f'timer "{self.name}": ' 176 | f'avg {_format_duration(self._mean)} ' 177 | f'± {_format_duration(std)}, ' 178 | f'max {_format_duration(self._max)} ' 179 | f'in {self._count} runs') 180 | elif self._count > 0: 181 | self._log_fn(f'timer "{self.name}": ' 182 | f'{_format_duration(self._mean)}') 183 | 184 | 185 | class HistogramObserver(_PerfTimerBase): 186 | """Mixin which outputs mean, standard deviation, and percentiles 187 | 188 | output synopsis: 189 | timer "foo": avg 11.9ms ± 961µs, 50% ≤ 12.6ms, 90% ≤ 12.7ms in 10 runs 190 | """ 191 | 192 | def __init__(self, *args, quantiles=(.5, .9, .98), max_bins=64, **kwargs): 193 | super().__init__(*args, **kwargs) 194 | self._init_ok = False 195 | if not all(0 <= x <= 1 for x in quantiles): 196 | raise ValueError('quantile values must be in the range [0, 1]') 197 | if not all(a < b for a, b in zip(quantiles, quantiles[1:])): 198 | raise ValueError('quantiles must be monotonically increasing') 199 | self._quantiles = quantiles 200 | self._hist = ApproximateHistogram(max_bins=max_bins) 201 | self._init_ok = True 202 | 203 | def _observe(self, duration): 204 | self._hist.add(duration) 205 | 206 | def _report(self): 207 | if self._hist.count > 1: 208 | _format = functools.partial(_format_duration, delimiter='') 209 | hist_quantiles = self._hist.quantile(self._quantiles) 210 | percentiles = [f"{pct * 100:.0f}% ≤ {_format(val)}" 211 | for pct, val in zip(self._quantiles, hist_quantiles)] 212 | self._log_fn(f'timer "{self.name}": ' 213 | f'avg {_format(self._hist.mean())} ' 214 | f'± {_format(self._hist.std())}, ' 215 | f'{", ".join(percentiles)} ' 216 | f'in {self._hist.count} runs') 217 | elif self._hist.count > 0: 218 | self._log_fn(f'timer "{self.name}": ' 219 | f'{_format_duration(self._hist.sum())}') 220 | 221 | 222 | class _ObservationLock(_PerfTimerBase): 223 | """Mixin which wraps _observe() in a lock""" 224 | 225 | def __init__(self, *args, **kwargs): 226 | super().__init__(*args, **kwargs) 227 | self._lock = Lock() 228 | 229 | def _observe(self, duration): 230 | with self._lock: 231 | super()._observe(duration) 232 | 233 | 234 | class _MixinMeta(type): 235 | """Metaclass which injects an observer mixin based on constructor arg""" 236 | 237 | @staticmethod 238 | @functools.lru_cache(maxsize=None) 239 | def _get_cls(observer, cls): 240 | # NOTE: bases ordering allows _ObservationLock to override the observer 241 | return type(cls.__name__, (cls, observer), {}) 242 | 243 | def __call__(cls, *args, observer=StdDevObserver, **kwargs): 244 | out_cls = _MixinMeta._get_cls(observer, cls) 245 | return type.__call__(out_cls, *args, **kwargs) 246 | 247 | 248 | class PerfTimer(_PerfTimerBase, metaclass=_MixinMeta): 249 | """Performance timer 250 | 251 | Use to measure performance of a block of code. The object will log 252 | performance stats when it is destroyed. 253 | 254 | perf_timer = PerfTimer('my code') 255 | ... 256 | def foo(): 257 | ... 258 | with perf_timer: 259 | # code under test 260 | ... 261 | 262 | It can also be used as a function decorator: 263 | 264 | @PerfTimer('my function') 265 | def foo(): 266 | ... 267 | 268 | This implementation is not thread safe. For a multi-threaded scenario, 269 | use `ThreadPerfTimer`. 270 | """ 271 | 272 | 273 | class ThreadPerfTimer(_ObservationLock, PerfTimer): 274 | """Variant of PerfTimer which measures CPU time of the current thread 275 | 276 | (Implemented with time.thread_time by default, which may not be available 277 | in some OS X environments.) 278 | """ 279 | 280 | def __init__(self, name, time_fn=thread_time, **kwargs): 281 | super().__init__(name, time_fn=time_fn, **kwargs) 282 | 283 | 284 | def measure_overhead(timer_factory): 285 | """Measure the overhead of a timer instance from the given factory. 286 | 287 | :param timer_factory: callable which returns a new timer instance 288 | :return: the average duration of one observation, in seconds 289 | """ 290 | timeit_timer = timeit.Timer( 291 | globals={'timer': timer_factory('foo', log_fn=lambda x: x)}, 292 | stmt='with timer: pass' 293 | ) 294 | n, duration = timeit_timer.autorange() 295 | min_duration = min([duration] + timeit_timer.repeat(number=n)) 296 | return min_duration / n 297 | 298 | 299 | @atexit.register 300 | def _atexit(): 301 | while _timers: 302 | _timers.pop()._report_once() 303 | -------------------------------------------------------------------------------- /src/perf_timer/_trio.py: -------------------------------------------------------------------------------- 1 | from collections import defaultdict 2 | from dataclasses import dataclass 3 | from time import perf_counter 4 | 5 | import trio 6 | try: 7 | import trio.lowlevel as trio_lowlevel 8 | except ImportError: 9 | import trio.hazmat as trio_lowlevel 10 | 11 | from ._impl import PerfTimer 12 | 13 | 14 | @dataclass 15 | class _TimeInfo: 16 | deschedule_start: float = 0 17 | elapsed_descheduled: float = 0 18 | 19 | 20 | class _DescheduledTimeInstrument(trio.abc.Instrument): 21 | """Trio instrument tracking elapsed descheduled time of selected tasks""" 22 | 23 | def __init__(self, time_fn=perf_counter): 24 | self._time_fn = time_fn 25 | self._info_by_task = defaultdict(_TimeInfo) 26 | 27 | def after_task_step(self, task): 28 | info = self._info_by_task.get(task) 29 | if info: 30 | info.deschedule_start = self._time_fn() 31 | 32 | def before_task_step(self, task): 33 | info = self._info_by_task.get(task) 34 | if info: 35 | info.elapsed_descheduled += self._time_fn() - info.deschedule_start 36 | 37 | def task_exited(self, task): 38 | # unregister instrument if there are no more traced tasks 39 | if self._info_by_task.pop(task, None) and not self._info_by_task: 40 | trio_lowlevel.remove_instrument(self) 41 | 42 | def get_elapsed_descheduled_time(self, task): 43 | """ 44 | Return elapsed descheduled time in seconds since the given task was 45 | first referenced by this method. The initial reference always returns 0. 46 | """ 47 | return self._info_by_task[task].elapsed_descheduled 48 | 49 | 50 | _instrument = _DescheduledTimeInstrument() 51 | 52 | 53 | def trio_perf_counter(): 54 | """Trio task-local equivalent of time.perf_counter(). 55 | 56 | For the current Trio task, return the value (in fractional seconds) of a 57 | performance counter, i.e. a clock with the highest available resolution to 58 | measure a short duration. It includes time elapsed during time.sleep, 59 | but not trio.sleep. The reference point of the returned value is 60 | undefined, so that only the difference between the results of consecutive 61 | calls is valid. 62 | 63 | Performance note: calling this function installs instrumentation on the 64 | Trio scheduler which may affect application performance. The 65 | instrumentation is automatically removed when the corresponding tasks 66 | have exited. 67 | """ 68 | trio_lowlevel.add_instrument(_instrument) 69 | task = trio_lowlevel.current_task() 70 | return perf_counter() - _instrument.get_elapsed_descheduled_time(task) 71 | 72 | 73 | class TrioPerfTimer(PerfTimer): 74 | """Variant of PerfTimer which measures Trio task time 75 | 76 | Use to measure performance of the current Trio tasks within a block 77 | of code. The object will log performance stats when it is destroyed. 78 | 79 | Measured time includes time.sleep, but not trio.sleep or other async 80 | blocking (due to I/O, child tasks, etc). 81 | 82 | perf_timer = PerfTimer('my code') 83 | ... 84 | async def foo(): 85 | ... 86 | with perf_timer: 87 | # code under test 88 | await trio.sleep(1) 89 | ... 90 | 91 | It can also be used as a function decorator: 92 | 93 | @PerfTimer('my function') 94 | async def foo(): 95 | ... 96 | """ 97 | 98 | def __init__(self, name, time_fn=trio_perf_counter, **kwargs): 99 | super().__init__(name, time_fn=time_fn, **kwargs) 100 | -------------------------------------------------------------------------------- /src/perf_timer/_version.py: -------------------------------------------------------------------------------- 1 | __version__ = '0.3.0-dev' 2 | -------------------------------------------------------------------------------- /test-requirements.in: -------------------------------------------------------------------------------- 1 | pylint 2 | pytest 3 | pytest-cov 4 | pytest-trio 5 | numpy 6 | trio 7 | -------------------------------------------------------------------------------- /test-requirements.txt: -------------------------------------------------------------------------------- 1 | # 2 | # This file is autogenerated by pip-compile with python 3.8 3 | # To update, run: 4 | # 5 | # pip-compile --output-file=test-requirements.txt setup.py test-requirements.in 6 | # 7 | astroid==2.8.3 8 | # via pylint 9 | async-generator==1.10 10 | # via 11 | # pytest-trio 12 | # trio 13 | attrs==21.2.0 14 | # via 15 | # outcome 16 | # pytest 17 | # trio 18 | coverage[toml]==6.0.2 19 | # via pytest-cov 20 | idna==3.3 21 | # via trio 22 | iniconfig==1.1.1 23 | # via pytest 24 | isort==5.9.3 25 | # via pylint 26 | lazy-object-proxy==1.6.0 27 | # via astroid 28 | mccabe==0.6.1 29 | # via pylint 30 | numpy==1.21.2 31 | # via -r test-requirements.in 32 | outcome==1.1.0 33 | # via 34 | # pytest-trio 35 | # trio 36 | packaging==21.0 37 | # via pytest 38 | platformdirs==2.4.0 39 | # via pylint 40 | pluggy==1.0.0 41 | # via pytest 42 | py==1.10.0 43 | # via pytest 44 | pylint==2.11.1 45 | # via -r test-requirements.in 46 | pyparsing==2.4.7 47 | # via packaging 48 | pytest==6.2.5 49 | # via 50 | # -r test-requirements.in 51 | # pytest-cov 52 | # pytest-trio 53 | pytest-cov==3.0.0 54 | # via -r test-requirements.in 55 | pytest-trio==0.7.0 56 | # via -r test-requirements.in 57 | sniffio==1.2.0 58 | # via trio 59 | sortedcontainers==2.4.0 60 | # via trio 61 | toml==0.10.2 62 | # via 63 | # pylint 64 | # pytest 65 | tomli==1.2.1 66 | # via coverage 67 | trio==0.19.0 68 | # via 69 | # -r test-requirements.in 70 | # pytest-trio 71 | typing-extensions==3.10.0.2 72 | # via 73 | # astroid 74 | # pylint 75 | wrapt==1.13.2 76 | # via astroid 77 | 78 | # The following packages are considered to be unsafe in a requirements file: 79 | # setuptools 80 | -------------------------------------------------------------------------------- /test-requirements_trio-0.11.txt: -------------------------------------------------------------------------------- 1 | # 2 | # This file is autogenerated by pip-compile with python 3.8 3 | # To update, run: 4 | # 5 | # pip-compile --output-file=test-requirements.txt setup.py test-requirements.in 6 | # 7 | 8 | astroid==2.8.3 9 | # via pylint 10 | async-generator==1.10 11 | # via 12 | # pytest-trio 13 | # trio 14 | attrs==21.2.0 15 | # via 16 | # outcome 17 | # pytest 18 | # trio 19 | coverage[toml]==6.0.2 20 | # via pytest-cov 21 | idna==3.3 22 | # via trio 23 | iniconfig==1.1.1 24 | # via pytest 25 | isort==5.9.3 26 | # via pylint 27 | lazy-object-proxy==1.6.0 28 | # via astroid 29 | mccabe==0.6.1 30 | # via pylint 31 | numpy==1.21.2 32 | # via -r test-requirements.in 33 | outcome==1.1.0 34 | # via trio 35 | packaging==21.0 36 | # via pytest 37 | platformdirs==2.4.0 38 | # via pylint 39 | pluggy==1.0.0 40 | # via pytest 41 | py==1.10.0 42 | # via pytest 43 | pylint==2.11.1 44 | # via -r test-requirements.in 45 | pyparsing==2.4.7 46 | # via packaging 47 | pytest==6.2.5 48 | # via 49 | # -r test-requirements.in 50 | # pytest-cov 51 | # pytest-trio 52 | pytest-cov==3.0.0 53 | # via -r test-requirements.in 54 | pytest-trio==0.5.2 55 | # via -r test-requirements.in 56 | sniffio==1.2.0 57 | # via trio 58 | sortedcontainers==2.4.0 59 | # via trio 60 | toml==0.10.2 61 | # via 62 | # pylint 63 | # pytest 64 | tomli==1.2.1 65 | # via coverage 66 | trio==0.11.0 67 | # via 68 | # -r test-requirements.in 69 | # pytest-trio 70 | typing-extensions==3.10.0.2 71 | # via 72 | # astroid 73 | # pylint 74 | wrapt==1.13.2 75 | # via astroid 76 | 77 | # The following packages are considered to be unsafe in a requirements file: 78 | # setuptools 79 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/belm0/perf-timer/ad0d836e8f513a045e865fe1ac2236dc51ddd879/tests/__init__.py -------------------------------------------------------------------------------- /tests/test_format_duration.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from perf_timer._impl import _format_duration 4 | 5 | 6 | @pytest.mark.parametrize('in_, expected', [ 7 | (( 12, 3), '12.0 s' ), 8 | (( 120, 3), '120 s' ), 9 | (( .05071, 3), '50.7 ms'), 10 | (( .05071, 2), '51 ms' ), 11 | ((12.34e-6, 3), '12.3 µs'), 12 | ((1.234e-9, 3), '1.23 ns'), 13 | (( .5e-9, 3), '0.500 ns' ), 14 | (( 120, 3, 'X'), '120Xs'), 15 | ]) 16 | def test_format_duration(in_, expected): 17 | assert _format_duration(*in_) == expected 18 | -------------------------------------------------------------------------------- /tests/test_histogram.py: -------------------------------------------------------------------------------- 1 | import random 2 | 3 | import numpy as np 4 | import pytest 5 | from pytest import approx 6 | 7 | from perf_timer._histogram import ApproximateHistogram 8 | 9 | 10 | def test_histogram_exact(): 11 | random.seed(0) 12 | max_bins = 50 13 | h = ApproximateHistogram(max_bins=max_bins) 14 | points = [] 15 | 16 | for _ in range(max_bins): 17 | p = random.expovariate(1/5) 18 | points.append(p) 19 | h.add(p) 20 | 21 | q = [i / 100 for i in range(101)] 22 | assert h.quantile(q) == approx(np.quantile(points, q)) 23 | assert h.mean() == approx(np.mean(points)) 24 | assert h.std() == approx(np.std(points)) 25 | assert h.sum() == approx(np.sum(points)) 26 | assert h.min == min(points) 27 | assert h.max == max(points) 28 | assert h.count == max_bins 29 | 30 | 31 | @pytest.mark.parametrize("max_bins,num_points,expected_error", [ 32 | (50, 50, 1e-6), 33 | (100, 150, 1.5), 34 | (100, 1000, 1), 35 | (250, 1000, .5), 36 | ]) 37 | def test_histogram_approx(max_bins, num_points, expected_error): 38 | random.seed(0) 39 | h = ApproximateHistogram(max_bins=max_bins) 40 | points = [] 41 | 42 | for _ in range(num_points): 43 | p = random.expovariate(1/5) 44 | points.append(p) 45 | h.add(p) 46 | 47 | q = [i / 100 for i in range(101)] 48 | err_sum = 0 # avg percent error across samples 49 | for p, b, b_np, b_np_min, b_np_max in zip( 50 | q, 51 | h.quantile(q), 52 | np.quantile(points, q), 53 | np.quantile(points, [0] * 7 + q), 54 | np.quantile(points, q[7:] + [1] * 7)): 55 | err_denom = b_np_max - b_np_min 56 | err_sum += abs(b - b_np) / err_denom 57 | assert err_sum <= expected_error 58 | assert h.mean() == approx(np.mean(points)) 59 | assert h.std() == approx(np.std(points), rel=.05) 60 | assert h.sum() == approx(np.sum(points)) 61 | assert h.min == min(points) 62 | assert h.max == max(points) 63 | assert h.count == num_points 64 | -------------------------------------------------------------------------------- /tests/test_observer.py: -------------------------------------------------------------------------------- 1 | import random 2 | from unittest.mock import Mock 3 | 4 | import numpy 5 | import pytest 6 | 7 | from perf_timer import AverageObserver, StdDevObserver, HistogramObserver 8 | 9 | 10 | @pytest.mark.parametrize('n', (0, 1, 100)) 11 | def test_average_observer(n): 12 | name = 'foo' 13 | log_fn = Mock() 14 | observer = AverageObserver(name, log_fn=log_fn) 15 | points = [] 16 | for _ in range(n): 17 | x = random.expovariate(1/5) # expected average is 5s 18 | observer._observe(x) 19 | points.append(x) 20 | observer._report() 21 | if n > 1: 22 | # timer "foo": avg 11.9 ms, max 12.8 ms in 10 runs 23 | log_fn.assert_called_once_with( 24 | f'timer "{name}": ' 25 | f'avg {sum(points)/n:.2f} s, ' 26 | f'max {max(points):.1f} s ' 27 | f'in {n} runs') 28 | elif n > 0: 29 | # timer "foo": 12.8 ms 30 | log_fn.assert_called_once_with( 31 | f'timer "{name}": ' 32 | f'{points[0]:.1f} s') 33 | else: 34 | log_fn.assert_not_called() 35 | 36 | 37 | @pytest.mark.parametrize('n', (0, 1, 100)) 38 | def test_std_dev_observer(n): 39 | name = 'foo' 40 | log_fn = Mock() 41 | observer = StdDevObserver(name, log_fn=log_fn) 42 | points = [] 43 | for _ in range(n): 44 | x = random.expovariate(1/5) # expected average is 5s 45 | observer._observe(x) 46 | points.append(x) 47 | observer._report() 48 | if n > 1: 49 | # timer "foo": avg 11.9 ms ± 961 µs, max 12.8 ms in 10 runs 50 | log_fn.assert_called_once_with( 51 | f'timer "{name}": ' 52 | f'avg {sum(points)/n:.2f} s ± {numpy.std(points):.2f} s, ' 53 | f'max {max(points):.1f} s ' 54 | f'in {n} runs') 55 | elif n > 0: 56 | # timer "foo": 12.8 ms 57 | log_fn.assert_called_once_with( 58 | f'timer "{name}": ' 59 | f'{points[0]:.1f} s') 60 | else: 61 | log_fn.assert_not_called() 62 | 63 | 64 | @pytest.mark.parametrize('n', (0, 1, 50)) 65 | def test_histogram_observer(n): 66 | name = 'foo' 67 | # cheating to allow a simple string compare: 68 | # * requested quantiles and distribution are such that output precision is fixed 69 | # * since n < max bins of the approximate histogram, quantile output will be exact 70 | quantiles = (.4, .5, .6) 71 | log_fn = Mock() 72 | observer = HistogramObserver(name, quantiles=quantiles, log_fn=log_fn) 73 | points = [] 74 | for _ in range(n): 75 | x = random.expovariate(1/5) # expected average is 5s 76 | observer._observe(x) 77 | points.append(x) 78 | observer._report() 79 | if n > 1: 80 | q_expected = numpy.quantile(points, quantiles) 81 | # timer "foo": avg 11.9ms ± 961µs, 50% ≤ 12.6ms, 90% ≤ 12.7ms in 10 runs 82 | log_fn.assert_called_once_with( 83 | f'timer "{name}": ' 84 | f'avg {sum(points)/n:.2f}s ± {numpy.std(points):.2f}s, ' 85 | f'{", ".join(f"{q:.0%} ≤ {out:.2f}s" for q, out in zip(quantiles, q_expected))} ' 86 | f'in {n} runs') 87 | elif n > 0: 88 | # timer "foo": 12.8 ms 89 | log_fn.assert_called_once_with( 90 | f'timer "{name}": ' 91 | f'{points[0] * 1000:.1f} ms') 92 | else: 93 | log_fn.assert_not_called() 94 | 95 | 96 | def test_histogram_observer_bad_input(): 97 | with pytest.raises(ValueError): 98 | HistogramObserver('foo', quantiles=(.5, 2)) 99 | with pytest.raises(ValueError): 100 | HistogramObserver('foo', quantiles=(.6, .5)) 101 | -------------------------------------------------------------------------------- /tests/test_perf_timer.py: -------------------------------------------------------------------------------- 1 | from functools import partial 2 | from unittest.mock import Mock, patch 3 | 4 | import pytest 5 | 6 | from perf_timer import PerfTimer, ThreadPerfTimer, \ 7 | AverageObserver, StdDevObserver, HistogramObserver, \ 8 | measure_overhead 9 | from perf_timer import _impl 10 | 11 | 12 | class _Containing: 13 | """Argument matcher for Mock""" 14 | 15 | def __init__(self, value): 16 | self.value = value 17 | 18 | def __eq__(self, other): 19 | return self.value in other 20 | 21 | def __repr__(self): 22 | return f'{self.__class__.__name__}("{self.value}")' 23 | 24 | 25 | class _NotContaining: 26 | """Argument matcher for Mock""" 27 | 28 | def __init__(self, value): 29 | self.value = value 30 | 31 | def __eq__(self, other): 32 | return self.value not in other 33 | 34 | def __repr__(self): 35 | return f'{self.__class__.__name__}("{self.value}")' 36 | 37 | 38 | def test_perf_timer(): 39 | # time_fn is called on enter and exit of each with block 40 | time_fn = Mock(side_effect=[10, 15, 41 | 15, 25]) 42 | log_fn = Mock() 43 | timer = PerfTimer('foo', observer=AverageObserver, time_fn=time_fn, 44 | log_fn=log_fn) 45 | 46 | for _ in range(2): 47 | with timer: 48 | pass 49 | 50 | assert timer._count == 2 51 | assert timer._sum == 15 52 | assert timer._max == 10 53 | timer._report() 54 | log_fn.assert_called_once_with(_Containing('in 2 runs')) 55 | 56 | 57 | def test_perf_timer_decorator(): 58 | time_fn = Mock(side_effect=[10, 15, 59 | 15, 25]) 60 | log_fn = Mock() 61 | 62 | @PerfTimer('foo', time_fn=time_fn, log_fn=log_fn) 63 | def foo(): 64 | pass 65 | 66 | for _ in range(2): 67 | foo() 68 | 69 | del foo 70 | log_fn.assert_called_once_with(_Containing('in 2 runs')) 71 | 72 | 73 | def test_perf_timer_one_run(): 74 | log_fn = Mock() 75 | timer = PerfTimer('foo', log_fn=log_fn) 76 | 77 | with timer: 78 | pass 79 | 80 | assert timer._count == 1 81 | timer._report() 82 | log_fn.assert_called_once_with(_NotContaining(' in ')) 83 | 84 | 85 | def test_perf_timer_non_reentrant(): 86 | timer = PerfTimer('foo') 87 | with timer: 88 | with pytest.raises(RuntimeError): 89 | with timer: 90 | pass 91 | 92 | 93 | def test_thread_perf_timer_lock(): 94 | lock_count = 0 95 | 96 | class MockLock: 97 | def __enter__(self): 98 | pass 99 | def __exit__(self, *args): 100 | nonlocal lock_count 101 | lock_count += 1 102 | 103 | timer = ThreadPerfTimer('foo') 104 | timer._lock = MockLock() 105 | 106 | with timer: 107 | pass 108 | with timer: 109 | pass 110 | timer._report() 111 | 112 | assert lock_count == 2 113 | 114 | 115 | def test_perf_timer_type(): 116 | # since metaclass is used, ensure type is cached 117 | assert type(PerfTimer('foo')) is type(PerfTimer('bar')) 118 | 119 | 120 | def test_perf_timer_not_implemented(): 121 | with pytest.raises(NotImplementedError): 122 | PerfTimer('foo', time_fn=None) 123 | 124 | 125 | @patch.object(PerfTimer, '_report_once') 126 | def test_perf_timer_atexit_and_del(_report_once): 127 | # atexit and del each cause 1 call to _report_once() 128 | timer = PerfTimer('foo') 129 | _impl._atexit() 130 | del timer 131 | assert _report_once.call_count == 2 132 | 133 | 134 | @patch.object(PerfTimer, '_report_once') 135 | def test_perf_timer_atexit_is_weak(_report_once): 136 | # atexit doesn't trigger _report_once() if object already finalized 137 | timer = PerfTimer('foo') 138 | del timer 139 | _impl._atexit() 140 | assert _report_once.call_count == 1 141 | 142 | 143 | def test_perf_timer_report(): 144 | # multiple calls to _report_once() cause only one report 145 | log_fn = Mock() 146 | timer = PerfTimer('foo', log_fn=log_fn) 147 | with timer: 148 | pass 149 | timer._report_once() 150 | timer._report_once() 151 | log_fn.assert_called_once() 152 | 153 | 154 | def test_measure_overhead(): 155 | assert measure_overhead(partial(PerfTimer, observer=AverageObserver)) < \ 156 | measure_overhead(partial(PerfTimer, observer=StdDevObserver)) < \ 157 | measure_overhead(partial(PerfTimer, observer=HistogramObserver)) 158 | -------------------------------------------------------------------------------- /tests/test_trio.py: -------------------------------------------------------------------------------- 1 | import time 2 | from unittest.mock import Mock 3 | 4 | import pytest 5 | import trio 6 | try: 7 | import trio.lowlevel as trio_lowlevel 8 | except ImportError: 9 | import trio.hazmat as trio_lowlevel 10 | 11 | from perf_timer import trio_perf_counter, _trio, TrioPerfTimer, AverageObserver 12 | 13 | 14 | async def test_descheduled_time_instrument(): 15 | time_fn = Mock(side_effect=[5, 10, 10, 20]) 16 | instrument = _trio._DescheduledTimeInstrument(time_fn=time_fn) 17 | trio_lowlevel.add_instrument(instrument) 18 | 19 | # Only tasks referenced by get_elapsed_descheduled_time() will be tracked, 20 | # so instrument is not tracking the current task. 21 | await trio.sleep(0) 22 | assert not time_fn.called 23 | 24 | async with trio.open_nursery() as nursery: 25 | @nursery.start_soon 26 | async def _tracked_child(): 27 | # calling get_elapsed_descheduled_time() initiates tracking 28 | task = trio_lowlevel.current_task() 29 | assert instrument.get_elapsed_descheduled_time(task) == 0 30 | await trio.sleep(0) 31 | assert instrument.get_elapsed_descheduled_time(task) == 10 - 5 32 | await trio.sleep(0) 33 | assert instrument.get_elapsed_descheduled_time(task) == 20 - 5 34 | # time function is called twice for each deschedule 35 | assert time_fn.call_count == 4 36 | 37 | # the sole tracked task exited, so instrument is automatically removed 38 | with pytest.raises(KeyError): 39 | trio_lowlevel.remove_instrument(instrument) 40 | 41 | 42 | async def test_descheduled_time_instrument_exclude_children(): 43 | time_fn = Mock(side_effect=[5, 10]) 44 | instrument = _trio._DescheduledTimeInstrument(time_fn=time_fn) 45 | trio_lowlevel.add_instrument(instrument) 46 | 47 | task = trio_lowlevel.current_task() 48 | assert instrument.get_elapsed_descheduled_time(task) == 0 49 | 50 | async with trio.open_nursery() as nursery: 51 | @nursery.start_soon 52 | async def _untracked_child(): 53 | await trio.sleep(0) 54 | 55 | assert instrument.get_elapsed_descheduled_time(task) == 10 - 5 56 | assert time_fn.call_count == 2 # 2 x 1 deschedule (due to nursery) 57 | 58 | # our task is still alive, so instrument remains active 59 | trio_lowlevel.remove_instrument(instrument) 60 | 61 | 62 | async def test_trio_perf_counter_time_sleep(): 63 | # NOTE: subject to false pass due to reliance on wall time 64 | t0 = trio_perf_counter() 65 | time.sleep(.01) 66 | dt = trio_perf_counter() - t0 67 | assert dt > .008 68 | 69 | 70 | async def test_trio_perf_counter_unregister(): 71 | async def perf_counter_with_trio_sleep(): 72 | trio_perf_counter() 73 | await trio.sleep(0) 74 | trio_perf_counter() 75 | 76 | async with trio.open_nursery() as nursery: 77 | nursery.start_soon(perf_counter_with_trio_sleep) 78 | nursery.start_soon(perf_counter_with_trio_sleep) 79 | 80 | # Since all tasks using task_perf_counter() have exited, we expected 81 | # the Trio instrumentation to no longer be active (so remove call 82 | # will fail). 83 | with pytest.raises(KeyError): 84 | trio_lowlevel.remove_instrument(_trio._instrument) 85 | 86 | 87 | async def test_trio_perf_timer(autojump_clock): 88 | # time_fn is called on enter and exit of each with block 89 | time_fn = Mock(side_effect=[10, 15, 90 | 15, 25]) 91 | timer = TrioPerfTimer('foo', observer=AverageObserver, time_fn=time_fn) 92 | 93 | for _ in range(2): 94 | with timer: 95 | await trio.sleep(1) 96 | 97 | assert timer._count == 2 98 | assert timer._sum == 15 99 | assert timer._max == 10 100 | del timer 101 | 102 | 103 | async def test_trio_perf_timer_decorator(autojump_clock): 104 | time_fn = Mock(side_effect=[10, 15, 105 | 15, 25]) 106 | timer = TrioPerfTimer('foo', observer=AverageObserver, time_fn=time_fn) 107 | 108 | @timer 109 | async def foo(): 110 | await trio.sleep(1) 111 | 112 | for _ in range(2): 113 | await foo() 114 | 115 | assert timer._count == 2 116 | assert timer._sum == 15 117 | assert timer._max == 10 118 | del timer 119 | --------------------------------------------------------------------------------