\n",
177 | "\n",
178 | "**Exercise: Ghostwriting a simple roundtrip**\n",
179 | " \n",
180 | "Run the following command in your terminal, to test [the `gzip` module](https://docs.python.org/3/library/gzip.html)\n",
181 | "\n",
182 | "```shell\n",
183 | "hypothesis write gzip.compress\n",
184 | "```\n",
185 | " \n",
186 | "and inspect the output. What changes might you make to improve this test? For example:\n",
187 | " \n",
188 | "- improving input strategies\n",
189 | "- removing or updating comments\n",
190 | "- using descriptive variable names\n",
191 | " \n",
192 | "Run your improved test to confirm that it works as you expect.\n",
193 | " \n",
194 | "Bonus round: the `mtime` argument was added to `gzip.compress()` in Python 3.8. Could you write version-independent tests, or think of extra properties that it enables?"
195 | ]
196 | },
197 | {
198 | "cell_type": "code",
199 | "execution_count": null,
200 | "id": "9c9162b0",
201 | "metadata": {},
202 | "outputs": [],
203 | "source": [
204 | "# Execute this cell to see the CLI output\n",
205 | "! hypothesis write gzip.compress"
206 | ]
207 | },
208 | {
209 | "cell_type": "markdown",
210 | "id": "22360e44",
211 | "metadata": {},
212 | "source": [
213 | "
\n",
214 | "\n",
215 | "**Exercise: Ghostwriting a more complex roundtrip**\n",
216 | " \n",
217 | "Run the following command to test [the `json` module](https://docs.python.org/3/library/json.html)\n",
218 | "\n",
219 | "```shell\n",
220 | "hypothesis write json.dumps\n",
221 | "```\n",
222 | " \n",
223 | "and inspect the output. What changes might you make to improve this test? For example:\n",
224 | "\n",
225 | "- [defining a `st.recursive()` strategy](https://hypothesis.readthedocs.io/en/latest/data.html#hypothesis.strategies.recursive) to generate [JSON objects](https://www.json.org/) (for the `obj` argument)\n",
226 | "- improving other input strategies, or removing those for `loads()`\n",
227 | "- removing or updating comments\n",
228 | "- using descriptive variable names\n",
229 | " \n",
230 | "Extension: use `assume(obj == obj)` instead of excluding `nan` from your JSON strategy. What happens?"
231 | ]
232 | },
233 | {
234 | "cell_type": "code",
235 | "execution_count": null,
236 | "id": "864707a0",
237 | "metadata": {},
238 | "outputs": [],
239 | "source": [
240 | "# Execute this cell to see the CLI output\n",
241 | "! hypothesis write json.dumps"
242 | ]
243 | },
244 | {
245 | "cell_type": "markdown",
246 | "id": "c2a7adee",
247 | "metadata": {},
248 | "source": [
249 | "### Equivalent Functions\n",
250 | "\n",
251 | "There are times where we are fortunate enough to have access to two distinct functions that are meant to exhibit the same behavior.\n",
252 | "Often times this comes in the form of a slow function (e.g. single-threaded) vs a faster one (multi-threaded), or \"cousin implementations\" of functions (e.g. NumPy's and PyTorch's respective implementations of `matmul`).\n",
253 | "\n",
254 | "- `f_old()` vs `f_new`\n",
255 | "- `f_singlethread()` vs `f_multithread()`\n",
256 | "- `func()` vs `numba.njit(func)()`\n",
257 | "- `numpy.matmul()` vs `torch.matmul()`\n",
258 | "- `numpy.einsum(..., optimize=False)` vs `numpy.einsum(..., optimize=True)`\n",
259 | "\n",
260 | "We can also use function which are equivalent for *some* of their possible inputs. For example, `numpy.asarray` converts the input to a `ndarray`, while `numpy.asanyarray` passes array-like types (such as sparse arrays, xarray DataArrays, etc) through unchanged. We can use the ghostwriter to check that *if passed an `ndarray`*, these functions are equivalent:"
261 | ]
262 | },
263 | {
264 | "cell_type": "code",
265 | "execution_count": null,
266 | "id": "9a255b37",
267 | "metadata": {},
268 | "outputs": [],
269 | "source": [
270 | "! hypothesis write --equivalent numpy.asarray numpy.asanyarray"
271 | ]
272 | },
273 | {
274 | "cell_type": "markdown",
275 | "id": "6d5affe7",
276 | "metadata": {},
277 | "source": [
278 | "### Metamorphic Relationships\n",
279 | "\n",
280 | "A metamorphic relationship is one in which a known transformation made to the input of a function has a *known* and *necessary* effect of the function's output.\n",
281 | "We already saw an example of this when we tested our `count_vowels` function:\n",
282 | "\n",
283 | "```python\n",
284 | "assert count_vowels(n * input_string) == n * count_vowels(input_string)\n",
285 | "```\n",
286 | "\n",
287 | "That is, if we replicate the input-string `n` times, then the number of vowels counted in the string should be scaled by a factor of `n` as well.\n",
288 | "Note that we had to evaluate our function twice to check its metamorphic property.\n",
289 | "**All metamorphic tests will require multiple evaluations of the function of interest.**\n",
290 | "\n",
291 | "Metamorphic testing is often a *highly* effective method of testing, which enables us to exercise our code and test for correctness under a wider range of inputs, without our needing to concoct a sophisticated \"oracle\" for validating the code's exact behavior.\n",
292 | "Basically, these tests give you a lot of bang for your buck!\n",
293 | "\n",
294 | "Let's consider some common metamorphic relationships that crop up in functions:\n",
295 | "\n",
296 | "**Linearity**\n",
297 | "\n",
298 | "```\n",
299 | "f(a * x) = a * f(x)\n",
300 | "```\n",
301 | "\n",
302 | "The example involving `count_vowels` is a demonstration of a \"linear\" metamorphic relationship.\n",
303 | "\n",
304 | "```\n",
305 | "mag = abs(x)\n",
306 | "assert a * mag == abs(a * x)\n",
307 | "```\n",
308 | "\n",
309 | "**Monotonicity**\n",
310 | "\n",
311 | "```\n",
312 | "f(x) <= f(x + |δ|)\n",
313 | "\n",
314 | "or\n",
315 | "\n",
316 | "f(x) >= f(x + |δ|)\n",
317 | "```\n",
318 | "\n",
319 | "A function is monotonic if transforming the input of th function leads an \"unwavering\" change - only-increasing or only-decreasing - in the function's output.\n",
320 | "Consider, for example, a database that returns some number of results for a query;\n",
321 | "making the query more precise *should not increase the number of results*\n",
322 | "\n",
323 | "```\n",
324 | "len(db.query(query_a)) >= len(db.query(query_a & query_b)) \n",
325 | "```\n",
326 | "\n",
327 | "**Fixed-Point Location**\n",
328 | "\n",
329 | "```python\n",
330 | "y = f(x)\n",
331 | "assert y == f(y)\n",
332 | "```\n",
333 | "\n",
334 | "A fixed point of a function `f` is any value `y` such that `f(y) -> y`.\n",
335 | "It might be surprising to see just how many functions always return fixed-points of themselves:\n",
336 | "\n",
337 | "```python\n",
338 | "y = sort(x)\n",
339 | "assert y == sort(y)\n",
340 | "\n",
341 | "sanitized = sanitize_input(x)\n",
342 | "assert sanitized == sanitize_input(sanitized)\n",
343 | "\n",
344 | "formatted_code = black_formatter(code)\n",
345 | "assert formatted_code == black_formatter(formatted_code)\n",
346 | "\n",
347 | "normed_vec = l2_normalize(vec)\n",
348 | "assert normed_vec == l2_normalize(vec)\n",
349 | "\n",
350 | "result = find_minimum(f, starting_point=x, err_tol=delta)\n",
351 | "assert result == find_minimum(f, starting_point=result, err_tol=delta)\n",
352 | "\n",
353 | "padded = left_pad(string=x, width=4, fillchar=\"a\")\n",
354 | "assert padded == left_pad(string=padded, width=4, fillchar=\"a\")\n",
355 | "```\n",
356 | "\n",
357 | "**Invariance Under Transformation**\n",
358 | "\n",
359 | "If `T(x)` is a function such that the following holds:\n",
360 | "\n",
361 | "```\n",
362 | "f(x) = f(T(x))\n",
363 | "```\n",
364 | "\n",
365 | "This might be an invariance to scaling (`f(x) == f(a * x)`), translation (`f(x) == f(x + a)`), permutation (`f(coll) == f(shuffle(coll))`).\n",
366 | "\n",
367 | "In the case of a computer-vision algorithm, like an image classifier, this could be an invariance under a change of brightness in the image or a horizontal flip of the image (e.g. these things shouldn't change if the model sees a cat in the image)."
368 | ]
369 | },
370 | {
371 | "cell_type": "markdown",
372 | "id": "d6553d66",
373 | "metadata": {},
374 | "source": [
375 | "
\n",
376 | "\n",
377 | "**Exercise: Metamorphic testing**\n",
378 | "\n",
379 | "Create the file `tests/test_metamorphic.py`.\n",
380 | "For each of the following functions identify one or more metamorphic relationships that are exhibited by the function and write tests that exercise them.\n",
381 | " \n",
382 | "- [`numpy.clip`](https://numpy.org/doc/stable/reference/generated/numpy.clip.html)\n",
383 | "- `sorted` (don't test the fixed-point relationship; identify a different metamorphic relationship)\n",
384 | "- `merge_max_mappings`\n",
385 | "- `pairwise_dists` (defined below.. add this to `basic_functions.py`)\n",
386 | "\n",
387 | "```python\n",
388 | "import numpy as np\n",
389 | "\n",
390 | "def pairwise_dists(x, y):\n",
391 | " \"\"\" Computing pairwise Euclidean distance between the respective\n",
392 | " row-vectors of `x` and `y`\n",
393 | "\n",
394 | " Parameters\n",
395 | " ----------\n",
396 | " x : numpy.ndarray, shape=(M, D)\n",
397 | " y : numpy.ndarray, shape=(N, D)\n",
398 | "\n",
399 | " Returns\n",
400 | " -------\n",
401 | " numpy.ndarray, shape=(M, N)\n",
402 | " The Euclidean distance between each pair of\n",
403 | " rows between `x` and `y`.\"\"\"\n",
404 | " sqr_dists = -2 * np.matmul(x, y.T)\n",
405 | " sqr_dists += np.sum(x**2, axis=1)[:, np.newaxis]\n",
406 | " sqr_dists += np.sum(y**2, axis=1)\n",
407 | " return np.sqrt(np.clip(sqr_dists, a_min=0, a_max=None))\n",
408 | "```\n",
409 | "\n",
410 | "
"
411 | ]
412 | },
413 | {
414 | "cell_type": "markdown",
415 | "id": "70be26fe",
416 | "metadata": {},
417 | "source": [
418 | "
\n",
419 | "\n",
420 | "**Exercise: Testing the softmax function**\n",
421 | "\n",
422 | "The so-called \"softmax\" function is a generalised argmax, used to normalize a set of numbers such that **they will have the properties of a probability distribution**. I.e., post-softmax, each number will reside in $[0, 1]$ and the resulting numbers will sum to $1$.\n",
423 | "\n",
424 | "The softmax of a set of $M$ numbers is:\n",
425 | "\n",
426 | "\\begin{equation}\n",
427 | "softmax([s_{k} ]_{k=1}^{M}) = \\Bigl [ \\frac{e^{s_k}}{\\sum_{i=1}^{M}{e^{s_i}}} \\Bigr ]_{k=1}^{M}\n",
428 | "\\end{equation}\n",
429 | "\n",
430 | "```python\n",
431 | ">>> softmax([10., 10., 10.])\n",
432 | "array([0.33333333, 0.33333333, 0.33333333])\n",
433 | "\n",
434 | ">>> softmax([0., 10000., 0.])\n",
435 | "array([0., 1., 0.])\n",
436 | "\n",
437 | ">>> softmax([-100., 0., -100.])\n",
438 | "array([3.72007598e-44, 1.00000000e+00, 3.72007598e-44])\n",
439 | "```\n",
440 | "\n",
441 | "Write an implementation of `softmax` in `basic_functions.py` and test the two properties of `softmax` that we described above.\n",
442 | "Note: you should use `math.isclose` when checking if two floats are approximately equal.\n",
443 | "\n",
444 | "\n",
445 | "If you implemented `softmax` in a straight-forward way (i.e. you implemented the function based on the exact equation above) then you property-based test should fail.\n",
446 | "This is due to the use of the exponential function in `softmax`, which quickly creates a numerical instability.\n",
447 | "\n",
448 | "We can fix the numerical instability by recognizing a metamorphic relationship that is satisfied by the softmax equation: it exhibits translational invariance:\n",
449 | "\n",
450 | "\\begin{align}\n",
451 | "softmax([s_{k} - a]_{k=1}^{M}) &= \\Bigl [ \\frac{e^{s_k - a}}{\\sum_{i=1}^{M}{e^{s_i - a}}} \\Bigr ]_{k=1}^{M}\\\\\n",
452 | "&= \\Bigl [ \\frac{e^{-a}e^{s_k}}{e^{-a}\\sum_{i=1}^{M}{e^{s_i}}} \\Bigr ]_{k=1}^{M}\\\\\n",
453 | "&= \\Bigl [ \\frac{e^{s_k}}{\\sum_{i=1}^{M}{e^{s_i}}} \\Bigr ]_{k=1}^{M}\\\\\n",
454 | "&= softmax([s_{k}]_{k=1}^{M})\n",
455 | "\\end{align}\n",
456 | "\n",
457 | "Thus we can address this instability by finding the max value in our vector of number, subtracting that number from each of the values in the vector, and *then* compute the softmax.\n",
458 | "Update your definition of `softmax` and see that your property-based tests now pass.\n",
459 | "\n",
460 | "Reflect on the fact that using Hypothesis to drive a property-based test lead us to identify a subtle-but-critical oversight in our function.\n",
461 | "Had we simply manually tested our function with known small inputs and outputs, we might not have discovered this issue.\n",
462 | " \n",
463 | "Are there any other properties that we could test here?\n",
464 | "Consider how to set that the respective ordering of the input and output are the same (e.g. `numpy.argsort` can get at this).\n",
465 | "
"
466 | ]
467 | }
468 | ],
469 | "metadata": {
470 | "kernelspec": {
471 | "display_name": "Python 3 (ipykernel)",
472 | "language": "python",
473 | "name": "python3"
474 | }
475 | },
476 | "nbformat": 4,
477 | "nbformat_minor": 5
478 | }
479 |
--------------------------------------------------------------------------------
/notebooks/03_Putting_into_Practice_STUDENT.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "cbc50cb9",
6 | "metadata": {},
7 | "source": [
8 | "# Putting it all into Practice\n",
9 | "\n",
10 | "In this final section we'll practice the practical configuration tips that will help you apply property-based testing in the real world.\n",
11 | "\n",
12 | "Now that we know how to write property-based tests, in this final section we will cover practical tips for using them as part of a larger project. Test-suite design patterns, use of performance and determinism settings, reproducing test failures from a CI server, etc; they're all practical topics and typically left out of shorter introductions!"
13 | ]
14 | },
15 | {
16 | "cell_type": "markdown",
17 | "id": "9b6cd0c4",
18 | "metadata": {},
19 | "source": [
20 | "### The joy of `--hypothesis-show-statistics`\n",
21 | "\n",
22 | "Hypothesis has [great support for showing test statistics](https://hypothesis.readthedocs.io/en/latest/details.html#test-statistics), including better-than-`print()` debugging with `note()`, custom `event()`s you can add to the summary, and a variety of performance details.\n",
23 | "\n",
24 | "Let's explore those now: run `pytest --hypothesis-show-statistics test_statistics.py`. You should see\n",
25 | "\n",
26 | "- a lot of output from `printing`... if you actually want to see every example Hypothesis generates, use the `verbosity` setting instead! You can even set it from the command-line with `--hypothesis-verbosity=verbose`.\n",
27 | "- **one** line of output from `note()`\n",
28 | "- statistics on the generate phase for both tests, and the shrink phase for the failing test. If you re-run the tests, you'll see a `reuse` phase where notable examples from previous runs are replayed.\n",
29 | "\n",
30 | "Useful, isn't it!"
31 | ]
32 | },
33 | {
34 | "cell_type": "markdown",
35 | "id": "805d759e",
36 | "metadata": {},
37 | "source": [
38 | "### Settings for Performance\n",
39 | "\n",
40 | "Hypothesis is designed to behave sensibly by default, but sometimes you have something\n",
41 | "more specific in mind. At those times, [`hypothesis.settings`](https://hypothesis.readthedocs.io/en/latest/settings.html)\n",
42 | "is here to help.\n",
43 | "\n",
44 | "The main performance-related settings to know are:\n",
45 | "\n",
46 | "- `max_examples` - the number of valid examples Hypothesis will run. Defaults to 100; turning it up or down makes your testing proportionally more or less rigorous... and also proportionally slower or faster, respectively!\n",
47 | "- `deadline` - if an input takes longer than this to run, we'll treat that as an error. Useful to detect weird performance issues; but can be flaky if VM performance gets weird."
48 | ]
49 | },
50 | {
51 | "cell_type": "code",
52 | "execution_count": null,
53 | "id": "f4609773",
54 | "metadata": {},
55 | "outputs": [],
56 | "source": [
57 | "from time import sleep\n",
58 | "from hypothesis import given, settings, strategies as st\n",
59 | "\n",
60 | "# TODO: add a settings decorator which reduces max_examples (for speed)\n",
61 | "# and increases or disables the deadline so the test passes.\n",
62 | "\n",
63 | "@given(st.floats(min_value=0.1, max_value=0.3))\n",
64 | "def test_really_slow(delay):\n",
65 | " sleep(delay)\n",
66 | "\n",
67 | "test_really_slow()"
68 | ]
69 | },
70 | {
71 | "cell_type": "markdown",
72 | "id": "1bab6558",
73 | "metadata": {},
74 | "source": [
75 | "### The `phases` setting\n",
76 | "\n",
77 | "The phases setting allows you to individually enable or disable [Hypothesis' six phases](https://hypothesis.readthedocs.io/en/latest/settings.html#controlling-what-runs), and has two main uses:\n",
78 | "\n",
79 | "- Disabling all but the `explicit` phase, reducing Hypothesis to parametrized tests ([e.g. here](https://github.com/python/cpython/pull/22863))\n",
80 | "- Enabling the `explain` phase, accepting some overhead to report additional feedback on failures\n",
81 | "\n",
82 | "Other use-cases tend to be esoteric, but are supported if you think of one."
83 | ]
84 | },
85 | {
86 | "cell_type": "code",
87 | "execution_count": null,
88 | "id": "d9e04b44",
89 | "metadata": {},
90 | "outputs": [],
91 | "source": [
92 | "# See `tests/test_settings.py` for this exercise."
93 | ]
94 | },
95 | {
96 | "cell_type": "markdown",
97 | "id": "7e84920c",
98 | "metadata": {},
99 | "source": [
100 | "### Dealing with a PRNG\n",
101 | "\n",
102 | "If you have test behaviour that depends on a psudeo-random number generator, and it's not being seeded between inputs, you're going to have some flaky tests. [`hypothesis.register_random()` to the rescue!](https://hypothesis.readthedocs.io/en/latest/details.html#making-random-code-deterministic)\n",
103 | "\n",
104 | "Try running this test a few times - you'll see the `Flaky` error - and then un-comment `hypothesis.register_random(r)`. Instant determinism!"
105 | ]
106 | },
107 | {
108 | "cell_type": "code",
109 | "execution_count": null,
110 | "id": "586378fe",
111 | "metadata": {},
112 | "outputs": [],
113 | "source": [
114 | "import random\n",
115 | "import hypothesis\n",
116 | "from hypothesis.strategies import integers\n",
117 | "\n",
118 | "r = random.Random()\n",
119 | "\n",
120 | "# hypothesis.register_random(r)\n",
121 | "\n",
122 | "@hypothesis.given(integers(0, 100))\n",
123 | "def test_sometimes_flaky(x):\n",
124 | " y = r.randint(0, 100)\n",
125 | " assert x <= y\n",
126 | "\n",
127 | "test_sometimes_flaky()"
128 | ]
129 | },
130 | {
131 | "cell_type": "markdown",
132 | "id": "f9359f06",
133 | "metadata": {},
134 | "source": [
135 | "### `target()`ed property-based testing\n",
136 | "\n",
137 | "Random search works well... but [guided search with `hypothesis.target()`](https://hypothesis.readthedocs.io/en/latest/details.html#targeted-example-generation)\n",
138 | "is even better. Targeted search can help\n",
139 | "\n",
140 | "- find rare bugs ([e.g.](https://github.com/astropy/astropy/pull/10373))\n",
141 | "- understand bugs, by mitigating [the \"threshold problem\"](https://hypothesis.works/articles/threshold-problem/) (where shrinking makes severe bugs look marginal)"
142 | ]
143 | },
144 | {
145 | "cell_type": "code",
146 | "execution_count": null,
147 | "id": "bbf1c9a2",
148 | "metadata": {},
149 | "outputs": [],
150 | "source": [
151 | "# See `tests/test_settings.py` for this exercise."
152 | ]
153 | },
154 | {
155 | "cell_type": "markdown",
156 | "id": "25ca56d3",
157 | "metadata": {},
158 | "source": [
159 | "### Hooks for external fuzzers\n",
160 | "\n",
161 | "If you're on Linux or OSX, you may want to [experiment with external fuzzers](https://hypothesis.readthedocs.io/en/latest/details.html#use-with-external-fuzzers).\n",
162 | "For example, [here's a fuzz-test for the Black autoformatter](https://github.com/psf/black/blob/3ef339b2e75468a09d617e6aa74bc920c317bce6/fuzz.py#L75-L85)\n",
163 | "using Atheris as the fuzzing engine.\n",
164 | "\n",
165 | "We can mock this up with our own very simple fuzzer:"
166 | ]
167 | },
168 | {
169 | "cell_type": "code",
170 | "execution_count": null,
171 | "id": "603fc32e",
172 | "metadata": {},
173 | "outputs": [],
174 | "source": [
175 | "from secrets import token_bytes as get_bytes_from_fuzzer\n",
176 | "from hypothesis import given, strategies as st\n",
177 | "\n",
178 | "\n",
179 | "@given(st.nothing())\n",
180 | "def test(_):\n",
181 | " pass \n",
182 | "\n",
183 | "\n",
184 | "# And now for the fuzzer:\n",
185 | "for _ in range(1000):\n",
186 | " payload = get_bytes_from_fuzzer(1000)\n",
187 | " test.hypothesis.fuzz_one_input(payload)"
188 | ]
189 | }
190 | ],
191 | "metadata": {
192 | "kernelspec": {
193 | "display_name": "Python 3 (ipykernel)",
194 | "language": "python",
195 | "name": "python3"
196 | }
197 | },
198 | "nbformat": 4,
199 | "nbformat_minor": 5
200 | }
201 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | -e .
2 | notebook
3 | numpy >= 1.11
4 | pytest >= 7.1
5 | hypothesis >= 6.45
6 |
--------------------------------------------------------------------------------
/runtime.txt:
--------------------------------------------------------------------------------
1 | python-3.10
2 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import find_packages, setup
2 |
3 | setup(
4 | name="pbt_tutorial",
5 | description="Library code for property-based testing tutorial",
6 | packages=find_packages(where="src", exclude=["tests*"]),
7 | package_dir={"": "src"},
8 | version="1.1.0",
9 | python_requires=">=3.7",
10 | install_requires=["numpy>=1.11"],
11 | tests_requires=["pytest >= 7.1", "hypothesis >= 6.45"],
12 | )
13 |
--------------------------------------------------------------------------------
/src/pbt_tutorial/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rsokl/testing-tutorial/a02f55971d73b7102ee99d30c86a328dbee4e490/src/pbt_tutorial/__init__.py
--------------------------------------------------------------------------------
/src/pbt_tutorial/basic_functions.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | from itertools import groupby
3 | from typing import Any, Dict, List, Union
4 |
5 | __all__ = ["count_vowels", "merge_max_mappings"]
6 |
7 |
8 | def count_vowels(x: str, include_y: bool = False) -> int:
9 | """Returns the number of vowels contained in `x`.
10 |
11 | The vowel 'y' is included optionally.
12 |
13 | Parameters
14 | ----------
15 | x : str
16 | The input string
17 |
18 | include_y : bool, optional (default=False)
19 | If `True` count y's as vowels
20 |
21 | Returns
22 | -------
23 | vowel_count: int
24 |
25 | Examples
26 | --------
27 | >>> count_vowels("happy")
28 | 1
29 | >>> count_vowels("happy", include_y=True)
30 | 2
31 | """
32 | vowels = set("aeiouAEIOU")
33 |
34 | if include_y:
35 | vowels.update("yY")
36 |
37 | return sum(char in vowels for char in x)
38 |
39 |
40 | def merge_max_mappings(
41 | dict1: Dict[str, float], dict2: Dict[str, float]
42 | ) -> Dict[str, float]:
43 | """Merges two dictionaries based on the largest value
44 | in a given mapping.
45 |
46 | The keys of the dictionaries are presumed to be floats.
47 |
48 | Parameters
49 | ----------
50 | dict1 : Dict[str, float]
51 | dict2 : Dict[str, float]
52 |
53 | Returns
54 | -------
55 | merged : Dict[str, float]
56 | The dictionary containing all of the keys common
57 | between `dict1` and `dict2`, retaining the largest
58 | value from common mappings.
59 |
60 | Examples
61 | --------
62 | >>> x = {"a": 1, "b": 2}
63 | >>> y = {"b": 100, "c": -1}
64 | >>> merge_max_mappings(x, y)
65 | {'a': 1, 'b': 100, 'c': -1}
66 | """
67 | # `dict(dict1)` makes a copy of `dict1`. We do this
68 | # so that updating `merged` doesn't also update `dict1`
69 | merged = dict(dict1)
70 | for key, value in dict2.items():
71 | if key not in merged or value > merged[key]:
72 | merged[key] = value
73 | return merged
74 |
75 |
76 | # EXTRA: Test-drive development
77 |
78 | # SOLUTION
79 | def leftpad(string: str, width: int, fillchar: str) -> str:
80 | """Left-pads `string` with `fillchar` until the resulting string
81 | has length `width`.
82 |
83 | Parameters
84 | ----------
85 | string : str
86 | The input string
87 |
88 | width : int
89 | A non-negative integer specifying the minimum guaranteed
90 | width of the left-padded output string.
91 |
92 | fillchar : str
93 | The character (length-1 string) used to pad the string.
94 |
95 | Examples
96 | --------
97 | The following is the intended behaviour of this function:
98 |
99 | >>> leftpad('cat', width=5, fillchar="Z")
100 | 'ZZcat'
101 | >>> leftpad('Dog', width=2, fillchar="Z")
102 | 'Dog'
103 | """
104 | assert isinstance(width, int) and width >= 0, width
105 | assert isinstance(fillchar, str) and len(fillchar) == 1, fillchar
106 | margin = max(width - len(string), 0)
107 | return margin * fillchar + string
108 |
109 |
110 | def safe_name(obj: Any, repr_allowed: bool=True) -> str:
111 | """Tries to get a descriptive name for an object. Returns '
`
112 | instead of raising - useful for writing descriptive/safe error messages."""
113 | if hasattr(obj, "__qualname__"):
114 | return obj.__qualname__
115 |
116 | if hasattr(obj, "__name__"):
117 | return obj.__name__
118 |
119 | if repr_allowed and hasattr(obj, "__repr__"):
120 | return repr(obj)
121 |
122 | return ""
123 |
124 |
125 | def run_length_encoder(in_string: str) -> List[Union[str, int]]:
126 | """
127 | >>> run_length_encoder("aaaaabbcbc")
128 | ['a', 'a', 5, 'b', 'b', 2, 'c', 'b', 'c']
129 | """
130 | assert isinstance(in_string, str)
131 | out = []
132 | for item, group in groupby(in_string):
133 | cnt = sum(1 for x in group)
134 | if cnt == 1:
135 | out.append(item)
136 | else:
137 | out.extend((item, item, cnt))
138 | assert isinstance(out, list)
139 | assert all(isinstance(x, (str, int)) for x in out)
140 | return out
141 |
142 |
143 | # SOLUTION
144 | def run_length_decoder(in_list: List[Union[str, int]]) -> str:
145 | """
146 | >>> run_length_decoder(['a', 'a', 5, 'b', 'b', 2, 'c', 'b', 'c'])
147 | "aaaaabbcbc"
148 | """
149 | out: str = ""
150 | for n, item in enumerate(in_list):
151 | if isinstance(item, int):
152 | char = in_list[n - 1]
153 | assert isinstance(char, str)
154 | out += char * (item - 2)
155 | else:
156 | out += item
157 | return out
158 |
159 |
160 | def pairwise_dists(x, y):
161 | """ Computing pairwise Euclidean distance between the respective
162 | row-vectors of `x` and `y`
163 |
164 | Parameters
165 | ----------
166 | x : numpy.ndarray, shape=(M, D)
167 | y : numpy.ndarray, shape=(N, D)
168 |
169 | Returns
170 | -------
171 | numpy.ndarray, shape=(M, N)
172 | The Euclidean distance between each pair of
173 | rows between `x` and `y`."""
174 | sqr_dists = -2 * np.matmul(x, y.T)
175 | sqr_dists += np.sum(x**2, axis=1)[:, np.newaxis]
176 | sqr_dists += np.sum(y**2, axis=1)
177 | return np.sqrt(np.clip(sqr_dists, a_min=0, a_max=None))
178 |
179 |
180 | def softmax(x):
181 | x = x - x.max()
182 | return np.exp(x) / np.exp(x).sum()
183 |
--------------------------------------------------------------------------------
/tests/test_settings.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | from hypothesis import Phase, given, settings
4 | from hypothesis import strategies as st
5 |
6 | # Running
7 | # pytest --hypothesis-show-statistics tests/test_settings.py
8 | #
9 | # will report lines which were always and only run by failing examples.
10 | # How useful is the report in this case? Would it be useful on your
11 | # code? Does it report the same lines if you re-run the tests?
12 |
13 |
14 | @settings(phases=tuple(Phase)) # Activates the `explain` phase!
15 | @given(
16 | allow_nan=st.booleans(),
17 | obj=st.recursive(
18 | st.none() | st.booleans() | st.floats() | st.text(),
19 | extend=lambda x: st.lists(x) | st.dictionaries(st.text(), x),
20 | ),
21 | )
22 | def test_roundtrip_dumps_loads(allow_nan, obj):
23 | encoded = json.dumps(obj=obj, allow_nan=allow_nan)
24 | decoded = json.loads(s=encoded)
25 | assert obj == decoded
26 |
--------------------------------------------------------------------------------
/tests/test_statistics.py:
--------------------------------------------------------------------------------
1 | from hypothesis import event, given, note
2 | from hypothesis import strategies as st
3 |
4 |
5 | @given(st.integers())
6 | def test_demonstrating_note(x):
7 | note(f"noting {x}")
8 | print(f"printing {x}")
9 | assert x < 3
10 |
11 |
12 | @given(st.integers().filter(lambda x: x % 2 == 0))
13 | def test_even_integers(i):
14 | event(f"i mod 3 = {i%3}")
15 |
--------------------------------------------------------------------------------
/tests/test_with_target.py:
--------------------------------------------------------------------------------
1 | from hypothesis import given
2 | from hypothesis import strategies as st
3 | from hypothesis import target
4 |
5 | # OK, what's going on here?
6 | #
7 | # If you run this test, it'll fail with x=1, y=-1 ... and you might wonder
8 | # if we have some kind of comparison bug involving the sign bit.
9 | # Uncomment the `target()` though, and you'll see a new line of output:
10 | #
11 | # Highest target score: ______ (label='difference between x and y')
12 | #
13 | # and that large score should make it obvious that the bug is not small!
14 |
15 |
16 | @given(st.integers(), st.integers())
17 | def test_positive_and_negative_integers_are_equal(x, y):
18 | if x and y:
19 | # target(abs(x - y), label="difference between x and y")
20 | assert x == y
21 |
--------------------------------------------------------------------------------