├── c ├── VERSION.txt ├── subprojects │ └── kastore │ │ ├── VERSION.txt │ │ ├── README.md │ │ └── meson.build ├── .gitignore ├── tests │ ├── meson-subproject │ │ ├── subprojects │ │ │ └── tskit │ │ ├── meson.build │ │ └── example.c │ └── test_convert.c ├── meson_options.txt ├── examples │ ├── error_handling.c │ ├── api_structure.c │ ├── Makefile │ ├── streaming.c │ ├── take_ownership.c │ ├── tree_iteration.c │ ├── cpp_sorting_example.cpp │ ├── tree_traversal.c │ ├── haploid_wright_fisher.c │ └── multichrom_wright_fisher_singlethreaded.c ├── tskit.h ├── tskit │ ├── convert.h │ └── stats.h └── meson.build ├── python ├── lib ├── LICENSE ├── tskit │ ├── jit │ │ └── __init__.py │ ├── __main__.py │ ├── _version.py │ ├── provenance.schema.json │ ├── exceptions.py │ ├── __init__.py │ └── provenance.py ├── tests │ ├── data │ │ ├── simplify-bugs │ │ │ ├── 01-sites.txt │ │ │ ├── 02-sites.txt │ │ │ ├── 04-sites.txt │ │ │ ├── 05-sites.txt │ │ │ ├── 01-mutations.txt │ │ │ ├── 02-mutations.txt │ │ │ ├── 04-mutations.txt │ │ │ ├── 05-mutations.txt │ │ │ ├── 03-mutations.txt │ │ │ ├── 03-sites.txt │ │ │ ├── 01-nodes.txt │ │ │ ├── 04-nodes.txt │ │ │ ├── 05-edges.txt │ │ │ ├── 04-edges.txt │ │ │ ├── 05-nodes.txt │ │ │ ├── 03-edges.txt │ │ │ ├── 01-edges.txt │ │ │ ├── 03-nodes.txt │ │ │ └── 02-nodes.txt │ │ ├── SLiM │ │ │ ├── README │ │ │ ├── minimal-example.trees │ │ │ ├── single-locus-example.trees │ │ │ ├── minimal-example.txt │ │ │ └── single-locus-example.txt │ │ ├── old-formats │ │ │ └── tskit-0.3.3.trees │ │ ├── dict-encodings │ │ │ ├── msprime-0.7.4.pkl │ │ │ └── generate_msprime.py │ │ ├── hdf5-formats │ │ │ ├── msprime-0.3.0_v2.0.hdf5 │ │ │ ├── msprime-0.4.0_v3.1.hdf5 │ │ │ └── msprime-0.5.0_v10.0.hdf5 │ │ └── svg │ │ │ ├── tree_subtree.svg │ │ │ ├── tree.svg │ │ │ ├── tree_simple_collapsed.svg │ │ │ ├── tree_muts.svg │ │ │ ├── tree_both_axes.svg │ │ │ ├── tree_timed_muts.svg │ │ │ ├── tree_y_axis_rank.svg │ │ │ ├── tree_x_axis.svg │ │ │ └── tree_poly_tracked_collapse.svg │ ├── test_version.py │ ├── test_dict_encoding.py │ └── conftest.py ├── .gitignore ├── MANIFEST.in ├── .flake8 ├── lwt_interface │ ├── cython_example │ │ ├── Makefile │ │ ├── pyproject.toml │ │ ├── setup.py │ │ ├── _lwtc.c │ │ └── example.pyx │ ├── Makefile │ ├── CHANGELOG.rst │ ├── setup.py │ ├── test_example_c_module.py │ ├── README.md │ └── example_c_module.c ├── development.yml ├── README.rst ├── Makefile ├── setup.py ├── benchmark │ ├── run-for-all-releases.py │ └── config.yaml ├── stress_lowlevel.py └── pyproject.toml ├── docs ├── .gitignore ├── _static │ ├── README │ ├── four_leaf_tree_shapes.png │ ├── bespoke.css │ └── tskit_logo_pale.svg ├── favicon.ico ├── data │ └── basic_tree_seq.trees ├── substitutions │ ├── linear_traversal_warning.rst │ ├── virtual_root_array_note.rst │ ├── table_edit_warning.rst │ ├── tree_array_warning.rst │ └── table_keep_rows_main.rst ├── changelogs.rst ├── cli.md ├── build.sh ├── Makefile ├── _toc.yml ├── convert_changelog.py ├── introduction.md ├── citation.md ├── installation.md ├── quickstart.md ├── _config.yml ├── logo.svg └── ibd.md ├── .github ├── workflows │ ├── docker │ │ ├── shared.env │ │ └── buildwheel.sh │ ├── docs.yml │ └── release.yml └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── .circleci └── images │ └── 32bit │ └── Dockerfile ├── CONTRIBUTING.md ├── .clang-format ├── .codecov.yml ├── LICENSE ├── .pre-commit-config.yaml └── README.md /c/VERSION.txt: -------------------------------------------------------------------------------- 1 | 1.3.0 -------------------------------------------------------------------------------- /python/lib: -------------------------------------------------------------------------------- 1 | ../c/ -------------------------------------------------------------------------------- /python/LICENSE: -------------------------------------------------------------------------------- 1 | ../LICENSE -------------------------------------------------------------------------------- /python/tskit/jit/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /c/subprojects/kastore/VERSION.txt: -------------------------------------------------------------------------------- 1 | 2.1.1 2 | -------------------------------------------------------------------------------- /docs/.gitignore: -------------------------------------------------------------------------------- 1 | _build 2 | doxygen/xml 3 | -------------------------------------------------------------------------------- /c/.gitignore: -------------------------------------------------------------------------------- 1 | build 2 | .*.swp 3 | .*.swo 4 | -------------------------------------------------------------------------------- /c/tests/meson-subproject/subprojects/tskit: -------------------------------------------------------------------------------- 1 | ../../../ -------------------------------------------------------------------------------- /docs/_static/README: -------------------------------------------------------------------------------- 1 | Placeholder file to keep git happy. 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/01-sites.txt: -------------------------------------------------------------------------------- 1 | position ancestral_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/02-sites.txt: -------------------------------------------------------------------------------- 1 | position ancestral_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/04-sites.txt: -------------------------------------------------------------------------------- 1 | position ancestral_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/05-sites.txt: -------------------------------------------------------------------------------- 1 | position ancestral_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/01-mutations.txt: -------------------------------------------------------------------------------- 1 | site node derived_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/02-mutations.txt: -------------------------------------------------------------------------------- 1 | site node derived_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/04-mutations.txt: -------------------------------------------------------------------------------- 1 | site node derived_state 2 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/05-mutations.txt: -------------------------------------------------------------------------------- 1 | site node derived_state 2 | -------------------------------------------------------------------------------- /c/meson_options.txt: -------------------------------------------------------------------------------- 1 | option('build_examples', type : 'boolean', value : true) 2 | -------------------------------------------------------------------------------- /docs/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/docs/favicon.ico -------------------------------------------------------------------------------- /python/tests/data/SLiM/README: -------------------------------------------------------------------------------- 1 | The files in this directory are generated by SLiM. 2 | -------------------------------------------------------------------------------- /python/.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.so 3 | *.egg-info 4 | build 5 | .*.swp 6 | .*.swo 7 | */.ipynb_checkpoints 8 | -------------------------------------------------------------------------------- /python/tskit/__main__.py: -------------------------------------------------------------------------------- 1 | from . import cli 2 | 3 | if __name__ == "__main__": 4 | cli.tskit_main() 5 | -------------------------------------------------------------------------------- /docs/data/basic_tree_seq.trees: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/docs/data/basic_tree_seq.trees -------------------------------------------------------------------------------- /docs/_static/four_leaf_tree_shapes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/docs/_static/four_leaf_tree_shapes.png -------------------------------------------------------------------------------- /.github/workflows/docker/shared.env: -------------------------------------------------------------------------------- 1 | PYTHON_VERSIONS=( 2 | cp313-cp313 3 | cp312-cp312 4 | cp311-cp311 5 | cp310-cp310 6 | ) 7 | -------------------------------------------------------------------------------- /python/tests/data/SLiM/minimal-example.trees: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/SLiM/minimal-example.trees -------------------------------------------------------------------------------- /python/tests/data/old-formats/tskit-0.3.3.trees: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/old-formats/tskit-0.3.3.trees -------------------------------------------------------------------------------- /python/tests/data/SLiM/single-locus-example.trees: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/SLiM/single-locus-example.trees -------------------------------------------------------------------------------- /python/tests/data/dict-encodings/msprime-0.7.4.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/dict-encodings/msprime-0.7.4.pkl -------------------------------------------------------------------------------- /python/tests/data/hdf5-formats/msprime-0.3.0_v2.0.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/hdf5-formats/msprime-0.3.0_v2.0.hdf5 -------------------------------------------------------------------------------- /python/tests/data/hdf5-formats/msprime-0.4.0_v3.1.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/hdf5-formats/msprime-0.4.0_v3.1.hdf5 -------------------------------------------------------------------------------- /python/tests/data/hdf5-formats/msprime-0.5.0_v10.0.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tskit-dev/tskit/HEAD/python/tests/data/hdf5-formats/msprime-0.5.0_v10.0.hdf5 -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | build-gcc 2 | .DS_Store 3 | python/benchmark/*.trees 4 | python/benchmark/*.json 5 | python/benchmark/*.html 6 | .venv 7 | .env 8 | .vscode 9 | env 10 | -------------------------------------------------------------------------------- /python/tskit/_version.py: -------------------------------------------------------------------------------- 1 | # Definitive location for the version number. 2 | # During development, should be x.y.z.devN 3 | # For beta should be x.y.zbN 4 | tskit_version = "1.0.0" 5 | -------------------------------------------------------------------------------- /c/subprojects/kastore/README.md: -------------------------------------------------------------------------------- 1 | This directory is an abbreviated version of the kastore distribution source. 2 | 3 | All files should be updated when we are updating to a new kastore version. 4 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/03-mutations.txt: -------------------------------------------------------------------------------- 1 | site node derived_state 2 | 0 52 1 3 | 1 34 1 4 | 2 57 1 5 | 2 3 1 6 | 3 58 1 7 | 4 34 1 8 | 5 56 1 9 | 6 55 1 10 | 6 1 1 11 | 7 51 1 12 | 8 43 1 13 | 9 54 1 14 | 9 0 1 15 | -------------------------------------------------------------------------------- /python/MANIFEST.in: -------------------------------------------------------------------------------- 1 | include LICENSE 2 | include lwt_interface/tskit_lwt_interface.h 3 | include lib/subprojects/kastore/kastore.h 4 | include lib/tskit.h 5 | include lib/tskit/*.h 6 | include tskit/_version.py 7 | include tskit/provenance.schema.json 8 | -------------------------------------------------------------------------------- /docs/_static/bespoke.css: -------------------------------------------------------------------------------- 1 | /* When a code cell outputs tskit tables in plain text, widen the tab size so column 2 | contents line up. Invoke this by adding :tags:["output-wide-tabs"] to the cell */ 3 | .tag_output-wide-tabs .cell_output pre {tab-size: 16} 4 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/03-sites.txt: -------------------------------------------------------------------------------- 1 | position ancestral_state 2 | 284.252209 0 3 | 1313.686815 0 4 | 1554.123401 0 5 | 1736.203571 0 6 | 3310.290546 0 7 | 4208.672558 0 8 | 4995.288904 0 9 | 5187.559857 0 10 | 5211.162157 0 11 | 5483.889413 0 12 | -------------------------------------------------------------------------------- /c/tests/meson-subproject/meson.build: -------------------------------------------------------------------------------- 1 | project('example', 'c') 2 | 3 | tskit_proj = subproject('tskit') 4 | tskit_dep = tskit_proj.get_variable('tskit_dep') 5 | 6 | executable('example', 7 | 'example.c', 8 | dependencies : [tskit_dep], 9 | install : true) 10 | 11 | -------------------------------------------------------------------------------- /docs/substitutions/linear_traversal_warning.rst: -------------------------------------------------------------------------------- 1 | .. warning:: The current implementation of this operation is linear in the number of 2 | trees, so may be inefficient for large tree sequences. See 3 | `this issue `_ for more 4 | information. 5 | -------------------------------------------------------------------------------- /python/.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | # Based directly on Black's recommendations: 3 | # https://black.readthedocs.io/en/stable/the_black_code_style.html#line-length 4 | max-line-length = 81 5 | select = A,C,E,F,W,B,B950 6 | #B305 doesn't like `.next()` that is a key Tree method. 7 | ignore = E203, E501, W503, B305 8 | -------------------------------------------------------------------------------- /python/lwt_interface/cython_example/Makefile: -------------------------------------------------------------------------------- 1 | all: compile run 2 | 3 | compile: 4 | pip install -e . --use-pep517 5 | 6 | run: 7 | PYTHONPATH=../.. python -c "import example; example.main()" 8 | 9 | clean: 10 | rm -rf build/ 11 | rm -rf tskit_cython_example.egg-info/ 12 | rm -f example.c 13 | rm -f *.so 14 | -------------------------------------------------------------------------------- /docs/changelogs.rst: -------------------------------------------------------------------------------- 1 | .. note: this is left in rst format to avoid Duplicate ID issues 2 | 3 | .. _sec_changelogs: 4 | 5 | ========== 6 | Changelogs 7 | ========== 8 | 9 | ****** 10 | Python 11 | ****** 12 | 13 | .. include:: ../python/CHANGELOG.rst 14 | 15 | ***** 16 | C API 17 | ***** 18 | 19 | .. include:: ../c/CHANGELOG.rst 20 | -------------------------------------------------------------------------------- /docs/substitutions/virtual_root_array_note.rst: -------------------------------------------------------------------------------- 1 | .. note:: The length of these arrays is 2 | equal to the number of nodes in the tree sequence plus 1, with the 3 | final element corresponding to the tree's :meth:`~.Tree.virtual_root`. 4 | Please see the :ref:`tree roots ` section 5 | for more details. 6 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/01-nodes.txt: -------------------------------------------------------------------------------- 1 | is_sample time 2 | 1 0.000000 3 | 1 0.000000 4 | 1 0.000000 5 | 1 0.000000 6 | 1 0.000000 7 | 0 5.000000 8 | 0 6.000000 9 | 0 7.000000 10 | 0 8.000000 11 | 0 9.000000 12 | 0 10.000000 13 | 0 11.000000 14 | 0 12.000000 15 | 0 13.000000 16 | 0 14.000000 17 | 0 15.000000 18 | 0 16.000000 19 | -------------------------------------------------------------------------------- /.circleci/images/32bit/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM i386/python:3.10-slim-bullseye 2 | 3 | RUN apt-get update && apt-get install -y sudo rustc cargo libhdf5-dev libgsl-dev pkg-config libssl-dev llvm build-essential 4 | RUN adduser --disabled-password --gecos "" circleci 5 | RUN echo 'circleci ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers 6 | 7 | USER circleci 8 | -------------------------------------------------------------------------------- /.github/workflows/docs.yml: -------------------------------------------------------------------------------- 1 | name: Build Docs 2 | on: 3 | pull_request: 4 | merge_group: 5 | push: 6 | branches: [main] 7 | tags: 8 | - '*' 9 | jobs: 10 | Docs: 11 | uses: tskit-dev/.github/.github/workflows/docs-build-template.yml@v1 12 | with: 13 | additional-setup: sudo apt-get install -y doxygen 14 | make-command: make -C python -------------------------------------------------------------------------------- /docs/substitutions/table_edit_warning.rst: -------------------------------------------------------------------------------- 1 | .. warning:: The numpy arrays returned by table attribute accesses are copies 2 | of the underlying data. In particular, this means that editing 3 | individual values in the arrays will not change the table data 4 | Instead, you should set entire columns or rows at once 5 | (see :ref:`sec_tables_api_accessing_table_data`). 6 | -------------------------------------------------------------------------------- /python/lwt_interface/Makefile: -------------------------------------------------------------------------------- 1 | 2 | all: cmodule 3 | 4 | allchecks: example_c_module.c 5 | CFLAGS="-std=c99 --coverage -Wall -Wextra -Werror -Wno-unused-parameter -Wno-cast-function-type" \ 6 | python3 setup.py build_ext --inplace 7 | 8 | cmodule: example_c_module.c 9 | python3 setup.py build_ext --inplace 10 | 11 | clean: 12 | rm -f *.so *.o tags 13 | rm -fR build 14 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/04-nodes.txt: -------------------------------------------------------------------------------- 1 | is_sample time population 2 | 1 0.000000 -1 3 | 1 0.000000 -1 4 | 1 0.000000 -1 5 | 1 0.000000 -1 6 | 1 0.000000 -1 7 | 1 0.000000 -1 8 | 0 1.000000 -1 9 | 0 1.000000 -1 10 | 0 1.000000 -1 11 | 0 1.000000 -1 12 | 0 2.000000 -1 13 | 0 3.000000 -1 14 | 0 4.000000 -1 15 | 0 2.000000 -1 16 | 0 1.000000 -1 17 | 0 2.000000 -1 18 | -------------------------------------------------------------------------------- /python/development.yml: -------------------------------------------------------------------------------- 1 | name: tskit-dev 2 | channels: 3 | - conda-forge 4 | dependencies: 5 | - python>=3.9 6 | - pip 7 | - doxygen 8 | - numpy 9 | - pandas 10 | - scipy 11 | - pytest 12 | - pytest-cov 13 | - pytest-xdist 14 | - coverage 15 | - flake8 16 | - mypy 17 | - pre-commit 18 | - sphinx>=4.4 19 | - jupyter-book 20 | - networkx 21 | - matplotlib 22 | - jsonschema 23 | -------------------------------------------------------------------------------- /python/README.rst: -------------------------------------------------------------------------------- 1 | 2 | The tree sequence toolkit. 3 | 4 | Tskit is a cross-platform library for the storage and analysis of large-scale 5 | genetic genealogy and variation data. 6 | Please see the `documentation `_ 7 | for further details. 8 | 9 | Tskit is highly portable, and provides a number of 10 | `installation options `_. 11 | -------------------------------------------------------------------------------- /python/lwt_interface/cython_example/pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools>=64", "wheel", "Cython", "numpy"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "tskit_cython_example" 7 | version = "0.0.1" 8 | description = "Cython example for tskit" 9 | authors = [{name = "tskit developers"}] 10 | dependencies = ["numpy", "Cython"] 11 | 12 | [tool.setuptools] 13 | packages = [] 14 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/05-edges.txt: -------------------------------------------------------------------------------- 1 | left right parent child 2 | 0.0 0.8 5 9 3 | 0.3 1.0 5 10 4 | 0.0 1.0 6 8 5 | 0.0 0.3 6 10 6 | 0.0 0.9 7 11 7 | 0.0 1.0 7 12 8 | 0.8 1.0 7 9 9 | 0.9 1.0 1 11 10 | 0.4 1.0 1 6 11 | 0.0 0.4 4 6 12 | 0.0 1.0 4 7 13 | 0.0 1.0 0 1,2,4,5 14 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/04-edges.txt: -------------------------------------------------------------------------------- 1 | left right parent child 2 | 0.000000 0.500000 6 0,1 3 | 0.500000 1.000000 6 4,5 4 | 0.000000 0.400000 7 2,3 5 | 0.000000 0.500000 8 4,5 6 | 0.500000 1.000000 8 0,1 7 | 0.400000 1.000000 9 2,3 8 | 0.400000 1.000000 10 8,9 9 | 0.000000 0.100000 13 6,14 10 | 0.100000 0.400000 15 7,14 11 | 0.000000 0.100000 11 7,13 12 | 0.100000 0.400000 11 6,15 13 | 0.000000 0.400000 12 8,11 14 | 0.400000 1.000000 12 6,10 15 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/05-nodes.txt: -------------------------------------------------------------------------------- 1 | id is_sample population time 2 | 0 0 0 6.0 3 | 1 0 0 2.0 4 | 2 0 0 2.0 5 | 3 0 0 2.0 6 | 4 0 0 2.0 7 | 5 0 0 1.0 8 | 6 0 0 1.0 9 | 7 0 0 1.0 10 | 8 1 0 0.0 11 | 9 1 0 0.0 12 | 10 1 0 0.0 13 | 11 1 0 0.0 14 | 12 1 0 0.0 15 | -------------------------------------------------------------------------------- /python/tests/data/SLiM/minimal-example.txt: -------------------------------------------------------------------------------- 1 | initialize() { 2 | initializeTreeSeq(); 3 | initializeMutationRate(0.0); 4 | initializeMutationType("m1", 0.5, "f", -0.1); 5 | initializeGenomicElementType("g1", m1, 1.0); 6 | initializeGenomicElement(g1, 0, 9); 7 | initializeRecombinationRate(1e-1); 8 | } 9 | 1 { 10 | sim.addSubpop("p1", 5); 11 | } 12 | 3 { 13 | sim.treeSeqOutput("tests/data/SLiM/minimal-example.trees"); 14 | sim.simulationFinished(); 15 | } 16 | -------------------------------------------------------------------------------- /python/tests/data/SLiM/single-locus-example.txt: -------------------------------------------------------------------------------- 1 | initialize() { 2 | initializeTreeSeq(); 3 | initializeMutationRate(0.0); 4 | initializeMutationType("m1", 0.5, "f", -0.1); 5 | initializeGenomicElementType("g1", m1, 1.0); 6 | initializeGenomicElement(g1, 0, 9); 7 | initializeRecombinationRate(0); 8 | } 9 | 1 { 10 | sim.addSubpop("p1", 5); 11 | } 12 | 3 { 13 | sim.treeSeqOutput("tests/data/SLiM/single-locus-example.trees"); 14 | sim.simulationFinished(); 15 | } 16 | -------------------------------------------------------------------------------- /docs/cli.md: -------------------------------------------------------------------------------- 1 | --- 2 | jupytext: 3 | text_representation: 4 | extension: .md 5 | format_name: myst 6 | format_version: 0.12 7 | jupytext_version: 1.9.1 8 | kernelspec: 9 | display_name: Python 3 10 | language: python 11 | name: python3 12 | --- 13 | 14 | ```{currentmodule} tskit 15 | ``` 16 | 17 | (sec_cli)= 18 | 19 | # Command line interface 20 | 21 | ```{eval-rst} 22 | .. argparse:: 23 | :module: tskit.cli 24 | :func: get_tskit_parser 25 | :prog: python3 -m tskit 26 | ``` -------------------------------------------------------------------------------- /docs/substitutions/tree_array_warning.rst: -------------------------------------------------------------------------------- 1 | .. warning:: This is a high-performance interface which 2 | provides zero-copy access to memory used in the C library. 3 | As a consequence, the values stored in this array will change as 4 | the Tree state is modified as we move along the tree sequence. See the 5 | :class:`.Tree` documentation for more details. Therefore, if you want to 6 | compare arrays representing different trees along the sequence, you must 7 | take **copies** of the arrays. 8 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | Tskit is a free and open-source project that welcomes contributions from everyone. 4 | The [Developer documentation](https://tskit.dev/tskit/docs/latest/development.html) 5 | will help you get started. 6 | 7 | We have an active slack group where tskit and associated projects are discussed. 8 | If you wish to join email [admin@tskit.dev](mailto:admin@tskit.dev). 9 | 10 | We ask all users to follow our [code of conduct](https://github.com/tskit-dev/.github/blob/main/CODE_OF_CONDUCT.md) 11 | when interacting with the project. 12 | -------------------------------------------------------------------------------- /docs/build.sh: -------------------------------------------------------------------------------- 1 | #/bin/bash 2 | 3 | # Jupyter-build doesn't have an option to automatically show the 4 | # saved reports, which makes it difficult to debug the reasons for 5 | # build failures in CI. This is a simple wrapper to handle that. 6 | 7 | REPORTDIR=_build/html/reports 8 | 9 | jupyter-book build -Wn --keep-going . 10 | RETVAL=$? 11 | if [ $RETVAL -ne 0 ]; then 12 | if [ -e $REPORTDIR ]; then 13 | echo "Error occured; showing saved reports" 14 | cat $REPORTDIR/* 15 | fi 16 | else 17 | # Clear out any old reports 18 | rm -f $REPORTDIR/* 19 | fi 20 | exit $RETVAL 21 | -------------------------------------------------------------------------------- /python/Makefile: -------------------------------------------------------------------------------- 1 | PYTHON := $(shell command -v python 2> /dev/null) 2 | ifndef PYTHON 3 | $(error "python is not available via the `python` executable. If you have python only via `python3` you may need to `apt install python-is-python3`") 4 | endif 5 | 6 | all: ext3 7 | 8 | allchecks: _tskitmodule.c 9 | CFLAGS="-std=c99 --coverage -Wall -Wextra -Werror -Wno-unused-parameter -Wno-cast-function-type" \ 10 | python setup.py build_ext --inplace 11 | 12 | ext3: _tskitmodule.c 13 | 14 | python setup.py build_ext --inplace 15 | 16 | ctags: 17 | ctags lib/*.c lib/*.h tskit/*.py 18 | 19 | clean: 20 | rm -f *.so *.o tags 21 | rm -fR build 22 | -------------------------------------------------------------------------------- /.github/workflows/docker/buildwheel.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | DOCKER_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 3 | source "$DOCKER_DIR/shared.env" 4 | 5 | set -e -x 6 | 7 | ARCH=`uname -p` 8 | echo "arch=$ARCH" 9 | #yum -y install gsl-devel #For msprime 10 | 11 | cd python 12 | 13 | for V in "${PYTHON_VERSIONS[@]}"; do 14 | PYBIN=/opt/python/$V/bin 15 | rm -rf build/ # Avoid lib build by narrow Python is used by wide python 16 | $PYBIN/python -m pip install build 17 | $PYBIN/python -m build --wheel 18 | done 19 | 20 | cd dist 21 | for whl in *.whl; do 22 | auditwheel repair "$whl" 23 | rm "$whl" 24 | done 25 | -------------------------------------------------------------------------------- /.clang-format: -------------------------------------------------------------------------------- 1 | Language: Cpp 2 | BasedOnStyle: GNU 3 | SortIncludes: false 4 | AllowShortIfStatementsOnASingleLine: false 5 | BreakBeforeBraces: Linux 6 | TabWidth: 4 7 | IndentWidth: 4 8 | ColumnLimit: 89 9 | SpaceBeforeParens: 10 | ControlStatements 11 | SpacesInCStyleCastParentheses: false 12 | SpaceAfterCStyleCast: true 13 | IndentCaseLabels: true 14 | AlignAfterOpenBracket: DontAlign 15 | BinPackArguments: true 16 | BinPackParameters: true 17 | AlwaysBreakAfterReturnType: AllDefinitions 18 | 19 | # These are disabled for version 6 compatibility 20 | # StatementMacros: ["PyObject_HEAD"] 21 | # AlignConsecutiveMacros: true 22 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Description 2 | 3 | Thanks for contributing to tskit! :heart: 4 | A guide to the PR process is [here](https://tskit.readthedocs.io/en/latest/development.html#development_workflow_git) 5 | Please replace this text with a summary of the change and which issue is fixed, if any. Please also include relevant motivation and context. 6 | 7 | Fixes #(issue) <- Putting the issue number here will auto-close the issue when this PR is merged 8 | 9 | # PR Checklist: 10 | 11 | - [ ] Tests that fully cover new/changed functionality. 12 | - [ ] Documentation including tutorial content if appropriate. 13 | - [ ] Changelogs, if there are API changes. 14 | -------------------------------------------------------------------------------- /python/tests/data/dict-encodings/generate_msprime.py: -------------------------------------------------------------------------------- 1 | import pathlib 2 | import pickle 3 | 4 | import _msprime 5 | import msprime 6 | 7 | pop_configs = [msprime.PopulationConfiguration(5) for _ in range(2)] 8 | migration_matrix = [[0, 1], [1, 0]] 9 | ts = msprime.simulate( 10 | population_configurations=pop_configs, 11 | migration_matrix=migration_matrix, 12 | mutation_rate=1, 13 | record_migrations=True, 14 | random_seed=1, 15 | ) 16 | lwt = _msprime.LightweightTableCollection() 17 | lwt.fromdict(ts.tables.asdict()) 18 | 19 | test_dir = pathlib.Path(__file__).parent 20 | with open(test_dir / f"msprime-{msprime.__version__}.pkl", "wb") as f: 21 | pickle.dump(lwt.asdict(), f) 22 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/03-edges.txt: -------------------------------------------------------------------------------- 1 | left right parent child 2 | 0.000000 10000.000000 50 29,31 3 | 0.000000 10000.000000 51 11,15 4 | 0.000000 1554.123401 52 1,51 5 | 1554.123401 10000.000000 52 1 6 | 0.000000 1736.203571 53 52 7 | 1736.203571 10000.000000 53 51,52 8 | 0.000000 10000.000000 54 4,12,27,38,39,40 9 | 0.000000 10000.000000 55 17,25,45,48,49,50 10 | 0.000000 10000.000000 56 24,55 11 | 0.000000 1554.123401 57 56 12 | 1554.123401 1736.203571 57 51,56 13 | 1736.203571 10000.000000 57 56 14 | 0.000000 10000.000000 58 0,13,22,57 15 | 0.000000 10000.000000 59 2,3,5,6,7,8,9,10,14,16,18,19,20,21,23,26,28,30,32,33,34,35,36,37,41,42,43,44,46,47,53,54,58 16 | 0.000000 10000.000000 60 59 17 | -------------------------------------------------------------------------------- /docs/Makefile: -------------------------------------------------------------------------------- 1 | 2 | # Need to set PYTHONPATH so that we pick up the local tskit 3 | PYPATH=$(shell pwd)/../python/ 4 | TSK_VERSION:=$(shell PYTHONPATH=${PYPATH} \ 5 | python -c 'import tskit; print(tskit.__version__.split("+")[0])') 6 | 7 | BUILDDIR = _build 8 | DOXYGEN_XML=doxygen/xml 9 | 10 | all: ${DOXYGEN_XML} dev 11 | 12 | ${DOXYGEN_XML}: ../c/tskit/*.h 13 | cd doxygen && doxygen 14 | 15 | dev: 16 | PYTHONPATH=${PYPATH} ./build.sh 17 | 18 | dist: 19 | @echo Building distribution for tskit version ${TSK_VERSION} 20 | cd doxygen && doxygen 21 | sed -i -e s/__TSKIT_VERSION__/${TSK_VERSION}/g _config.yml 22 | PYTHONPATH=${PYPATH} ./build.sh 23 | 24 | clean: 25 | rm -fR $(BUILDDIR) $(DOXYGEN_XML) 26 | -------------------------------------------------------------------------------- /docs/substitutions/table_keep_rows_main.rst: -------------------------------------------------------------------------------- 1 | Updates this table in-place according to the specified boolean 2 | array, and returns the resulting mapping from old to new row IDs. 3 | For each row ``j``, if ``keep[j]`` is True, that row will be 4 | retained in the output; otherwise, the row will be deleted. 5 | Rows are retained in their original ordering. 6 | 7 | The returned ``id_map`` is an array of the same length as 8 | this table before the operation, such that ``id_map[j] = -1`` 9 | (:data:`tskit.NULL`) if row ``j`` was deleted, and ``id_map[j]`` 10 | is the new ID of that row, otherwise. 11 | 12 | .. todo:: 13 | This needs some examples to link to. See 14 | https://github.com/tskit-dev/tskit/issues/2708 15 | -------------------------------------------------------------------------------- /docs/_toc.yml: -------------------------------------------------------------------------------- 1 | format: jb-book 2 | root: introduction 3 | parts: 4 | - caption: Getting started 5 | chapters: 6 | - file: installation 7 | - file: quickstart 8 | - caption: Concepts 9 | chapters: 10 | - file: glossary 11 | - file: data-model 12 | - file: metadata 13 | - file: provenance 14 | - caption: Analysis 15 | chapters: 16 | - file: stats 17 | - file: topological-analysis 18 | - file: ibd 19 | - file: export 20 | - caption: Interfaces 21 | chapters: 22 | - file: python-api 23 | - file: numba 24 | - file: c-api 25 | - file: cli 26 | - file: file-formats 27 | - caption: For developers 28 | chapters: 29 | - file: development 30 | - file: changelogs 31 | - caption: Miscellaneous 32 | chapters: 33 | - file: citation 34 | 35 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/01-edges.txt: -------------------------------------------------------------------------------- 1 | left right parent child 2 | 0.000000 4.000000 5 2,3 3 | 4.000000 9.000000 5 3 4 | 22.000000 28.000000 5 3 5 | 0.000000 18.000000 6 0,1,4 6 | 18.000000 19.000000 6 0,1,4,5 7 | 19.000000 28.000000 6 0,1,5 8 | 0.000000 19.000000 7 6 9 | 19.000000 28.000000 7 2,6 10 | 0.000000 28.000000 8 7 11 | 0.000000 28.000000 9 8 12 | 0.000000 18.000000 10 5,9 13 | 18.000000 28.000000 10 9 14 | 0.000000 19.000000 11 10 15 | 19.000000 28.000000 11 4,10 16 | 0.000000 9.000000 12 11 17 | 9.000000 22.000000 12 3,11 18 | 22.000000 28.000000 12 11 19 | 0.000000 28.000000 13 12 20 | 0.000000 28.000000 14 13 21 | 0.000000 28.000000 15 14 22 | 0.000000 4.000000 16 15 23 | 4.000000 19.000000 16 2,15 24 | 19.000000 28.000000 16 15 25 | -------------------------------------------------------------------------------- /docs/convert_changelog.py: -------------------------------------------------------------------------------- 1 | import re 2 | import sys 3 | 4 | SUBS = [ 5 | (r":user:`([A-Za-z0-9-]*)`", r"[@\1](https://github.com/\1)"), 6 | (r":pr:`([0-9]*)`", r"[#\1](https://github.com/tskit-dev/tskit/issues/\1)"), 7 | (r":issue:`([0-9]*)`", r"[#\1](https://github.com/tskit-dev/tskit/issues/\1)"), 8 | ] 9 | 10 | 11 | def process_log(log): 12 | delimiters_seen = 0 13 | for line in log: 14 | if line.startswith("-------"): 15 | delimiters_seen += 1 16 | continue 17 | if delimiters_seen == 3: 18 | return 19 | if delimiters_seen % 2 == 0: 20 | for pattern, replace in SUBS: 21 | line = re.sub(pattern, replace, line) 22 | yield line 23 | 24 | 25 | with open(sys.argv[1]) as f: 26 | print("".join(process_log(f.readlines()))) 27 | -------------------------------------------------------------------------------- /c/examples/error_handling.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | int 6 | main(int argc, char **argv) 7 | { 8 | int ret; 9 | tsk_treeseq_t ts; 10 | 11 | if (argc != 2) { 12 | fprintf(stderr, "usage: "); 13 | exit(EXIT_FAILURE); 14 | } 15 | ret = tsk_treeseq_load(&ts, argv[1], 0); 16 | if (ret < 0) { 17 | /* Error condition. Free and exit */ 18 | tsk_treeseq_free(&ts); 19 | fprintf(stderr, "%s", tsk_strerror(ret)); 20 | exit(EXIT_FAILURE); 21 | } 22 | printf("Loaded tree sequence with %lld nodes and %lld edges from %s\n", 23 | (long long) tsk_treeseq_get_num_nodes(&ts), 24 | (long long) tsk_treeseq_get_num_edges(&ts), 25 | argv[1]); 26 | tsk_treeseq_free(&ts); 27 | 28 | return EXIT_SUCCESS; 29 | } 30 | -------------------------------------------------------------------------------- /.codecov.yml: -------------------------------------------------------------------------------- 1 | codecov: 2 | require_ci_to_pass: false 3 | comment: 4 | layout: "diff, flags, files" 5 | fixes: 6 | - "tskit/::python/tskit/" 7 | flag_management: 8 | individual_flags: 9 | - name: python-tests 10 | paths: 11 | - python/tskit/*.py 12 | statuses: 13 | - type: project 14 | target: 95% 15 | - name: python-c-tests 16 | paths: 17 | - python/_tskitmodule.c 18 | statuses: 19 | - type: project 20 | target: 85% 21 | - name: c-tests 22 | paths: 23 | - c/tskit/*.c 24 | - c/tskit/*.h 25 | statuses: 26 | - type: project 27 | target: 85% 28 | - name: lwt-tests 29 | paths: 30 | - python/lwt_interface/*.c 31 | - python/lwt_interface/*.h 32 | statuses: 33 | - type: project 34 | target: 80% 35 | -------------------------------------------------------------------------------- /python/lwt_interface/CHANGELOG.rst: -------------------------------------------------------------------------------- 1 | -------------------- 2 | [0.1.4] - 2021-09-02 3 | -------------------- 4 | 5 | - Offset columns are now 64 bit in tskit. For compatibility, offset arrays that fit into 6 | 32bits will be a 32bit array in the dict encoding. Large arrays that require 64 bit 7 | will fail to ``fromdict`` in previous versions with the error: 8 | ``TypeError: Cannot cast array data from dtype('uint64') to dtype('uint32') according 9 | to the rule 'safe'`` 10 | A ``force_offset_64`` option on ``asdict`` allows the easy creation of 64bit arrays for 11 | testing. 12 | 13 | -------------------- 14 | [0.1.3] - 2021-02-01 15 | -------------------- 16 | 17 | - Added optional ``parents`` to individual table. 18 | 19 | -------------------- 20 | [0.1.2] - 2020-10-22 21 | -------------------- 22 | 23 | - Added optional top-level key ``indexes`` which has contains ``edge_insertion_order`` and 24 | ``edge_removal_order`` -------------------------------------------------------------------------------- /c/examples/api_structure.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #define check_tsk_error(val) \ 6 | if (val < 0) { \ 7 | fprintf(stderr, "line %d: %s", __LINE__, tsk_strerror(val)); \ 8 | exit(EXIT_FAILURE); \ 9 | } 10 | 11 | int 12 | main(int argc, char **argv) 13 | { 14 | int j, ret; 15 | tsk_edge_table_t edges; 16 | 17 | ret = tsk_edge_table_init(&edges, 0); 18 | check_tsk_error(ret); 19 | for (j = 0; j < 5; j++) { 20 | ret = tsk_edge_table_add_row(&edges, 0, 1, j + 1, j, NULL, 0); 21 | check_tsk_error(ret); 22 | } 23 | tsk_edge_table_print_state(&edges, stdout); 24 | tsk_edge_table_free(&edges); 25 | 26 | return EXIT_SUCCESS; 27 | } 28 | -------------------------------------------------------------------------------- /python/setup.py: -------------------------------------------------------------------------------- 1 | import os 2 | import platform 3 | 4 | import numpy 5 | from setuptools import Extension 6 | from setuptools import setup 7 | 8 | IS_WINDOWS = platform.system() == "Windows" 9 | 10 | 11 | libdir = "lib" 12 | kastore_dir = os.path.join(libdir, "subprojects", "kastore") 13 | tsk_source_files = [ 14 | "core.c", 15 | "tables.c", 16 | "trees.c", 17 | "genotypes.c", 18 | "stats.c", 19 | "convert.c", 20 | "haplotype_matching.c", 21 | ] 22 | sources = ( 23 | ["_tskitmodule.c"] 24 | + [os.path.join(libdir, "tskit", f) for f in tsk_source_files] 25 | + [os.path.join(kastore_dir, "kastore.c")] 26 | ) 27 | 28 | defines = [] 29 | libraries = [] 30 | if IS_WINDOWS: 31 | libraries.append("Advapi32") 32 | defines.append(("WIN32", None)) 33 | 34 | _tskit_module = Extension( 35 | "_tskit", 36 | sources=sources, 37 | extra_compile_args=["-std=c99"], 38 | libraries=libraries, 39 | define_macros=defines, 40 | include_dirs=["lwt_interface", libdir, kastore_dir, numpy.get_include()], 41 | ) 42 | 43 | setup( 44 | ext_modules=[_tskit_module], 45 | ) 46 | -------------------------------------------------------------------------------- /python/benchmark/run-for-all-releases.py: -------------------------------------------------------------------------------- 1 | import json 2 | import subprocess 3 | from urllib.request import urlopen 4 | 5 | import tqdm 6 | from distutils.version import StrictVersion 7 | 8 | 9 | def versions(package_name): 10 | url = f"https://pypi.org/pypi/{package_name}/json" 11 | data = json.load(urlopen(url)) 12 | return sorted(data["releases"].keys(), key=StrictVersion) 13 | 14 | 15 | def sh(command): 16 | subprocess.run(command, check=True, shell=True) 17 | 18 | 19 | if __name__ == "__main__": 20 | try: 21 | sh("python -m venv _bench-temp-venv") 22 | sh("_bench-temp-venv/bin/pip install -r ../requirements/development.txt") 23 | versions = [ 24 | v 25 | for v in versions("tskit") 26 | # We don't want alphas, betas or two broken versions: 27 | if "a" not in v and "b" not in v and v not in ("0.0.0", "0.1.0") 28 | ] 29 | for v in tqdm.tqdm(versions): 30 | sh(f"_bench-temp-venv/bin/pip install tskit=={v}") 31 | sh("_bench-temp-venv/bin/python run.py") 32 | finally: 33 | sh("rm -rf _bench-temp-venv") 34 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018-2019 Tskit Developers 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /python/lwt_interface/cython_example/setup.py: -------------------------------------------------------------------------------- 1 | import glob 2 | import os 3 | 4 | import numpy as np 5 | from Cython.Build import cythonize 6 | from setuptools import setup 7 | from setuptools.extension import Extension 8 | 9 | TSKIT_BASE = os.path.join(os.path.dirname(__file__), "..", "..", "..") 10 | TSKIT_C_PATH = os.path.join(TSKIT_BASE, "c") 11 | TSKIT_PY_PATH = os.path.join(TSKIT_BASE, "python/lwt_interface") 12 | KASTORE_PATH = os.path.join(TSKIT_BASE, "c", "subprojects", "kastore") 13 | include_dirs = [TSKIT_C_PATH, TSKIT_PY_PATH, KASTORE_PATH, np.get_include()] 14 | 15 | tskit_sourcefiles = list(glob.glob(os.path.join(TSKIT_C_PATH, "tskit", "*.c"))) + [ 16 | os.path.join(KASTORE_PATH, "kastore.c") 17 | ] 18 | 19 | extensions = [ 20 | Extension( 21 | "_lwtc", 22 | ["_lwtc.c"] + tskit_sourcefiles, 23 | language="c", 24 | include_dirs=include_dirs, 25 | ), 26 | Extension( 27 | "example", 28 | ["example.pyx"] + tskit_sourcefiles, 29 | language="c", 30 | include_dirs=include_dirs, 31 | ), 32 | ] 33 | 34 | extensions = cythonize(extensions, language_level=3) 35 | 36 | setup( 37 | name="tskit_cython_example", 38 | version="0.0.1", 39 | ext_modules=extensions, 40 | ) 41 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/03-nodes.txt: -------------------------------------------------------------------------------- 1 | is_sample time population 2 | 1 0.000000 -1 3 | 1 0.000000 -1 4 | 1 0.000000 -1 5 | 1 0.000000 -1 6 | 1 0.000000 -1 7 | 1 0.000000 -1 8 | 1 0.000000 -1 9 | 1 0.000000 -1 10 | 1 0.000000 -1 11 | 1 0.000000 -1 12 | 1 0.000000 -1 13 | 1 0.000000 -1 14 | 1 0.000000 -1 15 | 1 0.000000 -1 16 | 1 0.000000 -1 17 | 1 0.000000 -1 18 | 1 0.000000 -1 19 | 1 0.000000 -1 20 | 1 0.000000 -1 21 | 1 0.000000 -1 22 | 1 0.000000 -1 23 | 1 0.000000 -1 24 | 1 0.000000 -1 25 | 1 0.000000 -1 26 | 1 0.000000 -1 27 | 1 0.000000 -1 28 | 1 0.000000 -1 29 | 1 0.000000 -1 30 | 1 0.000000 -1 31 | 1 0.000000 -1 32 | 1 0.000000 -1 33 | 1 0.000000 -1 34 | 1 0.000000 -1 35 | 1 0.000000 -1 36 | 1 0.000000 -1 37 | 1 0.000000 -1 38 | 1 0.000000 -1 39 | 1 0.000000 -1 40 | 1 0.000000 -1 41 | 1 0.000000 -1 42 | 1 0.000000 -1 43 | 1 0.000000 -1 44 | 1 0.000000 -1 45 | 1 0.000000 -1 46 | 1 0.000000 -1 47 | 1 0.000000 -1 48 | 1 0.000000 -1 49 | 1 0.000000 -1 50 | 1 0.000000 -1 51 | 1 0.000000 -1 52 | 0 50.000000 -1 53 | 0 51.000000 -1 54 | 0 52.000000 -1 55 | 0 53.000000 -1 56 | 0 54.000000 -1 57 | 0 55.000000 -1 58 | 0 56.000000 -1 59 | 0 57.000000 -1 60 | 0 58.000000 -1 61 | 0 59.000000 -1 62 | 0 60.000000 -1 63 | -------------------------------------------------------------------------------- /c/examples/Makefile: -------------------------------------------------------------------------------- 1 | # Simple Makefile for building examples. 2 | # This will build the examples in the current directory by compiling in the 3 | # full tskit source into each of the examples. This is *not* recommended for 4 | # real projects! 5 | # 6 | # To use, type "make" in the this directory. If you have GSL installed you 7 | # should then get two example programs built. 8 | # 9 | # **Note**: This repo uses git submodules, and these must be checked out 10 | # correctly for this makefile to work, e.g.: 11 | # 12 | # $ git clone git@github.com:tskit-dev/tskit.git --recurse-submodules 13 | # 14 | # See the documentation (https://tskit.dev/tskit/docs/stable/c-api.html) 15 | # for more details on how to use the C API, and the tskit build examples 16 | # repo (https://github.com/tskit-dev/tskit-build-examples) for examples 17 | # of how to set up a production-ready build with tskit. 18 | # 19 | 20 | CFLAGS=-I../ -I../subprojects/kastore 21 | TSKIT_SOURCE=../tskit/*.c ../subprojects/kastore/kastore.c 22 | 23 | targets = api_structure error_handling \ 24 | haploid_wright_fisher streaming \ 25 | tree_iteration tree_traversal \ 26 | take_ownership 27 | 28 | all: $(targets) 29 | 30 | $(targets): %: %.c 31 | ${CC} ${CFLAGS} -o $@ $< ${TSKIT_SOURCE} -lm 32 | 33 | clean: 34 | rm -f $(targets) 35 | 36 | -------------------------------------------------------------------------------- /c/examples/streaming.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #define check_tsk_error(val) \ 6 | if (val < 0) { \ 7 | fprintf(stderr, "Error: line %d: %s\n", __LINE__, tsk_strerror(val)); \ 8 | exit(EXIT_FAILURE); \ 9 | } 10 | 11 | int 12 | main(int argc, char **argv) 13 | { 14 | int ret; 15 | int j = 0; 16 | tsk_table_collection_t tables; 17 | 18 | ret = tsk_table_collection_init(&tables, 0); 19 | check_tsk_error(ret); 20 | 21 | while (true) { 22 | ret = tsk_table_collection_loadf(&tables, stdin, TSK_NO_INIT); 23 | if (ret == TSK_ERR_EOF) { 24 | break; 25 | } 26 | check_tsk_error(ret); 27 | fprintf(stderr, "Tree sequence %d had %lld mutations\n", j, 28 | (long long) tables.mutations.num_rows); 29 | ret = tsk_mutation_table_truncate(&tables.mutations, 0); 30 | check_tsk_error(ret); 31 | ret = tsk_table_collection_dumpf(&tables, stdout, 0); 32 | check_tsk_error(ret); 33 | j++; 34 | } 35 | tsk_table_collection_free(&tables); 36 | return EXIT_SUCCESS; 37 | } 38 | -------------------------------------------------------------------------------- /c/examples/take_ownership.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #define check_tsk_error(val) \ 7 | if (val < 0) { \ 8 | errx(EXIT_FAILURE, "line %d: %s", __LINE__, tsk_strerror(val)); \ 9 | } 10 | 11 | int 12 | main(int argc, char **argv) 13 | { 14 | tsk_table_collection_t *tables; 15 | tsk_treeseq_t treeseq; 16 | int rv; 17 | 18 | tables = malloc(sizeof(*tables)); 19 | rv = tsk_table_collection_init(tables, 0); 20 | check_tsk_error(rv); 21 | 22 | /* NOTE: you must set sequence length AFTER initialization */ 23 | tables->sequence_length = 1.0; 24 | 25 | /* Do your regular table operations */ 26 | rv = tsk_node_table_add_row(&tables->nodes, 0, 0.0, -1, -1, NULL, 0); 27 | check_tsk_error(rv); 28 | 29 | /* Initalize the tree sequence, transferring all responsibility 30 | * for the table collection's memory managment 31 | */ 32 | rv = tsk_treeseq_init( 33 | &treeseq, tables, TSK_TS_INIT_BUILD_INDEXES | TSK_TAKE_OWNERSHIP); 34 | check_tsk_error(rv); 35 | 36 | /* WARNING: calling tsk_table_collection_free is now a memory error! */ 37 | tsk_treeseq_free(&treeseq); 38 | } 39 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/pre-commit/pre-commit-hooks 3 | rev: v6.0.0 4 | hooks: 5 | - id: check-merge-conflict 6 | - id: debug-statements 7 | - id: mixed-line-ending 8 | - id: check-case-conflict 9 | - id: check-yaml 10 | - repo: https://github.com/benjeffery/pre-commit-clang-format 11 | rev: '1.0' 12 | hooks: 13 | - id: clang-format 14 | exclude: dev-tools|examples 15 | verbose: true 16 | - repo: https://github.com/asottile/reorder_python_imports 17 | rev: v3.15.0 18 | hooks: 19 | - id: reorder-python-imports 20 | args: [--application-directories=python, 21 | --unclassifiable-application-module=_tskit] 22 | - repo: https://github.com/asottile/pyupgrade 23 | rev: v3.20.0 24 | hooks: 25 | - id: pyupgrade 26 | args: [--py3-plus, --py38-plus] 27 | - repo: https://github.com/psf/black 28 | rev: 25.1.0 29 | hooks: 30 | - id: black 31 | language_version: python3 32 | - repo: https://github.com/pycqa/flake8 33 | rev: 7.3.0 34 | hooks: 35 | - id: flake8 36 | args: [--config=python/.flake8] 37 | additional_dependencies: ["flake8-bugbear==23.9.16", "flake8-builtins==2.1.0"] 38 | - repo: https://github.com/asottile/blacken-docs 39 | rev: 1.20.0 40 | hooks: 41 | - id: blacken-docs 42 | args: [--skip-errors] 43 | additional_dependencies: [black==22.3.0] 44 | language_version: python3 45 | -------------------------------------------------------------------------------- /c/examples/tree_iteration.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #include 6 | 7 | #define check_tsk_error(val) \ 8 | if (val < 0) { \ 9 | errx(EXIT_FAILURE, "line %d: %s", __LINE__, tsk_strerror(val)); \ 10 | } 11 | 12 | int 13 | main(int argc, char **argv) 14 | { 15 | int ret; 16 | tsk_treeseq_t ts; 17 | tsk_tree_t tree; 18 | 19 | if (argc != 2) { 20 | errx(EXIT_FAILURE, "usage: "); 21 | } 22 | ret = tsk_treeseq_load(&ts, argv[1], 0); 23 | check_tsk_error(ret); 24 | ret = tsk_tree_init(&tree, &ts, 0); 25 | check_tsk_error(ret); 26 | 27 | printf("Iterate forwards\n"); 28 | for (ret = tsk_tree_first(&tree); ret == TSK_TREE_OK; ret = tsk_tree_next(&tree)) { 29 | printf("\ttree %lld has %lld roots\n", 30 | (long long) tree.index, 31 | (long long) tsk_tree_get_num_roots(&tree)); 32 | } 33 | check_tsk_error(ret); 34 | 35 | printf("Iterate backwards\n"); 36 | for (ret = tsk_tree_last(&tree); ret == TSK_TREE_OK; ret = tsk_tree_prev(&tree)) { 37 | printf("\ttree %lld has %lld roots\n", 38 | (long long) tree.index, 39 | (long long) tsk_tree_get_num_roots(&tree)); 40 | } 41 | check_tsk_error(ret); 42 | 43 | tsk_tree_free(&tree); 44 | tsk_treeseq_free(&ts); 45 | return 0; 46 | } 47 | -------------------------------------------------------------------------------- /c/tskit.h: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2019-2024 Tskit Developers 5 | * 6 | * Permission is hereby granted, free of charge, to any person obtaining a copy 7 | * of this software and associated documentation files (the "Software"), to deal 8 | * in the Software without restriction, including without limitation the rights 9 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | * copies of the Software, and to permit persons to whom the Software is 11 | * furnished to do so, subject to the following conditions: 12 | * 13 | * The above copyright notice and this permission notice shall be included in all 14 | * copies or substantial portions of the Software. 15 | * 16 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | * SOFTWARE. 23 | */ 24 | 25 | /** 26 | * @file tskit.h 27 | * @brief Tskit API. 28 | */ 29 | #ifndef __TSKIT_H__ 30 | #define __TSKIT_H__ 31 | 32 | #include 33 | #include 34 | #include 35 | #include 36 | #include 37 | #include 38 | 39 | #endif 40 | -------------------------------------------------------------------------------- /python/tests/test_version.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # 3 | # Copyright (c) 2020-2024 Tskit Developers 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | # SOFTWARE. 22 | """ 23 | Test python package versioning 24 | """ 25 | from packaging.version import Version 26 | 27 | from tskit import _version 28 | 29 | 30 | class TestPythonVersion: 31 | """ 32 | Test that the version is PEP440 compliant 33 | """ 34 | 35 | def test_version(self): 36 | assert str(Version(_version.tskit_version)) == _version.tskit_version 37 | -------------------------------------------------------------------------------- /c/tskit/convert.h: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2018-2021 Tskit Developers 5 | * Copyright (c) 2015-2017 University of Oxford 6 | * 7 | * Permission is hereby granted, free of charge, to any person obtaining a copy 8 | * of this software and associated documentation files (the "Software"), to deal 9 | * in the Software without restriction, including without limitation the rights 10 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 | * copies of the Software, and to permit persons to whom the Software is 12 | * furnished to do so, subject to the following conditions: 13 | * 14 | * The above copyright notice and this permission notice shall be included in all 15 | * copies or substantial portions of the Software. 16 | * 17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 | * SOFTWARE. 24 | */ 25 | 26 | #ifndef TSK_CONVERT_H 27 | #define TSK_CONVERT_H 28 | 29 | #ifdef __cplusplus 30 | extern "C" { 31 | #endif 32 | 33 | #include 34 | 35 | #define TSK_NEWICK_LEGACY_MS_LABELS (1 << 0) 36 | 37 | int tsk_convert_newick(const tsk_tree_t *tree, tsk_id_t root, unsigned int precision, 38 | tsk_flags_t options, size_t buffer_size, char *buffer); 39 | 40 | #ifdef __cplusplus 41 | } 42 | #endif 43 | #endif 44 | -------------------------------------------------------------------------------- /docs/introduction.md: -------------------------------------------------------------------------------- 1 | --- 2 | jupytext: 3 | text_representation: 4 | extension: .md 5 | format_name: myst 6 | format_version: 0.12 7 | jupytext_version: 1.9.1 8 | kernelspec: 9 | display_name: Python 3 10 | language: python 11 | name: python3 12 | --- 13 | 14 | ```{currentmodule} tskit 15 | ``` 16 | 17 | (sec_introduction)= 18 | 19 | # Introduction 20 | 21 | This is the documentation for `tskit`, the tree sequence toolkit. 22 | Succinct tree sequences are an efficient way of representing the 23 | genetic history - often technically referred to as an Ancestral 24 | Recombination Graph or ARG - of a set of DNA sequences. 25 | 26 | The tree sequence format is output by a number of external software libraries 27 | and programs (such as [msprime](https://tskit.dev/msprime/docs), 28 | [SLiM](https://github.com/MesserLab/SLiM), 29 | [fwdpp](https://fwdpp.readthedocs.io/en/), and 30 | [tsinfer](https://tskit.dev/tsinfer/docs/)) that either simulate or 31 | infer the evolutionary ancestry of genetic sequences. This library provides the 32 | underlying functionality that such software uses to load, examine, and 33 | manipulate ARGs in tree sequence format, including efficient access to the 34 | correlated sequence of trees along a genome and general methods to calculate 35 | {ref}`genetic statistics`. 36 | 37 | For a gentle introduction, you might like to read "{ref}`tutorials:sec_what_is`" 38 | on our {ref}`tutorials site`. There you can also find further 39 | tutorial material to introduce you to key `tskit` concepts. 40 | 41 | :::{important} 42 | If you use `tskit` in your work, please remember to cite it appropriately: see the {ref}`citations` page for details. 43 | ::: 44 | 45 | :::{note} 46 | This documentation is under active development and may be incomplete 47 | in some areas. If you would like to help improve it, please open an issue or 48 | pull request on [GitHub](https://github.com/tskit-dev/tskit). 49 | ::: 50 | -------------------------------------------------------------------------------- /python/lwt_interface/setup.py: -------------------------------------------------------------------------------- 1 | import os.path 2 | import platform 3 | 4 | from setuptools import Extension 5 | from setuptools import setup 6 | from setuptools.command.build_ext import build_ext 7 | 8 | 9 | IS_WINDOWS = platform.system() == "Windows" 10 | 11 | 12 | # Obscure magic required to allow numpy be used as a 'setup_requires'. 13 | # Based on https://stackoverflow.com/questions/19919905 14 | class local_build_ext(build_ext): 15 | def finalize_options(self): 16 | build_ext.finalize_options(self) 17 | import builtins 18 | 19 | # Prevent numpy from thinking it is still in its setup process: 20 | builtins.__NUMPY_SETUP__ = False 21 | import numpy 22 | 23 | self.include_dirs.append(numpy.get_include()) 24 | 25 | 26 | libdir = "../lib" 27 | kastore_dir = os.path.join(libdir, "subprojects", "kastore") 28 | # TODO pathlib glob this. 29 | tsk_source_files = [ 30 | "core.c", 31 | "tables.c", 32 | "trees.c", 33 | "genotypes.c", 34 | "stats.c", 35 | "convert.c", 36 | "haplotype_matching.c", 37 | ] 38 | sources = ( 39 | ["example_c_module.c"] 40 | + [os.path.join(libdir, "tskit", f) for f in tsk_source_files] 41 | + [os.path.join(kastore_dir, "kastore.c")] 42 | ) 43 | 44 | defines = [] 45 | libraries = [] 46 | if IS_WINDOWS: 47 | # Needed for generating UUIDs in tskit 48 | libraries.append("Advapi32") 49 | defines.append(("WIN32", None)) 50 | 51 | extension_module = Extension( 52 | "example_c_module", 53 | sources=sources, 54 | extra_compile_args=["-std=c99"], 55 | libraries=libraries, 56 | define_macros=defines, 57 | include_dirs=[libdir, kastore_dir], 58 | ) 59 | 60 | numpy_ver = "numpy>=1.7" 61 | 62 | setup( 63 | name="example_c_module", 64 | description="Example usage of the LightweightTableCollection tskit interface", 65 | ext_modules=[extension_module], 66 | setup_requires=[numpy_ver], 67 | cmdclass={"build_ext": local_build_ext}, 68 | license="MIT", 69 | platforms=["POSIX", "Windows", "MacOS X"], 70 | ) 71 | -------------------------------------------------------------------------------- /c/subprojects/kastore/meson.build: -------------------------------------------------------------------------------- 1 | project('kastore', ['c', 'cpp'], 2 | version: files('VERSION.txt'), 3 | default_options: [ 4 | 'c_std=c99', 5 | 'cpp_std=c++11', 6 | 'warning_level=3', 7 | 'werror=true']) 8 | 9 | if not meson.is_subproject() 10 | add_global_arguments([ 11 | '-W', '-Wmissing-prototypes', '-Wstrict-prototypes', 12 | '-Wconversion', '-Wshadow', '-Wpointer-arith', '-Wcast-align', 13 | '-Wcast-qual', '-Wwrite-strings', '-Wnested-externs', 14 | '-fshort-enums', '-fno-common'], language : 'c') 15 | endif 16 | 17 | # Subprojects should compile in the static library for simplicity. 18 | kastore_inc = include_directories('.') 19 | kastore = static_library('kastore', 'kastore.c') 20 | kastore_dep = declare_dependency(link_with : kastore, include_directories: kastore_inc) 21 | 22 | if not meson.is_subproject() 23 | 24 | # The shared library can be installed into the system. 25 | install_headers('kastore.h') 26 | shared_library('kastore', 'kastore.c', install: true) 27 | executable('example', ['example.c'], link_with: kastore) 28 | 29 | # Note: we don't declare these as meson tests because they depend on 30 | # being run from the current working directory because of the paths 31 | # to example files. 32 | cunit_dep = dependency('cunit') 33 | executable('tests', ['tests.c', 'kastore.c'], dependencies: cunit_dep, 34 | c_args: ['-DMESON_VERSION="@0@"'.format(meson.project_version())]) 35 | 36 | executable('cpp_tests', ['cpp_tests.cpp'], link_with: kastore) 37 | 38 | executable('malloc_tests', ['malloc_tests.c', 'kastore.c'], 39 | dependencies: cunit_dep, 40 | link_args:['-Wl,--wrap=malloc', '-Wl,--wrap=realloc', '-Wl,--wrap=calloc']) 41 | 42 | executable('io_tests', ['io_tests.c', 'kastore.c'], 43 | dependencies: cunit_dep, 44 | link_args:[ 45 | '-Wl,--wrap=fwrite', 46 | '-Wl,--wrap=fread', 47 | '-Wl,--wrap=fclose', 48 | '-Wl,--wrap=ftell', 49 | '-Wl,--wrap=fseek']) 50 | endif 51 | -------------------------------------------------------------------------------- /.github/workflows/release.yml: -------------------------------------------------------------------------------- 1 | name: Release 2 | 3 | on: 4 | push: 5 | branches: [main, test] 6 | tags: ['*'] 7 | 8 | jobs: 9 | build: 10 | runs-on: ubuntu-24.04 11 | steps: 12 | - name: Checkout 13 | uses: actions/checkout@v4.2.2 14 | - name: Set up Python 15 | uses: actions/setup-python@v5.4.0 16 | with: 17 | python-version: '3.12' 18 | - name: Install dependencies and set up venv 19 | run: | 20 | sudo apt-get update 21 | sudo apt-get install -y ninja-build libcunit1-dev 22 | python -m venv venv 23 | source venv/bin/activate 24 | pip install meson 25 | - name: Build tarball and changelogs 26 | run: | 27 | source venv/bin/activate 28 | git rm -rf c/tests/meson-subproject 29 | git config --global user.email "CI@CI.com" 30 | git config --global user.name "Mr Robot" 31 | git add -A 32 | git commit -m "dummy commit to make meson not add in the symlinked directory" 33 | meson c build-gcc 34 | meson dist -C build-gcc 35 | python docs/convert_changelog.py c/CHANGELOG.rst > C-CHANGELOG.txt 36 | python docs/convert_changelog.py python/CHANGELOG.rst > PYTHON-CHANGELOG.txt 37 | - name: Get the version 38 | id: get_version 39 | run: 40 | echo ::set-output name=VERSION::$(echo $GITHUB_REF | cut -d / -f 3) 41 | - name: C Release 42 | uses: softprops/action-gh-release@v2.2.1 43 | if: startsWith(github.ref, 'refs/tags/') && contains(github.event.ref, 'C_') 44 | with: 45 | name: C API ${{ steps.get_version.outputs.VERSION }} 46 | body_path: C-CHANGELOG.txt 47 | draft: True 48 | fail_on_unmatched_files: True 49 | files: build-gcc/meson-dist/* 50 | - name: Python Release 51 | uses: softprops/action-gh-release@v2.2.1 52 | if: startsWith(github.ref, 'refs/tags/') && !contains(github.event.ref, 'C_') 53 | with: 54 | name: Python ${{ steps.get_version.outputs.VERSION }} 55 | body_path: PYTHON-CHANGELOG.txt 56 | draft: True 57 | fail_on_unmatched_files: True -------------------------------------------------------------------------------- /python/tests/test_dict_encoding.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # 3 | # Copyright (c) 2018-2020 Tskit Developers 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | # SOFTWARE. 22 | """ 23 | Test cases for the low-level dictionary encoding used to move 24 | data around in C. 25 | """ 26 | import pathlib 27 | import pickle 28 | 29 | import _tskit 30 | import lwt_interface.dict_encoding_testlib 31 | import tskit 32 | 33 | 34 | lwt_interface.dict_encoding_testlib.lwt_module = _tskit 35 | # Bring the tests defined in dict_encoding_testlib into the current namespace 36 | # so pytest will find and execute them. 37 | from lwt_interface.dict_encoding_testlib import * # noqa 38 | 39 | 40 | def test_pickled_examples(): 41 | seen_msprime = False 42 | test_dir = pathlib.Path(__file__).parent / "data/dict-encodings" 43 | for filename in test_dir.glob("*.pkl"): 44 | if "msprime" in str(filename): 45 | seen_msprime = True 46 | with open(test_dir / filename, "rb") as f: 47 | d = pickle.load(f) 48 | lwt = _tskit.LightweightTableCollection() 49 | lwt.fromdict(d) 50 | tskit.TableCollection.fromdict(d) 51 | # Check we've done something 52 | assert seen_msprime 53 | -------------------------------------------------------------------------------- /c/tskit/stats.h: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2019-2021 Tskit Developers 5 | * Copyright (c) 2016-2017 University of Oxford 6 | * 7 | * Permission is hereby granted, free of charge, to any person obtaining a copy 8 | * of this software and associated documentation files (the "Software"), to deal 9 | * in the Software without restriction, including without limitation the rights 10 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 | * copies of the Software, and to permit persons to whom the Software is 12 | * furnished to do so, subject to the following conditions: 13 | * 14 | * The above copyright notice and this permission notice shall be included in all 15 | * copies or substantial portions of the Software. 16 | * 17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 | * SOFTWARE. 24 | */ 25 | 26 | #ifndef TSK_STATS_H 27 | #define TSK_STATS_H 28 | 29 | #ifdef __cplusplus 30 | extern "C" { 31 | #endif 32 | 33 | #include 34 | 35 | typedef struct { 36 | const tsk_treeseq_t *tree_sequence; 37 | tsk_site_t focal_site; 38 | tsk_size_t total_samples; 39 | tsk_size_t focal_samples; 40 | double max_distance; 41 | tsk_size_t max_sites; 42 | tsk_tree_t tree; 43 | tsk_id_t *sample_buffer; 44 | double *result; 45 | tsk_size_t result_length; 46 | } tsk_ld_calc_t; 47 | 48 | int tsk_ld_calc_init(tsk_ld_calc_t *self, const tsk_treeseq_t *tree_sequence); 49 | int tsk_ld_calc_free(tsk_ld_calc_t *self); 50 | void tsk_ld_calc_print_state(const tsk_ld_calc_t *self, FILE *out); 51 | int tsk_ld_calc_get_r2(tsk_ld_calc_t *self, tsk_id_t a, tsk_id_t b, double *r2); 52 | int tsk_ld_calc_get_r2_array(tsk_ld_calc_t *self, tsk_id_t a, int direction, 53 | tsk_size_t max_sites, double max_distance, double *r2, tsk_size_t *num_r2_values); 54 | 55 | #ifdef __cplusplus 56 | } 57 | #endif 58 | #endif 59 | -------------------------------------------------------------------------------- /python/lwt_interface/test_example_c_module.py: -------------------------------------------------------------------------------- 1 | # flake8: noqa 2 | import os 3 | import sys 4 | 5 | import pytest 6 | 7 | # Make sure we use the local tskit version. 8 | 9 | sys.path.insert(0, os.path.abspath("../")) 10 | 11 | # An example of how to run the tests defined in the dict_encoding_testlib.py 12 | # file for a given compiled version of the code. 13 | import dict_encoding_testlib 14 | import example_c_module 15 | import tskit 16 | 17 | # The test cases defined in dict_encoding_testlib all use the form 18 | # lwt_module.LightweightTableCollection() to create an instance 19 | # of LightweightTableCollection. So, by setting this variable in 20 | # the module here, we can control which definition of the 21 | # LightweightTableCollection gets used. 22 | dict_encoding_testlib.lwt_module = example_c_module 23 | 24 | from dict_encoding_testlib import * 25 | 26 | 27 | def test_example_receiving(): 28 | # The example_receiving function returns true if the first tree 29 | # has more than one root 30 | lwt = example_c_module.LightweightTableCollection() 31 | tables = tskit.TableCollection(1) 32 | lwt.fromdict(tables.asdict()) 33 | # Our example function throws an error for an empty table collection 34 | with pytest.raises(ValueError, match="Table collection must be indexed"): 35 | example_c_module.example_receiving(lwt) 36 | 37 | # This tree sequence has one root so we get false 38 | tables = msprime.simulate(10).dump_tables() 39 | lwt.fromdict(tables.asdict()) 40 | assert not example_c_module.example_receiving(lwt) 41 | 42 | # Add a root and we get true 43 | tables.nodes.add_row(flags=tskit.NODE_IS_SAMPLE) 44 | lwt.fromdict(tables.asdict()) 45 | assert example_c_module.example_receiving(lwt) 46 | 47 | 48 | def test_example_modifying(): 49 | lwt = example_c_module.LightweightTableCollection() 50 | # The example_modifying function clears out the table and adds two rows 51 | tables = msprime.simulate(10, random_seed=42).tables 52 | assert tables.edges.num_rows == 18 53 | assert tables.nodes.num_rows == 19 54 | lwt.fromdict(tables.asdict()) 55 | example_c_module.example_modifying(lwt) 56 | modified_tables = tskit.TableCollection.fromdict(lwt.asdict()) 57 | assert modified_tables.edges.num_rows == 0 58 | assert modified_tables.nodes.num_rows == 2 59 | -------------------------------------------------------------------------------- /python/lwt_interface/cython_example/_lwtc.c: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2019-2020 Tskit Developers 5 | * Copyright (c) 2015-2018 University of Oxford 6 | * 7 | * Permission is hereby granted, free of charge, to any person obtaining a copy 8 | * of this software and associated documentation files (the "Software"), to deal 9 | * in the Software without restriction, including without limitation the rights 10 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 | * copies of the Software, and to permit persons to whom the Software is 12 | * furnished to do so, subject to the following conditions: 13 | * 14 | * The above copyright notice and this permission notice shall be included in all 15 | * copies or substantial portions of the Software. 16 | * 17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 | * SOFTWARE. 24 | */ 25 | // Turn off clang-formatting for this file as turning off formatting 26 | // for specific bits will make it more confusing. 27 | // clang-format off 28 | 29 | #define PY_SSIZE_T_CLEAN 30 | #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION 31 | 32 | #include 33 | #include 34 | #include 35 | 36 | #include "kastore.h" 37 | #include "tskit.h" 38 | 39 | #include "tskit_lwt_interface.h" 40 | 41 | static PyMethodDef lwt_methods[] = { 42 | { NULL, NULL, 0, NULL } /* sentinel */ 43 | }; 44 | 45 | static struct PyModuleDef lwt_module = { 46 | .m_base = PyModuleDef_HEAD_INIT, 47 | .m_name = "_lwt", 48 | .m_doc = "tskit LightweightTableCollection", 49 | .m_size = -1, 50 | .m_methods = lwt_methods }; 51 | 52 | PyMODINIT_FUNC 53 | PyInit__lwtc(void) 54 | { 55 | PyObject *module = PyModule_Create(&lwt_module); 56 | if (module == NULL) { 57 | return NULL; 58 | } 59 | import_array(); 60 | if (register_lwt_class(module) != 0) { 61 | return NULL; 62 | } 63 | return module; 64 | } 65 | -------------------------------------------------------------------------------- /python/tskit/provenance.schema.json: -------------------------------------------------------------------------------- 1 | { 2 | "schema": "http://json-schema.org/draft-07/schema#", 3 | "version": "1.1.0", 4 | "title": "tskit provenance", 5 | "description": "The combination of software, parameters and environment that produced a tree sequence", 6 | "type": "object", 7 | "required": ["schema_version", "software", "parameters", "environment"], 8 | "properties": { 9 | "schema_version": { 10 | "description": "The version of this schema used.", 11 | "type": "string", 12 | "minLength": 1 13 | }, 14 | "software": { 15 | "description": "The primary software used to produce the tree sequence.", 16 | "type": "object", 17 | "required": ["name", "version"], 18 | "properties": { 19 | "name": { 20 | "description": "The name of the primary software.", 21 | "type": "string", 22 | "minLength": 1 23 | }, 24 | "version": { 25 | "description": "The version of primary software.", 26 | "type": "string", 27 | "minLength": 1 28 | } 29 | } 30 | }, 31 | "parameters": { 32 | "description": "The parameters used to produce the tree sequence.", 33 | "type": "object" 34 | }, 35 | "environment": { 36 | "description": "The computational environment within which the primary software ran.", 37 | "type": "object", 38 | "properties": { 39 | "os": { 40 | "description": "Operating system.", 41 | "type": "object" 42 | }, 43 | "libraries": { 44 | "description": "Details of libraries the primary software linked against.", 45 | "type": "object" 46 | } 47 | } 48 | }, 49 | "resources": { 50 | "description": "Resources used by this operation.", 51 | "type": "object", 52 | "properties": { 53 | "elapsed_time": { 54 | "description": "Wall clock time in used in seconds.", 55 | "type": "number" 56 | }, 57 | "user_time": { 58 | "description": "User time used in seconds.", 59 | "type": "number" 60 | }, 61 | "sys_time": { 62 | "description": "System time used in seconds.", 63 | "type": "number" 64 | }, 65 | "max_memory": { 66 | "description": "Maximum memory used in bytes.", 67 | "type": "number" 68 | } 69 | } 70 | } 71 | } 72 | } 73 | -------------------------------------------------------------------------------- /python/lwt_interface/cython_example/example.pyx: -------------------------------------------------------------------------------- 1 | from libc.stdint cimport uint32_t 2 | import _lwtc 3 | import tskit 4 | 5 | cdef extern from "tskit.h" nogil: 6 | ctypedef uint32_t tsk_flags_t 7 | ctypedef struct tsk_table_collection_t: 8 | pass 9 | ctypedef struct tsk_treeseq_t: 10 | pass 11 | int tsk_treeseq_init(tsk_treeseq_t *self, const tsk_table_collection_t *tables, tsk_flags_t options) 12 | int tsk_treeseq_free(tsk_treeseq_t *self) 13 | int tsk_table_collection_build_index(tsk_table_collection_t *self, tsk_flags_t options) 14 | ctypedef struct tsk_tree_t: 15 | pass 16 | int tsk_tree_init(tsk_tree_t *self, const tsk_treeseq_t *ts, tsk_flags_t options) 17 | int tsk_tree_first(tsk_tree_t *self) 18 | int tsk_tree_next(tsk_tree_t *self) 19 | int tsk_tree_last(tsk_tree_t *self) 20 | int tsk_tree_prev(tsk_tree_t *self) 21 | int tsk_tree_get_num_roots(tsk_tree_t *self) 22 | int tsk_tree_free(tsk_tree_t *self) 23 | const char *tsk_strerror(int err) 24 | 25 | cdef extern: 26 | ctypedef class _lwtc.LightweightTableCollection [object LightweightTableCollection]: 27 | cdef tsk_table_collection_t *tables 28 | 29 | def check_tsk_error(val): 30 | if val < 0: 31 | raise RuntimeError(tsk_strerror(val)) 32 | 33 | def iterate_trees(pyts: tskit.TreeSequence): 34 | lwtc = LightweightTableCollection() 35 | lwtc.fromdict(pyts.dump_tables().asdict()) 36 | cdef tsk_treeseq_t ts 37 | err = tsk_treeseq_init(&ts, lwtc.tables, 0) 38 | check_tsk_error(err) 39 | cdef tsk_tree_t tree 40 | ret = tsk_tree_init(&tree, &ts, 0) 41 | check_tsk_error(ret) 42 | 43 | print("Iterate forwards") 44 | cdef int tree_iter = tsk_tree_first(&tree) 45 | while tree_iter == 1: 46 | print("\ttree has %d roots" % (tsk_tree_get_num_roots(&tree))) 47 | tree_iter = tsk_tree_next(&tree) 48 | check_tsk_error(tree_iter) 49 | 50 | print("Iterate backwards") 51 | tree_iter = tsk_tree_last(&tree) 52 | while tree_iter == 1: 53 | print("\ttree has %d roots" % (tsk_tree_get_num_roots(&tree))) 54 | tree_iter = tsk_tree_prev(&tree) 55 | check_tsk_error(tree_iter) 56 | 57 | tsk_tree_free(&tree) 58 | tsk_treeseq_free(&ts) 59 | 60 | def main(): 61 | import msprime as msp # (msprime could be compiled against a different version of tskit) 62 | ts = msp.simulate(sample_size=5, length=100, recombination_rate=.01) 63 | iterate_trees(ts) 64 | -------------------------------------------------------------------------------- /python/lwt_interface/README.md: -------------------------------------------------------------------------------- 1 | # LightweightTableCollection interface 2 | 3 | The files in this directory define the LightweightTableCollection 4 | interface used to safely interchange table collection data between 5 | different compiled instances of the tskit C library. This is a 6 | *very* specialised use-case, and unless you are using the tskit 7 | C API in your own compiled Python module (either via Cython 8 | or the Python C API), you almost certainly don't need to use 9 | this code. 10 | 11 | ## Overview 12 | 13 | To allow a tskit table collection to be transferred from one compiled Python 14 | extension module to another the table collection is converted to a `dict` of 15 | basic python types and numpy arrays. This is then converted back in the receiving 16 | module. `tskit_lwt_interface.h` provides a function `register_lwt_class` that 17 | defines a Python class `LightweightTableCollection` that performs these conversions 18 | with methods `asdict` and `fromdict`. These methods mirror the `asdict` and `fromdict` 19 | methods on `tskit.TableCollection`. 20 | 21 | ## Usage 22 | An example C module skeleton `example_c_module.c` is provided, which shows passing tables 23 | to the C module. See `test_example_c_module.py` for the python example usage 24 | of the example module. 25 | 26 | To add the 27 | `LightweightTableCollection` type to your module you include `tskit_lwt_interface.h` 28 | and then call `register_lwt_class` on your C Python module object. You can then convert 29 | to and from the lightweight table collection in Python, for example to convert a tskit 30 | `TableCollection` to a `LightweightTableCollection`: 31 | ```python 32 | tables = tskit.TableCollection(1) 33 | lwt = example_c_module.LightweightTableCollection() 34 | lwt.fromdict(tables.asdict()) 35 | ``` 36 | and vice-versa: 37 | ```python 38 | tc = tskit.TableCollection(lwt.asdict()) 39 | ``` 40 | In C you can access the tables in a `LightweightTableCollection` instance that is passed 41 | to your function, as shown in the `example_receiving` function in `example_c_module.c`. 42 | Note the requirement to check for errors from tskit functions and to call 43 | `handle_tskit_error` to set a Python error, returning `NULL` to Python to indicate error. 44 | 45 | Tables can also be modified in the extension code as in `example_modifying`. We recommend 46 | creating table collections in Python then passing them to C for modification rather than 47 | creating them in C and returning them. This avoids complex managing of object lifecycles 48 | in C code. 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | -------------------------------------------------------------------------------- /docs/citation.md: -------------------------------------------------------------------------------- 1 | (sec_citation)= 2 | 3 | # Citing tskit 4 | 5 | If you use `tskit` in your work, we recommend citing the [2024 ARG Genetics paper]() and the [2016 msprime PLOS Computational Biology paper](): 6 | > Yan Wong, Anastasia Ignatieva, Jere Koskela, Gregor Gorjanc, Anthony W 7 | > Wohns, Jerome Kelleher, *A general and efficient representation of ancestral 8 | > recombination graphs*, Genetics, Volume 228, Issue 1, September 2024, iyae100, 9 | > https://doi.org/10.1093/genetics/iyae100 10 | 11 | > Jerome Kelleher, Alison M Etheridge and Gilean McVean (2016), 12 | > *Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes*, 13 | > PLOS Comput Biol 12(5): e1004842. doi: 10.1371/journal.pcbi.1004842 14 | 15 | If you use summary statistics, please cite the 16 | [2020 Genetics paper](https://doi.org/10.1534/genetics.120.303253): 17 | 18 | > Peter Ralph, Kevin Thornton, Jerome Kelleher, *Efficiently Summarizing 19 | > Relationships in Large Samples: A General Duality Between Statistics of 20 | > Genealogies and Genomes*, Genetics, Volume 215, Issue 3, 1 July 2020, 21 | > Pages 779–797, https://doi.org/10.1534/genetics.120.303253 22 | 23 | 24 | Bibtex records: 25 | 26 | ```bibtex 27 | @article{Wong2024ARGs, 28 | author = {Wong, Yan and Ignatieva, Anastasia and Koskela, Jere and Gorjanc, Gregor and 29 | Wohns, Anthony W and Kelleher, Jerome}, 30 | title = {A general and efficient representation of ancestral recombination graphs}, 31 | journal = {Genetics}, 32 | volume = {228}, 33 | number = {1}, 34 | pages = {iyae100}, 35 | year = {2024}, 36 | doi = {10.1093/genetics/iyae100} 37 | } 38 | 39 | @article{Kelleher2016msprime, 40 | author = {Kelleher, Jerome and Etheridge, Alison M and McVean, Gilean}, 41 | title = {Efficient coalescent simulation and genealogical analysis for large sample sizes}, 42 | journal = {PLoS Computational Biology}, 43 | volume = {12}, 44 | number = {5}, 45 | pages = {e1004842}, 46 | year = {2016}, 47 | publisher = {Public Library of Science} 48 | } 49 | 50 | @article{Ralph2020Stats, 51 | author = {Ralph, Peter and Thornton, Kevin and Kelleher, Jerome}, 52 | title = {Efficiently Summarizing Relationships in Large Samples: A General Duality Between Statistics of Genealogies and Genomes}, 53 | journal = {Genetics}, 54 | volume = {215}, 55 | number = {3}, 56 | pages = {779--797}, 57 | year = {2020}, 58 | doi = {10.1534/genetics.120.303253} 59 | } 60 | ``` -------------------------------------------------------------------------------- /python/tskit/exceptions.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # 3 | # Copyright (c) 2018-2021 Tskit Developers 4 | # Copyright (c) 2017 University of Oxford 5 | # 6 | # Permission is hereby granted, free of charge, to any person obtaining a copy 7 | # of this software and associated documentation files (the "Software"), to deal 8 | # in the Software without restriction, including without limitation the rights 9 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | # copies of the Software, and to permit persons to whom the Software is 11 | # furnished to do so, subject to the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be included in all 14 | # copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | # SOFTWARE. 23 | """ 24 | Exceptions defined in tskit. 25 | """ 26 | from _tskit import FileFormatError # noqa: F401 27 | from _tskit import IdentityPairsNotStoredError # noqa: F401 28 | from _tskit import IdentitySegmentsNotStoredError # noqa: F401 29 | from _tskit import LibraryError # noqa: F401 30 | from _tskit import TskitException # noqa: F401 31 | from _tskit import VersionTooNewError # noqa: F401 32 | from _tskit import VersionTooOldError # noqa: F401 33 | 34 | 35 | class DuplicatePositionsError(TskitException): 36 | """ 37 | Duplicate positions in the list of sites. 38 | """ 39 | 40 | 41 | class ProvenanceValidationError(TskitException): 42 | """ 43 | A JSON document did not validate against the provenance schema. 44 | """ 45 | 46 | 47 | class MetadataValidationError(TskitException): 48 | """ 49 | A metadata object did not validate against the provenance schema. 50 | """ 51 | 52 | 53 | class MetadataSchemaValidationError(TskitException): 54 | """ 55 | A metadata schema object did not validate against the metaschema. 56 | """ 57 | 58 | 59 | class MetadataEncodingError(TskitException): 60 | """ 61 | A metadata object was of a type that could not be encoded 62 | """ 63 | 64 | 65 | class ImmutableTableError(ValueError): 66 | """ 67 | Raised when attempting to modify an immutable table view. 68 | 69 | Use TreeSequence.dump_tables() to get a mutable copy. 70 | """ 71 | -------------------------------------------------------------------------------- /c/tests/meson-subproject/example.c: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2019-2022 Tskit Developers 5 | * 6 | * Permission is hereby granted, free of charge, to any person obtaining a copy 7 | * of this software and associated documentation files (the "Software"), to deal 8 | * in the Software without restriction, including without limitation the rights 9 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | * copies of the Software, and to permit persons to whom the Software is 11 | * furnished to do so, subject to the following conditions: 12 | * 13 | * The above copyright notice and this permission notice shall be included in all 14 | * copies or substantial portions of the Software. 15 | * 16 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | * SOFTWARE. 23 | */ 24 | 25 | /* Simple example testing that we compile and link in tskit and kastore 26 | * when we use meson submodules. 27 | */ 28 | #include 29 | #include 30 | #include 31 | #include 32 | 33 | void 34 | test_kas_strerror() 35 | { 36 | printf("test_kas_strerror\n"); 37 | const char *str = kas_strerror(KAS_ERR_NO_MEMORY); 38 | assert(strcmp(str, "Out of memory") == 0); 39 | } 40 | 41 | void 42 | test_strerror() 43 | { 44 | printf("test_strerror\n"); 45 | const char *str = tsk_strerror(TSK_ERR_NO_MEMORY); 46 | assert(strcmp(str, "Out of memory. (TSK_ERR_NO_MEMORY)") == 0); 47 | } 48 | 49 | void 50 | test_load_error() 51 | { 52 | printf("test_open_error\n"); 53 | tsk_treeseq_t ts; 54 | int ret = tsk_treeseq_load(&ts, "no such file", 0); 55 | assert(ret == TSK_ERR_IO); 56 | tsk_treeseq_free(&ts); 57 | } 58 | 59 | void 60 | test_table_basics() 61 | { 62 | printf("test_table_basics\n"); 63 | tsk_table_collection_t tables; 64 | int ret = tsk_table_collection_init(&tables, 0); 65 | assert(ret == 0); 66 | 67 | ret = tsk_node_table_add_row(&tables.nodes, 0, 1.0, TSK_NULL, TSK_NULL, NULL, 0); 68 | assert(ret == 0); 69 | ret = tsk_node_table_add_row(&tables.nodes, 0, 2.0, TSK_NULL, TSK_NULL, NULL, 0); 70 | assert(ret == 1); 71 | assert(tables.nodes.num_rows == 2); 72 | 73 | tsk_table_collection_free(&tables); 74 | } 75 | 76 | int 77 | main() 78 | { 79 | test_kas_strerror(); 80 | test_strerror(); 81 | test_load_error(); 82 | test_table_basics(); 83 | return 0; 84 | } 85 | -------------------------------------------------------------------------------- /docs/installation.md: -------------------------------------------------------------------------------- 1 | --- 2 | jupytext: 3 | text_representation: 4 | extension: .md 5 | format_name: myst 6 | format_version: 0.12 7 | jupytext_version: 1.9.1 8 | kernelspec: 9 | display_name: Python 3 10 | language: python 11 | name: python3 12 | --- 13 | 14 | ```{currentmodule} tskit 15 | ``` 16 | 17 | (sec_installation)= 18 | 19 | 20 | # Installation 21 | 22 | There are two basic options for installing `tskit`: either through 23 | pre-built binary packages using {ref}`sec_installation_conda` or 24 | by compiling locally using {ref}`sec_installation_pip`. We recommend using `conda` 25 | for most users, although `pip` can be more convenient in certain cases. 26 | Tskit is installed to provide succinct tree sequence functionality 27 | to other software (such as [msprime](https://github.com/tskit-dev/msprime)), 28 | so it may already be installed if you use such software. 29 | 30 | (sec_installation_requirements)= 31 | 32 | 33 | ## Requirements 34 | 35 | Tskit requires Python 3.8+. There are no external C library dependencies. Python 36 | dependencies are installed automatically by `pip` or `conda`. 37 | 38 | (sec_installation_conda)= 39 | 40 | 41 | ## Conda 42 | 43 | Pre-built binary packages for `tskit` are available through 44 | [conda](https://conda.io/docs/), and built using [conda-forge](https://conda-forge.org/). 45 | Packages for recent version of Python are available for Linux, OSX and Windows. Install 46 | using: 47 | 48 | ```bash 49 | $ conda install -c conda-forge tskit 50 | ``` 51 | 52 | ### Quick Start 53 | 54 | 1. Install `conda` using [miniconda ](https://conda.io/miniconda.html). 55 | Make sure you follow the instructions to fully activate your `conda` 56 | installation! 57 | 2. Set up the [conda-forge channel ](https://conda-forge.org/) using 58 | `conda config --add channels conda-forge`. 59 | 3. Install tskit: `conda install tskit`. 60 | 4. Try it out: `tskit --version`. 61 | 62 | 63 | There are several different ways to obtain `conda`. Please see the 64 | [anaconda installation documentation](https://docs.anaconda.com/anaconda/install/) 65 | for full details. 66 | 67 | (sec_installation_pip)= 68 | 69 | 70 | ## Pip 71 | 72 | Installing using `pip` is somewhat more flexible than `conda` and 73 | may result in code that is (slightly) faster on your specific hardware. 74 | `Pip` is the recommended method when using the system provided Python 75 | installations. Installation is straightforward: 76 | 77 | ```bash 78 | $ python3 -m pip install tskit 79 | ``` 80 | 81 | (sec_installation_development_versions)= 82 | 83 | 84 | ## Development versions 85 | 86 | For general use, we do not recommend installing development versions. 87 | Occasionally pre-release versions are made available, which can be 88 | installed using `python3 -m pip install --pre tskit`. If you really need to install a 89 | bleeding-edge version, see {ref}`sec_development_installing`. 90 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_subtree.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 0 14 | 15 | 16 | 17 | 18 | 1 19 | 20 | 21 | 22 | 8 23 | 24 | 25 | 26 | 27 | 28 | 2 29 | 30 | 31 | 32 | 33 | 3 34 | 35 | 36 | 37 | 9 38 | 39 | 40 | 10 41 | 42 | 43 | 44 | 45 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 0 14 | 15 | 16 | 17 | 18 | 1 19 | 20 | 21 | 22 | 4 23 | 24 | 25 | 26 | 27 | 28 | 2 29 | 30 | 31 | 32 | 33 | 3 34 | 35 | 36 | 37 | 5 38 | 39 | 40 | 8 41 | 42 | 43 | 44 | 45 | -------------------------------------------------------------------------------- /c/examples/cpp_sorting_example.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | static void 11 | handle_tskit_return_code(int code) 12 | { 13 | if (code != 0) { 14 | std::ostringstream o; 15 | o << tsk_strerror(code); 16 | throw std::runtime_error(o.str()); 17 | } 18 | } 19 | 20 | struct edge_plus_time { 21 | double time; 22 | tsk_id_t parent, child; 23 | double left, right; 24 | }; 25 | 26 | int 27 | sort_edges(tsk_table_sorter_t *sorter, tsk_size_t start) 28 | { 29 | if (sorter->tables->edges.metadata_length != 0) { 30 | throw std::invalid_argument( 31 | "the sorter does not currently handle edge metadata"); 32 | } 33 | if (start != 0) { 34 | throw std::invalid_argument("the sorter requires start==0"); 35 | } 36 | 37 | std::vector temp; 38 | temp.reserve(static_cast(sorter->tables->edges.num_rows)); 39 | 40 | auto edges = &sorter->tables->edges; 41 | auto nodes = &sorter->tables->nodes; 42 | 43 | for (tsk_size_t i = 0; i < sorter->tables->edges.num_rows; ++i) { 44 | temp.push_back(edge_plus_time{ nodes->time[edges->parent[i]], edges->parent[i], 45 | edges->child[i], edges->left[i], edges->right[i] }); 46 | } 47 | 48 | std::sort(begin(temp), end(temp), 49 | [](const edge_plus_time &lhs, const edge_plus_time &rhs) { 50 | if (lhs.time == rhs.time) { 51 | if (lhs.parent == rhs.parent) { 52 | if (lhs.child == rhs.child) { 53 | return lhs.left < rhs.left; 54 | } 55 | return lhs.child < rhs.child; 56 | } 57 | return lhs.parent < rhs.parent; 58 | } 59 | return lhs.time < rhs.time; 60 | }); 61 | 62 | for (std::size_t i = 0; i < temp.size(); ++i) { 63 | edges->left[i] = temp[i].left; 64 | edges->right[i] = temp[i].right; 65 | edges->parent[i] = temp[i].parent; 66 | edges->child[i] = temp[i].child; 67 | } 68 | 69 | return 0; 70 | } 71 | 72 | int 73 | main(int argc, char **argv) 74 | { 75 | if (argc != 3) { 76 | std::cerr << "Usage: " << argv[0] << " input.trees output.trees\n"; 77 | std::exit(0); 78 | } 79 | const char *infile = argv[1]; 80 | const char *outfile = argv[2]; 81 | 82 | tsk_table_collection_t tables; 83 | auto ret = tsk_table_collection_load(&tables, infile, 0); 84 | handle_tskit_return_code(ret); 85 | 86 | tsk_table_sorter_t sorter; 87 | ret = tsk_table_sorter_init(&sorter, &tables, 0); 88 | handle_tskit_return_code(ret); 89 | sorter.sort_edges = sort_edges; 90 | try { 91 | ret = tsk_table_sorter_run(&sorter, NULL); 92 | } catch (std::exception &e) { 93 | std::cerr << e.what() << '\n'; 94 | std::exit(1); 95 | } 96 | handle_tskit_return_code(ret); 97 | ret = tsk_table_collection_dump(&tables, outfile, 0); 98 | handle_tskit_return_code(ret); 99 | ret = tsk_table_collection_free(&tables); 100 | handle_tskit_return_code(ret); 101 | } 102 | 103 | -------------------------------------------------------------------------------- /python/tskit/__init__.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # 3 | # Copyright (c) 2018-2025 Tskit Developers 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | # SOFTWARE. 22 | import _tskit 23 | 24 | #: Special reserved value representing a null ID. 25 | NULL = _tskit.NULL 26 | 27 | #: Special value representing missing data in a genotype array 28 | MISSING_DATA = _tskit.MISSING_DATA 29 | 30 | #: Node flag value indicating that it is a sample. 31 | NODE_IS_SAMPLE = _tskit.NODE_IS_SAMPLE 32 | 33 | #: Constant representing the forward direction of travel (i.e., 34 | #: increasing genomic coordinate values). 35 | FORWARD = _tskit.FORWARD 36 | 37 | #: Constant representing the reverse direction of travel (i.e., 38 | #: decreasing genomic coordinate values). 39 | REVERSE = _tskit.REVERSE 40 | 41 | #: The allele mapping where the strings "0" and "1" map to genotype 42 | #: values 0 and 1. 43 | ALLELES_01 = ("0", "1") 44 | 45 | #: The allele mapping where the four nucleotides A, C, G and T map to 46 | #: the genotype integers 0, 1, 2, and 3, respectively. 47 | ALLELES_ACGT = ("A", "C", "G", "T") 48 | 49 | #: Special NAN value used to indicate unknown mutation times. Since this is a 50 | #: NAN value, you cannot use `==` to test for it. Use :func:`is_unknown_time` instead. 51 | UNKNOWN_TIME = _tskit.UNKNOWN_TIME 52 | 53 | #: Default value of ts.time_units 54 | TIME_UNITS_UNKNOWN = _tskit.TIME_UNITS_UNKNOWN 55 | 56 | #: ts.time_units value when dimension is uncalibrated 57 | TIME_UNITS_UNCALIBRATED = _tskit.TIME_UNITS_UNCALIBRATED 58 | 59 | #: Options for printing to strings and HTML, modify with tskit.set_print_options. 60 | _print_options = {"max_lines": 40} 61 | 62 | TABLE_NAMES = [ 63 | "individuals", 64 | "nodes", 65 | "edges", 66 | "migrations", 67 | "sites", 68 | "mutations", 69 | "populations", 70 | "provenances", 71 | ] 72 | 73 | 74 | from tskit.provenance import __version__ # NOQA 75 | from tskit.provenance import validate_provenance # NOQA 76 | from tskit.trees import * # NOQA 77 | from tskit.genotypes import Variant # NOQA 78 | from tskit.tables import * # NOQA 79 | from tskit.stats import * # NOQA 80 | from tskit.combinatorics import ( # NOQA 81 | all_trees, 82 | all_tree_shapes, 83 | all_tree_labellings, 84 | TopologyCounter, 85 | Rank, 86 | ) 87 | from tskit.drawing import SVGString # NOQA 88 | from tskit.exceptions import * # NOQA 89 | from tskit.util import * # NOQA 90 | from tskit.metadata import * # NOQA 91 | from tskit.text_formats import * # NOQA 92 | from tskit.intervals import RateMap # NOQA 93 | -------------------------------------------------------------------------------- /python/stress_lowlevel.py: -------------------------------------------------------------------------------- 1 | import curses 2 | import os 3 | import random 4 | import resource 5 | import sys 6 | import time 7 | import tracemalloc 8 | from contextlib import redirect_stdout 9 | 10 | import pytest 11 | 12 | """ 13 | Code to stress the low-level API as much as possible to expose 14 | any memory leaks or error handling issues. 15 | """ 16 | 17 | 18 | def main(stdscr): 19 | if len(sys.argv) > 1: 20 | args = sys.argv[1:] 21 | else: 22 | args = ["-n0", "tests/test_lowlevel.py"] 23 | 24 | class StressPlugin: 25 | def __init__(self): 26 | self.max_rss = 0 27 | self.max_rss_iter = 0 28 | self.min_rss = 1e100 29 | self.iteration = 0 30 | self.last_print = time.time() 31 | self.memory_start = None 32 | 33 | def pytest_sessionstart(self): 34 | if self.memory_start is None: 35 | tracemalloc.start() 36 | self.memory_start = tracemalloc.take_snapshot() 37 | 38 | def pytest_sessionfinish(self): 39 | memory_current = tracemalloc.take_snapshot() 40 | rusage = resource.getrusage(resource.RUSAGE_SELF) 41 | if self.max_rss < rusage.ru_maxrss: 42 | self.max_rss = rusage.ru_maxrss 43 | self.max_rss_iter = self.iteration 44 | if self.min_rss > rusage.ru_maxrss: 45 | self.min_rss = rusage.ru_maxrss 46 | 47 | # We don't want to flood stdout, so we rate-limit to 1 per second. 48 | if time.time() - self.last_print > 1: 49 | stdscr.clear() 50 | rows, cols = stdscr.getmaxyx() 51 | stdscr.addstr( 52 | 0, 53 | 0, 54 | "iter\tRSS\tmin\tmax\tmax@iter"[: cols - 1], 55 | ) 56 | stdscr.addstr( 57 | 1, 58 | 0, 59 | "\t".join( 60 | map( 61 | str, 62 | [ 63 | self.iteration, 64 | rusage.ru_maxrss, 65 | self.min_rss, 66 | self.max_rss, 67 | self.max_rss_iter, 68 | ], 69 | ) 70 | )[: cols - 1], 71 | ) 72 | stats = memory_current.compare_to(self.memory_start, "traceback") 73 | for i, stat in enumerate(stats[: rows - 3], 1): 74 | stdscr.addstr(i + 2, 0, str(stat)[: cols - 1]) 75 | self.last_print = time.time() 76 | stdscr.refresh() 77 | self.iteration += 1 78 | 79 | plugin = StressPlugin() 80 | while True: 81 | # We don't want any random variation in the amount of memory 82 | # used from test-to-test. 83 | random.seed(1) 84 | with open(os.devnull, "w") as devnull: 85 | with redirect_stdout(devnull): 86 | result = pytest.main(args, plugins=[plugin]) 87 | if result != 0: 88 | exit("TESTS FAILED") 89 | 90 | 91 | if __name__ == "__main__": 92 | stdscr = curses.initscr() 93 | curses.noecho() 94 | curses.cbreak() 95 | 96 | try: 97 | main(stdscr) 98 | finally: 99 | curses.echo() 100 | curses.nocbreak() 101 | curses.endwin() 102 | -------------------------------------------------------------------------------- /python/tests/data/simplify-bugs/02-nodes.txt: -------------------------------------------------------------------------------- 1 | is_sample time 2 | 1 0.000000 3 | 1 0.000000 4 | 1 0.000000 5 | 1 0.000000 6 | 1 0.000000 7 | 1 0.000000 8 | 1 0.000000 9 | 1 0.000000 10 | 1 0.000000 11 | 1 0.000000 12 | 1 0.000000 13 | 1 0.000000 14 | 1 0.000000 15 | 1 0.000000 16 | 1 0.000000 17 | 1 0.000000 18 | 1 0.000000 19 | 1 0.000000 20 | 1 0.000000 21 | 1 0.000000 22 | 1 0.000000 23 | 1 0.000000 24 | 1 0.000000 25 | 1 0.000000 26 | 1 0.000000 27 | 1 0.000000 28 | 1 0.000000 29 | 1 0.000000 30 | 1 0.000000 31 | 1 0.000000 32 | 1 0.000000 33 | 1 0.000000 34 | 1 0.000000 35 | 1 0.000000 36 | 1 0.000000 37 | 1 0.000000 38 | 1 0.000000 39 | 1 0.000000 40 | 1 0.000000 41 | 1 0.000000 42 | 1 0.000000 43 | 1 0.000000 44 | 1 0.000000 45 | 1 0.000000 46 | 1 0.000000 47 | 1 0.000000 48 | 1 0.000000 49 | 1 0.000000 50 | 1 0.000000 51 | 1 0.000000 52 | 1 0.000000 53 | 1 0.000000 54 | 1 0.000000 55 | 1 0.000000 56 | 1 0.000000 57 | 1 0.000000 58 | 1 0.000000 59 | 1 0.000000 60 | 1 0.000000 61 | 1 0.000000 62 | 1 0.000000 63 | 1 0.000000 64 | 1 0.000000 65 | 1 0.000000 66 | 1 0.000000 67 | 1 0.000000 68 | 1 0.000000 69 | 1 0.000000 70 | 1 0.000000 71 | 1 0.000000 72 | 1 0.000000 73 | 1 0.000000 74 | 1 0.000000 75 | 1 0.000000 76 | 1 0.000000 77 | 1 0.000000 78 | 1 0.000000 79 | 1 0.000000 80 | 1 0.000000 81 | 1 0.000000 82 | 1 0.000000 83 | 1 0.000000 84 | 1 0.000000 85 | 1 0.000000 86 | 1 0.000000 87 | 1 0.000000 88 | 1 0.000000 89 | 1 0.000000 90 | 1 0.000000 91 | 1 0.000000 92 | 1 0.000000 93 | 1 0.000000 94 | 1 0.000000 95 | 1 0.000000 96 | 1 0.000000 97 | 1 0.000000 98 | 1 0.000000 99 | 1 0.000000 100 | 1 0.000000 101 | 1 0.000000 102 | 0 0.000194 103 | 0 0.000317 104 | 0 0.000403 105 | 0 0.000539 106 | 0 0.001031 107 | 0 0.001435 108 | 0 0.001762 109 | 0 0.001774 110 | 0 0.001809 111 | 0 0.002119 112 | 0 0.002788 113 | 0 0.002811 114 | 0 0.003626 115 | 0 0.003640 116 | 0 0.003920 117 | 0 0.003996 118 | 0 0.004180 119 | 0 0.004187 120 | 0 0.004326 121 | 0 0.004453 122 | 0 0.005014 123 | 0 0.005035 124 | 0 0.005512 125 | 0 0.005679 126 | 0 0.005842 127 | 0 0.006024 128 | 0 0.006182 129 | 0 0.006282 130 | 0 0.006540 131 | 0 0.006850 132 | 0 0.006989 133 | 0 0.007400 134 | 0 0.007440 135 | 0 0.007559 136 | 0 0.007880 137 | 0 0.008043 138 | 0 0.008337 139 | 0 0.008406 140 | 0 0.008968 141 | 0 0.009216 142 | 0 0.009236 143 | 0 0.009300 144 | 0 0.010000 145 | 0 0.010592 146 | 0 0.011448 147 | 0 0.011471 148 | 0 0.011991 149 | 0 0.012237 150 | 0 0.012290 151 | 0 0.012429 152 | 0 0.012484 153 | 0 0.013078 154 | 0 0.013189 155 | 0 0.014031 156 | 0 0.014208 157 | 0 0.014449 158 | 0 0.014731 159 | 0 0.015388 160 | 0 0.015556 161 | 0 0.015588 162 | 0 0.015727 163 | 0 0.015773 164 | 0 0.015945 165 | 0 0.016374 166 | 0 0.016542 167 | 0 0.016560 168 | 0 0.016713 169 | 0 0.017029 170 | 0 0.017180 171 | 0 0.017280 172 | 0 0.017546 173 | 0 0.017637 174 | 0 0.017806 175 | 0 0.017943 176 | 0 0.017983 177 | 0 0.018078 178 | 0 0.018319 179 | 0 0.018490 180 | 0 0.018598 181 | 0 0.018688 182 | 0 0.019008 183 | 0 0.019012 184 | 0 0.019112 185 | 0 0.019190 186 | 0 0.019191 187 | 0 0.019477 188 | 0 0.020000 189 | 0 0.020659 190 | 0 0.020952 191 | 0 0.021267 192 | 0 0.021289 193 | 0 0.021641 194 | 0 0.021823 195 | 0 0.022321 196 | 0 0.022553 197 | 0 0.022602 198 | 0 0.023120 199 | 0 0.023233 200 | 0 0.024210 201 | 0 0.024342 202 | 0 0.024893 203 | 0 0.024922 204 | 0 0.024934 205 | 0 0.025736 206 | 0 0.025806 207 | 0 0.025938 208 | 0 0.026345 209 | 0 0.026486 210 | 0 0.026561 211 | 0 0.026877 212 | 0 0.027657 213 | 0 0.028587 214 | 0 0.029557 215 | 0 0.029563 216 | 0 0.029588 217 | 0 0.029963 218 | 0 0.030000 219 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_simple_collapsed.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | +2 16 | A collapsed non-sample node with 2 descendant samples in this tree 17 | 18 | 8 19 | 20 | 21 | 22 | 23 | 24 | 2 25 | 26 | 27 | 28 | 29 | 3 30 | 31 | 32 | 33 | 9 34 | 35 | 36 | 37 | 10 38 | 39 | 40 | 41 | 42 | 43 | 44 | +4 45 | A collapsed non-sample node with 4 descendant samples in this tree 46 | 47 | 13 48 | 49 | 50 | 14 51 | 52 | 53 | 54 | 55 | -------------------------------------------------------------------------------- /python/pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools>=45", "wheel", "numpy>=2.0"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "tskit" 7 | dynamic = ["version"] 8 | authors = [ 9 | {name = "Tskit Developers", email = "admin@tskit.dev"}, 10 | ] 11 | description = "The tree sequence toolkit." 12 | readme = "README.rst" 13 | license = {text = "MIT"} 14 | classifiers = [ 15 | "Programming Language :: C", 16 | "Programming Language :: Python", 17 | "Programming Language :: Python :: 3", 18 | "Programming Language :: Python :: 3.10", 19 | "Programming Language :: Python :: 3.11", 20 | "Programming Language :: Python :: 3.12", 21 | "Programming Language :: Python :: 3.13", 22 | "Programming Language :: Python :: 3 :: Only", 23 | "Development Status :: 5 - Production/Stable", 24 | "Environment :: Other Environment", 25 | "Intended Audience :: Science/Research", 26 | "License :: OSI Approved :: MIT License", 27 | "Operating System :: POSIX", 28 | "Operating System :: MacOS :: MacOS X", 29 | "Operating System :: Microsoft :: Windows", 30 | "Topic :: Scientific/Engineering", 31 | "Topic :: Scientific/Engineering :: Bio-Informatics", 32 | ] 33 | keywords = [ 34 | "population genetics", 35 | "tree sequence", 36 | "ancestral recombination graph", 37 | "evolutionary tree", 38 | "statistical genetics", 39 | "phylogenetics", 40 | "tskit", 41 | ] 42 | requires-python = ">=3.10" 43 | dependencies = [ 44 | "jsonschema>=3.0.0", 45 | "numpy>=2", 46 | ] 47 | 48 | [project.urls] 49 | Homepage = "https://tskit.dev/tskit" 50 | Documentation = "https://tskit.dev/tskit/docs/stable" 51 | Changelog = "https://tskit.dev/tskit/docs/stable/changelogs.html" 52 | "Bug Tracker" = "https://github.com/tskit-dev/tskit/issues" 53 | GitHub = "https://github.com/tskit-dev/tskit/" 54 | 55 | [project.scripts] 56 | tskit = "tskit.cli:tskit_main" 57 | 58 | [tool.setuptools] 59 | packages = ["tskit", "tskit.jit"] 60 | 61 | [tool.setuptools.dynamic] 62 | version = {attr = "tskit._version.tskit_version"} 63 | 64 | [project.optional-dependencies] 65 | test = [ 66 | "biopython==1.85", 67 | "coverage==7.7.0", 68 | "dendropy==5.0.1", 69 | "kastore==0.3.3", 70 | "lshmm==0.0.8", 71 | "msgpack==1.1.0", 72 | "msprime==1.3.4", 73 | "networkx==3.2.1", 74 | "numba==0.61.2", 75 | "portion==2.6.0", 76 | "pytest==8.3.5", 77 | "pytest-cov==6.0.0", 78 | "pytest-xdist==3.6.1", 79 | "tszip==0.2.5", 80 | "xmlunittest==1.0.1", 81 | "svgwrite==1.4.3", 82 | "newick==1.10.0", 83 | "zarr<3", 84 | ] 85 | 86 | docs = [ 87 | "jupyter-book==1.0.4.post1", 88 | "breathe==4.35.0", 89 | "sphinx-autodoc-typehints==2.3.0", 90 | "sphinx-issues==5.0.0", 91 | "sphinx-argparse==0.5.2", 92 | "msprime==1.3.3", 93 | "numba==0.61.2", 94 | "sphinx-book-theme", 95 | "pandas==2.2.3", 96 | ] 97 | 98 | dev = [ 99 | "biopython>=1.70", 100 | "coverage", 101 | "dendropy", 102 | "flake8", 103 | "kastore", 104 | "lshmm", 105 | "msgpack", 106 | "msprime", 107 | "mypy", 108 | "networkx", 109 | "numba", 110 | "portion", 111 | "pre-commit", 112 | "pytest", 113 | "pytest-cov", 114 | "pytest-xdist", 115 | "setuptools_scm", 116 | "svgwrite", 117 | "tszip", 118 | "xmlunittest", 119 | "newick", 120 | "zarr<3", 121 | "jupyter-book", 122 | "breathe", 123 | "sphinx-autodoc-typehints", 124 | "sphinx-issues", 125 | "sphinx-argparse", 126 | "sphinx-book-theme", 127 | "pandas", 128 | ] 129 | 130 | [tool.pytest.ini_options] 131 | addopts = "-n 4" 132 | testpaths = ["tests"] 133 | -------------------------------------------------------------------------------- /c/examples/tree_traversal.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #include 6 | 7 | #define check_tsk_error(val) \ 8 | if (val < 0) { \ 9 | errx(EXIT_FAILURE, "line %d: %s", __LINE__, tsk_strerror(val)); \ 10 | } 11 | 12 | static void 13 | traverse_standard(const tsk_tree_t *tree) 14 | { 15 | int ret; 16 | tsk_size_t num_nodes, j; 17 | tsk_id_t *nodes = malloc(tsk_tree_get_size_bound(tree) * sizeof(*nodes)); 18 | 19 | if (nodes == NULL) { 20 | errx(EXIT_FAILURE, "Out of memory"); 21 | } 22 | ret = tsk_tree_preorder(tree, nodes, &num_nodes); 23 | check_tsk_error(ret); 24 | for (j = 0; j < num_nodes; j++) { 25 | printf("Visit preorder %lld\n", (long long) nodes[j]); 26 | } 27 | 28 | ret = tsk_tree_postorder(tree, nodes, &num_nodes); 29 | check_tsk_error(ret); 30 | for (j = 0; j < num_nodes; j++) { 31 | printf("Visit postorder %lld\n", (long long) nodes[j]); 32 | } 33 | 34 | free(nodes); 35 | } 36 | 37 | static void 38 | _traverse(const tsk_tree_t *tree, tsk_id_t u, int depth) 39 | { 40 | tsk_id_t v; 41 | int j; 42 | 43 | for (j = 0; j < depth; j++) { 44 | printf(" "); 45 | } 46 | printf("Visit recursive %lld\n", (long long) u); 47 | for (v = tree->left_child[u]; v != TSK_NULL; v = tree->right_sib[v]) { 48 | _traverse(tree, v, depth + 1); 49 | } 50 | } 51 | 52 | static void 53 | traverse_recursive(const tsk_tree_t *tree) 54 | { 55 | _traverse(tree, tree->virtual_root, -1); 56 | } 57 | 58 | static void 59 | traverse_stack(const tsk_tree_t *tree) 60 | { 61 | int stack_top; 62 | tsk_id_t u, v; 63 | tsk_id_t *stack = malloc(tsk_tree_get_size_bound(tree) * sizeof(*stack)); 64 | 65 | if (stack == NULL) { 66 | errx(EXIT_FAILURE, "Out of memory"); 67 | } 68 | stack_top = 0; 69 | stack[stack_top] = tree->virtual_root; 70 | while (stack_top >= 0) { 71 | u = stack[stack_top]; 72 | stack_top--; 73 | printf("Visit stack %lld\n", (long long) u); 74 | /* Put nodes on the stack right-to-left, so we visit in left-to-right */ 75 | for (v = tree->right_child[u]; v != TSK_NULL; v = tree->left_sib[v]) { 76 | stack_top++; 77 | stack[stack_top] = v; 78 | } 79 | } 80 | free(stack); 81 | } 82 | 83 | static void 84 | traverse_upwards(const tsk_tree_t *tree) 85 | { 86 | const tsk_id_t *samples = tsk_treeseq_get_samples(tree->tree_sequence); 87 | tsk_size_t num_samples = tsk_treeseq_get_num_samples(tree->tree_sequence); 88 | tsk_size_t j; 89 | tsk_id_t u; 90 | 91 | for (j = 0; j < num_samples; j++) { 92 | u = samples[j]; 93 | while (u != TSK_NULL) { 94 | printf("Visit upwards: %lld\n", (long long) u); 95 | u = tree->parent[u]; 96 | } 97 | } 98 | } 99 | 100 | int 101 | main(int argc, char **argv) 102 | { 103 | int ret; 104 | tsk_treeseq_t ts; 105 | tsk_tree_t tree; 106 | 107 | if (argc != 2) { 108 | errx(EXIT_FAILURE, "usage: "); 109 | } 110 | ret = tsk_treeseq_load(&ts, argv[1], 0); 111 | check_tsk_error(ret); 112 | ret = tsk_tree_init(&tree, &ts, 0); 113 | check_tsk_error(ret); 114 | ret = tsk_tree_first(&tree); 115 | check_tsk_error(ret); 116 | 117 | traverse_standard(&tree); 118 | 119 | traverse_recursive(&tree); 120 | 121 | traverse_stack(&tree); 122 | 123 | traverse_upwards(&tree); 124 | 125 | tsk_tree_free(&tree); 126 | tsk_treeseq_free(&ts); 127 | return 0; 128 | } 129 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_muts.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 0 14 | 15 | 16 | 17 | 18 | 1 19 | 20 | 21 | 22 | 4 23 | 24 | 25 | 26 | 27 | 28 | 2 29 | 30 | 31 | 32 | 33 | 3 34 | 35 | 36 | 37 | 38 | 39 | 2 40 | 41 | 42 | 5 43 | 44 | 45 | 46 | 47 | 48 | 0 49 | 50 | 51 | 52 | 53 | 1 54 | 55 | 56 | 9 57 | 58 | 59 | 60 | 61 | -------------------------------------------------------------------------------- /docs/_static/tskit_logo_pale.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 12 | 13 | influencersAsset 2 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | -------------------------------------------------------------------------------- /docs/quickstart.md: -------------------------------------------------------------------------------- 1 | --- 2 | jupytext: 3 | text_representation: 4 | extension: .md 5 | format_name: myst 6 | format_version: 0.12 7 | jupytext_version: 1.9.1 8 | kernelspec: 9 | display_name: Python 3 10 | language: python 11 | name: python3 12 | --- 13 | 14 | :::{currentmodule} tskit 15 | ::: 16 | 17 | ```{code-cell} ipython3 18 | :tags: [remove-cell] 19 | import msprime 20 | 21 | def basic_sim(): 22 | ts = msprime.sim_ancestry( 23 | 3, 24 | population_size=1000, 25 | model="dtwf", 26 | sequence_length=1e4, 27 | recombination_rate=1e-7, 28 | random_seed=665) 29 | ts = msprime.sim_mutations(ts, rate=2e-7, random_seed=123) 30 | ts.dump("data/basic_tree_seq.trees") 31 | 32 | def create_notebook_data(): 33 | basic_sim() 34 | 35 | # create_notebook_data() # uncomment to recreate the tree seqs used in this notebook 36 | ``` 37 | 38 | # Quickstart 39 | 40 | Our {ref}`tutorials site` has a more extensive tutorial on 41 | {ref}`sec_tskit_getting_started`. Below we just give a quick flavour of the 42 | {ref}`sec_python_api` (note that 43 | APIs in {ref}`C ` and Rust exist, and it is also possible to 44 | {ref}`interface to the Python library in R `). 45 | 46 | ## Basic properties 47 | 48 | Any tree sequence, such as one generated by {ref}`msprime `, can be 49 | loaded, and a summary table printed. This example uses a small tree sequence, but the 50 | `tskit` library scales effectively to ones encoding millions of genomes and variable 51 | sites. 52 | 53 | ```{code-cell} 54 | import tskit 55 | 56 | ts = tskit.load("data/basic_tree_seq.trees") # Or generate using e.g. msprime.sim_ancestry() 57 | ts # In a Jupyter notebook this displays a summary table. Otherwise use print(ts) 58 | ``` 59 | 60 | ## Individual trees 61 | 62 | You can get e.g. the first tree in the tree sequence and analyse it. 63 | 64 | ```{code-cell} 65 | first_tree = ts.first() 66 | print("Total branch length in first tree is", first_tree.total_branch_length, ts.time_units) 67 | print("The first of", ts.num_trees, "trees is plotted below") 68 | first_tree.draw_svg(y_axis=True) # plot the tree: only useful for small trees 69 | ``` 70 | 71 | ## Extracting genetic data 72 | 73 | A tree sequence provides an extremely compact way to 74 | {ref}`store genetic variation data `. The trees allow 75 | this data to be {meth}`decoded ` at each site: 76 | 77 | ```{code-cell} 78 | for variant in ts.variants(): 79 | print( 80 | "Variable site", variant.site.id, 81 | "at genome position", variant.site.position, 82 | ":", [variant.alleles[g] for g in variant.genotypes], 83 | ) 84 | ``` 85 | 86 | ## Analysis 87 | 88 | Tree sequences enable {ref}`efficient analysis ` 89 | of genetic variation using a comprehensive range of built-in {ref}`sec_stats`: 90 | 91 | ```{code-cell} 92 | genetic_diversity = ts.diversity() 93 | print("Av. genetic diversity across the genome is", genetic_diversity) 94 | 95 | branch_diversity = ts.diversity(mode="branch") 96 | print("Av. genealogical dist. between pairs of tips is", branch_diversity, ts.time_units) 97 | ``` 98 | 99 | ## Plotting the whole tree sequence 100 | 101 | This can give you a visual feel for small genealogies: 102 | 103 | ```{code-cell} 104 | ts.draw_svg( 105 | size=(800, 300), 106 | y_axis=True, 107 | mutation_labels={m.id: m.derived_state for m in ts.mutations()}, 108 | ) 109 | ``` 110 | 111 | ## Underlying data structures 112 | 113 | The data that defines a tree sequence is stored in a set of tables. These tables 114 | can be viewed, and copies of the tables can be edited to create a new tree sequence. 115 | 116 | ```{code-cell} 117 | # The sites table is one of several tables that underlie a tree sequence 118 | ts.tables.sites 119 | ``` 120 | 121 | The rest of this documentation gives a comprehensive description of the entire `tskit` 122 | library, including {ref}`descriptions and definitions ` of all 123 | the tables. 124 | 125 | -------------------------------------------------------------------------------- /c/examples/haploid_wright_fisher.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #include 7 | 8 | #define check_tsk_error(val) \ 9 | if (val < 0) { \ 10 | errx(EXIT_FAILURE, "line %d: %s", __LINE__, tsk_strerror(val)); \ 11 | } 12 | 13 | void 14 | simulate( 15 | tsk_table_collection_t *tables, int N, int T, int simplify_interval) 16 | { 17 | tsk_id_t *buffer, *parents, *children, child, left_parent, right_parent; 18 | double breakpoint; 19 | int ret, j, t, b; 20 | 21 | assert(simplify_interval != 0); // leads to division by zero 22 | buffer = malloc(2 * N * sizeof(tsk_id_t)); 23 | if (buffer == NULL) { 24 | errx(EXIT_FAILURE, "Out of memory"); 25 | } 26 | tables->sequence_length = 1.0; 27 | parents = buffer; 28 | for (j = 0; j < N; j++) { 29 | parents[j] 30 | = tsk_node_table_add_row(&tables->nodes, 0, T, TSK_NULL, TSK_NULL, NULL, 0); 31 | check_tsk_error(parents[j]); 32 | } 33 | b = 0; 34 | for (t = T - 1; t >= 0; t--) { 35 | /* Alternate between using the first and last N values in the buffer */ 36 | parents = buffer + (b * N); 37 | b = (b + 1) % 2; 38 | children = buffer + (b * N); 39 | for (j = 0; j < N; j++) { 40 | child = tsk_node_table_add_row( 41 | &tables->nodes, 0, t, TSK_NULL, TSK_NULL, NULL, 0); 42 | check_tsk_error(child); 43 | /* NOTE: the use of rand() is discouraged for 44 | * research code and proper random number generator 45 | * libraries should be preferred. 46 | */ 47 | left_parent = parents[(size_t)((rand()/(1.+RAND_MAX))*N)]; 48 | right_parent = parents[(size_t)((rand()/(1.+RAND_MAX))*N)]; 49 | do { 50 | breakpoint = rand()/(1.+RAND_MAX); 51 | } while (breakpoint == 0); /* tiny proba of breakpoint being 0 */ 52 | ret = tsk_edge_table_add_row( 53 | &tables->edges, 0, breakpoint, left_parent, child, NULL, 0); 54 | check_tsk_error(ret); 55 | ret = tsk_edge_table_add_row( 56 | &tables->edges, breakpoint, 1, right_parent, child, NULL, 0); 57 | check_tsk_error(ret); 58 | children[j] = child; 59 | } 60 | if (t % simplify_interval == 0) { 61 | printf("Simplify at generation %lld: (%lld nodes %lld edges)", 62 | (long long) t, 63 | (long long) tables->nodes.num_rows, 64 | (long long) tables->edges.num_rows); 65 | /* Note: Edges must be sorted for simplify to work, and we use a brute force 66 | * approach of sorting each time here for simplicity. This is inefficient. */ 67 | ret = tsk_table_collection_sort(tables, NULL, 0); 68 | check_tsk_error(ret); 69 | ret = tsk_table_collection_simplify(tables, children, N, 0, NULL); 70 | check_tsk_error(ret); 71 | printf(" -> (%lld nodes %lld edges)\n", 72 | (long long) tables->nodes.num_rows, 73 | (long long) tables->edges.num_rows); 74 | for (j = 0; j < N; j++) { 75 | children[j] = j; 76 | } 77 | } 78 | } 79 | free(buffer); 80 | } 81 | 82 | int 83 | main(int argc, char **argv) 84 | { 85 | int ret; 86 | tsk_table_collection_t tables; 87 | 88 | if (argc != 6) { 89 | errx(EXIT_FAILURE, "usage: N T simplify-interval output-file seed"); 90 | } 91 | ret = tsk_table_collection_init(&tables, 0); 92 | check_tsk_error(ret); 93 | srand((unsigned)atoi(argv[5])); 94 | simulate(&tables, atoi(argv[1]), atoi(argv[2]), atoi(argv[3])); 95 | 96 | /* Sort and index so that the result can be opened as a tree sequence */ 97 | ret = tsk_table_collection_sort(&tables, NULL, 0); 98 | check_tsk_error(ret); 99 | ret = tsk_table_collection_build_index(&tables, 0); 100 | check_tsk_error(ret); 101 | ret = tsk_table_collection_dump(&tables, argv[4], 0); 102 | check_tsk_error(ret); 103 | 104 | tsk_table_collection_free(&tables); 105 | return 0; 106 | } 107 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # tskit 2 | 3 | [![License](https://img.shields.io/github/license/tskit-dev/tskit)](https://github.com/tskit-dev/tskit/blob/main/LICENSE) 4 | [![Contributors](https://img.shields.io/github/contributors/tskit-dev/tskit)](https://github.com/tskit-dev/tskit/graphs/contributors) 5 | [![Commit activity](https://img.shields.io/github/commit-activity/m/tskit-dev/tskit)](https://github.com/tskit-dev/tskit/commits/main) 6 | [![Coverage](https://codecov.io/gh/tskit-dev/tskit/branch/main/graph/badge.svg)](https://codecov.io/gh/tskit-dev/tskit) 7 | ![OS](https://img.shields.io/badge/OS-linux%20%7C%20OSX%20%7C%20win--64-steelblue) 8 | 9 | [Documentation (stable)](https://tskit.dev/tskit/docs/stable/) • [Documentation (latest)](https://tskit.dev/tskit/docs/latest/) 10 | 11 | [![Docs Build](https://github.com/tskit-dev/tskit/actions/workflows/docs.yml/badge.svg)](https://github.com/tskit-dev/tskit/actions/workflows/docs.yml) [![Binary wheels](https://github.com/tskit-dev/tskit/actions/workflows/wheels.yml/badge.svg)](https://github.com/tskit-dev/tskit/actions/workflows/wheels.yml) [![Tests](https://github.com/tskit-dev/tskit/actions/workflows/tests.yml/badge.svg)](https://github.com/tskit-dev/tskit/actions/workflows/tests.yml) 12 | 13 | 14 | The succinct tree sequence (`tskit`) format is an efficient way of representing 15 | the genetic history - sometimes known as an 16 | [Ancestral Recombination Graph or ARG](https://doi.org/10.1093/genetics/iyae100) - 17 | of a set of related DNA sequences. `Tskit` is used 18 | by a number of software libraries and programs (such as 19 | [msprime](https://github.com/tskit-dev/msprime), 20 | [SLiM](https://github.com/MesserLab/SLiM), 21 | [fwdpp](http://molpopgen.github.io/fwdpp/), and 22 | [tsinfer](https://tskit.dev/tsinfer/docs/stable/)) that either simulate or infer 23 | the evolutionary ancestry of genetic sequences. 24 | 25 | The `tskit` library provides the underlying functionality used to load, examine, and 26 | manipulate ARGs in the tree sequence format, including efficient access to the 27 | sequence of correlated trees along a genome and general methods to calculate 28 | genetic statistics. `Tskit` often forms part of an installation of other 29 | software packages such as those listed above. Please see the 30 | [documentation](https://tskit.dev/tskit/docs/stable/) for further details, which 31 | includes 32 | [installation instructions](https://tskit.dev/tskit/docs/stable/installation.html). 33 | 34 | To get started with tskit, tutorials and other content are at http://tskit.dev. For help 35 | and support from the community you can use 36 | [discussions](https://github.com/tskit-dev/tskit/discussions) here on github, or raise an 37 | issue for a specific bug or feature request. 38 | 39 | We warmly welcome contributions from the community. Raise an issue if you have an 40 | idea you'd like to work on, or submit a PR for comments and help. 41 | 42 | The base `tskit` library provides both a [Python](https://tskit.dev/tskit/docs/stable/python-api.html) 43 | and [C](https://tskit.dev/tskit/docs/stable/c-api.html) API. A Rust API is provided in the 44 | [tskit-rust](https://github.com/tskit-dev/tskit-rust) repository. 45 | 46 | 47 | #### Python API 48 | [![PyPI version](https://img.shields.io/pypi/v/tskit.svg)](https://pypi.org/project/tskit/) 49 | [![Supported Python Versions](https://img.shields.io/pypi/pyversions/tskit.svg)](https://pypi.org/project/tskit/) 50 | [![Wheel](https://img.shields.io/pypi/wheel/tskit)](https://pypi.org/project/tskit/) 51 | [![Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) 52 | 53 | 54 | Most users of `tskit` will use the python API as it provides a convenient, high-level API 55 | to access, analyse and create tree sequences. Full documentation is 56 | [here](https://tskit.dev/tskit/docs/stable/python-api.html). 57 | 58 | #### C API 59 | [![C99](https://img.shields.io/badge/Language-C99-steelblue.svg)](https://en.wikipedia.org/wiki/C99) 60 | 61 | 62 | The `tskit` C API provides comprehensive, low-level methods for manipulating and 63 | processing tree-sequences. Written to the C99 standard and fully thread-safe, it can be 64 | used with either C or C++. Full documentation is 65 | [here](https://tskit.dev/tskit/docs/stable/c-api.html). 66 | 67 | ## Installation 68 | 69 | ```bash 70 | python -m pip install tskit 71 | # or 72 | conda install -c conda-forge tskit 73 | ``` 74 | -------------------------------------------------------------------------------- /docs/_config.yml: -------------------------------------------------------------------------------- 1 | # Book settings 2 | # Learn more at https://jupyterbook.org/customize/config.html 3 | 4 | title: Tskit manual 5 | author: Tskit Developers 6 | copyright: "2022" 7 | only_build_toc_files: true 8 | logo: logo.svg 9 | favicon: favicon.ico 10 | 11 | execute: 12 | execute_notebooks: cache 13 | timeout: 120 14 | 15 | launch_buttons: 16 | binderhub_url: "" 17 | 18 | repository: 19 | url: https://github.com/tskit-dev/tskit 20 | branch: main 21 | path_to_book: docs 22 | 23 | html: 24 | use_issues_button: true 25 | use_repository_button: true 26 | use_edit_page_button: true 27 | 28 | sphinx: 29 | extra_extensions: 30 | - sphinx_copybutton 31 | - breathe 32 | - sphinx.ext.autodoc 33 | - sphinx_autodoc_typehints 34 | - sphinx.ext.autosummary 35 | - sphinx.ext.todo 36 | - sphinx.ext.viewcode 37 | - sphinx.ext.intersphinx 38 | - sphinx_issues 39 | - sphinxarg.ext 40 | - IPython.sphinxext.ipython_console_highlighting 41 | #- sphinxcontrib.prettyspecialmethods 42 | 43 | config: 44 | html_theme: sphinx_book_theme 45 | html_theme_options: 46 | pygments_dark_style: monokai 47 | navigation_with_keys: false 48 | logo: 49 | text: "Version __TSKIT_VERSION__" 50 | repository_url: https://github.com/tskit-dev/tskit 51 | repository_branch: main 52 | path_to_docs: docs 53 | use_repository_button: true 54 | use_edit_page_button: true 55 | use_issues_button: true 56 | pygments_style: monokai 57 | myst_enable_extensions: 58 | - colon_fence 59 | - deflist 60 | - dollarmath 61 | - substitution 62 | issues_github_path: tskit-dev/tskit 63 | todo_include_todos: true 64 | intersphinx_mapping: 65 | python: ["https://docs.python.org/3/", null] 66 | tutorials: ["https://tskit.dev/tutorials/", null] 67 | stdpopsim: ["https://stdpopsim.readthedocs.io/en/stable", null] 68 | pyslim: ["https://tskit.dev/pyslim/docs/latest/", null] 69 | msprime: ["https://tskit.dev/msprime/docs/stable/", null] 70 | numpy: ["https://numpy.org/doc/stable/", null] 71 | 72 | breathe_projects: {"tskit": "doxygen/xml"} 73 | breathe_default_project: "tskit" 74 | breathe_domain_by_extension: {"h": "c"} 75 | breathe_show_define_initializer: True 76 | 77 | # Note we have to use the regex version here because of 78 | # https://github.com/sphinx-doc/sphinx/issues/9748 79 | nitpick_ignore_regex: [ 80 | ["c:identifier", "uint8_t"], 81 | ["c:identifier", "int32_t"], 82 | ["c:identifier", "uint32_t"], 83 | ["c:identifier", "uint64_t"], 84 | ["c:identifier", "FILE"], 85 | ["c:identifier", "bool"], 86 | # This is for the anonymous interval struct embedded in the tsk_tree_t. 87 | ["c:identifier", "tsk_tree_t.@4"], 88 | ["c:type", "int32_t"], 89 | ["c:type", "uint32_t"], 90 | ["c:type", "uint64_t"], 91 | ["c:type", "bool"], 92 | # TODO these have been triaged here to make the docs compile, but we should 93 | # sort them out properly. https://github.com/tskit-dev/tskit/issues/336 94 | ["py:class", "array_like"], 95 | ["py:class", "row-like"], 96 | ["py:class", "array-like"], 97 | ["py:class", "dtype=np.uint32"], 98 | ["py:class", "dtype=np.uint32."], 99 | ["py:class", "dtype=np.int32"], 100 | ["py:class", "dtype=np.int8"], 101 | ["py:class", "dtype=np.float64"], 102 | ["py:class", "dtype=np.int64"], 103 | ] 104 | 105 | # Added to allow "bool" be used as a :ctype: - this list has to be 106 | # manually specifed in order to remove "bool" from it. 107 | c_extra_keywords: [ 108 | "alignas", 109 | "alignof", 110 | "complex", 111 | "imaginary", 112 | "noreturn", 113 | "static_assert", 114 | "thread_local" 115 | ] 116 | 117 | autodoc_member_order: bysource 118 | 119 | # Without this option, autodoc tries to put links for all return types 120 | # in terms of the fully-qualified classnames which we don't want, and also 121 | # leads to broken links and nitpick failures. So, until we tackle 122 | # typehints fully, this is the simplest approach. 123 | autodoc_typehints: none 124 | 125 | -------------------------------------------------------------------------------- /c/tests/test_convert.c: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2019-2022 Tskit Developers 5 | * 6 | * Permission is hereby granted, free of charge, to any person obtaining a copy 7 | * of this software and associated documentation files (the "Software"), to deal 8 | * in the Software without restriction, including without limitation the rights 9 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | * copies of the Software, and to permit persons to whom the Software is 11 | * furnished to do so, subject to the following conditions: 12 | * 13 | * The above copyright notice and this permission notice shall be included in all 14 | * copies or substantial portions of the Software. 15 | * 16 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | * SOFTWARE. 23 | */ 24 | 25 | #include "testlib.h" 26 | #include 27 | 28 | #include 29 | #include 30 | 31 | static void 32 | test_single_tree_newick(void) 33 | { 34 | int ret; 35 | tsk_treeseq_t ts; 36 | tsk_tree_t t; 37 | size_t buffer_size = 1024; 38 | char newick[buffer_size]; 39 | 40 | tsk_treeseq_from_text(&ts, 1, single_tree_ex_nodes, single_tree_ex_edges, NULL, NULL, 41 | NULL, NULL, NULL, 0); 42 | 43 | ret = tsk_tree_init(&t, &ts, 0); 44 | CU_ASSERT_EQUAL_FATAL(ret, 0) 45 | ret = tsk_tree_first(&t); 46 | CU_ASSERT_EQUAL_FATAL(ret, TSK_TREE_OK) 47 | 48 | ret = tsk_convert_newick(&t, 0, 0, TSK_NEWICK_LEGACY_MS_LABELS, buffer_size, newick); 49 | CU_ASSERT_EQUAL_FATAL(ret, 0); 50 | /* Seems odd, but this is what a single node newick tree looks like. 51 | * Newick parsers seems to accept it in any case */ 52 | CU_ASSERT_STRING_EQUAL(newick, "1;"); 53 | 54 | ret = tsk_convert_newick(&t, 0, 0, 0, buffer_size, newick); 55 | CU_ASSERT_EQUAL_FATAL(ret, 0); 56 | CU_ASSERT_STRING_EQUAL(newick, "n0;"); 57 | 58 | ret = tsk_convert_newick(&t, 4, 0, TSK_NEWICK_LEGACY_MS_LABELS, buffer_size, newick); 59 | CU_ASSERT_EQUAL_FATAL(ret, 0); 60 | CU_ASSERT_STRING_EQUAL(newick, "(1:1,2:1);"); 61 | ret = tsk_convert_newick(&t, 4, 0, 0, buffer_size, newick); 62 | CU_ASSERT_EQUAL_FATAL(ret, 0); 63 | CU_ASSERT_STRING_EQUAL(newick, "(n0:1,n1:1);"); 64 | 65 | ret = tsk_convert_newick(&t, 6, 0, TSK_NEWICK_LEGACY_MS_LABELS, buffer_size, newick); 66 | CU_ASSERT_EQUAL_FATAL(ret, 0); 67 | CU_ASSERT_STRING_EQUAL(newick, "((1:1,2:1):2,(3:2,4:2):1);"); 68 | 69 | ret = tsk_convert_newick(&t, 6, 0, 0, buffer_size, newick); 70 | CU_ASSERT_EQUAL_FATAL(ret, 0); 71 | CU_ASSERT_STRING_EQUAL(newick, "((n0:1,n1:1):2,(n2:2,n3:2):1);"); 72 | 73 | tsk_tree_free(&t); 74 | tsk_treeseq_free(&ts); 75 | } 76 | 77 | static void 78 | test_single_tree_newick_errors(void) 79 | { 80 | int ret; 81 | tsk_treeseq_t ts; 82 | tsk_tree_t t; 83 | size_t j, len; 84 | size_t buffer_size = 1024; 85 | char newick[buffer_size]; 86 | 87 | tsk_treeseq_from_text(&ts, 1, single_tree_ex_nodes, single_tree_ex_edges, NULL, NULL, 88 | NULL, NULL, NULL, 0); 89 | 90 | ret = tsk_tree_init(&t, &ts, 0); 91 | CU_ASSERT_EQUAL_FATAL(ret, 0) 92 | ret = tsk_tree_first(&t); 93 | CU_ASSERT_EQUAL_FATAL(ret, TSK_TREE_OK) 94 | 95 | ret = tsk_convert_newick(&t, -1, 1, 0, buffer_size, newick); 96 | CU_ASSERT_EQUAL_FATAL(ret, TSK_ERR_NODE_OUT_OF_BOUNDS); 97 | ret = tsk_convert_newick(&t, 7, 1, 0, buffer_size, newick); 98 | CU_ASSERT_EQUAL_FATAL(ret, TSK_ERR_NODE_OUT_OF_BOUNDS); 99 | 100 | ret = tsk_convert_newick(&t, 6, 0, 0, buffer_size, NULL); 101 | CU_ASSERT_EQUAL_FATAL(ret, TSK_ERR_BAD_PARAM_VALUE); 102 | ret = tsk_convert_newick(&t, 6, 0, 0, buffer_size, newick); 103 | CU_ASSERT_EQUAL_FATAL(ret, 0); 104 | len = 1 + strlen(newick); 105 | for (j = 0; j < len; j++) { 106 | ret = tsk_convert_newick(&t, 6, 0, 0, j, newick); 107 | CU_ASSERT_EQUAL_FATAL(ret, TSK_ERR_BUFFER_OVERFLOW); 108 | } 109 | ret = tsk_convert_newick(&t, 6, 0, TSK_NEWICK_LEGACY_MS_LABELS, len, newick); 110 | 111 | CU_ASSERT_EQUAL_FATAL(ret, 0); 112 | CU_ASSERT_STRING_EQUAL(newick, "((1:1,2:1):2,(3:2,4:2):1);"); 113 | 114 | tsk_tree_free(&t); 115 | tsk_treeseq_free(&ts); 116 | } 117 | 118 | int 119 | main(int argc, char **argv) 120 | { 121 | CU_TestInfo tests[] = { 122 | { "test_single_tree_newick", test_single_tree_newick }, 123 | { "test_single_tree_newick_errors", test_single_tree_newick_errors }, 124 | { NULL, NULL }, 125 | }; 126 | return test_main(tests, argc, argv); 127 | } 128 | -------------------------------------------------------------------------------- /python/tests/conftest.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # 3 | # Copyright (c) 2018-2022 Tskit Developers 4 | # 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy 6 | # of this software and associated documentation files (the "Software"), to deal 7 | # in the Software without restriction, including without limitation the rights 8 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | # copies of the Software, and to permit persons to whom the Software is 10 | # furnished to do so, subject to the following conditions: 11 | # 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | # 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | # SOFTWARE. 22 | """ 23 | Configuration and fixtures for pytest. Only put test-suite wide fixtures in here. Module 24 | specific fixtures should live in their modules. 25 | 26 | To use a fixture in a test simply refer to it by name as an argument. This is called 27 | dependency injection. Note that all fixtures should have the suffix "_fixture" to make 28 | it clear in test code. 29 | 30 | For example to use the `ts` fixture (a tree sequence with data in all fields) in a test: 31 | 32 | class TestClass: 33 | def test_something(self, ts_fixture): 34 | assert ts_fixture.some_method() == expected 35 | 36 | Fixtures can be parameterised etc. see https://docs.pytest.org/en/stable/fixture.html 37 | 38 | Note that fixtures have a "scope" for example `ts_fixture` below is only created once 39 | per test session and re-used for subsequent tests. 40 | """ 41 | import msprime 42 | import pytest 43 | from pytest import fixture 44 | 45 | from . import tsutil 46 | 47 | 48 | def pytest_addoption(parser): 49 | """ 50 | Add options, e.g. to skip tests marked with `@pytest.mark.slow` 51 | """ 52 | parser.addoption( 53 | "--skip-slow", action="store_true", default=False, help="Skip slow tests" 54 | ) 55 | parser.addoption( 56 | "--skip-network", 57 | action="store_true", 58 | default=False, 59 | help="Skip network/FIFO tests", 60 | ) 61 | parser.addoption( 62 | "--overwrite-expected-visualizations", 63 | action="store_true", 64 | default=False, 65 | help="Overwrite the expected viz files in tests/data/svg/", 66 | ) 67 | parser.addoption( 68 | "--draw-svg-debug-box", 69 | action="store_true", 70 | default=False, 71 | help="To help debugging, draw lines around the plotboxes in SVG output files", 72 | ) 73 | 74 | 75 | def pytest_configure(config): 76 | """ 77 | Add docs on the "slow" marker 78 | """ 79 | config.addinivalue_line("markers", "slow: mark test as slow to run") 80 | config.addinivalue_line("markers", "network: mark test as using network/FIFO") 81 | 82 | 83 | def pytest_collection_modifyitems(config, items): 84 | if config.getoption("--skip-slow"): 85 | skip_slow = pytest.mark.skip(reason="--skip-slow specified") 86 | for item in items: 87 | if "slow" in item.keywords: 88 | item.add_marker(skip_slow) 89 | if config.getoption("--skip-network"): 90 | skip_network = pytest.mark.skip(reason="--skip-network specified") 91 | for item in items: 92 | if "network" in item.keywords: 93 | item.add_marker(skip_network) 94 | 95 | 96 | @fixture 97 | def overwrite_viz(request): 98 | return request.config.getoption("--overwrite-expected-visualizations") 99 | 100 | 101 | @fixture 102 | def draw_plotbox(request): 103 | return request.config.getoption("--draw-svg-debug-box") 104 | 105 | 106 | @fixture(scope="session") 107 | def simple_degree1_ts_fixture(): 108 | return msprime.simulate(10, random_seed=42) 109 | 110 | 111 | @fixture(scope="session") 112 | def simple_degree2_ts_fixture(): 113 | ts = msprime.simulate(10, recombination_rate=0.2, random_seed=42) 114 | assert ts.num_trees == 2 115 | return ts 116 | 117 | 118 | @fixture(scope="session") 119 | def ts_fixture(): 120 | """ 121 | A tree sequence with data in all fields 122 | """ 123 | return tsutil.all_fields_ts() 124 | 125 | 126 | @fixture(scope="session") 127 | def ts_fixture_for_simplify(): 128 | """ 129 | A tree sequence with data in all fields execpt edge metadata and migrations 130 | """ 131 | return tsutil.all_fields_ts(edge_metadata=False, migrations=False) 132 | 133 | 134 | @fixture(scope="session") 135 | def replicate_ts_fixture(): 136 | """ 137 | A list of tree sequences 138 | """ 139 | return list(msprime.simulate(10, num_replicates=10, random_seed=42)) 140 | -------------------------------------------------------------------------------- /c/examples/multichrom_wright_fisher_singlethreaded.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | #include 8 | 9 | #define check_tsk_error(val) \ 10 | if (val < 0) { \ 11 | errx(EXIT_FAILURE, "line %d: %s\n", __LINE__, tsk_strerror(val)); \ 12 | } 13 | 14 | void 15 | simulate( 16 | tsk_table_collection_t *tables, int num_chroms, int N, int T, int simplify_interval) 17 | { 18 | tsk_id_t *buffer, *parents, *children, child, left_parent, right_parent; 19 | bool left_is_first; 20 | double chunk_left, chunk_right; 21 | int ret, j, t, b, k; 22 | 23 | assert(simplify_interval != 0); // leads to division by zero 24 | buffer = malloc(2 * N * sizeof(tsk_id_t)); 25 | if (buffer == NULL) { 26 | errx(EXIT_FAILURE, "Out of memory"); 27 | } 28 | tables->sequence_length = num_chroms; 29 | parents = buffer; 30 | for (j = 0; j < N; j++) { 31 | parents[j] 32 | = tsk_node_table_add_row(&tables->nodes, 0, T, TSK_NULL, TSK_NULL, NULL, 0); 33 | check_tsk_error(parents[j]); 34 | } 35 | b = 0; 36 | for (t = T - 1; t >= 0; t--) { 37 | /* Alternate between using the first and last N values in the buffer */ 38 | parents = buffer + (b * N); 39 | b = (b + 1) % 2; 40 | children = buffer + (b * N); 41 | for (j = 0; j < N; j++) { 42 | child = tsk_node_table_add_row( 43 | &tables->nodes, 0, t, TSK_NULL, TSK_NULL, NULL, 0); 44 | check_tsk_error(child); 45 | /* NOTE: the use of rand() is discouraged for 46 | * research code and proper random number generator 47 | * libraries should be preferred. 48 | */ 49 | left_parent = parents[(size_t)((rand() / (1. + RAND_MAX)) * N)]; 50 | right_parent = parents[(size_t)((rand() / (1. + RAND_MAX)) * N)]; 51 | left_is_first = rand() < 0.5; 52 | chunk_left = 0.0; 53 | for (k = 0; k < num_chroms; k++) { 54 | chunk_right = chunk_left + rand() / (1. + RAND_MAX); 55 | /* a very tiny chance that right and left are equal */ 56 | if (chunk_right > chunk_left) { 57 | ret = tsk_edge_table_add_row(&tables->edges, chunk_left, chunk_right, 58 | left_is_first ? left_parent : right_parent, child, NULL, 0); 59 | check_tsk_error(ret); 60 | } 61 | chunk_left += 1.0; 62 | if (chunk_right < chunk_left) { 63 | ret = tsk_edge_table_add_row(&tables->edges, chunk_right, chunk_left, 64 | left_is_first ? right_parent : left_parent, child, NULL, 0); 65 | check_tsk_error(ret); 66 | } 67 | } 68 | children[j] = child; 69 | } 70 | if (t % simplify_interval == 0) { 71 | printf("Simplify at generation %lld: (%lld nodes %lld edges)", 72 | (long long) t, 73 | (long long) tables->nodes.num_rows, 74 | (long long) tables->edges.num_rows); 75 | /* Note: Edges must be sorted for simplify to work, and we use a brute force 76 | * approach of sorting each time here for simplicity. This is inefficient. */ 77 | ret = tsk_table_collection_sort(tables, NULL, 0); 78 | check_tsk_error(ret); 79 | ret = tsk_table_collection_simplify(tables, children, N, 0, NULL); 80 | check_tsk_error(ret); 81 | printf(" -> (%lld nodes %lld edges)\n", 82 | (long long) tables->nodes.num_rows, 83 | (long long) tables->edges.num_rows); 84 | for (j = 0; j < N; j++) { 85 | children[j] = j; 86 | } 87 | } 88 | } 89 | /* Set the sample flags for final generation */ 90 | for (j = 0; j < N; j++) { 91 | tables->nodes.flags[children[j]] = TSK_NODE_IS_SAMPLE; 92 | } 93 | free(buffer); 94 | } 95 | 96 | int 97 | main(int argc, char **argv) 98 | { 99 | int ret; 100 | tsk_table_collection_t tables; 101 | 102 | if (argc != 7) { 103 | errx(EXIT_FAILURE, "usage: N T simplify-interval output seed num-chroms"); 104 | } 105 | ret = tsk_table_collection_init(&tables, 0); 106 | check_tsk_error(ret); 107 | srand((unsigned)atoi(argv[5])); 108 | simulate(&tables, atoi(argv[6]), atoi(argv[1]), atoi(argv[2]), atoi(argv[3])); 109 | 110 | /* Sort and index so that the result can be opened as a tree sequence */ 111 | ret = tsk_table_collection_sort(&tables, NULL, 0); 112 | check_tsk_error(ret); 113 | ret = tsk_table_collection_build_index(&tables, 0); 114 | check_tsk_error(ret); 115 | ret = tsk_table_collection_dump(&tables, argv[4], 0); 116 | check_tsk_error(ret); 117 | 118 | tsk_table_collection_free(&tables); 119 | return 0; 120 | } 121 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_both_axes.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Genome position 11 | 12 | 13 | 14 | 15 | 16 | 17 | 0.91 18 | 19 | 20 | 21 | 22 | 23 | 1.00 24 | 25 | 26 | 27 | 28 | 29 | 30 | Time ago 31 | 32 | 33 | 34 | 35 | 36 | 37 | 0.00 38 | 39 | 40 | 41 | 42 | 43 | 0.11 44 | 45 | 46 | 47 | 48 | 49 | 1.11 50 | 51 | 52 | 53 | 54 | 55 | 6.57 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 0 68 | 69 | 70 | 71 | 72 | 1 73 | 74 | 75 | 76 | 4 77 | 78 | 79 | 80 | 81 | 82 | 2 83 | 84 | 85 | 86 | 87 | 3 88 | 89 | 90 | 91 | 5 92 | 93 | 94 | 8 95 | 96 | 97 | 98 | 99 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_timed_muts.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Time ago 11 | 12 | 13 | 14 | 15 | 16 | 17 | 0.00 18 | 19 | 20 | 21 | 22 | 23 | 0.11 24 | 25 | 26 | 27 | 28 | 29 | 1.11 30 | 31 | 32 | 33 | 34 | 35 | 9.08 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 0 48 | 49 | 50 | 51 | 52 | 1 53 | 54 | 55 | 56 | 4 57 | 58 | 59 | 60 | 61 | 62 | 2 63 | 64 | 65 | 66 | 67 | 3 68 | 69 | 70 | 71 | 72 | 73 | 2 74 | 75 | 76 | 5 77 | 78 | 79 | 80 | 81 | 82 | 0 83 | 84 | 85 | 86 | 87 | 1 88 | 89 | 90 | 9 91 | 92 | 93 | 94 | 95 | -------------------------------------------------------------------------------- /docs/logo.svg: -------------------------------------------------------------------------------- 1 | 2 | image/svg+xml -------------------------------------------------------------------------------- /python/lwt_interface/example_c_module.c: -------------------------------------------------------------------------------- 1 | /* 2 | * MIT License 3 | * 4 | * Copyright (c) 2019-2020 Tskit Developers 5 | * Copyright (c) 2015-2018 University of Oxford 6 | * 7 | * Permission is hereby granted, free of charge, to any person obtaining a copy 8 | * of this software and associated documentation files (the "Software"), to deal 9 | * in the Software without restriction, including without limitation the rights 10 | * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 | * copies of the Software, and to permit persons to whom the Software is 12 | * furnished to do so, subject to the following conditions: 13 | * 14 | * The above copyright notice and this permission notice shall be included in all 15 | * copies or substantial portions of the Software. 16 | * 17 | * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 | * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 | * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 | * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 | * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 | * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 | * SOFTWARE. 24 | */ 25 | // Turn off clang-formatting for this file as turning off formatting 26 | // for specific bits will make it more confusing. 27 | // clang-format off 28 | 29 | #define PY_SSIZE_T_CLEAN 30 | #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION 31 | 32 | #include 33 | #include 34 | #include 35 | 36 | #include "kastore.h" 37 | #include "tskit.h" 38 | 39 | #include "tskit_lwt_interface.h" 40 | 41 | static PyObject * 42 | example_receiving(PyObject *self, PyObject *args) { 43 | int err = -1; 44 | PyObject* ret = NULL; 45 | LightweightTableCollection *tables = NULL; 46 | tsk_treeseq_t tree_seq; 47 | tsk_tree_t tree; 48 | 49 | memset(&tree, 0, sizeof(tsk_tree_t)); 50 | memset(&tree_seq, 0, sizeof(tsk_treeseq_t)); 51 | 52 | /* Get the tables from the args */ 53 | if (!PyArg_ParseTuple(args, "O!", &LightweightTableCollectionType, &tables)) { 54 | goto out; 55 | } 56 | 57 | /* Check that the tables are init'd to prevent seg faults */ 58 | if (LightweightTableCollection_check_state(tables) != 0) { 59 | goto out; 60 | } 61 | 62 | /* Build a tree sequence from the tables */ 63 | err = tsk_treeseq_init(&tree_seq, tables->tables, 0); 64 | if (err < 0) { 65 | handle_tskit_error(err); 66 | goto out; 67 | } 68 | 69 | /* Get the first tree */ 70 | err = tsk_tree_init(&tree, &tree_seq, 0); 71 | if (err < 0) { 72 | handle_tskit_error(err); 73 | goto out; 74 | } 75 | err = tsk_tree_first(&tree); 76 | if (err < 0) { 77 | handle_tskit_error(err); 78 | goto out; 79 | } 80 | 81 | /* Return true if the tree has more than one root */ 82 | ret = Py_BuildValue("O", tsk_tree_get_num_roots(&tree) > 1 ? Py_True: Py_False); 83 | 84 | out: 85 | tsk_tree_free(&tree); 86 | tsk_treeseq_free(&tree_seq); 87 | return ret; 88 | } 89 | 90 | static PyObject * example_modifying(PyObject *self, PyObject *args) { 91 | int err = -1; 92 | PyObject* ret = NULL; 93 | LightweightTableCollection *tables = NULL; 94 | 95 | if (!PyArg_ParseTuple(args, "O!", &LightweightTableCollectionType, &tables)) { 96 | goto out; 97 | } 98 | 99 | /* Check that the tables are init'd to prevent seg faults */ 100 | if (LightweightTableCollection_check_state(tables) != 0) { 101 | goto out; 102 | } 103 | 104 | /* Modify the tables, note the need to check for error states and handle them */ 105 | err = tsk_table_collection_clear(tables->tables, 0); 106 | if (err < 0) { 107 | handle_tskit_error(err); 108 | goto out; 109 | } 110 | err = tsk_node_table_add_row(&tables->tables->nodes, 0, 0, 0, 0, NULL, 0); 111 | if (err < 0) { 112 | handle_tskit_error(err); 113 | goto out; 114 | } 115 | err = tsk_node_table_add_row(&tables->tables->nodes, 0, 0, 0, 0, NULL, 0); 116 | if (err < 0) { 117 | handle_tskit_error(err); 118 | goto out; 119 | } 120 | 121 | /* Only set the return after no errors */ 122 | ret = Py_BuildValue(""); 123 | out: 124 | return ret; 125 | } 126 | 127 | 128 | static PyMethodDef example_c_module_methods[] = { 129 | {"example_receiving", (PyCFunction) example_receiving, METH_VARARGS, "Example of function receiving tables"}, 130 | {"example_modifying", (PyCFunction) example_modifying, METH_VARARGS, "Example of function modifying tables"}, 131 | { NULL, NULL, 0, NULL } /* sentinel */ 132 | }; 133 | 134 | static struct PyModuleDef example_c_module = { 135 | .m_base = PyModuleDef_HEAD_INIT, 136 | .m_name = "example_c_module", 137 | .m_doc = "Example C module using the tskit LightweightTableCollection.", 138 | .m_size = -1, 139 | .m_methods = example_c_module_methods }; 140 | 141 | PyMODINIT_FUNC 142 | PyInit_example_c_module(void) 143 | { 144 | PyObject *module = PyModule_Create(&example_c_module); 145 | if (module == NULL) { 146 | return NULL; 147 | } 148 | import_array(); 149 | if (register_lwt_class(module) != 0) { 150 | return NULL; 151 | } 152 | 153 | /* Put your own functions/class definitions here, as usual */ 154 | 155 | 156 | return module; 157 | } 158 | -------------------------------------------------------------------------------- /python/tskit/provenance.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # 3 | # Copyright (c) 2018-2024 Tskit Developers 4 | # Copyright (c) 2016-2017 University of Oxford 5 | # 6 | # Permission is hereby granted, free of charge, to any person obtaining a copy 7 | # of this software and associated documentation files (the "Software"), to deal 8 | # in the Software without restriction, including without limitation the rights 9 | # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | # copies of the Software, and to permit persons to whom the Software is 11 | # furnished to do so, subject to the following conditions: 12 | # 13 | # The above copyright notice and this permission notice shall be included in all 14 | # copies or substantial portions of the Software. 15 | # 16 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | # SOFTWARE. 23 | """ 24 | Common provenance methods used to determine the state and versions 25 | of various dependencies and the OS. 26 | """ 27 | import json 28 | import os.path 29 | import platform 30 | import sys 31 | import time 32 | 33 | try: 34 | import resource 35 | except ImportError: 36 | resource = None # resource.getrusage absent on windows 37 | 38 | import jsonschema 39 | 40 | import _tskit 41 | import tskit.exceptions as exceptions 42 | from . import _version 43 | 44 | __version__ = _version.tskit_version 45 | 46 | 47 | # NOTE: the APIs here are all preliminary. We should have a class that encapsulates 48 | # all of the required functionality, including parsing and printing out provenance 49 | # records. This will replace the current functions. 50 | 51 | 52 | def get_environment(extra_libs=None, include_tskit=True): 53 | """ 54 | Returns a dictionary describing the environment in which tskit 55 | is currently running. 56 | 57 | This API is tentative and will change in the future when a more 58 | comprehensive provenance API is implemented. 59 | """ 60 | env = { 61 | "os": { 62 | "system": platform.system(), 63 | "node": platform.node(), 64 | "release": platform.release(), 65 | "version": platform.version(), 66 | "machine": platform.machine(), 67 | }, 68 | "python": { 69 | "implementation": platform.python_implementation(), 70 | "version": platform.python_version(), 71 | }, 72 | } 73 | libs = {"kastore": {"version": ".".join(map(str, _tskit.get_kastore_version()))}} 74 | if include_tskit: 75 | libs["tskit"] = {"version": __version__} 76 | if extra_libs is not None: 77 | libs.update(extra_libs) 78 | env["libraries"] = libs 79 | return env 80 | 81 | 82 | def get_resources(start_time): 83 | # Returns a dict describing the resources used by the current process 84 | times = os.times() 85 | ret = { 86 | "elapsed_time": time.time() - start_time, 87 | "user_time": times.user + times.children_user, 88 | "sys_time": times.system + times.children_system, 89 | } 90 | if resource is not None: 91 | # Don't report max memory on Windows, we would need an external dep like psutil 92 | ret["max_memory"] = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss 93 | if sys.platform != "darwin": 94 | ret["max_memory"] *= 1024 # Linux, freeBSD et al reports in KiB, not bytes 95 | 96 | return ret 97 | 98 | 99 | def get_provenance_dict(parameters=None): 100 | """ 101 | Returns a dictionary encoding an execution of tskit conforming to the 102 | provenance schema. 103 | """ 104 | document = { 105 | "schema_version": "1.0.0", 106 | "software": {"name": "tskit", "version": __version__}, 107 | "parameters": parameters, 108 | "environment": get_environment(include_tskit=False), 109 | } 110 | return document 111 | 112 | 113 | # Cache the schema 114 | _schema = None 115 | 116 | 117 | def get_schema(): 118 | """ 119 | Returns the tskit provenance :ref:`provenance schema ` as 120 | a dict. 121 | 122 | :return: The provenance schema. 123 | :rtype: dict 124 | """ 125 | global _schema 126 | if _schema is None: 127 | base = os.path.dirname(__file__) 128 | schema_file = os.path.join(base, "provenance.schema.json") 129 | with open(schema_file) as f: 130 | _schema = json.load(f) 131 | # Return a copy to avoid issues with modifying the cached schema 132 | return dict(_schema) 133 | 134 | 135 | def validate_provenance(provenance): 136 | """ 137 | Validates the specified dict-like object against the tskit 138 | :ref:`provenance schema `. If the input does 139 | not represent a valid instance of the schema an exception is 140 | raised. 141 | 142 | :param dict provenance: The dictionary representing a JSON document 143 | to be validated against the schema. 144 | :raises ProvenanceValidationError: if the schema is not valid. 145 | """ 146 | schema = get_schema() 147 | try: 148 | jsonschema.validate(provenance, schema) 149 | except jsonschema.exceptions.ValidationError as ve: 150 | raise exceptions.ProvenanceValidationError from ve 151 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_y_axis_rank.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Time (relative steps) 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 0.00 19 | 20 | 21 | 22 | 23 | 24 | 25 | 0.11 26 | 27 | 28 | 29 | 30 | 31 | 32 | 1.11 33 | 34 | 35 | 36 | 37 | 38 | 39 | 5.31 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 0 52 | 53 | 54 | 55 | 56 | 1 57 | 58 | 59 | 60 | 61 | 62 | 3 63 | 64 | 65 | 66 | 67 | 4 68 | 69 | 70 | 4 71 | 72 | 73 | 74 | 75 | 76 | 2 77 | 78 | 79 | 80 | 81 | 82 | 83 | 6 84 | 85 | 86 | 3 87 | 88 | 89 | 90 | 5 91 | 92 | 93 | 94 | 95 | 96 | 5 97 | 98 | 99 | 7 100 | 101 | 102 | 103 | 104 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_x_axis.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | pos on genome 11 | 12 | 13 | 14 | 15 | 16 | 17 | 0.06 18 | 19 | 20 | 21 | 22 | 23 | 0.79 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 0 57 | 58 | 59 | 60 | 61 | 1 62 | 63 | 64 | 65 | 66 | 67 | 3 68 | 69 | 70 | 71 | 72 | 4 73 | 74 | 75 | 4 76 | 77 | 78 | 79 | 80 | 81 | 2 82 | 83 | 84 | 85 | 86 | 87 | 88 | 6 89 | 90 | 91 | 3 92 | 93 | 94 | 95 | 5 96 | 97 | 98 | 99 | 100 | 101 | 5 102 | 103 | 104 | 7 105 | 106 | 107 | 108 | 109 | -------------------------------------------------------------------------------- /c/meson.build: -------------------------------------------------------------------------------- 1 | project('tskit', ['c', 'cpp'], 2 | version: files('VERSION.txt'), 3 | default_options: ['c_std=c99', 'cpp_std=c++11'] 4 | ) 5 | 6 | debug_c_args = [] 7 | if get_option('buildtype').startswith('debug') 8 | debug_c_args = ['-DTSK_TRACE_ERRORS'] 9 | endif 10 | 11 | kastore_proj = subproject('kastore') 12 | kastore_dep = kastore_proj.get_variable('kastore_dep') 13 | kastore_inc = kastore_proj.get_variable('kastore_inc') 14 | 15 | cc = meson.get_compiler('c') 16 | m_dep = cc.find_library('m', required: false) 17 | lib_deps = [m_dep, kastore_dep] 18 | 19 | extra_c_args = [ 20 | '-Wall', '-Wextra', '-Werror', '-Wpedantic', '-W', 21 | '-Wmissing-prototypes', '-Wstrict-prototypes', 22 | '-Wconversion', '-Wshadow', '-Wpointer-arith', '-Wcast-align', 23 | '-Wcast-qual', '-Wwrite-strings', '-Wnested-externs', 24 | '-fshort-enums', '-fno-common'] + debug_c_args 25 | 26 | lib_sources = [ 27 | 'tskit/core.c', 'tskit/tables.c', 'tskit/trees.c', 28 | 'tskit/genotypes.c', 'tskit/stats.c', 'tskit/convert.c', 'tskit/haplotype_matching.c'] 29 | lib_headers = [ 30 | 'tskit/core.h', 'tskit/tables.h', 'tskit/trees.h', 31 | 'tskit/genotypes.h', 'tskit/stats.h', 'tskit/convert.h', 'tskit/haplotype_matching.h'] 32 | 33 | # Subprojects use the static library for simplicity. 34 | tskit_inc = [kastore_inc, include_directories(['.'])] 35 | tskit_lib = static_library('tskit', 36 | sources: lib_sources, dependencies: lib_deps) 37 | tskit_dep = declare_dependency(include_directories:tskit_inc, link_with: tskit_lib) 38 | 39 | if not meson.is_subproject() 40 | 41 | # Shared library install target. 42 | shared_library('tskit', 43 | sources: lib_sources, dependencies: lib_deps, c_args: extra_c_args, install: true) 44 | install_headers('tskit.h') 45 | install_headers(lib_headers, subdir: 'tskit') 46 | 47 | cunit_dep = dependency('cunit') 48 | # We don't specify extra C args here as CUnit won't pass the checks. 49 | test_lib = static_library('testlib', 50 | sources: ['tests/testlib.c'], dependencies: [cunit_dep, kastore_dep, tskit_dep]) 51 | 52 | test_core = executable('test_core', 53 | sources: ['tests/test_core.c'], 54 | link_with: [tskit_lib, test_lib], 55 | c_args: extra_c_args+['-DMESON_PROJECT_VERSION="@0@"'.format(meson.project_version())], 56 | dependencies: kastore_dep, 57 | ) 58 | test('core', test_core) 59 | 60 | test_tables = executable('test_tables', 61 | sources: ['tests/test_tables.c'], 62 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 63 | test('tables', test_tables) 64 | 65 | test_trees = executable('test_trees', 66 | sources: ['tests/test_trees.c'], 67 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 68 | test('trees', test_trees) 69 | 70 | test_genotypes = executable('test_genotypes', 71 | sources: ['tests/test_genotypes.c'], 72 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 73 | test('genotypes', test_genotypes) 74 | 75 | test_convert = executable('test_convert', 76 | sources: ['tests/test_convert.c'], 77 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 78 | test('convert', test_convert) 79 | 80 | test_stats = executable('test_stats', 81 | sources: ['tests/test_stats.c'], 82 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 83 | test('stats', test_stats) 84 | 85 | test_haplotype_matching = executable('test_haplotype_matching', 86 | sources: ['tests/test_haplotype_matching.c'], 87 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 88 | test('haplotype_matching', test_haplotype_matching) 89 | 90 | test_file_format = executable('test_file_format', 91 | sources: ['tests/test_file_format.c'], 92 | link_with: [tskit_lib, test_lib], c_args: extra_c_args, dependencies: kastore_dep) 93 | test('file_format', test_file_format) 94 | 95 | test_minimal_cpp = executable('test_minimal_cpp', 96 | sources: ['tests/test_minimal_cpp.cpp'], link_with: [tskit_lib], 97 | dependencies: kastore_dep) 98 | test('minimal_cpp', test_minimal_cpp) 99 | 100 | if get_option('build_examples') 101 | # These example programs use less portable features, 102 | # and we don't want to always compile them. Use, e.g., 103 | # meson build -Dbuild_examples=false 104 | executable('api_structure', 105 | sources: ['examples/api_structure.c'], 106 | link_with: [tskit_lib], dependencies: lib_deps) 107 | executable('error_handling', 108 | sources: ['examples/error_handling.c'], 109 | link_with: [tskit_lib], dependencies: lib_deps) 110 | executable('tree_iteration', 111 | sources: ['examples/tree_iteration.c'], 112 | link_with: [tskit_lib], dependencies: lib_deps) 113 | executable('tree_traversal', 114 | sources: ['examples/tree_traversal.c'], 115 | link_with: [tskit_lib], dependencies: lib_deps) 116 | executable('streaming', 117 | sources: ['examples/streaming.c'], 118 | link_with: [tskit_lib], dependencies: lib_deps) 119 | executable('cpp_sorting_example', 120 | sources: ['examples/cpp_sorting_example.cpp'], 121 | link_with: [tskit_lib], dependencies: lib_deps) 122 | executable('haploid_wright_fisher', 123 | sources: ['examples/haploid_wright_fisher.c'], 124 | link_with: [tskit_lib], dependencies: lib_deps) 125 | executable('multichrom_wright_fisher_singlethreaded', 126 | sources: ['examples/multichrom_wright_fisher_singlethreaded.c'], 127 | link_with: [tskit_lib], dependencies: lib_deps) 128 | 129 | thread_dep = dependency('threads') 130 | executable('multichrom_wright_fisher', 131 | sources: ['examples/multichrom_wright_fisher.c'], 132 | link_with: [tskit_lib], dependencies: [m_dep, kastore_dep, thread_dep]) 133 | endif 134 | endif 135 | -------------------------------------------------------------------------------- /python/benchmark/config.yaml: -------------------------------------------------------------------------------- 1 | setup: | 2 | import tskit 3 | 4 | benchmarks: 5 | - code: ts = tskit.load("{filename}") 6 | parameters: 7 | filename: &files 8 | - "tiny.trees" 9 | - "bench.trees" 10 | 11 | - code: ts.dump("/dev/null");"{filename}" 12 | setup: | 13 | ts = tskit.load("{filename}") 14 | parameters: 15 | filename: *files 16 | 17 | - code: ts.write_vcf(null) 18 | #, site_mask=site_mask, sample_mask=sample_mask) 19 | setup: | 20 | import numpy 21 | ts = tskit.load("bench.trees") 22 | tables = ts.tables 23 | tables.migrations.clear() 24 | ts = tables.tree_sequence() 25 | ts = ts.simplify(samples=list(range(1000))) 26 | null = open("/dev/null", "w") 27 | 28 | - code: tree = ts.first();"{filename}" 29 | setup: ts = tskit.load("{filename}") 30 | parameters: 31 | filename: *files 32 | 33 | - name: tree.seek() 34 | # We can't just repeatedly seek to the same position as this will be a noop, 35 | # so we go back and forth. 36 | code: | 37 | tree.seek(pos) 38 | pos = 0 if pos == 500_000 else 500_000 39 | setup: | 40 | ts = tskit.load("bench.trees") 41 | tree = ts.first() 42 | pos = 500_000 43 | 44 | - code: "for _ in ts.trees(): pass;'{filename}'" 45 | setup: ts = tskit.load("{filename}") 46 | parameters: 47 | filename: *files 48 | 49 | - code: tree.{array} 50 | setup: | 51 | ts = tskit.load("bench.trees") 52 | tree = ts.first() 53 | parameters: 54 | array: &tree_arrays 55 | - parent_array 56 | - left_child_array 57 | - right_child_array 58 | - left_sib_array 59 | - right_sib_array 60 | - num_children_array 61 | - edge_array 62 | 63 | - code: tree.{array}(42); 64 | setup: | 65 | ts = tskit.load("bench.trees") 66 | tree = ts.first() 67 | parameters: 68 | array: 69 | - parent 70 | - left_child 71 | - right_child 72 | - left_sib 73 | - right_sib 74 | - num_children 75 | - edge 76 | 77 | - code: tree.{traversal_order}() 78 | setup: | 79 | ts = tskit.load("bench.trees") 80 | tree = ts.first() 81 | parameters: 82 | traversal_order: &traversal_orders 83 | - postorder 84 | - preorder 85 | - timeasc 86 | - timedesc 87 | 88 | - code: "for v in ts.variants(): pass;'{filename}'" 89 | setup: ts = tskit.load("{filename}") 90 | parameters: 91 | filename: *files 92 | 93 | - code: "ts.genotype_matrix();'{filename}'" 94 | setup: | 95 | ts = tskit.load("{filename}") 96 | if ts.num_samples > 10_000: 97 | tables = ts.tables 98 | tables.migrations.clear() 99 | ts = tables.tree_sequence() 100 | ts = ts.simplify(samples=list(range(1000))) 101 | parameters: 102 | filename: *files 103 | 104 | - code: "for row in ts.{table}(): pass" 105 | setup: ts = tskit.load("bench.trees") 106 | parameters: 107 | table: &tables 108 | - nodes 109 | - edges 110 | - sites 111 | - mutations 112 | - populations 113 | - individuals 114 | - migrations 115 | - provenances 116 | 117 | - code: "for row in ts.populations(): {decode_metadata}" 118 | setup : | 119 | tc = tskit.TableCollection(1) 120 | tc.populations.metadata_schema = tskit.MetadataSchema({{'codec':'json'}}) 121 | for i in range(1000): 122 | tc.populations.add_row(metadata={{'a': i}}) 123 | ts = tc.tree_sequence() 124 | parameters: 125 | decode_metadata: 126 | - "pass" 127 | - "row.metadata" 128 | 129 | - code: ts.{table}(1) 130 | setup: | 131 | ts = tskit.load("bench.trees") 132 | parameters: 133 | table: 134 | - node 135 | - edge 136 | - site 137 | - mutation 138 | - population 139 | - individual 140 | - migration 141 | - provenance 142 | 143 | - code: ts.tables 144 | setup: ts = tskit.load("bench.trees") 145 | 146 | - code: tables.{table} 147 | setup: | 148 | ts = tskit.load("bench.trees") 149 | tables = ts.tables 150 | parameters: 151 | table: *tables 152 | 153 | - code: x = {table}.{column} 154 | setup: | 155 | ts = tskit.load("bench.trees") 156 | tables = ts.tables 157 | {table} = tables.{table} 158 | parameters: &table_columns 159 | table: 160 | nodes: 161 | column: 162 | - flags 163 | - time 164 | - population 165 | - individual 166 | - metadata 167 | - metadata_offset 168 | individuals: 169 | column: 170 | - flags 171 | - location 172 | - location_offset 173 | - parents 174 | - metadata 175 | edges: 176 | column: 177 | - left 178 | - right 179 | - parent 180 | - child 181 | - metadata 182 | - metadata_offset 183 | sites: 184 | column: 185 | - position 186 | - ancestral_state 187 | - ancestral_state_offset 188 | - metadata 189 | - metadata_offset 190 | mutations: 191 | column: 192 | - site 193 | - node 194 | - parent 195 | - time 196 | - derived_state 197 | - derived_state_offset 198 | - metadata 199 | - metadata_offset 200 | migrations: 201 | column: 202 | - left 203 | - right 204 | - node 205 | - source 206 | - dest 207 | - time 208 | - metadata 209 | - metadata_offset 210 | populations: 211 | column: 212 | - metadata 213 | - metadata_offset 214 | provenances: 215 | column: 216 | - timestamp 217 | - timestamp_offset 218 | - record 219 | - record_offset 220 | -------------------------------------------------------------------------------- /docs/ibd.md: -------------------------------------------------------------------------------- 1 | --- 2 | jupytext: 3 | text_representation: 4 | extension: .md 5 | format_name: myst 6 | format_version: 0.12 7 | jupytext_version: 1.9.1 8 | kernelspec: 9 | display_name: Python 3 10 | language: python 11 | name: python3 12 | --- 13 | 14 | ```{currentmodule} tskit 15 | ``` 16 | 17 | 18 | (sec_identity)= 19 | 20 | # Identity by descent 21 | 22 | The {meth}`.TreeSequence.ibd_segments` method allows us to compute 23 | segments of identity by descent. 24 | 25 | :::{note} 26 | This documentation page is preliminary 27 | ::: 28 | 29 | :::{todo} 30 | Relate the concept of identity by descent to the MRCA spans in the tree sequence. 31 | ::: 32 | 33 | ## Examples 34 | 35 | Let's take a simple tree sequence to illustrate the {meth}`.TreeSequence.ibd_segments` 36 | method and associated {ref}`sec_python_api_reference_identity`: 37 | 38 | ```{code-cell} 39 | :tags: [hide-input] 40 | 41 | import tskit 42 | import io 43 | from IPython.display import SVG 44 | 45 | nodes = io.StringIO( 46 | """\ 47 | id is_sample time 48 | 0 1 0 49 | 1 1 0 50 | 2 1 0 51 | 3 0 1 52 | 4 0 2 53 | 5 0 3 54 | """ 55 | ) 56 | edges = io.StringIO( 57 | """\ 58 | left right parent child 59 | 2 10 3 0 60 | 2 10 3 2 61 | 0 10 4 1 62 | 0 2 4 2 63 | 2 10 4 3 64 | 0 2 5 0 65 | 0 2 5 4 66 | """ 67 | ) 68 | ts = tskit.load_text(nodes=nodes, edges=edges, strict=False) 69 | 70 | SVG(ts.draw_svg()) 71 | ``` 72 | 73 | ### Definition 74 | 75 | A pair of nodes ``(u, v)`` has an IBD segment with a left and right 76 | coordinate ``[left, right)`` and ancestral node ``a`` iff the most 77 | recent common ancestor of the segment ``[left, right)`` in nodes ``u`` 78 | and ``v`` is ``a``, and the segment has been inherited along the same 79 | genealogical path (ie. it has not been broken by recombination). The 80 | segments returned are the longest possible ones. 81 | 82 | Consider the IBD segments that we get from our example tree sequence: 83 | 84 | ```{code-cell} 85 | segments = ts.ibd_segments(store_segments=True) 86 | for pair, segment_list in segments.items(): 87 | print(pair, list(segment_list)) 88 | ``` 89 | 90 | Each of the sample pairs (0, 1), (0, 2) and (1, 2) is associated with 91 | two IBD segments, representing the different paths from these sample 92 | pairs to their common ancestor. Note in particular that (1, 2) has 93 | **two** IBD segments rather than one: even though the MRCA is 94 | 4 in both cases, the paths from the samples to the MRCA are different 95 | in the left and right trees. 96 | 97 | 98 | ### Data structures 99 | 100 | The result of calling {meth}`.TreeSequence.ibd_segments` is an 101 | {class}`.IdentitySegments` class: 102 | 103 | ```{code-cell} 104 | segments = ts.ibd_segments() 105 | print(segments) 106 | ``` 107 | 108 | By default this class only stores the high-level summaries of the 109 | IBD segments discovered. As we can see in this example, we have a 110 | total of six segments and 111 | the total span (i.e., the sum lengths of the genomic intervals spanned 112 | by IBD segments) is 30. 113 | 114 | If required, we can get more detailed information about particular 115 | segment pairs and the actual segments using the ``store_pairs`` 116 | and ``store_segments`` arguments. 117 | 118 | :::{warning} 119 | Only use the ``store_pairs`` and ``store_segments`` arguments if you 120 | really need this information! The number of IBD segments can be 121 | very large and storing them all requires a lot of memory. It is 122 | also much faster to just compute the overall summaries, without 123 | needing to store the actual lists. 124 | ::: 125 | 126 | 127 | ```{code-cell} 128 | segments = ts.ibd_segments(store_pairs=True) 129 | for pair, value in segments.items(): 130 | print(pair, "::", value) 131 | ``` 132 | 133 | Now we can see the more detailed breakdown of how the identity segments 134 | are distributed among the sample pairs. The {class}`.IdentitySegments` 135 | class behaves like a dictionary, such that ``segments[(a, b)]`` will return 136 | the {class}`.IdentitySegmentList` instance for that pair of samples: 137 | 138 | ```{code-cell} 139 | seglist = segments[(0, 1)] 140 | print(seglist) 141 | ``` 142 | 143 | If we want to access the detailed information about the actual 144 | identity segments, we must use the ``store_segments`` argument: 145 | 146 | ```{code-cell} 147 | segments = ts.ibd_segments(store_pairs=True, store_segments=True) 148 | segments[(0, 1)] 149 | ``` 150 | 151 | The {class}`.IdentitySegmentList` behaves like a Python list, 152 | where each element is an instance of {class}`.IdentitySegment`. 153 | 154 | :::{warning} 155 | The order of segments in an {class}`.IdentitySegmentList` 156 | is arbitrary, and may change in future versions. 157 | ::: 158 | 159 | 160 | ```{eval-rst} 161 | .. todo:: More examples using the other bits of the IdentitySegments 162 | API here 163 | ``` 164 | 165 | ### Controlling the sample sets 166 | 167 | By default we get the IBD segments between all pairs of 168 | {ref}`sample` nodes. 169 | 170 | #### IBD within a sample set 171 | We can reduce this to pairs within a specific set using the 172 | ``within`` argument: 173 | 174 | 175 | ```{eval-rst} 176 | .. todo:: More detail and better examples here. 177 | ``` 178 | 179 | ```{code-cell} 180 | segments = ts.ibd_segments(within=[0, 2], store_pairs=True) 181 | print(list(segments.keys())) 182 | ``` 183 | 184 | #### IBD between sample sets 185 | 186 | We can also compute IBD **between** sample sets: 187 | 188 | ```{code-cell} 189 | segments = ts.ibd_segments(between=[[0,1], [2]], store_pairs=True) 190 | print(list(segments.keys())) 191 | ``` 192 | 193 | :::{seealso} 194 | See the {meth}`.TreeSequence.ibd_segments` documentation for 195 | more details. 196 | ::: 197 | 198 | ### Constraints on the segments 199 | 200 | The ``max_time`` and ``min_span`` arguments allow us to constrain the 201 | segments that we consider. 202 | 203 | ```{eval-rst} 204 | .. todo:: Add examples for these arguments. 205 | ``` 206 | -------------------------------------------------------------------------------- /python/tests/data/svg/tree_poly_tracked_collapse.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 20 15 | 16 | 17 | 18 | 19 | +3/𝟑 20 | This polytomy has 3 additional branches, leading to a total of 3 descendant samples 21 | 22 | 23 | 24 | 25 | 34 26 | 27 | 28 | 29 | 30 | +3/𝟑 31 | This polytomy has 3 additional branches, leading to a total of 3 descendant samples 32 | 33 | 34 | 35 | 36 | 35 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | +2 45 | A collapsed non-sample node with 2 descendant samples in this tree 46 | 47 | 36 48 | 49 | 50 | 51 | 52 | 53 | 23 54 | 55 | 56 | 57 | 58 | 24 59 | 60 | 61 | 62 | 37 63 | 64 | 65 | 66 | 67 | 68 | 25 69 | 70 | 71 | 72 | 73 | 26 74 | 75 | 76 | 77 | 38 78 | 79 | 80 | 81 | 82 | 83 | 84 | +3 85 | A collapsed non-sample node with 3 descendant samples in this tree 86 | 87 | 39 88 | 89 | 90 | 91 | 40 92 | 93 | 94 | 95 | 96 | +14/𝟐 97 | This polytomy has 2 additional branches, leading to a total of 14 descendant samples 98 | 99 | 100 | 101 | 41 102 | 103 | 104 | 105 | 106 | --------------------------------------------------------------------------------