2 |
3 | [](https://gitter.im/phylopandas/Lobby)
4 | [](http://phylopandas.readthedocs.io/en/latest/?badge=latest)
5 | [](https://travis-ci.org/Zsailer/phylopandas)
6 | [](https://mybinder.org/v2/gh/Zsailer/phylopandas/master?filepath=examples%2Fintro-notebook.ipynb)
7 |
8 | **Bringing the [Pandas](https://github.com/pandas-dev/pandas) `DataFrame` to phylogenetics.**
9 |
10 |
11 | PhyloPandas provides a Pandas-like interface for reading sequence and phylogenetic tree data into pandas DataFrames. This enables easy manipulation of phylogenetic data using familiar Python/Pandas functions. Finally, phylogenetics for humans!
12 |
13 |
14 |
15 | ## How does it work?
16 |
17 | Don't worry, we didn't reinvent the wheel. **PhyloPandas** is simply a [DataFrame](https://github.com/pandas-dev/pandas)
18 | (great for human-accessible data storage) interface on top of [Biopython](https://github.com/biopython/biopython) (great for parsing/writing sequence data) and [DendroPy](https://github.com/jeetsukumaran/DendroPy) (great for reading tree data).
19 |
20 | PhyloPandas does two things:
21 | 1. It offers new `read` functions to read sequence/tree data directly into a DataFrame.
22 | 2. It attaches a new `phylo` **accessor** to the Pandas DataFrame. This accessor provides writing methods for sequencing/tree data (powered by Biopython and dendropy).
23 |
24 | ## Basic Usage
25 |
26 | **Sequence data:**
27 |
28 | Read in a sequence file.
29 | ```python
30 | import phylopandas as ph
31 |
32 | df1 = ph.read_fasta('sequences.fasta')
33 | df2 = ph.read_phylip('sequences.phy')
34 | ```
35 |
36 | Write to various sequence file formats.
37 |
38 | ```python
39 | df1.phylo.to_clustal('sequences.clustal')
40 | ```
41 |
42 | Convert between formats.
43 |
44 | ```python
45 | # Read a format.
46 | df = ph.read_fasta('sequences.fasta')
47 |
48 | # Write to a different format.
49 | df.phylo.to_phylip('sequences.phy')
50 | ```
51 |
52 | **Tree data:**
53 |
54 | Read newick tree data
55 | ```python
56 | df = ph.read_newick('tree.newick')
57 | ```
58 |
59 | Visualize the phylogenetic data (powered by [phylovega](https://github.com/Zsailer/phylovega)).
60 | ```python
61 | df.phylo.display(
62 | height=500,
63 | )
64 | ```
65 |
66 |
67 |
68 | ## Contributing
69 |
70 | If you have ideas for the project, please share them on the project's [Gitter chat](https://gitter.im/phylopandas/Lobby).
71 |
72 | It's *easy* to create new read/write functions and methods for PhyloPandas. If you
73 | have a format you'd like to add, please submit PRs! There are many more formats
74 | in Biopython that I haven't had the time to add myself, so please don't be afraid
75 | to add them! I thank you ahead of time!
76 |
77 | ## Testing
78 |
79 | PhyloPandas includes a small [pytest](https://docs.pytest.org/en/latest/) suite. Run these tests from base directory.
80 | ```
81 | $ cd phylopandas
82 | $ pytest
83 | ```
84 |
85 | ## Install
86 |
87 | Install from PyPI:
88 | ```
89 | pip install phylopandas
90 | ```
91 |
92 | Install from source:
93 | ```
94 | git clone https://github.com/Zsailer/phylopandas
95 | cd phylopandas
96 | pip install -e .
97 | ```
98 |
99 | ## Dependencies
100 |
101 | - [BioPython](https://github.com/biopython/biopython): Library for managing and manipulating biological data.
102 | - [DendroPy](https://github.com/jeetsukumaran/DendroPy): Library for phylogenetic scripting, simulation, data processing and manipulation
103 | - [Pandas](https://github.com/pandas-dev/pandas): Flexible and powerful data analysis / manipulation library for Python
104 | - [pandas_flavor](https://github.com/Zsailer/pandas_flavor): Flavor pandas objects with new accessors using pandas' new register API (with backwards compatibility).
105 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | # -*- coding: utf-8 -*-
3 |
4 | # Note: To use the 'upload' functionality of this file, you must:
5 | # $ pip install twine
6 |
7 | import io
8 | import os
9 | import sys
10 | from shutil import rmtree
11 |
12 | from setuptools import find_packages, setup, Command
13 |
14 | # Package meta-data.
15 | NAME = 'phylopandas'
16 | DESCRIPTION = 'Pandas for phylogenetics'
17 | URL = 'https://github.com/Zsailer/phylopandas'
18 | EMAIL = 'zachsailer@gmail.com'
19 | AUTHOR = 'Zachary Sailer'
20 | REQUIRES_PYTHON = '>=3.0'
21 | VERSION = None
22 |
23 | # What packages are required for this module to be executed?
24 | REQUIRED = ["pandas>=0.22.0",
25 | "pandas_flavor>=0.1.0",
26 | "biopython",
27 | "dendropy"]
28 |
29 | # What packages are optional?
30 | EXTRAS = {}
31 |
32 | # The rest you shouldn't have to touch too much :)
33 | # ------------------------------------------------
34 | # Except, perhaps the License and Trove Classifiers!
35 | # If you do change the License, remember to change the Trove Classifier for that!
36 |
37 | here = os.path.abspath(os.path.dirname(__file__))
38 |
39 | # Import the README and use it as the long-description.
40 | # Note: this will only work if 'README.md' is present in your MANIFEST.in file!
41 | try:
42 | with io.open(os.path.join(here, 'README.md'), encoding='utf-8') as f:
43 | long_description = '\n' + f.read()
44 | except FileNotFoundError:
45 | long_description = DESCRIPTION
46 |
47 | # Load the package's __version__.py module as a dictionary.
48 | about = {}
49 | if not VERSION:
50 | with open(os.path.join(here, NAME, '__version__.py')) as f:
51 | exec(f.read(), about)
52 | else:
53 | about['__version__'] = VERSION
54 |
55 |
56 | class UploadCommand(Command):
57 | """Support setup.py upload."""
58 |
59 | description = 'Build and publish the package.'
60 | user_options = []
61 |
62 | @staticmethod
63 | def status(s):
64 | """Prints things in bold."""
65 | print('\033[1m{0}\033[0m'.format(s))
66 |
67 | def initialize_options(self):
68 | pass
69 |
70 | def finalize_options(self):
71 | pass
72 |
73 | def run(self):
74 | try:
75 | self.status('Removing previous builds…')
76 | rmtree(os.path.join(here, 'dist'))
77 | except OSError:
78 | pass
79 |
80 | self.status('Building Source and Wheel (universal) distribution…')
81 | os.system('{0} setup.py sdist bdist_wheel --universal'.format(sys.executable))
82 |
83 | self.status('Uploading the package to PyPI via Twine…')
84 | os.system('twine upload dist/*')
85 |
86 | self.status('Pushing git tags…')
87 | os.system('git tag v{0}'.format(about['__version__']))
88 | os.system('git push --tags')
89 |
90 | sys.exit()
91 |
92 | # Where the magic happens:
93 | setup(
94 | name=NAME,
95 | version=about['__version__'],
96 | description=DESCRIPTION,
97 | long_description=long_description,
98 | long_description_content_type='text/markdown',
99 | author=AUTHOR,
100 | author_email=EMAIL,
101 | python_requires=REQUIRES_PYTHON,
102 | url=URL,
103 | packages=find_packages(exclude=('tests',)),
104 | # If your package is a single module, use this instead of 'packages':
105 | # py_modules=['mypackage'],
106 |
107 | # entry_points={
108 | # 'console_scripts': ['mycli=mymodule:cli'],
109 | # },
110 | install_requires=REQUIRED,
111 | extras_require=EXTRAS,
112 | include_package_data=True,
113 | license='MIT',
114 | classifiers=[
115 | # Trove classifiers
116 | # Full list: https://pypi.python.org/pypi?%3Aaction=list_classifiers
117 | 'License :: OSI Approved :: MIT License',
118 | 'Programming Language :: Python',
119 | 'Programming Language :: Python :: 3',
120 | 'Programming Language :: Python :: 3.6',
121 | 'Programming Language :: Python :: Implementation :: CPython',
122 | 'Programming Language :: Python :: Implementation :: PyPy'
123 | ],
124 | # $ setup.py publish support.
125 | cmdclass={
126 | 'upload': UploadCommand,
127 | },
128 | )
129 |
--------------------------------------------------------------------------------
/phylopandas/seqio/read.py:
--------------------------------------------------------------------------------
1 | __doc__ = """
2 | Functions for reading sequence files into pandas DataFrame.
3 | """
4 |
5 | # Imports
6 | from Bio import SeqIO
7 | from Bio.Seq import Seq
8 | from Bio.SeqRecord import SeqRecord
9 | from Bio.Blast import NCBIXML
10 | import Bio.Alphabet
11 |
12 | # Import Phylopandas DataFrame
13 | import pandas as pd
14 | from ..utils import get_random_id
15 |
16 |
17 | def _read_doc_template(schema):
18 | s = """Read a {} file.
19 |
20 | Construct a PhyloPandas DataFrame with columns:
21 | - name
22 | - id
23 | - description
24 | - sequence
25 |
26 | Parameters
27 | ----------
28 | filename : str
29 | File name of {} file.
30 |
31 | seq_label : str (default='sequence')
32 | Sequence column name in DataFrame.
33 | """.format(schema, schema, schema)
34 | return s
35 |
36 |
37 | def _read(
38 | filename,
39 | schema,
40 | seq_label='sequence',
41 | alphabet=None,
42 | use_uids=True,
43 | **kwargs):
44 | """Use BioPython's sequence parsing module to convert any file format to
45 | a Pandas DataFrame.
46 |
47 | The resulting DataFrame has the following columns:
48 | - name
49 | - id
50 | - description
51 | - sequence
52 | """
53 | # Check Alphabet if given
54 | if alphabet is None:
55 | alphabet = Bio.Alphabet.Alphabet()
56 |
57 | elif alphabet in ['dna', 'rna', 'protein', 'nucleotide']:
58 | alphabet = getattr(Bio.Alphabet, 'generic_{}'.format(alphabet))
59 |
60 | else:
61 | raise Exception(
62 | "The alphabet is not recognized. Must be 'dna', 'rna', "
63 | "'nucleotide', or 'protein'.")
64 |
65 | kwargs.update(alphabet=alphabet)
66 |
67 | # Prepare DataFrame fields.
68 | data = {
69 | 'id': [],
70 | seq_label: [],
71 | 'description': [],
72 | 'label': []
73 | }
74 | if use_uids:
75 | data['uid'] = []
76 |
77 | # Parse Fasta file.
78 | for i, s in enumerate(SeqIO.parse(filename, format=schema, **kwargs)):
79 | data['id'].append(s.id)
80 | data[seq_label].append(str(s.seq))
81 | data['description'].append(s.description)
82 | data['label'].append(s.name)
83 |
84 | if use_uids:
85 | data['uid'].append(get_random_id(10))
86 |
87 | # Port to DataFrame.
88 | return pd.DataFrame(data)
89 |
90 |
91 | def _read_method(schema):
92 | """Add a write method for named schema to a class.
93 | """
94 | def func(
95 | self,
96 | filename,
97 | seq_label='sequence',
98 | alphabet=None,
99 | combine_on='uid',
100 | use_uids=True,
101 | **kwargs):
102 | # Use generic write class to write data.
103 | df0 = self._data
104 | df1 = _read(
105 | filename=filename,
106 | schema=schema,
107 | seq_label=seq_label,
108 | alphabet=alphabet,
109 | use_uids=use_uids,
110 | **kwargs
111 | )
112 | return df0.phylo.combine(df1, on=combine_on)
113 |
114 | # Update docs
115 | func.__doc__ = _read_doc_template(schema)
116 | return func
117 |
118 |
119 | def _read_function(schema):
120 | """Add a write method for named schema to a class.
121 | """
122 | def func(
123 | filename,
124 | seq_label='sequence',
125 | alphabet=None,
126 | use_uids=True,
127 | **kwargs):
128 | # Use generic write class to write data.
129 | return _read(
130 | filename=filename,
131 | schema=schema,
132 | seq_label=seq_label,
133 | alphabet=alphabet,
134 | use_uids=use_uids,
135 | **kwargs
136 | )
137 | # Update docs
138 | func.__doc__ = _read_doc_template(schema)
139 | return func
140 |
141 |
142 | # Various read functions to various formats.
143 | read_fasta = _read_function('fasta')
144 | read_phylip = _read_function('phylip')
145 | read_clustal = _read_function('clustal')
146 | read_embl = _read_function('embl')
147 | read_nexus_seq = _read_function('nexus')
148 | read_swiss = _read_function('swiss')
149 | read_fastq = _read_function('fastq')
150 | read_phylip_sequential = _read_function('phylip-sequential')
151 | read_phylip_relaxed = _read_function('phylip-relaxed')
152 |
153 |
154 | def read_blast_xml(filename, **kwargs):
155 | """Read BLAST XML format."""
156 | # Read file.
157 | with open(filename, 'r') as f:
158 | blast_record = NCBIXML.read(f)
159 |
160 | # Prepare DataFrame fields.
161 | data = {'accession': [],
162 | 'hit_def': [],
163 | 'hit_id': [],
164 | 'title': [],
165 | 'length': [],
166 | 'e_value': [],
167 | 'sequence': [],
168 | 'subject_start': [],
169 | 'subject_end':[],
170 | 'query_start':[],
171 | 'query_end':[],
172 | 'uid':[]}
173 |
174 | # Get alignments from blast result.
175 | for i, s in enumerate(blast_record.alignments):
176 | data['accession'].append(s.accession)
177 | data['hit_def'].append(s.hit_def)
178 | data['hit_id'].append(s.hit_id)
179 | data['title'].append(s.title)
180 | data['length'].append(s.length)
181 | data['e_value'].append(s.hsps[0].expect)
182 | data['sequence'].append(s.hsps[0].sbjct)
183 | data['subject_start'].append(s.hsps[0].sbjct_start)
184 | data['subject_end'].append(s.hsps[0].sbjct_end)
185 | data['query_start'].append(s.hsps[0].query_start)
186 | data['query_end'].append(s.hsps[0].query_end)
187 | data['uid'].append(get_random_id(10))
188 |
189 | # Port to DataFrame.
190 | return pd.DataFrame(data)
191 |
--------------------------------------------------------------------------------
/docs/conf.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | #
4 | # phylopandas documentation build configuration file, created by
5 | # sphinx-quickstart on Mon Oct 30 16:22:28 2017.
6 | #
7 | # This file is execfile()d with the current directory set to its
8 | # containing dir.
9 | #
10 | # Note that not all possible configuration values are present in this
11 | # autogenerated file.
12 | #
13 | # All configuration values have a default; values that are commented out
14 | # serve to show the default.
15 |
16 | # If extensions (or modules to document with autodoc) are in another directory,
17 | # add these directories to sys.path here. If the directory is relative to the
18 | # documentation root, use os.path.abspath to make it absolute, like shown here.
19 | #
20 | # import os
21 | # import sys
22 | # sys.path.insert(0, os.path.abspath('.'))
23 |
24 |
25 | # -- General configuration ------------------------------------------------
26 |
27 | # If your documentation needs a minimal Sphinx version, state it here.
28 | #
29 | # needs_sphinx = '1.0'
30 |
31 | # Add any Sphinx extension module names here, as strings. They can be
32 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
33 | # ones.
34 | extensions = ['sphinx.ext.mathjax']
35 |
36 | # Add any paths that contain templates here, relative to this directory.
37 | templates_path = ['_templates']
38 |
39 | # The suffix(es) of source filenames.
40 | # You can specify multiple suffix as a list of string:
41 | #
42 | # source_suffix = ['.rst', '.md']
43 | source_suffix = '.rst'
44 |
45 | # The master toctree document.
46 | master_doc = 'index'
47 |
48 | # General information about the project.
49 | project = 'phylopandas'
50 | copyright = '2017, Zach Sailer'
51 | author = 'Zach Sailer'
52 |
53 | # The version info for the project you're documenting, acts as replacement for
54 | # |version| and |release|, also used in various other places throughout the
55 | # built documents.
56 | #
57 | # The short X.Y version.
58 | version = '0.1.2'
59 | # The full version, including alpha/beta/rc tags.
60 | release = '0.1.2'
61 |
62 | # The language for content autogenerated by Sphinx. Refer to documentation
63 | # for a list of supported languages.
64 | #
65 | # This is also used if you do content translation via gettext catalogs.
66 | # Usually you set "language" from the command line for these cases.
67 | language = None
68 |
69 | # List of patterns, relative to source directory, that match files and
70 | # directories to ignore when looking for source files.
71 | # This patterns also effect to html_static_path and html_extra_path
72 | exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
73 |
74 | # The name of the Pygments (syntax highlighting) style to use.
75 | pygments_style = 'sphinx'
76 |
77 | # If true, `todo` and `todoList` produce output, else they produce nothing.
78 | todo_include_todos = False
79 |
80 |
81 | # -- Options for HTML output ----------------------------------------------
82 |
83 | # The theme to use for HTML and HTML Help pages. See the documentation for
84 | # a list of builtin themes.
85 | #
86 | html_theme = 'alabaster'
87 |
88 | # Theme options are theme-specific and customize the look and feel of a theme
89 | # further. For a list of options available for each theme, see the
90 | # documentation.
91 | #
92 | # html_theme_options = {}
93 |
94 | # Add any paths that contain custom static files (such as style sheets) here,
95 | # relative to this directory. They are copied after the builtin static files,
96 | # so a file named "default.css" will overwrite the builtin "default.css".
97 | html_static_path = ['_static']
98 |
99 | # Custom sidebar templates, must be a dictionary that maps document names
100 | # to template names.
101 | #
102 | # This is required for the alabaster theme
103 | # refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
104 | html_sidebars = {
105 | '**': [
106 | 'relations.html', # needs 'show_related': True theme option to display
107 | 'searchbox.html',
108 | ]
109 | }
110 |
111 |
112 | # -- Options for HTMLHelp output ------------------------------------------
113 |
114 | # Output file base name for HTML help builder.
115 | htmlhelp_basename = 'phylopandasdoc'
116 |
117 |
118 | # -- Options for LaTeX output ---------------------------------------------
119 |
120 | latex_elements = {
121 | # The paper size ('letterpaper' or 'a4paper').
122 | #
123 | # 'papersize': 'letterpaper',
124 |
125 | # The font size ('10pt', '11pt' or '12pt').
126 | #
127 | # 'pointsize': '10pt',
128 |
129 | # Additional stuff for the LaTeX preamble.
130 | #
131 | # 'preamble': '',
132 |
133 | # Latex figure (float) alignment
134 | #
135 | # 'figure_align': 'htbp',
136 | }
137 |
138 | # Grouping the document tree into LaTeX files. List of tuples
139 | # (source start file, target name, title,
140 | # author, documentclass [howto, manual, or own class]).
141 | latex_documents = [
142 | (master_doc, 'phylopandas.tex', 'phylopandas Documentation',
143 | 'Zach Sailer', 'manual'),
144 | ]
145 |
146 |
147 | # -- Options for manual page output ---------------------------------------
148 |
149 | # One entry per manual page. List of tuples
150 | # (source start file, name, description, authors, manual section).
151 | man_pages = [
152 | (master_doc, 'phylopandas', 'phylopandas Documentation',
153 | [author], 1)
154 | ]
155 |
156 |
157 | # -- Options for Texinfo output -------------------------------------------
158 |
159 | # Grouping the document tree into Texinfo files. List of tuples
160 | # (source start file, target name, title, author,
161 | # dir menu entry, description, category)
162 | texinfo_documents = [
163 | (master_doc, 'phylopandas', 'phylopandas Documentation',
164 | author, 'phylopandas', 'One line description of project.',
165 | 'Miscellaneous'),
166 | ]
167 |
--------------------------------------------------------------------------------
/phylopandas/treeio/read.py:
--------------------------------------------------------------------------------
1 | import pandas
2 | import dendropy
3 | from ..utils import get_random_id
4 |
5 | def _read_doc_template(schema):
6 | doc = """
7 | Read a {} tree into a phylopandas.DataFrame.
8 |
9 | The resulting DataFrame has the following columns:
10 | - name: label for each taxa or node.
11 | - id: unique id (created by phylopandas) given to each node.
12 | - type: type of node (leaf, internal, or root).
13 | - parent: parent id. necessary for constructing trees.
14 | - length: length of branch from parent to node.
15 | - distance: distance from root.
16 |
17 | Parameters
18 | ----------
19 | filename: str (default is None)
20 | {} file to read into DataFrame.
21 |
22 | data: str (default is None)
23 | {} string to parse and read into DataFrame.
24 |
25 | add_node_labels: bool
26 | If true, labels the internal nodes with numbers.
27 |
28 | Returns
29 | -------
30 | df: phylopandas.DataFrame
31 | """.format(schema, schema, schema)
32 | return doc
33 |
34 |
35 | def _dendropy_to_dataframe(
36 | tree,
37 | add_node_labels=True,
38 | use_uids=True):
39 | """Convert Dendropy tree to Pandas dataframe."""
40 | # Initialize the data object.
41 | idx = []
42 | data = {
43 | 'type': [],
44 | 'id': [],
45 | 'parent': [],
46 | 'length': [],
47 | 'label': [],
48 | 'distance': []}
49 |
50 | if use_uids:
51 | data['uid'] = []
52 |
53 | # Add labels to internal nodes if set to true.
54 | if add_node_labels:
55 | for i, node in enumerate(tree.internal_nodes()):
56 | node.label = str(i)
57 |
58 | # Check is branch lengths were given.
59 | branch_lengths_given = tree.length() > 0
60 |
61 | for node in tree.nodes():
62 | # Get node type
63 | if node.is_leaf():
64 | type_ = 'leaf'
65 | # Check if node has taxon
66 | if hasattr(node.taxon, 'label'):
67 | label = str(node.taxon.label).replace(' ', '_')
68 | else:
69 | label = None
70 | elif node.is_internal():
71 | type_ = 'node'
72 | label = str(node.label)
73 |
74 | # Set node label and parent.
75 | id_ = label
76 | parent_node = node.parent_node
77 | length = node.edge_length
78 | if length is None:
79 | distance = None
80 | else:
81 | distance = node.distance_from_root()
82 |
83 | # Is this node a root?
84 | if parent_node is None:
85 | parent_label = None
86 | parent_node = None
87 | if length is None and branch_lengths_given:
88 | length = 0
89 | distance = 0
90 | type_ = 'root'
91 |
92 | # Set parent node label
93 | elif parent_node.is_internal():
94 | parent_label = str(parent_node.label)
95 |
96 | else:
97 | raise Exception("Subtree is not attached to tree?")
98 |
99 | # Add this node to the data.
100 | data['type'].append(type_)
101 | data['id'].append(id_)
102 | data['parent'].append(parent_label)
103 | data['length'].append(length)
104 | data['label'].append(label)
105 | data['distance'].append(distance)
106 |
107 | if use_uids:
108 | data['uid'].append(get_random_id(10))
109 |
110 | # Construct dataframe.
111 | df = pandas.DataFrame(data)
112 | return df
113 |
114 |
115 | def _read(
116 | filename=None,
117 | data=None,
118 | schema=None,
119 | add_node_labels=True,
120 | use_uids=True
121 | ):
122 | """Read a phylogenetic tree into a phylopandas.DataFrame.
123 |
124 | The resulting DataFrame has the following columns:
125 | - name: label for each taxa or node.
126 | - id: unique id (created by phylopandas) given to each node.
127 | - type: type of node (leaf, internal, or root).
128 | - parent: parent id. necessary for constructing trees.
129 | - length: length of branch from parent to node.
130 | - distance: distance from root.
131 |
132 | Parameters
133 | ----------
134 | filename: str (default is None)
135 | newick file to read into DataFrame.
136 |
137 | data: str (default is None)
138 | newick string to parse and read into DataFrame.
139 |
140 | add_node_labels: bool
141 | If true, labels the internal nodes with numbers.
142 |
143 | Returns
144 | -------
145 | df: phylopandas.DataFrame.
146 | """
147 | if filename is not None:
148 | # Use Dendropy to parse tree.
149 | tree = dendropy.Tree.get(
150 | path=filename,
151 | schema=schema,
152 | preserve_underscores=True)
153 | elif data is not None:
154 | tree = dendropy.Tree.get(
155 | data=data,
156 | schema=schema,
157 | preserve_underscores=True)
158 | else:
159 | raise Exception('No tree given?')
160 |
161 | df = _dendropy_to_dataframe(
162 | tree,
163 | add_node_labels=add_node_labels,
164 | use_uids=use_uids
165 | )
166 | return df
167 |
168 |
169 | def _read_method(schema):
170 | """Add a write method for named schema to a class.
171 | """
172 | def func(
173 | self,
174 | filename=None,
175 | data=None,
176 | add_node_labels=True,
177 | combine_on='index',
178 | use_uids=True,
179 | **kwargs):
180 | # Use generic write class to write data.
181 | df0 = self._data
182 | df1 = _read(
183 | filename=filename,
184 | data=data,
185 | schema=schema,
186 | add_node_labels=add_node_labels,
187 | use_uids=use_uids,
188 | **kwargs
189 | )
190 | return df0.phylo.combine(df1, on=combine_on)
191 |
192 | # Update docs
193 | func.__doc__ = _read_doc_template(schema)
194 | return func
195 |
196 |
197 | def _read_function(schema):
198 | """Add a write method for named schema to a class.
199 | """
200 | def func(
201 | filename=None,
202 | data=None,
203 | add_node_labels=True,
204 | use_uids=True,
205 | **kwargs):
206 | # Use generic write class to write data.
207 | return _read(
208 | filename=filename,
209 | data=data,
210 | schema=schema,
211 | add_node_labels=add_node_labels,
212 | use_uids=use_uids,
213 | **kwargs
214 | )
215 | # Update docs
216 | func.__doc__ = _read_doc_template(schema)
217 | return func
218 |
219 |
220 | def read_dendropy(
221 | df,
222 | add_node_labels=True,
223 | use_uids=True):
224 | __doc__ = _read_doc_template('dendropy')
225 |
226 | df = _dendropy_to_dataframe(
227 | tree,
228 | add_node_labels=add_node_labels,
229 | use_uids=use_uids
230 | )
231 | return df
232 |
233 | read_newick = _read_function('newick')
234 | read_nexml = _read_function('nexml')
235 | read_nexus_tree = _read_function('nexus')
236 |
--------------------------------------------------------------------------------
/phylopandas/core.py:
--------------------------------------------------------------------------------
1 | # Import pandas
2 | import pandas as pd
3 | from pandas_flavor import register_dataframe_accessor, register_series_accessor
4 |
5 | from . import seqio
6 | from . import treeio
7 |
8 |
9 | try:
10 | from phylovega import TreeChart
11 | except ImportError:
12 | TreeChart = None
13 |
14 |
15 | @register_series_accessor('phylo')
16 | class PhyloPandasSeriesMethods(object):
17 | """
18 | """
19 | def __init__(self, data):
20 | self._data = data
21 |
22 | # -----------------------------------------------------------
23 | # Extra write methods.
24 | # -----------------------------------------------------------
25 |
26 | to_fasta = seqio.write._write_method('fasta')
27 | to_phylip = seqio.write._write_method('phylip')
28 | to_clustal = seqio.write._write_method('clustal')
29 | to_embl = seqio.write._write_method('embl')
30 | to_nexus = seqio.write._write_method('nexus')
31 | to_swiss = seqio.write._write_method('swiss')
32 | to_fastq = seqio.write._write_method('fastq')
33 | to_fasta_twoline = seqio.write._write_method('fasta-2line')
34 | to_phylip_sequential = seqio.write._write_method('phylip-sequential')
35 | to_phylip_relaxed = seqio.write._write_method('phylip-relaxed')
36 |
37 |
38 | @register_dataframe_accessor('phylo')
39 | class PhyloPandasDataFrameMethods(object):
40 | """PhyloPandas accessor to the Pandas DataFrame.
41 |
42 | This accessor adds reading/writing methods to the pandas DataFrame that
43 | are specific to phylogenetic data.
44 | """
45 | def __init__(self, data):
46 | self._data = data
47 |
48 | # -----------------------------------------------------------
49 | # Extra read methods.
50 | # -----------------------------------------------------------
51 |
52 | # Sequence file reading methods
53 | read_fasta = seqio.read._read_method('fasta')
54 | read_phylip = seqio.read._read_method('phylip')
55 | read_clustal = seqio.read._read_method('clustal')
56 | read_embl = seqio.read._read_method('embl')
57 | read_nexus_seq = seqio.read._read_method('nexus')
58 | read_swiss = seqio.read._read_method('swiss')
59 | read_fastq = seqio.read._read_method('fastq')
60 | read_fasta_twoline = seqio.read._read_method('fasta-2line')
61 | read_phylip_sequential = seqio.read._read_method('phylip-sequential')
62 | read_phylip_relaxed = seqio.read._read_method('phylip-relaxed')
63 |
64 | # Tree file reading methods.
65 | read_newick = treeio.read._read_method('newick')
66 | read_nexus_tree = treeio.read._read_method('nexus')
67 |
68 | def read_dendropy(
69 | self,
70 | add_node_labels=True,
71 | combine_on='index',
72 | use_uids=True):
73 | df0 = self._data
74 | df1 = treeio.read.read_dendropy(
75 | self._data,
76 | add_node_labels=add_node_labels,
77 | use_uids=use_uids
78 | )
79 | return df0.phylo.combine(df1, on=combine_on)
80 |
81 |
82 | # -----------------------------------------------------------
83 | # Extra write methods.
84 | # -----------------------------------------------------------
85 |
86 | to_fasta = seqio.write._write_method('fasta')
87 | to_phylip = seqio.write._write_method('phylip')
88 | to_clustal = seqio.write._write_method('clustal')
89 | to_embl = seqio.write._write_method('embl')
90 | to_nexus_seq = seqio.write._write_method('nexus')
91 | to_swiss = seqio.write._write_method('swiss')
92 | to_fastq = seqio.write._write_method('fastq')
93 | to_fasta_twoline = seqio.write._write_method('fasta-2line')
94 | to_phylip_sequential = seqio.write._write_method('phylip-sequential')
95 | to_phylip_relaxed = seqio.write._write_method('phylip-relaxed')
96 |
97 | # Tree file reading methods.
98 | to_newick = treeio.write._write_method('newick')
99 | to_nexus_tree = treeio.write._write_method('nexus')
100 |
101 | def to_dendropy(
102 | self,
103 | taxon_col='uid',
104 | taxon_annotations=[],
105 | node_col='uid',
106 | node_annotations=[],
107 | branch_lengths=True):
108 | return treeio.write.to_dendropy(
109 | self._data,
110 | taxon_col=taxon_col,
111 | taxon_annotations=taxon_annotations,
112 | node_col=node_col,
113 | node_annotations=node_annotations,
114 | branch_lengths=branch_lengths,
115 | )
116 |
117 | # -----------------------------------------------------------
118 | # Useful dataframe methods specific to sequencing data.
119 | # -----------------------------------------------------------
120 |
121 | def match_value(self, column, value):
122 | """Return a subset dataframe that column values match the given value.
123 |
124 | Parameters
125 | ----------
126 | column : string
127 | column to search for matches
128 |
129 | value : float, int, list, etc.
130 | values to match.
131 | """
132 | # Get column
133 | col = self._data[column]
134 |
135 | # Get items in a list
136 | try:
137 | idx = col[col.isin(value)].index
138 |
139 | # Or value is a single item?
140 | except TypeError:
141 | idx = col[col == value].index
142 |
143 | return self._data.loc[idx]
144 |
145 |
146 | def combine(self, other, on='index'):
147 | """Combine two dataframes. Update the first dataframe with second.
148 | New columns are added to the right of the first dataframe. Overlapping
149 | columns update the values of the columns.
150 |
151 | Technical note: maintains order of columns, appending new dataframe to
152 | old.
153 |
154 | Parameters
155 | ----------
156 | other : DataFrame
157 | Index+Columns that match self will be updated with new values.
158 | New rows will be added separately.
159 |
160 | on : str
161 | Column to update index.
162 | """
163 | # Determine column labels for new dataframe (Maintain order of columns)
164 | column_idx = {k: None for k in self._data.columns}
165 | column_idx.update({k: None for k in other.columns})
166 | column_idx = list(column_idx.keys())
167 |
168 | df0 = self._data.copy()
169 | df1 = other.copy()
170 |
171 | # Set index to whatever column is given
172 | df0 = df0.set_index(on, inplace=False, drop=False)
173 | df1 = df1.set_index(on, inplace=False, drop=False)
174 |
175 | # Write out both dataframes to dictionaries
176 | data0 = df0.to_dict(orient="index")
177 | data1 = df1.to_dict(orient="index")
178 |
179 | # Update.
180 | for key in data1.keys():
181 | try:
182 | data0[key].update(data1[key])
183 | except KeyError:
184 | data0[key] = data1[key]
185 |
186 | # Build new dataframe
187 | df = pd.DataFrame(data0).T
188 |
189 | # Check for missing columns
190 | for key in column_idx:
191 | if key not in df.columns:
192 | df[key] = None
193 |
194 | # Reset the index.
195 | df.reset_index(inplace=True)
196 |
197 | # Return dataframe (maintaining original order)
198 | return df[column_idx]
199 |
200 | def display(self, **kwargs):
201 | __doc__ = TreeChart.__doc__
202 | # Show the tree using phylovega.
203 | try:
204 | if TreeChart is None:
205 | raise NameError
206 | return TreeChart(self._data.to_dict(orient='records'), **kwargs)
207 | except NameError:
208 | raise NameError("Looks like phylovega couldn't be imported. Is phylovega installed?")
209 |
210 |
--------------------------------------------------------------------------------
/phylopandas/treeio/write.py:
--------------------------------------------------------------------------------
1 | import pandas
2 | import dendropy
3 |
4 | def _write_doc_template(schema):
5 | s = """Write to {} format.
6 |
7 | Parameters
8 | ----------
9 | filename : str
10 | File to write {} string to. If no filename is given, a {} string
11 | will be returned.
12 |
13 | taxon_col : str (default='sequence')
14 | Sequence column name in DataFrame.
15 |
16 | taxon_annotations : str
17 | List of columns to annotation in the tree taxon.
18 |
19 | node_col : str (default='id')
20 | ID column name in DataFrame
21 |
22 | node_annotations : str
23 | List of columns to annotation in the node taxon.
24 |
25 | branch_lengths : bool (default=False)
26 | If True, use only the ID column to label sequences in fasta.
27 | """.format(schema, schema, schema)
28 | return s
29 |
30 |
31 | def _pandas_df_to_dendropy_tree(
32 | df,
33 | taxon_col='uid',
34 | taxon_annotations=[],
35 | node_col='uid',
36 | node_annotations=[],
37 | branch_lengths=True,
38 | ):
39 | """Turn a phylopandas dataframe into a dendropy tree.
40 |
41 | Parameters
42 | ----------
43 | df : DataFrame
44 | DataFrame containing tree data.
45 |
46 | taxon_col : str (optional)
47 | Column in dataframe to label the taxon. If None, the index will be used.
48 |
49 | taxon_annotations : str
50 | List of columns to annotation in the tree taxon.
51 |
52 | node_col : str (optional)
53 | Column in dataframe to label the nodes. If None, the index will be used.
54 |
55 | node_annotations : str
56 | List of columns to annotation in the node taxon.
57 |
58 | branch_lengths : bool
59 | If True, inclues branch lengths.
60 | """
61 | if isinstance(taxon_col, str) is False:
62 | raise Exception("taxon_col must be a string.")
63 |
64 | if isinstance(node_col, str) is False:
65 | raise Exception("taxon_col must be a string.")
66 |
67 | # Construct a list of nodes from dataframe.
68 | taxon_namespace = dendropy.TaxonNamespace()
69 | nodes = {}
70 | for idx in df.index:
71 | # Get node data.
72 | data = df.loc[idx]
73 |
74 | # Get taxon for node (if leaf node).
75 | taxon = None
76 | if data['type'] == 'leaf':
77 | taxon = dendropy.Taxon(label=data[taxon_col])
78 | # Add annotations data.
79 | for ann in taxon_annotations:
80 | taxon.annotations.add_new(ann, data[ann])
81 | taxon_namespace.add_taxon(taxon)
82 |
83 | # Get label for node.
84 | label = data[node_col]
85 |
86 | # Get edge length.
87 | edge_length = None
88 | if branch_lengths is True:
89 | edge_length = data['length']
90 |
91 | # Build a node
92 | n = dendropy.Node(
93 | taxon=taxon,
94 | label=label,
95 | edge_length=edge_length
96 | )
97 |
98 | # Add node annotations
99 | for ann in node_annotations:
100 | n.annotations.add_new(ann, data[ann])
101 |
102 | nodes[idx] = n
103 |
104 | # Build branching pattern for nodes.
105 | root = None
106 | for idx, node in nodes.items():
107 | # Get node data.
108 | data = df.loc[idx]
109 |
110 | # Get children nodes
111 | children_idx = df[df['parent'] == data['id']].index
112 | children_nodes = [nodes[i] for i in children_idx]
113 |
114 | # Set child nodes
115 | nodes[idx].set_child_nodes(children_nodes)
116 |
117 | # Check if this is root.
118 | if data['parent'] is None:
119 | root = nodes[idx]
120 |
121 | # Build tree.
122 | tree = dendropy.Tree(
123 | seed_node=root,
124 | taxon_namespace=taxon_namespace
125 | )
126 | return tree
127 |
128 |
129 | def _write(
130 | df,
131 | filename=None,
132 | schema='newick',
133 | taxon_col='uid',
134 | taxon_annotations=[],
135 | node_col='uid',
136 | node_annotations=[],
137 | branch_lengths=True,
138 | **kwargs
139 | ):
140 | """Write a phylopandas tree DataFrame to various formats.
141 |
142 | Parameters
143 | ----------
144 | df : DataFrame
145 | DataFrame containing tree data.
146 |
147 | filename : str
148 | filepath to write out tree. If None, will return string.
149 |
150 | schema : str
151 | tree format to write out.
152 |
153 | taxon_col : str (optional)
154 | Column in dataframe to label the taxon. If None, the index will be used.
155 |
156 | taxon_annotations : str
157 | List of columns to annotation in the tree taxon.
158 |
159 | node_col : str (optional)
160 | Column in dataframe to label the nodes. If None, the index will be used.
161 |
162 | node_annotations : str
163 | List of columns to annotation in the node taxon.
164 |
165 | branch_lengths : bool
166 | If True, inclues branch lengths.
167 | """
168 | tree = _pandas_df_to_dendropy_tree(
169 | df,
170 | taxon_col=taxon_col,
171 | taxon_annotations=taxon_annotations,
172 | node_col=node_col,
173 | node_annotations=node_annotations,
174 | branch_lengths=branch_lengths,
175 | )
176 |
177 | # Write out format
178 | if filename is not None:
179 | tree.write(path=filename, schema=schema, suppress_annotations=False, **kwargs)
180 | else:
181 | return tree.as_string(schema=schema)
182 |
183 |
184 | def _write_method(schema):
185 | """Add a write method for named schema to a class.
186 | """
187 | def method(
188 | self,
189 | filename=None,
190 | schema=schema,
191 | taxon_col='uid',
192 | taxon_annotations=[],
193 | node_col='uid',
194 | node_annotations=[],
195 | branch_lengths=True,
196 | **kwargs):
197 | # Use generic write class to write data.
198 | return _write(
199 | self._data,
200 | filename=filename,
201 | schema=schema,
202 | taxon_col=taxon_col,
203 | taxon_annotations=taxon_annotations,
204 | node_col=node_col,
205 | node_annotations=node_annotations,
206 | branch_lengths=branch_lengths,
207 | **kwargs
208 | )
209 | # Update docs
210 | method.__doc__ = _write_doc_template(schema)
211 | return method
212 |
213 |
214 | def _write_function(schema):
215 | """Add a write method for named schema to a class.
216 | """
217 | def func(
218 | data,
219 | filename=None,
220 | schema=schema,
221 | taxon_col='uid',
222 | taxon_annotations=[],
223 | node_col='uid',
224 | node_annotations=[],
225 | branch_lengths=True,
226 | **kwargs):
227 | # Use generic write class to write data.
228 | return _write(
229 | data,
230 | filename=filename,
231 | schema=schema,
232 | taxon_col=taxon_col,
233 | taxon_annotations=taxon_annotations,
234 | node_col=node_col,
235 | node_annotations=node_annotations,
236 | branch_lengths=branch_lengths,
237 | **kwargs
238 | )
239 | # Update docs
240 | func.__doc__ = _write_doc_template(schema)
241 | return func
242 |
243 | def to_dendropy(
244 | data,
245 | taxon_col='uid',
246 | taxon_annotations=[],
247 | node_col='uid',
248 | node_annotations=[],
249 | branch_lengths=True):
250 | return _pandas_df_to_dendropy_tree(
251 | data,
252 | taxon_col=taxon_col,
253 | taxon_annotations=taxon_annotations,
254 | node_col=node_col,
255 | node_annotations=node_annotations,
256 | branch_lengths=branch_lengths,
257 | )
258 |
259 | to_newick = _write_function('newick')
260 | to_nexml = _write_function('nexml')
261 | to_nexus_tree = _write_function('nexus')
262 |
--------------------------------------------------------------------------------
/phylopandas/seqio/write.py:
--------------------------------------------------------------------------------
1 | __doc__ = """
2 | Functions for write sequence data to sequence files.
3 | """
4 | import pandas as pd
5 |
6 | # Import Biopython
7 | from Bio import SeqIO
8 | from Bio.Seq import Seq
9 | from Bio.SeqRecord import SeqRecord
10 | import Bio.Alphabet
11 |
12 |
13 | def _write_doc_template(schema):
14 | s = """Write to {} format.
15 |
16 | Parameters
17 | ----------
18 | filename : str
19 | File to write {} string to. If no filename is given, a {} string
20 | will be returned.
21 |
22 | sequence_col : str (default='sequence')
23 | Sequence column name in DataFrame.
24 |
25 | id_col : str (default='id')
26 | ID column name in DataFrame
27 |
28 | id_only : bool (default=False)
29 | If True, use only the ID column to label sequences in fasta.
30 | """.format(schema, schema, schema)
31 | return s
32 |
33 |
34 | def pandas_df_to_biopython_seqrecord(
35 | df,
36 | id_col='uid',
37 | sequence_col='sequence',
38 | extra_data=None,
39 | alphabet=None,
40 | ):
41 | """Convert pandas dataframe to biopython seqrecord for easy writing.
42 |
43 | Parameters
44 | ----------
45 | df : Dataframe
46 | Pandas dataframe to convert
47 |
48 | id_col : str
49 | column in dataframe to use as sequence label
50 |
51 | sequence_col str:
52 | column in dataframe to use as sequence data
53 |
54 | extra_data : list
55 | extra columns to use in sequence description line
56 |
57 | alphabet :
58 | biopython Alphabet object
59 |
60 | Returns
61 | -------
62 | seq_records :
63 | List of biopython seqrecords.
64 | """
65 | seq_records = []
66 |
67 | for i, row in df.iterrows():
68 | # Tries getting sequence data. If a TypeError at the seqrecord
69 | # creation is thrown, it is assumed that this row does not contain
70 | # sequence data and therefore the row is ignored.
71 | try:
72 | # Get sequence
73 | seq = Seq(row[sequence_col], alphabet=alphabet)
74 |
75 | # Get id
76 | id = row[id_col]
77 |
78 | # Build a description
79 | description = ""
80 | if extra_data is not None:
81 | description = " ".join([row[key] for key in extra_data])
82 |
83 | # Build a record
84 | record = SeqRecord(
85 | seq=seq,
86 | id=id,
87 | description=description,
88 | )
89 | seq_records.append(record)
90 | except TypeError:
91 | pass
92 |
93 | return seq_records
94 |
95 | def pandas_series_to_biopython_seqrecord(
96 | series,
97 | id_col='uid',
98 | sequence_col='sequence',
99 | extra_data=None,
100 | alphabet=None
101 | ):
102 | """Convert pandas series to biopython seqrecord for easy writing.
103 |
104 | Parameters
105 | ----------
106 | series : Series
107 | Pandas series to convert
108 |
109 | id_col : str
110 | column in dataframe to use as sequence label
111 |
112 | sequence_col : str
113 | column in dataframe to use as sequence data
114 |
115 | extra_data : list
116 | extra columns to use in sequence description line
117 |
118 | Returns
119 | -------
120 | seq_records :
121 | List of biopython seqrecords.
122 | """
123 | # Get sequence
124 | seq = Seq(series[sequence_col], alphabet=alphabet)
125 |
126 | # Get id
127 | id = series[id_col]
128 |
129 | # Build a description
130 | description = ""
131 | if extra_data is not None:
132 | description = " ".join([series[key] for key in extra_data])
133 |
134 | # Build a record
135 | record = SeqRecord(
136 | seq=seq,
137 | id=id,
138 | description=description,
139 | )
140 |
141 | seq_records = [record]
142 | return seq_records
143 |
144 | def _write(
145 | data,
146 | filename=None,
147 | schema='fasta',
148 | id_col='uid',
149 | sequence_col='sequence',
150 | extra_data=None,
151 | alphabet=None,
152 | **kwargs):
153 | """General write function. Write phylopanda data to biopython format.
154 |
155 | Parameters
156 | ----------
157 | filename : str
158 | File to write string to. If no filename is given, a string
159 | will be returned.
160 |
161 | sequence_col : str (default='sequence')
162 | Sequence column name in DataFrame.
163 |
164 | id_col : str (default='id')
165 | ID column name in DataFrame
166 |
167 | id_only : bool (default=False)
168 | If True, use only the ID column to label sequences in fasta.
169 | """
170 | # Check Alphabet if given
171 | if alphabet is None:
172 | alphabet = Bio.Alphabet.Alphabet()
173 |
174 | elif alphabet in ['dna', 'rna', 'protein', 'nucleotide']:
175 | alphabet = getattr(Bio.Alphabet, 'generic_{}'.format(alphabet))
176 |
177 | else:
178 | raise Exception(
179 | "The alphabet is not recognized. Must be 'dna', 'rna', "
180 | "'nucleotide', or 'protein'.")
181 |
182 | # Build a list of records from a pandas DataFrame
183 | if type(data) is pd.DataFrame:
184 | seq_records = pandas_df_to_biopython_seqrecord(
185 | data,
186 | id_col=id_col,
187 | sequence_col=sequence_col,
188 | extra_data=extra_data,
189 | alphabet=alphabet,
190 | )
191 |
192 | # Build a record from a pandas Series
193 | elif type(data) is pd.Series:
194 | seq_records = pandas_series_to_biopython_seqrecord(
195 | data,
196 | id_col=id_col,
197 | sequence_col=sequence_col,
198 | extra_data=extra_data,
199 | alphabet=alphabet,
200 | )
201 |
202 | # Write to disk or return string
203 | if filename is not None:
204 | SeqIO.write(seq_records, filename, format=schema, **kwargs)
205 |
206 | else:
207 | return "".join([s.format(schema) for s in seq_records])
208 |
209 | def _write_method(schema):
210 | """Add a write method for named schema to a class.
211 | """
212 | def method(
213 | self,
214 | filename=None,
215 | schema=schema,
216 | id_col='uid',
217 | sequence_col='sequence',
218 | extra_data=None,
219 | alphabet=None,
220 | **kwargs):
221 | # Use generic write class to write data.
222 | return _write(
223 | self._data,
224 | filename=filename,
225 | schema=schema,
226 | id_col=id_col,
227 | sequence_col=sequence_col,
228 | extra_data=extra_data,
229 | alphabet=alphabet,
230 | **kwargs
231 | )
232 | # Update docs
233 | method.__doc__ = _write_doc_template(schema)
234 | return method
235 |
236 |
237 | def _write_function(schema):
238 | """Add a write method for named schema to a class.
239 | """
240 | def func(
241 | data,
242 | filename=None,
243 | schema=schema,
244 | id_col='uid',
245 | sequence_col='sequence',
246 | extra_data=None,
247 | alphabet=None,
248 | **kwargs):
249 | # Use generic write class to write data.
250 | return _write(
251 | data,
252 | filename=filename,
253 | schema=schema,
254 | id_col=id_col,
255 | sequence_col=sequence_col,
256 | extra_data=extra_data,
257 | alphabet=alphabet,
258 | **kwargs
259 | )
260 | # Update docs
261 | func.__doc__ = _write_doc_template(schema)
262 | return func
263 |
264 |
265 | # Write functions to various formats.
266 | to_fasta = _write_function('fasta')
267 | to_phylip = _write_function('phylip')
268 | to_clustal = _write_function('clustal')
269 | to_embl = _write_function('embl')
270 | to_nexus_seq = _write_function('nexus')
271 | to_swiss = _write_function('swiss')
272 | to_fastq = _write_function('fastq')
273 |
--------------------------------------------------------------------------------
/docs/_logo/logo-02.svg:
--------------------------------------------------------------------------------
1 |
2 |
196 |
--------------------------------------------------------------------------------
/docs/_logo/banner.svg:
--------------------------------------------------------------------------------
1 |
2 |
200 |
--------------------------------------------------------------------------------
/docs/_logo/logo-2.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
262 |
--------------------------------------------------------------------------------
/docs/_logo/logo.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
262 |
--------------------------------------------------------------------------------
/examples/intro-notebook.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Introduction to Phylopandas"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "Let me introduce you to PhyloPandas. A Pandas dataframe and interface for phylogenetics."
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "import pandas as pd"
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "execution_count": 2,
29 | "metadata": {},
30 | "outputs": [],
31 | "source": [
32 | "import phylopandas as ph"
33 | ]
34 | },
35 | {
36 | "cell_type": "markdown",
37 | "metadata": {},
38 | "source": [
39 | "## Reading data"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "Phylopandas comes with various `read_` methods to load phylogenetic data into a Pandas DataFrame.\n",
47 | "\n",
48 | "Check out the various formats by hitting `tab` after `read` in the cell below."
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": 3,
54 | "metadata": {},
55 | "outputs": [
56 | {
57 | "ename": "AttributeError",
58 | "evalue": "module 'phylopandas' has no attribute 'read_'",
59 | "output_type": "error",
60 | "traceback": [
61 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
62 | "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
63 | "\u001b[0;32m| \n", 297 | " | type | \n", 298 | "id | \n", 299 | "parent | \n", 300 | "length | \n", 301 | "label | \n", 302 | "distance | \n", 303 | "uid | \n", 304 | "
|---|---|---|---|---|---|---|---|
| 0 | \n", 309 | "root | \n", 310 | "0 | \n", 311 | "None | \n", 312 | "0.000000 | \n", 313 | "0 | \n", 314 | "0.000000 | \n", 315 | "kCIjFBZKXZ | \n", 316 | "
| 1 | \n", 319 | "leaf | \n", 320 | "Q8QUQ5_ISKNN/45-79 | \n", 321 | "0 | \n", 322 | "0.383764 | \n", 323 | "Q8QUQ5_ISKNN/45-79 | \n", 324 | "0.383764 | \n", 325 | "wKP5pcfIok | \n", 326 | "
| 2 | \n", 329 | "leaf | \n", 330 | "Q8QUQ6_ISKNN/37-75 | \n", 331 | "0 | \n", 332 | "0.934733 | \n", 333 | "Q8QUQ6_ISKNN/37-75 | \n", 334 | "0.934733 | \n", 335 | "Wi6ARQAOcw | \n", 336 | "
| 3 | \n", 339 | "node | \n", 340 | "1 | \n", 341 | "0 | \n", 342 | "0.489399 | \n", 343 | "1 | \n", 344 | "0.489399 | \n", 345 | "iKoRLPGtl6 | \n", 346 | "
| 4 | \n", 349 | "leaf | \n", 350 | "Q8QUQ5_ISKNN/123-157 | \n", 351 | "1 | \n", 352 | "1.145829 | \n", 353 | "Q8QUQ5_ISKNN/123-157 | \n", 354 | "1.635229 | \n", 355 | "RbLr5Hi2L9 | \n", 356 | "
| 5 | \n", 359 | "node | \n", 360 | "2 | \n", 361 | "1 | \n", 362 | "0.158686 | \n", 363 | "2 | \n", 364 | "0.648085 | \n", 365 | "pR3f9C8Ort | \n", 366 | "
| 6 | \n", 369 | "leaf | \n", 370 | "Q0E553_SFAVA/142-176 | \n", 371 | "2 | \n", 372 | "0.943087 | \n", 373 | "Q0E553_SFAVA/142-176 | \n", 374 | "1.591172 | \n", 375 | "8wbvqaG3jg | \n", 376 | "
| 7 | \n", 379 | "node | \n", 380 | "3 | \n", 381 | "2 | \n", 382 | "0.271689 | \n", 383 | "3 | \n", 384 | "0.919775 | \n", 385 | "sCUs3pJLK8 | \n", 386 | "
| 8 | \n", 389 | "leaf | \n", 390 | "Q0E553_SFAVA/184-218 | \n", 391 | "3 | \n", 392 | "0.989771 | \n", 393 | "Q0E553_SFAVA/184-218 | \n", 394 | "1.909546 | \n", 395 | "Lov4UJif6D | \n", 396 | "
| 9 | \n", 399 | "node | \n", 400 | "4 | \n", 401 | "3 | \n", 402 | "0.206426 | \n", 403 | "4 | \n", 404 | "1.126200 | \n", 405 | "5yDZXG1tyd | \n", 406 | "
| 10 | \n", 409 | "leaf | \n", 410 | "Q0E553_SFAVA/60-94 | \n", 411 | "4 | \n", 412 | "0.957061 | \n", 413 | "Q0E553_SFAVA/60-94 | \n", 414 | "2.083262 | \n", 415 | "5vdNmIFPXc | \n", 416 | "
| 11 | \n", 419 | "node | \n", 420 | "5 | \n", 421 | "4 | \n", 422 | "0.200542 | \n", 423 | "5 | \n", 424 | "1.326742 | \n", 425 | "WBQ9xPajWc | \n", 426 | "
| 12 | \n", 429 | "node | \n", 430 | "6 | \n", 431 | "5 | \n", 432 | "0.379082 | \n", 433 | "6 | \n", 434 | "1.705824 | \n", 435 | "ghTcY2ffpP | \n", 436 | "
| 13 | \n", 439 | "node | \n", 440 | "7 | \n", 441 | "6 | \n", 442 | "0.307241 | \n", 443 | "7 | \n", 444 | "2.013065 | \n", 445 | "Jppnt2XML0 | \n", 446 | "
| 14 | \n", 449 | "leaf | \n", 450 | "019R_FRG3G/5-39 | \n", 451 | "7 | \n", 452 | "0.067233 | \n", 453 | "019R_FRG3G/5-39 | \n", 454 | "2.080298 | \n", 455 | "TDxeDIcFpO | \n", 456 | "
| 15 | \n", 459 | "node | \n", 460 | "8 | \n", 461 | "7 | \n", 462 | "0.128148 | \n", 463 | "8 | \n", 464 | "2.141213 | \n", 465 | "6iGd5fwhap | \n", 466 | "
| 16 | \n", 469 | "leaf | \n", 470 | "019R_FRG3G/139-172 | \n", 471 | "8 | \n", 472 | "0.056904 | \n", 473 | "019R_FRG3G/139-172 | \n", 474 | "2.198117 | \n", 475 | "hyhVxB9tHU | \n", 476 | "
| 17 | \n", 479 | "node | \n", 480 | "9 | \n", 481 | "8 | \n", 482 | "0.619688 | \n", 483 | "9 | \n", 484 | "2.760901 | \n", 485 | "QnhYrilQej | \n", 486 | "
| 18 | \n", 489 | "leaf | \n", 490 | "019R_FRG3G/249-283 | \n", 491 | "9 | \n", 492 | "0.957730 | \n", 493 | "019R_FRG3G/249-283 | \n", 494 | "3.718631 | \n", 495 | "V4FBuc6aRj | \n", 496 | "
| 19 | \n", 499 | "leaf | \n", 500 | "019R_FRG3G/302-336 | \n", 501 | "9 | \n", 502 | "0.583613 | \n", 503 | "019R_FRG3G/302-336 | \n", 504 | "3.344514 | \n", 505 | "QKM2XtMt64 | \n", 506 | "
| 20 | \n", 509 | "node | \n", 510 | "10 | \n", 511 | "6 | \n", 512 | "0.000272 | \n", 513 | "10 | \n", 514 | "1.706096 | \n", 515 | "NIJbhPLOuF | \n", 516 | "
| 21 | \n", 519 | "node | \n", 520 | "11 | \n", 521 | "10 | \n", 522 | "0.205513 | \n", 523 | "11 | \n", 524 | "1.911610 | \n", 525 | "lk3EvVgXDt | \n", 526 | "
| 22 | \n", 529 | "leaf | \n", 530 | "VF232_IIV6/64-98 | \n", 531 | "11 | \n", 532 | "0.773389 | \n", 533 | "VF232_IIV6/64-98 | \n", 534 | "2.684999 | \n", 535 | "FAdWRQzPjL | \n", 536 | "
| 23 | \n", 539 | "node | \n", 540 | "12 | \n", 541 | "11 | \n", 542 | "0.093885 | \n", 543 | "12 | \n", 544 | "2.005495 | \n", 545 | "i7efTQ6vir | \n", 546 | "
| 24 | \n", 549 | "node | \n", 550 | "13 | \n", 551 | "12 | \n", 552 | "0.373670 | \n", 553 | "13 | \n", 554 | "2.379165 | \n", 555 | "1b9wK0j6kP | \n", 556 | "
| 25 | \n", 559 | "leaf | \n", 560 | "VF380_IIV6/7-45 | \n", 561 | "13 | \n", 562 | "0.561336 | \n", 563 | "VF380_IIV6/7-45 | \n", 564 | "2.940501 | \n", 565 | "M4uL7HMUUX | \n", 566 | "
| 26 | \n", 569 | "leaf | \n", 570 | "VF380_IIV3/8-47 | \n", 571 | "13 | \n", 572 | "0.643071 | \n", 573 | "VF380_IIV3/8-47 | \n", 574 | "3.022236 | \n", 575 | "3D2lkpDYdj | \n", 576 | "
| 27 | \n", 579 | "node | \n", 580 | "14 | \n", 581 | "12 | \n", 582 | "0.205226 | \n", 583 | "14 | \n", 584 | "2.210721 | \n", 585 | "MPQsyhpHRU | \n", 586 | "
| 28 | \n", 589 | "leaf | \n", 590 | "VF378_IIV6/4-38 | \n", 591 | "14 | \n", 592 | "0.315302 | \n", 593 | "VF378_IIV6/4-38 | \n", 594 | "2.526023 | \n", 595 | "PpUlPwYDu2 | \n", 596 | "
| 29 | \n", 599 | "leaf | \n", 600 | "O41158_PBCV1/63-96 | \n", 601 | "14 | \n", 602 | "0.460768 | \n", 603 | "O41158_PBCV1/63-96 | \n", 604 | "2.671490 | \n", 605 | "TdWSC3FiL6 | \n", 606 | "
| 30 | \n", 609 | "leaf | \n", 610 | "Q0E553_SFAVA/14-48 | \n", 611 | "10 | \n", 612 | "1.588348 | \n", 613 | "Q0E553_SFAVA/14-48 | \n", 614 | "3.294444 | \n", 615 | "99kd8VpIQk | \n", 616 | "
| 31 | \n", 619 | "node | \n", 620 | "15 | \n", 621 | "5 | \n", 622 | "0.362965 | \n", 623 | "15 | \n", 624 | "1.689707 | \n", 625 | "y1TJLF3YJa | \n", 626 | "
| 32 | \n", 629 | "leaf | \n", 630 | "Q8QUQ5_ISKNN/164-198 | \n", 631 | "15 | \n", 632 | "0.639072 | \n", 633 | "Q8QUQ5_ISKNN/164-198 | \n", 634 | "2.328779 | \n", 635 | "6MpoaU0KeN | \n", 636 | "
| 33 | \n", 639 | "leaf | \n", 640 | "Q8QUQ5_ISKNN/7-42 | \n", 641 | "15 | \n", 642 | "0.967432 | \n", 643 | "Q8QUQ5_ISKNN/7-42 | \n", 644 | "2.657139 | \n", 645 | "w84ibjT7xv | \n", 646 | "
| \n", 780 | " | type | \n", 781 | "id | \n", 782 | "parent | \n", 783 | "length | \n", 784 | "label | \n", 785 | "distance | \n", 786 | "uid | \n", 787 | "
|---|---|---|---|---|---|---|---|
| 0 | \n", 792 | "root | \n", 793 | "0 | \n", 794 | "None | \n", 795 | "0.000000 | \n", 796 | "0 | \n", 797 | "0.000000 | \n", 798 | "9x4F7nTLnY | \n", 799 | "
| 1 | \n", 802 | "leaf | \n", 803 | "Q8QUQ5_ISKNN/45-79 | \n", 804 | "0 | \n", 805 | "0.383764 | \n", 806 | "Q8QUQ5_ISKNN/45-79 | \n", 807 | "0.383764 | \n", 808 | "bhUZpMzqaw | \n", 809 | "
| 2 | \n", 812 | "leaf | \n", 813 | "Q8QUQ6_ISKNN/37-75 | \n", 814 | "0 | \n", 815 | "0.934733 | \n", 816 | "Q8QUQ6_ISKNN/37-75 | \n", 817 | "0.934733 | \n", 818 | "AGoLMJy4qb | \n", 819 | "
| 3 | \n", 822 | "node | \n", 823 | "1 | \n", 824 | "0 | \n", 825 | "0.489399 | \n", 826 | "1 | \n", 827 | "0.489399 | \n", 828 | "PEr58Pk7IB | \n", 829 | "
| 4 | \n", 832 | "leaf | \n", 833 | "Q8QUQ5_ISKNN/123-157 | \n", 834 | "1 | \n", 835 | "1.145829 | \n", 836 | "Q8QUQ5_ISKNN/123-157 | \n", 837 | "1.635229 | \n", 838 | "CQmpogxXrH | \n", 839 | "
| 5 | \n", 842 | "node | \n", 843 | "2 | \n", 844 | "1 | \n", 845 | "0.158686 | \n", 846 | "2 | \n", 847 | "0.648085 | \n", 848 | "4fGJ1yqAd6 | \n", 849 | "
| 6 | \n", 852 | "leaf | \n", 853 | "Q0E553_SFAVA/142-176 | \n", 854 | "2 | \n", 855 | "0.943087 | \n", 856 | "Q0E553_SFAVA/142-176 | \n", 857 | "1.591172 | \n", 858 | "W89uwOl3sK | \n", 859 | "
| 7 | \n", 862 | "node | \n", 863 | "3 | \n", 864 | "2 | \n", 865 | "0.271689 | \n", 866 | "3 | \n", 867 | "0.919775 | \n", 868 | "xCOwZZkfi5 | \n", 869 | "
| 8 | \n", 872 | "leaf | \n", 873 | "Q0E553_SFAVA/184-218 | \n", 874 | "3 | \n", 875 | "0.989771 | \n", 876 | "Q0E553_SFAVA/184-218 | \n", 877 | "1.909546 | \n", 878 | "gDFNACm9Vx | \n", 879 | "
| 9 | \n", 882 | "node | \n", 883 | "4 | \n", 884 | "3 | \n", 885 | "0.206426 | \n", 886 | "4 | \n", 887 | "1.126200 | \n", 888 | "BngfjtGSGI | \n", 889 | "
| 10 | \n", 892 | "leaf | \n", 893 | "Q0E553_SFAVA/60-94 | \n", 894 | "4 | \n", 895 | "0.957061 | \n", 896 | "Q0E553_SFAVA/60-94 | \n", 897 | "2.083262 | \n", 898 | "fRZuaBG9S3 | \n", 899 | "
| 11 | \n", 902 | "node | \n", 903 | "5 | \n", 904 | "4 | \n", 905 | "0.200542 | \n", 906 | "5 | \n", 907 | "1.326742 | \n", 908 | "NmlJLQbDRv | \n", 909 | "
| 12 | \n", 912 | "node | \n", 913 | "6 | \n", 914 | "5 | \n", 915 | "0.379082 | \n", 916 | "6 | \n", 917 | "1.705824 | \n", 918 | "lrdraHKZPu | \n", 919 | "
| 13 | \n", 922 | "node | \n", 923 | "7 | \n", 924 | "6 | \n", 925 | "0.307241 | \n", 926 | "7 | \n", 927 | "2.013065 | \n", 928 | "PFW37AvcYM | \n", 929 | "
| 14 | \n", 932 | "leaf | \n", 933 | "019R_FRG3G/5-39 | \n", 934 | "7 | \n", 935 | "0.067233 | \n", 936 | "019R_FRG3G/5-39 | \n", 937 | "2.080298 | \n", 938 | "mclkZI6LJJ | \n", 939 | "
| 15 | \n", 942 | "node | \n", 943 | "8 | \n", 944 | "7 | \n", 945 | "0.128148 | \n", 946 | "8 | \n", 947 | "2.141213 | \n", 948 | "L812ps7kEQ | \n", 949 | "
| 16 | \n", 952 | "leaf | \n", 953 | "019R_FRG3G/139-172 | \n", 954 | "8 | \n", 955 | "0.056904 | \n", 956 | "019R_FRG3G/139-172 | \n", 957 | "2.198117 | \n", 958 | "6qtDyUu3Xx | \n", 959 | "
| 17 | \n", 962 | "node | \n", 963 | "9 | \n", 964 | "8 | \n", 965 | "0.619688 | \n", 966 | "9 | \n", 967 | "2.760901 | \n", 968 | "jbaGKmQX58 | \n", 969 | "
| 18 | \n", 972 | "leaf | \n", 973 | "019R_FRG3G/249-283 | \n", 974 | "9 | \n", 975 | "0.957730 | \n", 976 | "019R_FRG3G/249-283 | \n", 977 | "3.718631 | \n", 978 | "ZM0EOpcIQT | \n", 979 | "
| 19 | \n", 982 | "leaf | \n", 983 | "019R_FRG3G/302-336 | \n", 984 | "9 | \n", 985 | "0.583613 | \n", 986 | "019R_FRG3G/302-336 | \n", 987 | "3.344514 | \n", 988 | "WQi85K0XJ9 | \n", 989 | "
| 20 | \n", 992 | "node | \n", 993 | "10 | \n", 994 | "6 | \n", 995 | "0.000272 | \n", 996 | "10 | \n", 997 | "1.706096 | \n", 998 | "m5gcndbJ8y | \n", 999 | "
| 21 | \n", 1002 | "node | \n", 1003 | "11 | \n", 1004 | "10 | \n", 1005 | "0.205513 | \n", 1006 | "11 | \n", 1007 | "1.911610 | \n", 1008 | "HTHRlWWbVk | \n", 1009 | "
| 22 | \n", 1012 | "leaf | \n", 1013 | "VF232_IIV6/64-98 | \n", 1014 | "11 | \n", 1015 | "0.773389 | \n", 1016 | "VF232_IIV6/64-98 | \n", 1017 | "2.684999 | \n", 1018 | "HKAm1CcD5f | \n", 1019 | "
| 23 | \n", 1022 | "node | \n", 1023 | "12 | \n", 1024 | "11 | \n", 1025 | "0.093885 | \n", 1026 | "12 | \n", 1027 | "2.005495 | \n", 1028 | "43NfhfKUkH | \n", 1029 | "
| 24 | \n", 1032 | "node | \n", 1033 | "13 | \n", 1034 | "12 | \n", 1035 | "0.373670 | \n", 1036 | "13 | \n", 1037 | "2.379165 | \n", 1038 | "pHJPwh7ew7 | \n", 1039 | "
| 25 | \n", 1042 | "leaf | \n", 1043 | "VF380_IIV6/7-45 | \n", 1044 | "13 | \n", 1045 | "0.561336 | \n", 1046 | "VF380_IIV6/7-45 | \n", 1047 | "2.940501 | \n", 1048 | "rrJHPnwZSf | \n", 1049 | "
| 26 | \n", 1052 | "leaf | \n", 1053 | "VF380_IIV3/8-47 | \n", 1054 | "13 | \n", 1055 | "0.643071 | \n", 1056 | "VF380_IIV3/8-47 | \n", 1057 | "3.022236 | \n", 1058 | "ZvZl8mCP8M | \n", 1059 | "
| 27 | \n", 1062 | "node | \n", 1063 | "14 | \n", 1064 | "12 | \n", 1065 | "0.205226 | \n", 1066 | "14 | \n", 1067 | "2.210721 | \n", 1068 | "uBIkldlUE1 | \n", 1069 | "
| 28 | \n", 1072 | "leaf | \n", 1073 | "VF378_IIV6/4-38 | \n", 1074 | "14 | \n", 1075 | "0.315302 | \n", 1076 | "VF378_IIV6/4-38 | \n", 1077 | "2.526023 | \n", 1078 | "OBk4WSlGu7 | \n", 1079 | "
| 29 | \n", 1082 | "leaf | \n", 1083 | "O41158_PBCV1/63-96 | \n", 1084 | "14 | \n", 1085 | "0.460768 | \n", 1086 | "O41158_PBCV1/63-96 | \n", 1087 | "2.671490 | \n", 1088 | "PO4PsryR5V | \n", 1089 | "
| 30 | \n", 1092 | "leaf | \n", 1093 | "Q0E553_SFAVA/14-48 | \n", 1094 | "10 | \n", 1095 | "1.588348 | \n", 1096 | "Q0E553_SFAVA/14-48 | \n", 1097 | "3.294444 | \n", 1098 | "P3GhB4vdqL | \n", 1099 | "
| 31 | \n", 1102 | "node | \n", 1103 | "15 | \n", 1104 | "5 | \n", 1105 | "0.362965 | \n", 1106 | "15 | \n", 1107 | "1.689707 | \n", 1108 | "HTXFsynsuZ | \n", 1109 | "
| 32 | \n", 1112 | "leaf | \n", 1113 | "Q8QUQ5_ISKNN/164-198 | \n", 1114 | "15 | \n", 1115 | "0.639072 | \n", 1116 | "Q8QUQ5_ISKNN/164-198 | \n", 1117 | "2.328779 | \n", 1118 | "56dIugXUfd | \n", 1119 | "
| 33 | \n", 1122 | "leaf | \n", 1123 | "Q8QUQ5_ISKNN/7-42 | \n", 1124 | "15 | \n", 1125 | "0.967432 | \n", 1126 | "Q8QUQ5_ISKNN/7-42 | \n", 1127 | "2.657139 | \n", 1128 | "G193mLw0d7 | \n", 1129 | "
| \n", 1243 | " | type | \n", 1244 | "id | \n", 1245 | "parent | \n", 1246 | "length | \n", 1247 | "label | \n", 1248 | "distance | \n", 1249 | "uid | \n", 1250 | "
|---|---|---|---|---|---|---|---|
| 1 | \n", 1255 | "leaf | \n", 1256 | "Q8QUQ5_ISKNN/45-79 | \n", 1257 | "0 | \n", 1258 | "0.383764 | \n", 1259 | "Q8QUQ5_ISKNN/45-79 | \n", 1260 | "0.383764 | \n", 1261 | "bhUZpMzqaw | \n", 1262 | "
| 2 | \n", 1265 | "leaf | \n", 1266 | "Q8QUQ6_ISKNN/37-75 | \n", 1267 | "0 | \n", 1268 | "0.934733 | \n", 1269 | "Q8QUQ6_ISKNN/37-75 | \n", 1270 | "0.934733 | \n", 1271 | "AGoLMJy4qb | \n", 1272 | "
| 4 | \n", 1275 | "leaf | \n", 1276 | "Q8QUQ5_ISKNN/123-157 | \n", 1277 | "1 | \n", 1278 | "1.145829 | \n", 1279 | "Q8QUQ5_ISKNN/123-157 | \n", 1280 | "1.635229 | \n", 1281 | "CQmpogxXrH | \n", 1282 | "
| 6 | \n", 1285 | "leaf | \n", 1286 | "Q0E553_SFAVA/142-176 | \n", 1287 | "2 | \n", 1288 | "0.943087 | \n", 1289 | "Q0E553_SFAVA/142-176 | \n", 1290 | "1.591172 | \n", 1291 | "W89uwOl3sK | \n", 1292 | "
| 8 | \n", 1295 | "leaf | \n", 1296 | "Q0E553_SFAVA/184-218 | \n", 1297 | "3 | \n", 1298 | "0.989771 | \n", 1299 | "Q0E553_SFAVA/184-218 | \n", 1300 | "1.909546 | \n", 1301 | "gDFNACm9Vx | \n", 1302 | "
| 10 | \n", 1305 | "leaf | \n", 1306 | "Q0E553_SFAVA/60-94 | \n", 1307 | "4 | \n", 1308 | "0.957061 | \n", 1309 | "Q0E553_SFAVA/60-94 | \n", 1310 | "2.083262 | \n", 1311 | "fRZuaBG9S3 | \n", 1312 | "
| 14 | \n", 1315 | "leaf | \n", 1316 | "019R_FRG3G/5-39 | \n", 1317 | "7 | \n", 1318 | "0.067233 | \n", 1319 | "019R_FRG3G/5-39 | \n", 1320 | "2.080298 | \n", 1321 | "mclkZI6LJJ | \n", 1322 | "
| 16 | \n", 1325 | "leaf | \n", 1326 | "019R_FRG3G/139-172 | \n", 1327 | "8 | \n", 1328 | "0.056904 | \n", 1329 | "019R_FRG3G/139-172 | \n", 1330 | "2.198117 | \n", 1331 | "6qtDyUu3Xx | \n", 1332 | "
| 18 | \n", 1335 | "leaf | \n", 1336 | "019R_FRG3G/249-283 | \n", 1337 | "9 | \n", 1338 | "0.957730 | \n", 1339 | "019R_FRG3G/249-283 | \n", 1340 | "3.718631 | \n", 1341 | "ZM0EOpcIQT | \n", 1342 | "
| 19 | \n", 1345 | "leaf | \n", 1346 | "019R_FRG3G/302-336 | \n", 1347 | "9 | \n", 1348 | "0.583613 | \n", 1349 | "019R_FRG3G/302-336 | \n", 1350 | "3.344514 | \n", 1351 | "WQi85K0XJ9 | \n", 1352 | "
| 22 | \n", 1355 | "leaf | \n", 1356 | "VF232_IIV6/64-98 | \n", 1357 | "11 | \n", 1358 | "0.773389 | \n", 1359 | "VF232_IIV6/64-98 | \n", 1360 | "2.684999 | \n", 1361 | "HKAm1CcD5f | \n", 1362 | "
| 25 | \n", 1365 | "leaf | \n", 1366 | "VF380_IIV6/7-45 | \n", 1367 | "13 | \n", 1368 | "0.561336 | \n", 1369 | "VF380_IIV6/7-45 | \n", 1370 | "2.940501 | \n", 1371 | "rrJHPnwZSf | \n", 1372 | "
| 26 | \n", 1375 | "leaf | \n", 1376 | "VF380_IIV3/8-47 | \n", 1377 | "13 | \n", 1378 | "0.643071 | \n", 1379 | "VF380_IIV3/8-47 | \n", 1380 | "3.022236 | \n", 1381 | "ZvZl8mCP8M | \n", 1382 | "
| 28 | \n", 1385 | "leaf | \n", 1386 | "VF378_IIV6/4-38 | \n", 1387 | "14 | \n", 1388 | "0.315302 | \n", 1389 | "VF378_IIV6/4-38 | \n", 1390 | "2.526023 | \n", 1391 | "OBk4WSlGu7 | \n", 1392 | "
| 29 | \n", 1395 | "leaf | \n", 1396 | "O41158_PBCV1/63-96 | \n", 1397 | "14 | \n", 1398 | "0.460768 | \n", 1399 | "O41158_PBCV1/63-96 | \n", 1400 | "2.671490 | \n", 1401 | "PO4PsryR5V | \n", 1402 | "
| 30 | \n", 1405 | "leaf | \n", 1406 | "Q0E553_SFAVA/14-48 | \n", 1407 | "10 | \n", 1408 | "1.588348 | \n", 1409 | "Q0E553_SFAVA/14-48 | \n", 1410 | "3.294444 | \n", 1411 | "P3GhB4vdqL | \n", 1412 | "
| 32 | \n", 1415 | "leaf | \n", 1416 | "Q8QUQ5_ISKNN/164-198 | \n", 1417 | "15 | \n", 1418 | "0.639072 | \n", 1419 | "Q8QUQ5_ISKNN/164-198 | \n", 1420 | "2.328779 | \n", 1421 | "56dIugXUfd | \n", 1422 | "
| 33 | \n", 1425 | "leaf | \n", 1426 | "Q8QUQ5_ISKNN/7-42 | \n", 1427 | "15 | \n", 1428 | "0.967432 | \n", 1429 | "Q8QUQ5_ISKNN/7-42 | \n", 1430 | "2.657139 | \n", 1431 | "G193mLw0d7 | \n", 1432 | "