├── LICENSE
└── README.md
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Röst Lab
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # PythonProteomics
2 | This repository contains a list of open source Python tools for Proteomics analysis. The list is very likely incomplete and we are happy to take pull request with new tools.
3 |
4 | # Contents
5 |
6 | * [Introduction](#intro)
7 | * Packages
8 | * Full Libraries
9 | * [pyOpenMS](#pyOpenMS)
10 | * [pyteomics](#pyteomics)
11 | * [multiplierz](#multiplierz)
12 | * [pyproteome](#pyproteome)
13 | * Specialized packages
14 | * [pyMzML](#pyMzML)
15 | * [pyProphet](#pyProphet)
16 | * [msproteomicstools](#msproteomicstools)
17 | * [PaDuA](#PaDuA)
18 | * [psims](#psims)
19 | * [param-medic](#param-medic)
20 |
21 |
22 | # Intro
23 |
24 | Python is a versatile scripting language that is widely used in industry and academia. In bioinformatics, there are multiple packages supporting data analysis with Python that range from biological sequence analysis with Biopython to structural modeling and visualization with packages like PyMOL and PyRosetta, to numerical computation and advanced plotting with NumPy/SciPy. In the proteomics community, Python began to be widely used around 2012 when several mature Python packages were published including pymzML, Pyteomics and pyOpenMS. This has led to an ever-increasing interest in the Python programming language in the proteomics and mass spectrometry community. The number of publications referencing or using Python has risen eight fold since 2012 (compared with the same time period before 2012), with multiple open-source Python packages now supporting mass spectrometric data analysis and processing. Computing and data analysis in mass spectrometry is very diverse and in many cases must be tailored to a specific experiment. Often, multiple analysis steps have to be performed (identification, quantification, post-translational modification analysis, filtering, FDR analysis etc.) in an analysis pipeline, which requires high flexibility in the analysis. This is where Python truly shines, due to its flexibility, visualization capabilities and the ability to extend computation with a large number of powerful libraries. Python can be used to quickly prototype software, combine existing libraries into powerful analysis workflows while avoiding the trap of re-inventing the wheel for a new project.
25 |
26 | # Libraries
27 |
28 | ## Full libraries
29 |
30 | These packages contain full-fledged Python libraries that can process Proteomics data
31 |
32 |
33 | ### pyOpenMS
34 |
35 | [pyOpenMS](https://pyopenms.readthedocs.io/en/latest/) is an open-source Python library for mass spectrometry, specifically for the analysis of proteomics and metabolomics data in Python. pyOpenMS implements a set of Python bindings to the OpenMS library for computational mass spectrometry and is available for Windows, Linux and OSX.
36 |
37 | * Code: https://github.com/OpenMS/OpenMS
38 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/24420968
39 |
40 |
41 | ### pyteomics
42 |
43 | [Pyteomics](https://pyteomics.readthedocs.io/en/latest/) is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis, such as:
44 |
45 | * Code: https://bitbucket.org/levitsky/pyteomics/src/default/
46 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/30576148
47 |
48 |
49 | ### multiplierz
50 | [multiplierz](https://github.com/BlaisProteomics/multiplierz) is a Python software library and associated GUI desktop environment for managing proteomic mass spectrometry workflows and data analysis. Using the mzAPI interface to native instrument data formats, multiplierz is provides a complete toolset for a variety of methods for peptide identification, quantitation, and experimental reporting.
51 |
52 | * Code: https://github.com/BlaisProteomics/multiplierz
53 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/28686798
54 |
55 |
56 | ### pyproteome
57 |
58 | [pyproteome](https://github.com/white-lab/pyproteome): Python library for analyzing mass spectrometry proteomics data.
59 |
60 | * Code: https://github.com/white-lab/pyproteome
61 |
62 | ## Specialized packages
63 |
64 | These packages contain specialized functions that help with a specific aspect of Proteomics data processing
65 |
66 |
67 | ### pyMzML
68 |
69 | [pyMzML](https://pymzml.readthedocs.io/en/latest/): Module to parse mzML data in Python based on cElementTree.
70 |
71 | * Code: https://github.com/pymzml/pymzML
72 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/29394323
73 |
74 |
75 | ### pyProphet
76 |
77 | [PyProphet](https://github.com/PyProphet/pyprophet): Semi-supervised learning and scoring of OpenSWATH results.
78 |
79 | * Code: https://github.com/PyProphet/pyprophet
80 |
81 |
82 | ### msproteomicstools
83 |
84 | [msproteomicstools](http://msproteomicstools.roestlab.org/): is a Python library that can be used in LC-MS/MS based proteomics. It features a core library called msproteomicstoolslib and several associated executable scripts that use the library as well as a GUI for visualizing chromatograms, specifically output from OpenSWATH.
85 |
86 | * Code: https://github.com/msproteomicstools/msproteomicstools
87 |
88 |
89 | ### PaDuA
90 | [PaDuA](https://padua.readthedocs.io/en/latest/) is a Python package to simplify the processing and analysis of quantified proteomics data. Currently it supports processing and analysis of MaxQuant outputs, providing many of the features available in the GUI analysis tool Perseus. By scripting these processing and analysis steps you can get to your results more quickle and reproducibly.
91 |
92 | * Code: https://github.com/mfitzp/padua
93 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/30525654
94 |
95 |
96 | ### psims
97 |
98 | [psims](https://mobiusklein.github.io/psims/docs/build/html/): A declarative API for writing XML documents for HUPO PSI-MS mzML and mzIdentML
99 |
100 | * Code: https://mobiusklein.github.io/psims/
101 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/30563850
102 |
103 |
104 | ### param-medic
105 |
106 | [param-medic](https://github.com/dhmay/param-medic): Param-Medic breathes new life into MS/MS database searches by optimizing search parameter settings for your data.
107 |
108 | * Code: https://github.com/dhmay/param-medic
109 | * Publication: https://www.ncbi.nlm.nih.gov/pubmed/30714740
110 |
--------------------------------------------------------------------------------