├── .gitignore
├── CODE_OF_CONDUCT.md
├── LICENSE
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | 
 2 | # Created by https://www.gitignore.io/api/r,microsoftoffice
 3 | 
 4 | ### MicrosoftOffice ###
 5 | *.tmp
 6 | 
 7 | # Word temporary
 8 | ~$*.doc*
 9 | 
10 | # Excel temporary
11 | ~$*.xls*
12 | 
13 | # Excel Backup File
14 | *.xlk
15 | 
16 | # PowerPoint temporary
17 | ~$*.ppt*
18 | 
19 | # Visio autosave temporary files
20 | *.~vsd*
21 | 
22 | ### R ###
23 | # History files
24 | .Rhistory
25 | .Rapp.history
26 | 
27 | # Session Data files
28 | .RData
29 | 
30 | # Example code in package build process
31 | *-Ex.R
32 | 
33 | # Output files from R CMD build
34 | /*.tar.gz
35 | 
36 | # Output files from R CMD check
37 | /*.Rcheck/
38 | 
39 | # RStudio files
40 | .Rproj.user/
41 | 
42 | # produced vignettes
43 | vignettes/*.html
44 | vignettes/*.pdf
45 | 
46 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
47 | .httr-oauth
48 | 
49 | # knitr and R markdown default cache directories
50 | /*_cache/
51 | /cache/
52 | 
53 | # Temporary files created by R markdown
54 | *.utf8.md
55 | *.knit.md
56 | 
57 | 
58 | # End of https://www.gitignore.io/api/r,microsoftoffice
59 | 
60 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
 1 | # Contributor Covenant Code of Conduct
 2 | 
 3 | ## Our Pledge
 4 | 
 5 | In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
 6 | 
 7 | ## Our Standards
 8 | 
 9 | Examples of behavior that contributes to creating a positive environment include:
10 | 
11 | * Using welcoming and inclusive language
12 | * Being respectful of differing viewpoints and experiences
13 | * Gracefully accepting constructive criticism
14 | * Focusing on what is best for the community
15 | * Showing empathy towards other community members
16 | 
17 | Examples of unacceptable behavior by participants include:
18 | 
19 | * The use of sexualized language or imagery and unwelcome sexual attention or advances
20 | * Trolling, insulting/derogatory comments, and personal or political attacks
21 | * Public or private harassment
22 | * Publishing others' private information, such as a physical or electronic address, without explicit permission
23 | * Other conduct which could reasonably be considered inappropriate in a professional setting
24 | 
25 | ## Our Responsibilities
26 | 
27 | Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
28 | 
29 | Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
30 | 
31 | ## Scope
32 | 
33 | This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
34 | 
35 | ## Enforcement
36 | 
37 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at wilmarth@ohsu.edu. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
38 | 
39 | Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
40 | 
41 | ## Attribution
42 | 
43 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
44 | 
45 | [homepage]: http://contributor-covenant.org
46 | [version]: http://contributor-covenant.org/version/1/4/
47 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 Phillip Wilmarth
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # TMT_analysis_examples (and some spectral counting)
  2 | ## Examples of TMT data analyses using Jupyter notebooks and R
  3 | ## (also some spectral counting analyses)
  4 | #### Phillip Wilmarth
  5 | #### Oregon Health & Science University, PSR Core
  6 | #### 2018, 2019
  7 | 
  8 | ## Other repositories that may be helpful:
  9 | * [Multi-TMT experiments and IRS normalization](https://github.com/pwilmart/IRS_normalization.git)
 10 | * [Validation of the IRS method](https://github.com/pwilmart/IRS_validation.git)
 11 | 
 12 | ---
 13 | 
 14 | ## Repositories, notebooks, and descriptions:
 15 | ### (rendered HTML of [Jupyter notebooks](http://jupyter.org) are in the links below)
 16 | ### TMT data:
 17 | 
 18 | ### (1) [MaxQuant_and_PAW repository](https://github.com/pwilmart/MaxQuant_and_PAW.git)
 19 | #### [PAW analysis Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/KUR1502_PAW.html)
 20 | #### [Compare to t-test Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/KUR1502_PAW_t-test.html)
 21 | #### [Compare to limma Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/KUR1502_PAW_limma.html)
 22 | #### [Compare to limma-voom Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/KUR1502_PAW_limma-voom.html)
 23 | #### [MQ analysis Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/KUR1502_MQ.html)
 24 | ##### [older Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/KUR1502_PAW.html)
 25 | 
 26 | Comparisons of 3 control versus 4 treatment mouse cell culture data. The data are from SPS MS3 acquisition on a Thermo Fusion using TMT 10-plex. Data were analyzed with two pipelines: MaxQuant (v1.6.5.0) and an OHSU in-house pipeline (Comet/PAW). Each pipeline was analyzed separately in similar notebook layouts. Both notebooks can be opened side-by-side for an easy head-to-head comparison.
 27 | 
 28 | There is also a comparison of the edgeR statistical testing to a two-sample t-test and to limma (with and without using `voom` variance modeling) to clearly demonstrate the improved performance of newer statistical tools.
 29 | 
 30 | Analyses started with protein reports from each pipeline (files are in the repository folders). R analysis scripts and Jupyter notebooks were used for analyses. The data is from the publication below.
 31 | 
 32 | > Huan, J., Hornick, N.I., Goloviznina, N.A., Kamimae-Lanning, A.N., David, L.L., Wilmarth, P.A., Mori, T., Chevillet, J.R., Narla, A., Roberts Jr, C.T. and Loriaux, M.M., 2015. Coordinate regulation of residual bone marrow function by paracrine trafficking of AML exosomes. Leukemia, 29(12), p.2285.
 33 | 
 34 | ### (2) [Dilution_series repository](https://github.com/pwilmart/Dilution_series)
 35 | #### [Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/MAN1353_peptides_proteins.html)
 36 | 
 37 | **Updated January 1, 2019.** Analysis of a dilution series to compare the properties of reporter ions at the PSM, the peptide, and the protein levels. This looks at the advantages of aggregating TMT reporter ions into protein intensities. It also explores data normalization in edgeR with TMM. Updated with better R scripting and more use of ggplot.
 38 | 
 39 | ### (3) [Multiple_TMT_MQ repository](https://github.com/pwilmart/Multiple_TMT_MQ.git)
 40 | #### [Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/multiple_TMT_MQ.html)
 41 | 
 42 | Analysis of the mouse lens development data with MaxQuant. This is a three TMT experiment, and how to match the data between TMT experiments is demonstrated. This focuses on normalization methods. Statsitical testing is not explored since that was done in the other repository referenced above.
 43 | 
 44 | > Khan, S.Y., Ali, M., Kabir, F., Renuse, S., Na, C.H., Talbot, C.C., Hackett, S.F. and Riazuddin, S.A., 2018. Proteome Profiling of Developing Murine Lens Through Mass Spectrometry. Investigative Ophthalmology & Visual Science, 59(1), pp.100-107.
 45 | 
 46 | ### (4) [JPR-201712_MS2-MS3 repository](https://github.com/pwilmart/JPR-201712_MS2-MS3)
 47 | #### [First analysis notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/MS2MS3_peptides_proteins.html)
 48 | 
 49 | Looks at PSM, peptide, and protein level data. Analysis preformed with MaxQuant (v1.5.7.4).
 50 | 
 51 | #### [Second analysis notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/JPR-2017_E-coli_MS2-MS3.html)
 52 | 
 53 | Compares MS2 reporter ion data to SPS MS3 reporter ion data. Analysis done with PAW pipeline.
 54 | 
 55 | #### [serum analysis notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/JPR-2017_serum.html)
 56 | 
 57 | Analysis of depleted serum labeled with 10-plex TMT and processed on Q-Exactive. Analysis done with PAW pipeline.
 58 | 
 59 | Data is from this publication:
 60 | 
 61 | > D’Angelo, G., Chaerkady, R., Yu, W., Hizal, D.B., Hess, S., Zhao, W., Lekstrom, K., Guo, X., White, W.I., Roskos, L. and Bowen, M.A., 2017. Statistical models for the analysis of isobaric tags multiplexed quantitative proteomics. Journal of proteome research, 16(9), pp.3124-3136.
 62 | 
 63 | ### (5) [Gygi Lab Yeast triple knockout repository](https://github.com/pwilmart/Yeast_triple_KO_TMT)
 64 | #### [Triple_KO notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/Triple_KO.html)
 65 | 
 66 | Re-analysis of yeast triple knockout TMT data from the Gygi lab.
 67 | 
 68 | > Paulo, J.A., O’Connell, J.D. and Gygi, S.P., 2016. A triple knockout (TKO) proteomics standard for diagnosing ion interference in isobaric labeling experiments. Journal of the American Society for Mass Spectrometry, 27(10), pp.1620-1625.
 69 | 
 70 | ### (6) [Plubell_2017_PAW repository](https://github.com/pwilmart/Plubell_2017_PAW.git)
 71 | #### [Notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/auto_finder_PAW.html)
 72 | 
 73 | Re-analysis of the data in the original IRS paper using PAW/Comet, and other workflows. The original experiment was four 10-plex TMT labelings with the two internal reference pooled standards randomly assigned different channels in each plex. The first notebook explores how to make sure that those two channels in each plex are correctly determined before doing the IRS normalization.
 74 | 
 75 | > Plubell, D.L., Wilmarth, P.A., Zhao, Y., Fenton, A.M., Minnier, J., Reddy, A.P., Klimek, J., Yang, X., David, L.L. and Pamir, N., 2017. Extended multiplexing of tandem mass tags (TMT) labeling reveals age and high fat diet specific proteome changes in mouse epididymal adipose tissue. Molecular & Cellular Proteomics, 16(5), pp.873-890.
 76 | 
 77 | ### (7) [IRS_validation repository](https://github.com/pwilmart/IRS_validation.git)
 78 | #### [auto_finder notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/auto_finder_BIND-473.html)
 79 | #### [IRS_validation notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/IRS_validation.html)
 80 | 
 81 | Thorough testing and validation of the IRS method using reference channel data from a 77-channel TMT experiment.
 82 | 
 83 | ### (8) [Yeast_CarbonSources repository](https://github.com/pwilmart/Yeast_CarbonSources.git)
 84 | #### [CarbonSources_part-1 notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/CarbonSources_part-1.html)
 85 | #### [CarbonSources_MQ notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/CarbonSources_MQ.html)
 86 | 
 87 | This is analysis of a public dataset ([PRIDE PXD002875](https://www.ebi.ac.uk/pride/archive/projects/PXD002875)) from Paulo, O'Connell, Gaun, and Gygi processed with the PAW pipeline using Comet. The part-1 notebook covers basic TMT data sanity checks, normalization, and basic statistical testing with edgeR. Part-2 will explore how much the numbers of differential candidates can vary with statistical test choices.
 88 | 
 89 | I added a MaxQuant 1.6.3.3 processing of the same RAW files worked up using a very similar notebook.
 90 | 
 91 | > Paulo, J.A., O’Connell, J.D., Gaun, A. and Gygi, S.P., 2015. Proteome-wide quantitative multiplexed profiling of protein expression: carbon-source dependency in Saccharomyces cerevisiae. Molecular biology of the cell, 26(22), pp.4063-4074.
 92 | 
 93 | ### (9) [BCP-ALL_QE-TMT_Nat-Comm-2019 repository](https://github.com/pwilmart/BCP-ALL_QE-TMT_Nat-Comm-2019.git)
 94 | #### HTML files:
 95 | ##### [balanced study averages notebook](https://pwilmart.github.io/TMT_analysis_examples/Nat-Comm-2019_TMT_QE_averages.html) - slightly better IRS using plex averages
 96 | 
 97 | ##### [single pooled standard notebook](https://pwilmart.github.io/TMT_analysis_examples/Nat-Comm-2019_TMT_QE_pools.html) - single pooled internal standards have a little more uncertainty
 98 | 
 99 | Re-analysis of data from childhood acute lymphoblastic leukemia study in Nat. Comm. April 2019. Demonstrates an independent analysis of the 216 Q-Exactive RAW files where MS2 reporter ions are kept in their natural intensity scale instead of ratio transformations. Natural intensity scales are more informative than ratios and have more options for statistical testing. Data from the publication below.
100 | 
101 | > Yang, M., Vesterlund, M., Siavelis, I., Moura-Castro, L.H., Castor, A., Fioretos, T., Jafari, R., Lilljebjörn, H., Odom, D.T., Olsson, L. and Ravi, N., 2019. Proteogenomics and Hi-C reveal transcriptional dysregulation in high hyperdiploid childhood acute lymphoblastic leukemia. Nature communications, 10(1), p.1519.
102 | 
103 | ### (10) [SPS-MS3_vs_MS2 repository](https://github.com/pwilmart/SPS-MS3_vs_MS2_TMT.git)
104 | #### HTML files:
105 | #### [Combined notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/MORG-75_combined.html) - combined TMT data from both Fusion and Q-Exactive HF platforms
106 | 
107 | #### [Fusion notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/MORG-75_Fusion.html) - analysis of TMT data from SPS MS3 acquisition on a Fusion Tribrid
108 | 
109 | #### [Q-Exactive notebook HTML file](https://pwilmart.github.io/TMT_analysis_examples/MORG-75_QE.html) - analysis of MS2 TMT data from a Q-Exactive HF instrument
110 | 
111 | The same 10-plex TMT labeled samples were ran on a Thermo Fusion Tribrid Instrument using the SPS MS3 data acquisition method and ran on a Q-Exactive HF using MS2 reporter ions. Each platform's data are analyzed in separate notebooks and in a combined notebook (a more head-to-head workup). Each dataset used two TMT plexes to accommodate the 14 biological samples. [IRS normalization](https://github.com/pwilmart/IRS_normalization) ([also see this](https://github.com/pwilmart/IRS_validation)) was used to combine plexes for each platform. The [PAW pipeline](https://github.com/pwilmart/PAW_pipeline) was flexible enough to apply the IRS method to data from both platforms and allow a combined analysis. _(added 20200123)_
112 | 
113 | ---
114 | ### Spectral counting data:
115 | 
116 | ### (1) [ABRF_iPRG_2015_SpC repository](https://github.com/pwilmart/ABRF_iPRG_2015_SpC.git)
117 | #### [ABRF_2015_edgeR notebook file](https://pwilmart.github.io/TMT_analysis_examples/ABRF_2015_edgeR.html)
118 | 
119 | This is re-analysis of a spectral counting public dataset ([MassIVE MSV000079843](https://massive.ucsd.edu/ProteoSAFe/dataset.jsp?task=eccf4bd3e86a4f79af468b0010eb80b0)). The data is from the ABRF iPRG 2015 study and is described in (Choi-2017). Six proteins were prepared in 4 different abundance mixes and spiked into a yeast cell lysate background. Each of the 4 different spike-in experiments were analyzed in triplicate on a Q-Exactive instrument. Each sample was analyzed using a single LC run.
120 | 
121 | > Choi, M., Eren-Dogu, Z.F., Colangelo, C., Cottrell, J., Hoopmann, M.R., Kapp, E.A., Kim, S., Lam, H., Neubert, T.A., Palmblad, M. and Phinney, B.S., 2017. ABRF Proteome Informatics Research Group (iPRG) 2015 Study: Detection of Differentially Abundant Proteins in Label-Free Quantitative LC–MS/MS Experiments. Journal of proteome research, 16(2), pp.945-957.
122 | 
123 | ### (2) [Smith_SpC_2018 repository](https://github.com/pwilmart/Smith_SpC_2018.git)
124 | #### [Smith_2018_edgeR notebook file](https://pwilmart.github.io/TMT_analysis_examples/Smith_2018_edgeR.html)
125 | 
126 | This is re-analysis of a large spectral counting public dataset ([PXD005972](https://www.ebi.ac.uk/pride/archive/projects/PXD005972)). The data is from this [recent study](https://www.sciencedirect.com/science/article/pii/S0002939418301193) where human retinal and choroidal endothelial cells were compared. The study was 5 donor eyes where retinal and choroidal cells were collected and cultured in a paired design. The cell lysates from each of the 10 cell cultures were profiled using large-scale separations with a fast-scanning linear ion trap.
127 | 
128 | > Smith, J.R., David, L.L., Appukuttan, B. and Wilmarth, P.A., 2018. Angiogenic and Immunologic Proteins Identified by Deep Proteomic Profiling of Human Retinal and Choroidal Vascular Endothelial Cells: Potential Targets for New Biologic Drugs. American journal of ophthalmology, 193, pp.197-229.
129 | 
130 | ### (3) [Sea_lion_urine_SpC repository](https://github.com/pwilmart/Sea_lion_urine_SpC.git)
131 | #### [PXD009019_average_missing notebook file](https://pwilmart.github.io/TMT_analysis_examples/PXD009019_average_missing.html) - Determining the low SpC cutoff
132 | #### [PXD009019_QC_check notebook file](https://pwilmart.github.io/TMT_analysis_examples/PXD009019_QC_check.html) - Checking for outlier samples
133 | #### [PXD009019_SpC_DE notebook file](https://pwilmart.github.io/TMT_analysis_examples/PXD009019_SpC_DE.html) - Main differential expression testing
134 | 
135 | This is re-analysis of a Sea lion urine dataset ([PXD009019](https://www.ebi.ac.uk/pride/archive/projects/PXD009019)). The spectral counting analysis covers all processing in an end-to-end fashion. This shows how to approach data from non-model organisms and to make sure that various analysis choices are double checked. The README.md file for the repository is quite detailed and should be read in addition to the notebooks. The publication is listed below:
136 | 
137 | > Neely, B.A., Prager, K.C., Bland, A.M., Fontaine, C., Gulland, F.M. and Janech, M.G., 2018. Proteomic Analysis of Urine from California Sea Lions (Zalophus californianus): a Resource for Urinary Biomarker Discovery. Journal of proteome research, 17(9), pp.3281-3291.
138 | 


--------------------------------------------------------------------------------