├── CONTRIBUTE.md ├── LICENSE ├── TODO.md └── README.md /CONTRIBUTE.md: -------------------------------------------------------------------------------- 1 | Contribution and suggestions are welcome! Simply fork this repository, make a change, and submit a pull request. Please ensure your pull request adheres to the following guidelines: 2 | 3 | - This is an open source compilation, please check that the license of the software is suitable. 4 | - Please search previous suggestions before making a new one, as yours may be a duplicate. 5 | - Please make an individual pull request for *each* suggestion. 6 | - Use the following format: [RESOURCE](LINK) - [language(s)] - DESCRIPTION. 7 | - Keep descriptions short and simple. 8 | - End all descriptions with a full stop/period. 9 | - Order projects alphabetically within each category. 10 | - Check your spelling and grammar. 11 | - New categories, or improvements to the existing categorisation are welcome. 12 | 13 | Thank you for your suggestions! 14 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Sean Davis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /TODO.md: -------------------------------------------------------------------------------- 1 | # STUFF FOR ME TO ADD 2 | 3 | 4 | # Normalization 5 | 6 | http://michelebusby.tumblr.com/post/130202229486/the-ks-test-looks-pretty-good-for-single-cell 7 | 8 | Accounting for technical variation: http://www.nature.com/ncomms/2015/151022/ncomms9687/full/ncomms9687.html#supplementary-information 9 | 10 | ZIFA: Zero-inflated factor analysis https://github.com/epierson9/ZIFA 11 | 12 | http://biorxiv.org/content/early/2016/04/22/049734.full.pdf+html 13 | 14 | https://www.dropbox.com/s/pno78mmlj0exv7s/NODES_0.0.0.9010.tar.gz?dl=0 15 | 16 | scran: http://bioconductor.org/packages/devel/bioc/html/scran.html 17 | 18 | Correct for expression heterogeneity: https://github.com/PMBio/scLVM 19 | 20 | # Transcript counting 21 | 22 | Modified version of Kallisto: https://github.com/govinda-kamath/clustering_on_transcript_compatibility_counts 23 | 24 | DISCO: https://pbtech-vc.med.cornell.edu/git/mason-lab/disco/tree/master 25 | 26 | # Clustering 27 | 28 | Comparative analysis: http://biorxiv.org/content/early/2016/04/07/047613 29 | 30 | SC3: consensus clustering https://github.com/hemberg-lab/sc3 31 | 32 | destiny: diffusion maps for single-cell data http://bioconductor.org/packages/release/bioc/html/destiny.html 33 | 34 | https://github.com/govinda-kamath/clustering_on_transcript_compatibility_counts 35 | 36 | GiniClust https://github.com/lanjiangboston/GiniClust 37 | 38 | pcaReduce: https://github.com/JustinaZ/pcaReduce 39 | 40 | https://github.com/BatzoglouLabSU/SIMLR 41 | 42 | Vortex: http://web.stanford.edu/~samusik/vortex/ 43 | 44 | # Differential Expression 45 | 46 | Monocle cole-trapnell-lab.github.io/monocle-release/ 47 | 48 | scDD: https://github.com/kdkorthauer/scDD 49 | 50 | ISOP: comparison of isoform pairs in single cells https://github.com/nghiavtr/ISOP 51 | 52 | D3E: http://hemberg-lab.github.io/D3E/ 53 | 54 | BASiCS: https://github.com/catavallejos/BASiCS 55 | 56 | Beta Poisson: https://github.com/nghiavtr/BPSC 57 | 58 | # Time-series/ordering/lineage prediction 59 | 60 | Monocle 61 | 62 | Analysis of pseudotime uncertainty: http://biorxiv.org/content/biorxiv/early/2016/04/05/047365.full.pdf 63 | 64 | ECLAIR: cell lineage prediction https://github.com/GGiecold/ECLAIR 65 | 66 | Identification of ordering effects: https://github.com/lengning/OEFinder 67 | 68 | Slicer: non-linear trajectories https://github.com/jw156605/SLICER 69 | 70 | Wishbone: identification of bifurcations in developmental trajectories http://www.c2b2.columbia.edu/danapeerlab/html/cyt-download.html 71 | 72 | SCOUP: https://github.com/hmatsu1226/SCOUP 73 | 74 | Ouija: https://github.com/kieranrcampbell/ouija 75 | 76 | # Pipelines 77 | 78 | Seurat http://www.satijalab.org/seurat.html 79 | 80 | SINCERA https://research.cchmc.org/pbge/sincera.html 81 | 82 | MAST: https://github.com/RGLab/MAST 83 | 84 | scde (differential expression + gene set over-dispersion): https://github.com/hms-dbmi/scde 85 | 86 | BaSiCs: Bayesian analysis of single cell data: https://github.com/catavallejos/BASiCS 87 | 88 | FastProject: https://github.com/YosefLab/FastProject/wiki 89 | 90 | Citrus: http://chenmengjie.github.io/Citrus/ 91 | 92 | Tools from Teichmann lab (cellity, celloline, scrnatb): https://github.com/Teichlab/ 93 | 94 | SCell: https://github.com/diazlab/SCell 95 | # Other 96 | 97 | Ginko: analysis of CNVs in single-cell data: http://qb.cshl.edu/ginkgo/?q=/XWxZEerqqY477b9i4V8F 98 | 99 | CNV calling: http://genome.cshlp.org/content/early/2016/01/15/gr.198937.115.full.pdf 100 | 101 | Gene co-expression: http://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1004892 102 | 103 | DNA SNV calling: https://bitbucket.org/hamimzafar/monovar 104 | 105 | # Methylation 106 | 107 | Prediction of missing information: https://github.com/cangermueller/deepcpg 108 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # awesome-single-cell 2 | 3 | List of software packages (and the people developing these methods) for single-cell data analysis, including RNA-seq, ATAC-seq, etc. [Contributions welcome](https://github.com/seandavi/awesome-single-cell/blob/master/CONTRIBUTE.md)... 4 | 5 | ## Software packages 6 | 7 | ### RNA-seq 8 | 9 | - [anchor](https://github.com/yeolab/anchor) - [Python] - ⚓ Find bimodal, unimodal, and multimodal features in your data 10 | - [BackSPIN](https://github.com/linnarsson-lab/BackSPIN) - [Python] - Biclustering algorithm developed taking into account intrinsic features of single-cell RNA-seq experiments. 11 | - [BASiCs](https://github.com/catavallejos/BASiCS) - [R] - Bayesian Analysis of single-cell RNA-seq data. Estimates cell-specific normalization constants. Technical variability is quantified based on spike-in genes. The total variability of the expression counts is decomposed into technical and biological components. 12 | - [bonvoyage](https://github.com/yeolab/bonvoyage) - [Python] - 📐 Transform percentage-based units into a 2d space to evaluate changes in distribution with both magnitude and direction. 13 | - [BPSC](https://github.com/nghiavtr/BPSC) - [R] - Beta-Poisson model for single-cell RNA-seq data analyses 14 | - [Cellity](https://github.com/teichlab/cellity) - [R] - Classification of low quality cells in scRNA-seq data using R 15 | - [cellTree](https://www.bioconductor.org/packages/3.3/bioc/html/cellTree.html) - [R] - Cell population analysis and visualization from single cell RNA-seq data using a Latent Dirichlet Allocation model. 16 | - [clusterExperiment](https://github.com/epurdom/clusterExperiment) - [R] - Functions for running and comparing many different clusterings of single-cell sequencing data. Meant to work with SCONE and slingshot. 17 | - [ECLAIR](https://github.com/GGiecold/ECLAIR) - [python] - ECLAIR stands for Ensemble Clustering for Lineage Analysis, Inference and Robustness. Robust and scalable inference of cell lineages from gene expression data. 18 | - [Falco](https://github.com/VCCRI/Falco/) - [AWS cloud] - [Falco: A quick and flexible single-cell RNA-seq processing framework on the cloud](http://www.biorxiv.org/content/early/2016/07/15/064006.abstract). 19 | - [flotilla](https://github.com/yeolab/flotilla) - [Python] Reproducible machine learning analysis of gene expression and alternative splicing data 20 | - [HocusPocus](https://github.com/joeburns06/hocuspocus) - [R] - Basic PCA-based workflow for analysis and plotting of single cell RNA-seq data. 21 | - [MAST](https://github.com/RGLab/MAST) - [R] - Model-based Analysis of Single-cell Transcriptomics 22 | (MAST) fits a two-part, generalized linear models that are specially adapted for bimodal and/or zero-inflated single cell gene expression data. 23 | - [Monocle](http://cole-trapnell-lab.github.io/monocle-release/) - [R] - Differential expression and time-series analysis for single-cell RNA-Seq. 24 | - [OEFinder](https://github.com/lengning/OEFinder) - [R] - Identify ordering effect genes in single cell RNA-seq data. OEFinder shiny impelemention depends on packages shiny, shinyFiles, gdata, and EBSeq. 25 | - [Ouija](https://github.com/kieranrcampbell/ouija) - [R] - Incorporate prior information into single-cell trajectory (pseudotime) analyses using Bayesian nonlinear factor analysis. 26 | - [outrigger](https://github.com/YeoLab/outrigger) - [Python] - Outrigger is a program to calculate alternative splicing scores of RNA-Seq data based on junction reads and a *de novo*, custom annotation created with a graph database, especially made for single-cell analyses. 27 | - [SC3](https://github.com/hemberg-lab/sc3) - [R] - SC3 is an interactive tool for the unsupervised clustering of cells from single cell RNA-Seq experiments. 28 | - [scater](bioconductor.org/packages/scater) - [R] - Scater places an emphasis on tools for quality control, visualisation and pre-processing of data before further downstream analysis, filling a useful niche between raw RNA-sequencing count or transcripts-per-million data and more focused downstream modelling tools such as monocle, scLVM, SCDE, edgeR, limma and so on. 29 | - [scDD](https://github.com/kdkorthauer/scDD) - [R] - scDD (Single-Cell Differential Distributions) is a framework to identify genes with different expression patterns between biological groups of interest. In addition to traditional differential expression, it can detect differences that are more complex and subtle than a mean shift. 30 | - [SCDE](https://github.com/hms-dbmi/scde) - [R] - Differential expression using error models and overdispersion-based identification of important gene sets. 31 | - [SCell](https://github.com/diazlab/SCell) - [matlab] - SCell is an integrated software tool for quality filtering, normalization, feature selection, iterative dimensionality reduction, clustering and the estimation of gene-expression gradients from large ensembles of single-cell RNA-seq datasets. SCell is open source, and implemented with an intuitive graphical interface. 32 | - [scLVM](https://github.com/PMBio/scLVM) - [R] - scLVM is a modelling framework for single-cell RNA-seq data that can be used to dissect the observed heterogeneity into different sources, thereby allowing for the correction of confounding sources of variation. scLVM was primarily designed to account for cell-cycle induced variations in single-cell RNA-seq data where cell cycle is the primary soure of variability. 33 | - [SCONE](https://github.com/YosefLab/scone) - [R] - SCONE (Single-Cell Overview of Normalized Expression), a package for single-cell RNA-seq data quality control (QC) and normalization. This data-driven framework uses summaries of expression data to assess the efficacy of normalization workflows. 34 | - [SCOUP](https://github.com/hmatsu1226/SCOUP) - [C++] - Uses probabilistic model based on the Ornstein-Uhlenbeck process to analyze single-cell expression data during differentiation. 35 | - [scran](http://bioconductor.org/packages/scran) - [R] - This package implements a variety of low-level analyses of single-cell RNA-seq data. Methods are provided for normalization of cell-specific biases, pool-based norms to estimate size factors, assignment of cell cycle phase, and detection of highly variable and significantly correlated genes. 36 | - [SCRAT](https://github.com/zji90/SCRAT) - [R] - SCRAT provides essential tools for users to read in single-cell regolome data (ChIP-seq, ATAC-seq, DNase-seq) and summarize into different types of features. It also allows users to visualize the features, cluster samples and identify key features. 37 | - [SEPA](https://github.com/zji90/SEPA) - [R] - SEPA provides convenient functions for users to assign genes into different gene expression patterns such as constant, monotone increasing and increasing then decreasing. SEPA then performs GO enrichment analysis to analysis the functional roles of genes with same or similar patterns. 38 | - [scTCRseq](https://github.com/ElementoLab/scTCRseq) - [python] - Map T-cell receptor (TCR) repertoires from single cell RNAseq. 39 | - [Seurat](http://www.satijalab.org/seurat.html) - [R] - It contains easy-to-use implementations of commonly used analytical techniques, including the identification of highly variable genes, dimensionality reduction (PCA, ICA, t-SNE), standard unsupervised clustering algorithms (density clustering, hierarchical clustering, k-means), and the discovery of differentially expressed genes and markers. 40 | - [sincell](http://bioconductor.org/packages/sincell) - [R] - Existing computational approaches for the assessment of cell-state hierarchies from single-cell data might be formalized under a general workflow composed of i) a metric to assess cell-to-cell similarities (combined or not with a dimensionality reduction step), and ii) a graph-building algorithm (optionally making use of a cells-clustering step). Sincell R package implements a methodological toolbox allowing flexible workflows under such framework. 41 | - [sincera](https://research.cchmc.org/pbge/sincera.html) - [R] - R-based pipeline for single-cell analysis including clustering and visualization. 42 | - [SinQC](http://www.morgridge.net/SinQC.html) - [R] - A Method and Tool to Control Single-cell RNA-seq Data Quality. 43 | - [SLICER](https://github.com/jw156605/SLICER) - [R] - Selective Locally linear Inference of Cellular Expression Relationships (SLICER) algorithm for inferring cell trajectories. 44 | - [slingshot](https://github.com/kstreet13/slingshot) - [R] - Functions for identifying and characterizing continuous developmental trajectories in single-cell sequencing data. 45 | - [SPADE](http://www.nature.com/nprot/journal/v11/n7/full/nprot.2016.066.html) - [R] - Visualization and cellular hierarchy inference of single-cell data using SPADE. 46 | - [switchde](http://github.com/kieranrcampbell/switchde) - [R] - Differential expression analysis across pseudotime. Identify genes that exhibit switch-like up or down regulation along single-cell trajectories along with where in the trajectory the regulation occurs. 47 | - [TraCeR](http://github.com/teichlab/tracer) - [python] - Reconstruction of T-Cell receptor sequences from single-cell RNA-seq data. 48 | - [TSCAN](https://github.com/zji90/TSCAN) - [R] - Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. 49 | 50 | ### Copy number analysis 51 | 52 | - [Gingko](https://github.com/robertaboukhalil/ginkgo) - [R, C] - Gingko is a cloud-based web app single-cell copy-number variation analysis tool. 53 | 54 | ## Tutorials and workflows 55 | 56 | - [Aaron Lun's Single Cell workflow on Bioconductor](http://bioconductor.org/help/workflows/simpleSingleCell/) - [R] - This article describes a computational workflow for basic analysis of scRNA-seq data using software packages from the open-source Bioconductor project. 57 | - [Bioconductor2016 Single-cell-RNA-sequencing workshop by Sandrine Dudoit lab](https://github.com/drisso/bioc2016singlecell) - [R] - SCONE, clusterExperiment, and slingshot tutorial. 58 | - [BiomedCentral Single Cell Omics collectin](http://www.biomedcentral.com/collections/singlecellomics) - collection of papers describing techniques for single-cell analysis and protocols. 59 | - [CSHL Single Cell Analysis - Bioinformatics](https://github.com/YeoLab/single-cell-bioinformatics/) course materials - Uses Shalek 2013 and Macaulay 2016 datasets to teach machine learning to biologists 60 | - [Gilad Lab Single Cell Data Exploration](http://jdblischak.github.io/singleCellSeq/analysis/) - R-based exploration of single cell sequence data. Lots of experimentation. 61 | - [Harvard STEM Cell Institute Single Cell Workshop 2015](http://hms-dbmi.github.io/scw/) - workshop on common computational analysis techniques for scRNA-seq data from differential expression to subpopulation identification and network analysis. [See course description for more information](http://scholar.harvard.edu/jeanfan/classes/single-cell-workshop-2015) 62 | - [Hemberg Lab scRNA-seq course materials](http://hemberg-lab.github.io/scRNA.seq.course/index.html) 63 | - [Using Seurat for unsupervised clustering and biomarker discovery](http://www.satijalab.org/seurat-intro.html) - 301 single cells across diverse tissues from (Pollen et al., Nature Biotechnology, 2014) 64 | - [Using Seurat for spatial inference in single-cell data](http://www.satijalab.org/seurat-intro.html) - 851 single cells from Zebrafish embryogenesis (Satija*, Farrell* et al., Nature Biotechnology, 2015) 65 | 66 | ## Web portals and apps 67 | 68 | - [Neuro Single Cell Expression Portal at the Broad](https://portals.broadinstitute.org/single_cell) - The Single-Cell RNA-Seq Portal for Brain Research was developed to facilitate sharing scientific results, and disseminate datasets resulting from the NIH's BRAIN initiative. 69 | 70 | ## Journal articles of general interest 71 | 72 | ### Experimental design 73 | 74 | - [Design and computational analysis of single-cell RNA-sequencing experiments](http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0927-y) 75 | 76 | ### Methods comparisons 77 | 78 | - [Comparative analysis of single-cell RNA sequencing methods](http://biorxiv.org/content/early/2016/06/29/035758) - a comparison of wet lab protocols for scRNA sequencing. 79 | 80 | ## Similar lists and collections 81 | 82 | - [CrazyHotTommy's RNA-seq analysis list](https://github.com/crazyhottommy/RNA-seq-analysis#single-cell-rna-seq) - Very broad list that includes some single cell RNA-seq packages and papers. 83 | 84 | ## People 85 | 86 | Gender bias at conferences is a well known problem ([http://www.sciencemag.org/careers/2015/07/countering-gender-bias-conferences](http://www.sciencemag.org/careers/2015/07/countering-gender-bias-conferences)). Creating a list of potential speakers can help mitigate this bias and a community of people developing and maintaining helps to further diversify this list beyond smaller networks. 87 | 88 | ### Female 89 | 90 | - [Christina Kendziorski (University of Wisconsin–Madison, USA)](https://www.biostat.wisc.edu/~kendzior/) 91 | - [Sandrine Dudoit (UC Berkeley, USA)](http://www.stat.berkeley.edu/~sandrine/) 92 | - [Keegan Korthauer (Dana Farber Cancer Institute, USA)](http://bcb.dfci.harvard.edu/~keegan/) 93 | - [Stephanie Hicks (Dana Farber Cancer Institute, USA)](http://www.stephaniehicks.com/) 94 | - [Dana Pe'er (Columbia University, USA](http://www.c2b2.columbia.edu/danapeerlab/html/) 95 | - [Aviv Regev (Broad Institute, USA)](https://www.broadinstitute.org/scientific-community/science/core-faculty-labs/regev-lab/regev-lab-home) 96 | - [Catalina Vallejos (MRC Biostatistics Unit and EMBL-EBI, UK)](http://www.mrc-bsu.cam.ac.uk/people/in-alphabetical-order/t-to-z/catalina-vallejos-menesses/) 97 | - [Sarah Teichmann (Wellcome Trust Sanger Institute, UK)](http://www.teichlab.org/) 98 | - [Emma Pierson (Stanford University, USA)](http://cs.stanford.edu/people/emmap1/) 99 | - [Ning Leng (Morgridge Institute for Research, USA)](https://www.biostat.wisc.edu/~ningleng/) 100 | - [Laleh Haghverdi (Institute of Computational Biology, Germany)](https://www.helmholtz-muenchen.de/icb/institute/staff/staff/ma/2453/-Haghverdi/index.html) 101 | 102 | ### Male 103 | 104 | - [Raphael Gottardo (Fred Hutchinson Cancer Research Center, USA)](https://www.fredhutch.org/en/labs/profiles/gottardo-raphael.html) 105 | - [John Marioni (EBI, UK)](http://www.ebi.ac.uk/research/marioni) 106 | - [Oliver Stegle (EBI, UK)](http://www.ebi.ac.uk/research/stegle) 107 | - [Davis McCarthy (EBI, UK)](https://sites.google.com/site/davismcc/) 108 | - [Aaron Lun (Cancer Research UK, UK)](http://www.cruk.cam.ac.uk/users/aaron-lun) 109 | --------------------------------------------------------------------------------