├── README.md ├── cooccurence_clustering_repo.zip ├── cooccurence_clustering_repo_v2.zip ├── example_MouseInnerEar.PNG └── example_PBMC.PNG /README.md: -------------------------------------------------------------------------------- 1 | # Co-occurrence Clustering Algorithm 2 | 3 | One primary reason that makes the analysis of single-cell RNA-seq data challenging is the dropouts, where the data only captures a small fraction of the transcriptome of each cell. Many computational algorithms have been developed to address the dropouts. Here, an opposite view is explored. Instead of treating dropout as a problem to be fixed, we embrace it as a useful signal for defining cell types. We present an iterative co-occurrence clustering algorithm that works with binarized single-cell RNA-seq count data, and is able to effectively identify cell populations, as well as cell-type specific pathways and signatures. 4 | 5 | # Paper 6 | 7 | A manuscript describing the algorithm is available at https://www.biorxiv.org/content/early/2018/11/17/468025 8 | 9 | # User Instructions 10 | 11 | Download and unzip the **cooccurence_clustering_repo.zip** file. Among the three resulting folders, source code is under the "tools" folder, and the other two folders are two example datasets. To test the algorithm on the provided examples: open Matlab, change working directory to one example folder, run the step_01, 02, 03, ... scripts sequentially. 12 | 13 | The raw data in the examples are in common formats (sparse matrix and GSE series matrix). To quickly test the algorithm on new datasets, please format the new data in the same way as one of the examples. 14 | 15 | System requirements for running the code: Matlab 2017b, Windonws 10, >=32GB of RAM 16 | 17 | Run time of the algorithm depends on the computer and dataset. For the example datasets below, run time should be around 10 minutes. For the largest dataset we have tested (~70,000 cells), the run time of the algorithm was ~10 hours (produced roughly 100 clusters). 18 | 19 | **Note: cooccurence_clustering_repo_v2.zip presents an updated version of the coocurrence clustering algorithm. The concept stays the same, but the implementation is updated.** 20 | 21 | # Examples 22 | 23 | In each example folder, there is a subfolder called html, which contains all the intermediate figures generated by the co-occurrence clustering algorithm. Quick summaries of the examples are shown here. 24 | 25 | ### Peripheral Blood Mononuclear Cells (PBMC) 26 | 27 | Data Source: The PBMC is available from 10X Genomics (https://s3-us-west-2.amazonaws.com/10x.files/samples/cell/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz). 28 | 29 | Co-occurrence clustering result: 30 | 31 | 32 | ### Mouse inner ear sensory epithelia 33 | 34 | Data Source: GSE71982 35 | 36 | Co-occurrence clustering result: 37 | 38 | 39 | -------------------------------------------------------------------------------- /cooccurence_clustering_repo.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pqiu/cooccurrence_clustering/bafea56b9aaa84fa6d285028e0dfbd8bb2c52798/cooccurence_clustering_repo.zip -------------------------------------------------------------------------------- /cooccurence_clustering_repo_v2.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pqiu/cooccurrence_clustering/bafea56b9aaa84fa6d285028e0dfbd8bb2c52798/cooccurence_clustering_repo_v2.zip -------------------------------------------------------------------------------- /example_MouseInnerEar.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pqiu/cooccurrence_clustering/bafea56b9aaa84fa6d285028e0dfbd8bb2c52798/example_MouseInnerEar.PNG -------------------------------------------------------------------------------- /example_PBMC.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pqiu/cooccurrence_clustering/bafea56b9aaa84fa6d285028e0dfbd8bb2c52798/example_PBMC.PNG --------------------------------------------------------------------------------