├── .gitignore ├── DESCRIPTION ├── LICENSE ├── NAMESPACE ├── R ├── BuildTrajectory.R ├── CalculatePWProfiles.R ├── IdentifyCellTypes.R ├── IdentifyVaryingPWs.R ├── PlotVaryingPWs.R ├── TemporaObjClass.R ├── Visualize.R └── dataAccess.R ├── README.Rmd ├── README.md ├── Tempora.Rproj └── man ├── BuildTrajectory.Rd ├── CalculatePWProfiles.Rd ├── CreateTemporaObject.Rd ├── IdentifyCellTypes.Rd ├── IdentifyVaryingPWs.Rd ├── ImportSeuratObject.Rd ├── PlotTrajectory.Rd ├── PlotVaryingPWs.Rd ├── Tempora-class.Rd ├── getExpr.Rd └── getMD.Rd /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: Tempora 2 | Type: Package 3 | Title: Pathway-based Trajectory Inference for Time-series Single Cell RNA-Seq Data 4 | Version: 0.1.0 5 | Author: Thinh Tran 6 | Maintainer: Thinh Tran 7 | Description: Tempora is a trajectory inference method that infers cell lineage maps from time-series scRNAseq data using pathway enrichment profiles of single cells and experimental time information. Tempora uses an information theoretic approach to build a trajectory at the cluster level based on the clusters’ pathway enrichment profiles, effectively connecting cell types and states across multiple time points. Taking advantage of the available time information, Tempora can infer the directions of all connections in a trajectory that go from early to late clusters. 8 | License: MIT + file LICENSE 9 | Imports: 10 | GSVA, 11 | GSEABase, 12 | bnlearn, 13 | mgcv, 14 | RColorBrewer, 15 | igraph, 16 | methods, 17 | devtools, 18 | reshape2, 19 | scales 20 | Encoding: UTF-8 21 | LazyData: true 22 | RoxygenNote: 7.0.2 23 | Suggests: 24 | knitr, 25 | rmarkdown, 26 | Seurat 27 | VignetteBuilder: knitr 28 | Collate: 29 | 'BuildTrajectory.R' 30 | 'CalculatePWProfiles.R' 31 | 'IdentifyCellTypes.R' 32 | 'IdentifyVaryingPWs.R' 33 | 'PlotVaryingPWs.R' 34 | 'dataAccess.R' 35 | 'TemporaObjClass.R' 36 | 'Visualize.R' 37 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2020 Bader Lab, University of Toronto 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the “Software”), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 7 | of the Software, and to permit persons to whom the Software is furnished to do 8 | so, subject to the following conditions: 9 | 10 | The above copyright notice and this permission notice shall be included in all 11 | copies or substantial portions of the Software. 12 | 13 | THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 14 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 15 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 16 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 17 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 18 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 19 | SOFTWARE. -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(BuildTrajectory) 4 | export(CalculatePWProfiles) 5 | export(CreateTemporaObject) 6 | export(IdentifyCellTypes) 7 | export(IdentifyVaryingPWs) 8 | export(ImportSeuratObject) 9 | export(PlotTrajectory) 10 | export(PlotVaryingPWs) 11 | export(getExpr) 12 | export(getMD) 13 | exportClasses(Tempora) 14 | exportMethods("cluster.metadata<-") 15 | exportMethods("cluster.pathways.dr<-") 16 | exportMethods("cluster.pathways<-") 17 | exportMethods("data<-") 18 | exportMethods("gams<-") 19 | exportMethods("layouts<-") 20 | exportMethods("meta.data<-") 21 | exportMethods("n.pcs<-") 22 | exportMethods("varying.pws<-") 23 | exportMethods(cluster.metadata) 24 | exportMethods(cluster.pathways) 25 | exportMethods(cluster.pathways.dr) 26 | exportMethods(data) 27 | exportMethods(gams) 28 | exportMethods(layouts) 29 | exportMethods(meta.data) 30 | exportMethods(n.pcs) 31 | exportMethods(trajectory) 32 | exportMethods(varying.pws) 33 | importFrom(GSEABase,getGmt) 34 | importFrom(GSVA,gsva) 35 | importFrom(RColorBrewer,brewer.pal) 36 | importFrom(bnlearn,aracne) 37 | importFrom(grDevices,colorRampPalette) 38 | importFrom(graphics,axis) 39 | importFrom(graphics,legend) 40 | importFrom(graphics,par) 41 | importFrom(graphics,points) 42 | importFrom(graphics,text) 43 | importFrom(igraph,E) 44 | importFrom(igraph,graph_from_data_frame) 45 | importFrom(igraph,layout_with_sugiyama) 46 | importFrom(igraph,plot.igraph) 47 | importFrom(methods,new) 48 | importFrom(methods,validObject) 49 | importFrom(mgcv,anova.gam) 50 | importFrom(mgcv,gam) 51 | importFrom(mgcv,plot.gam) 52 | importFrom(reshape2,dcast) 53 | importFrom(reshape2,melt) 54 | importFrom(scales,rescale) 55 | importFrom(stats,p.adjust) 56 | importFrom(stats,prcomp) 57 | importFrom(stats,screeplot) 58 | importFrom(tibble,rownames_to_column) 59 | -------------------------------------------------------------------------------- /R/BuildTrajectory.R: -------------------------------------------------------------------------------- 1 | #' Build mutual information-based network 2 | #' 3 | #' Build the information-based cluster-cluster network using reduced-dimensionality pathway enrichment profiles of all clusters. 4 | #' This network connects related cell types and states across multiple time points. Taking advantage of the available time information, Tempora also infers the directions of all connections in a trajectory that go from early to late clusters. 5 | #' @param object A Tempora object containing a gene expression matrix and metadata 6 | #' @param n_pcs Number of principal components to be used in building the network. 7 | #' @param difference_threshold Percent of permissible difference between the temporal scores of two clusters to determine the direction of their connections. The temporal scores are calculated based on based on the clusters' composition of cells from each timepoint. The directions of edges connecting pairs of clusters will only be determined for cluster pairs with difference in their time scores higher than the threshold. Other edges will remain undirected. Default at 0.01 8 | #' @param loadings Threshold of PCA loadings for pathways to be used in trajectory construction. The higher the loading, the more the pathway contributes to a principal component. Default at 0.4. 9 | #' @export 10 | #' @importFrom bnlearn aracne 11 | #' @importFrom methods new validObject 12 | #' @importFrom stats p.adjust prcomp screeplot 13 | #' @importFrom reshape2 dcast 14 | #' @examples \dontrun{tempora_data <- BuildTrajectory(tempora_data, n_pcs=10, difference_threshold=0.01, loadings=0.4)} 15 | #' BuildTrajectory 16 | BuildTrajectory <- function(object, n_pcs, difference_threshold=0.01, loadings = 0.4){ 17 | 18 | if (class(object)[1] != "Tempora"){ 19 | stop("Not a valid Tempora object") 20 | } 21 | 22 | 23 | if (n_pcs > ncol(object@cluster.pathways.dr$rotation)){ 24 | stop("Number of PCs selected exceeds number of PCs calculated") 25 | } 26 | 27 | significant_pathways_list <- gsva_pca <- list() 28 | for (i in 1:n_pcs){ 29 | genes_scaled <- scale(object@cluster.pathways.dr$rotation[,i]) 30 | significant_pathways_list[[i]] <- object@cluster.pathways[which(rownames(object@cluster.pathways) %in% names(which(genes_scaled[,1] > loadings | genes_scaled[,1] < (-1*loadings)))), ] 31 | gsva_pca[[i]] <- colMeans(significant_pathways_list[[i]]) 32 | } 33 | 34 | gsva_pca <- Reduce(rbind, gsva_pca) 35 | rownames(gsva_pca) <- paste0("PC", seq(1:nrow(gsva_pca))) 36 | 37 | mi_network <- bnlearn::aracne(as.data.frame(gsva_pca)) 38 | edges_df <- as.data.frame(mi_network$arcs) 39 | edges_df$to <- as.numeric(as.character(edges_df$to)) 40 | edges_df$from <- as.numeric(as.character(edges_df$from)) 41 | edges_df$from_clusterscore <- unlist(sapply(edges_df$from, function(x) object@cluster.metadata$Cluster_time_score[object@cluster.metadata$Id == x])) 42 | edges_df$to_clusterscore <- unlist(sapply(edges_df$to, function(x) object@cluster.metadata$Cluster_time_score[object@cluster.metadata$Id == x])) 43 | 44 | 45 | edges_df$direction <- ifelse((abs(edges_df$to_clusterscore - edges_df$from_clusterscore)/(0.5*(edges_df$to_clusterscore + edges_df$from_clusterscore))) < difference_threshold, "bidirectional", "unidirectional") 46 | edges_df <- edges_df[-which(edges_df$from_clusterscore > edges_df$to_clusterscore), ] 47 | edges_df$id <- ifelse(as.numeric(edges_df$from) > as.numeric(edges_df$to), paste0(edges_df$from, edges_df$to), paste0(edges_df$to, edges_df$from)) 48 | edges_df <- edges_df[!duplicated(edges_df$id), ] 49 | edges_df <- edges_df[, -6] 50 | edges_df$type <- ifelse(edges_df$direction == "bidirectional", 3, 1) 51 | 52 | object@trajectory <- edges_df 53 | object@n.pcs <- n_pcs 54 | return(object) 55 | } 56 | 57 | 58 | 59 | -------------------------------------------------------------------------------- /R/CalculatePWProfiles.R: -------------------------------------------------------------------------------- 1 | #' Calculate pathway enrichment profile 2 | #' 3 | #' Calculate cluster average gene expression profile and determine the pathway enrichment profile of each cluster 4 | #' @param object A Tempora object containing a gene expression matrix and metadata (cluster identity and ) 5 | #' @param gmt_path Local path to database of pathways or genesets organized as a .gmt file. Genesets files in GMT format can be downloaded at http://baderlab.org/GeneSets. Please ensure 6 | #' @param method Method used to estimate pathway enrichment profile per cluster. Can be "gsva", "ssgsea", "zscore" or "plage", default to "gsva". See ?gsva for more information. 7 | #' @param min.sz Minimum size of the genesets used in enrichment estimation, set to 5 genes by default. 8 | #' @param max.sz Maximum size of the genesets used in enrichment estimation, set to 200 genes by default. 9 | #' @param parallel.sz Type of cluster architecture when using \code{snow}. If 1, no parallelization will be used. If 0, all available cores will be used. 10 | #' @export 11 | #' @importFrom methods new validObject 12 | #' @importFrom stats p.adjust prcomp screeplot 13 | #' @importFrom GSVA gsva 14 | #' @importFrom GSEABase getGmt 15 | #' @importFrom reshape2 dcast 16 | #' @examples \dontrun{tempora_data <- CalculatePWProfiles(tempora_data, gmt_path="~/Human_AllPathways_September_01_2019_symbol.gmt", parallel.sz = detectCores()-2)} 17 | #' @return An updated Tempora object containing the pathway enrichment profiles of each cluster, which can be accessed at \code{object@cluster.pathways} 18 | #' CalculatePWProfiles 19 | CalculatePWProfiles <- function(object, gmt_path, method="gsva", min.sz=5, max.sz=200, parallel.sz=1){ 20 | if (class(object)[1] != "Tempora"){ 21 | stop("Not a valid Tempora object") 22 | } 23 | 24 | cat("Calculating cluster average gene expression profile...") 25 | exprMatrix <- object@data 26 | exprMatrix_bycluster <- list() 27 | pathwaygmt <- GSEABase::getGmt(gmt_path) 28 | for (i in sort(unique(object@meta.data$Clusters))){ 29 | exprMatrix_bycluster[[i]] <- rowMeans(exprMatrix[, which(colnames(exprMatrix) %in% rownames(object@meta.data)[which(object@meta.data$Clusters == i)])]) 30 | } 31 | names(exprMatrix_bycluster) <- sort(unique(object@meta.data$Clusters)) 32 | 33 | exprMatrix_bycluster <- do.call(cbind, exprMatrix_bycluster) 34 | colnames(exprMatrix_bycluster) <- sort(unique(object@meta.data$Clusters)) 35 | rownames(exprMatrix_bycluster) <- rownames(exprMatrix) 36 | 37 | cat("\nCalculating cluster pathway enrichment profiles...\n") 38 | gsva_bycluster <- GSVA::gsva(as.matrix(exprMatrix_bycluster), pathwaygmt, method=method, min.sz=min.sz, max.sz=max.sz, parallel.sz=parallel.sz) 39 | 40 | colnames(gsva_bycluster) <- colnames(exprMatrix_bycluster) 41 | object@cluster.pathways <- gsva_bycluster 42 | 43 | gsva_bycluster_pca <- prcomp(t(gsva_bycluster), scale = T, center = T) 44 | screeplot(gsva_bycluster_pca, npcs=25, type="lines", main="PCA on pathway enrichment analysis result") 45 | object@cluster.pathways.dr <- gsva_bycluster_pca 46 | 47 | validObject(object) 48 | return(object) 49 | } 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | -------------------------------------------------------------------------------- /R/IdentifyCellTypes.R: -------------------------------------------------------------------------------- 1 | #' Calculate pathway enrichment profile 2 | #' 3 | #' Calculate cluster average gene expression profile and determine the pathway enrichment profile of each cluster 4 | #' @param exprMatrix A gene expression matrix, with genes in rows and cells in columns. 5 | #' @param cell_markers A list of possible cell types found in the dataset and their marker genes. 6 | #' @param cluster_labels A named vector of cluster identifier for each cell in the gene expression matrix 7 | #' @param threshold Threshold of GSVA score quantile. Cell types with GSVA scores in this quantile or higher compared to all other cell type scores for the same cluster would be included in cluster label. Numeric between 0-1, default to 0.9. 8 | #' @export 9 | #' @importFrom GSVA gsva 10 | #' @importFrom reshape2 melt 11 | #' @importFrom tibble rownames_to_column 12 | #' @return A vector of cell types inferred from the expression of marker genes provided 13 | IdentifyCellTypes <- function(exprMatrix, cluster_labels, cell_markers, threshold=0.9){ 14 | exprMatrix_bycluster <- list() 15 | for (i in sort(as.numeric(unique(cluster_labels)))){ 16 | exprMatrix_bycluster[[i]] <- rowMeans(exprMatrix[, which(colnames(exprMatrix) %in% names(cluster_labels)[which(cluster_labels == i)])]) 17 | } 18 | exprMatrix_bycluster <- do.call(cbind, exprMatrix_bycluster) 19 | colnames(exprMatrix_bycluster) <- sort(as.numeric(unique(cluster_labels))) 20 | rownames(exprMatrix_bycluster) <- rownames(exprMatrix) 21 | 22 | cell_type_classifier <- GSVA::gsva(exprMatrix_bycluster, cell_markers, parallel.sz=1) 23 | 24 | cell_types <- apply(cell_type_classifier, 2, function(x) paste0(rownames(cell_type_classifier)[which(x > quantile(x, threshold))], collapse="/")) 25 | 26 | if (any(cell_types == "")){ 27 | cell_types[cell_types==""] <- "Unclassified" 28 | } 29 | # if (any(apply(cell_type_classifier, 2, max) < 0.5)){ 30 | # message(paste("Cluster(s)", names(which(apply(cell_type_classifier, 2, max) < 0.5)), "cannot be classified with confidence using the provided markers and received the label 'Other'. Please revise marker sets and re-run if possible, or 31 | # manually alter cluster labels by setting object@clusterlabel")) 32 | # cell_types[which(apply(cell_type_classifier, 2, max) < 0.5)] <- "Other" 33 | # } 34 | return(cell_types) 35 | } 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | -------------------------------------------------------------------------------- /R/IdentifyVaryingPWs.R: -------------------------------------------------------------------------------- 1 | #' Calculate temporally changing pathways 2 | #' 3 | #' Identify the pathways that change over time by fitting a generalized additive model 4 | #' @param object A Tempora object 5 | #' @param pval_threshold P-value threshold to determine the significance of pathway enrichment over time. Default to 0.05. 6 | #' @export 7 | #' @importFrom mgcv gam anova.gam plot.gam 8 | #' @importFrom methods new validObject 9 | #' @importFrom stats p.adjust prcomp screeplot 10 | #' @importFrom grDevices colorRampPalette 11 | #' @importFrom graphics axis legend par points text 12 | #' @importFrom reshape2 dcast 13 | #' @examples \dontrun{tempora_data <- IdentifyVaryingPWs(tempora_data, pval_threshold = 0.05)} 14 | 15 | IdentifyVaryingPWs <- function(object, pval_threshold=0.05){ 16 | 17 | if (class(object)[1] != "Tempora"){ 18 | stop("Not a valid Tempora object") 19 | } 20 | if (is.null(object@n.pcs)){ 21 | stop("BuildTrajectory has not been run. See ?Tempora::BuildTrajectory for details") 22 | } 23 | if (is.null(object@cluster.pathways)){ 24 | stop("CalculatePWProfiles has not been run. See ?Tempora::CalculatePWProfiles for details") 25 | } 26 | gsva_bycluster <- object@cluster.pathways 27 | 28 | significant_pathways <- c() 29 | for (i in 1:object@n.pcs){ 30 | genes_scaled <- scale(object@cluster.pathways.dr$rotation[,i]) 31 | significant_pathways <- c(names(which(genes_scaled[,1] > 1.5 | genes_scaled[,1] < -1.5)), significant_pathways) 32 | } 33 | 34 | pca_pathways <- sub("%.*", "", significant_pathways) 35 | pca_pathways <- gsub("\\s*\\([^\\)]+\\)","",pca_pathways) 36 | pca_pathways_cleaned <- gsub("[[:punct:]]", "", pca_pathways) 37 | themes <- pca_pathways_cleaned 38 | 39 | cat("Fitting GAM models...") 40 | 41 | p_vals <- gams <- list() 42 | for (i in 1:length(themes)){ 43 | print(i) 44 | if(length(grep(themes[i], rownames(gsva_bycluster))) == 0) { 45 | p_vals[[i]] <- 1 46 | gams[[i]] <- NA 47 | next 48 | } 49 | if (length(grep(themes[i], rownames(gsva_bycluster))) > 1){ 50 | plot_df <- data.frame(cluster=colnames(gsva_bycluster[grep(themes[i], rownames(gsva_bycluster)), ]), value=colMeans(gsva_bycluster[grep(themes[i], rownames(gsva_bycluster)), ], na.rm=T)) 51 | } else if (length(grep(themes[i], rownames(gsva_bycluster))) == 1){ 52 | plot_df <- data.frame(cluster=names(gsva_bycluster[grep(themes[i], rownames(gsva_bycluster)), ]), value=gsva_bycluster[grep(themes[i], rownames(gsva_bycluster)), ]) } 53 | plot_df$time <- object@cluster.metadata$Cluster_time_score 54 | gams[[i]] <- mgcv::gam(value ~ s(time, k=3, bs='cr'), data=plot_df) 55 | temp_anova <- mgcv::anova.gam(gams[[i]]) 56 | p_vals[[i]] <- temp_anova$s.pv 57 | } 58 | 59 | names(p_vals) <- names(gams) <- themes 60 | 61 | pval_threshold = pval_threshold 62 | p_vals_adj <- p.adjust(unlist(p_vals[which(unlist(p_vals) > 0)]), method = "BH") 63 | varying_pathways <- p_vals_adj[which(p_vals_adj < pval_threshold)] 64 | varying_pathways <- varying_pathways[!duplicated(names(varying_pathways))] 65 | 66 | if (length(varying_pathways)==0){ 67 | cat("No temporally varying pathways detected. Please try running IdentifyVaryingPWs with a more relaxed p-value cutoff.") 68 | #eventhough the function was not successful return the object because in the vignette 69 | # this function call sets the original object to what is returned and if it is null 70 | # you loose all the processing you have done until now. 71 | return(object) 72 | } else { 73 | object@varying.pws <- varying_pathways 74 | object@gams <- gams 75 | return(object) 76 | } 77 | } 78 | 79 | 80 | -------------------------------------------------------------------------------- /R/PlotVaryingPWs.R: -------------------------------------------------------------------------------- 1 | #' Plot temporally changing pathways 2 | #' 3 | #' Plot the expression of temporally changing pathways as identified by IdentifyVaryingPWs() 4 | #' @param object A Tempora object 5 | #' @export 6 | #' @importFrom mgcv gam anova.gam plot.gam 7 | #' @importFrom methods new validObject 8 | #' @importFrom stats p.adjust prcomp screeplot 9 | #' @importFrom grDevices colorRampPalette 10 | #' @importFrom graphics axis legend par points text 11 | #' @importFrom reshape2 dcast 12 | #' @examples \dontrun{tempora_data <- IdentifyVaryingPWs(tempora_data, pval_threshold = 0.05)} 13 | PlotVaryingPWs <- function(object){ 14 | 15 | if (class(object)[1] != "Tempora"){ 16 | stop("Not a valid Tempora object") 17 | } 18 | if (is.null(object@varying.pws)){ 19 | stop("IdentifyVaryingPWs has not been run or no temporally varying pathways were detected. Please run IdentifyVaryingPWs or re-run with a more relaxed p-value cutoff See ?Tempora::IdentifyVaryingPWs for details") 20 | } 21 | 22 | varying_pathways <- object@varying.pws 23 | gsva_bycluster <- object@cluster.pathways 24 | gams <- object@gams 25 | 26 | cat("\nPlotting time-dependent pathways...") 27 | 28 | for (i in 1:length(varying_pathways)){ 29 | if (length(grep(names(varying_pathways)[i], rownames(gsva_bycluster))) > 1){ 30 | plot_df <- data.frame(cluster=colnames(gsva_bycluster[grep(names(varying_pathways)[i], rownames(gsva_bycluster)), ]), value=colMeans(gsva_bycluster[grep(names(varying_pathways)[i], rownames(gsva_bycluster)), ])) 31 | plot_df$time <- object@cluster.metadata$Cluster_time_score 32 | } 33 | else if (length(grep(names(varying_pathways)[i], rownames(gsva_bycluster))) == 1) { 34 | plot_df <- data.frame(cluster=names(gsva_bycluster[grep(names(varying_pathways)[i], rownames(gsva_bycluster)), ]), value=gsva_bycluster[grep(names(varying_pathways)[i], rownames(gsva_bycluster)), ]) 35 | plot_df$time <- object@cluster.metadata$Cluster_time_score 36 | } 37 | id <- which(names(gams)==names(varying_pathways)[i]) 38 | mgcv::plot.gam(gams[[id[1]]], main = paste0(names(varying_pathways)[i]), xlab = "Inferred time", ylab="Pathway expression level", bty="l", 39 | cex.main = 1, xaxt = "n", shade= F, se=3, scheme=1) 40 | xmin <- par("usr")[1] 41 | xmax <- par("usr")[2] 42 | points(x=plot_df$time, y=plot_df$value, pch=20, col="navy", cex=0.9) 43 | text(x=plot_df$time, y=plot_df$value, labels=plot_df[,1], pos = 4, cex = 1, col="navy") 44 | legend("topright", legend = "Cluster", pch = 20, col = "navy", bty="n", text.col="navy", cex=0.9) 45 | legend("topright", legend=paste0("\nAdjusted p-value = ", round(varying_pathways[[i]], 5)), bty="n", cex=0.9) 46 | axis(side=1, at=c(xmin, xmax), labels = c("Early", "Late"), tick=T) 47 | } 48 | 49 | } 50 | -------------------------------------------------------------------------------- /R/TemporaObjClass.R: -------------------------------------------------------------------------------- 1 | 2 | ###define S4 class 3 | #' Define a class of Tempora object 4 | #' 5 | #' A Tempora object contains the input gene expression matrix and metadata, as well as stores the meta data of each cluster, 6 | #' the clusters' pathway enrichment profiles, the constructed trajectory as well as the Sugiyama layout for the trajectory plot 7 | #' 8 | #' @name Tempora-class 9 | #' @rdname Tempora-class 10 | #' @exportClass Tempora 11 | #' 12 | #' @slot data A gene expression matrix (genes x cells), often aggregated from multiple time points 13 | #' @slot meta.data A dataframe containing the metadata for the cells in the gene expression matrix, which at minimum includes the 14 | #' collection timepoint and cluster identity of each cell 15 | #' @slot cluster.metadata A dataframe containing the metadata for each cell cluster 16 | #' @slot cluster.pathways A dataframe containing the pathway enrichment profile of each cluster as calculated by \code{\link{CalculatePWProfiles}} 17 | #' @slot cluster.pathways.dr A prcomp object containing the PCA of the clusters' pathway enrichment profiles 18 | #' @slot n.pcs The number of principal components to be used in trajectory construction 19 | #' @slot trajectory A dataframe describing the inferred trajectory as inferred by \code{\link{BuildTrajectory}} 20 | #' @slot layouts A matrix containing the Sugiyama layout of the trajectory to be used in \code{\link{PlotTrajectory}} 21 | #' @slot varying.pws A list of temporally varying pathways identified by \code{\link{IdentifyVaryingPWs}} 22 | Tempora <- setClass( 23 | "Tempora", 24 | slots = c( 25 | data = "matrix", 26 | meta.data = "data.frame", 27 | cluster.metadata = "data.frame", 28 | cluster.pathways = "matrix", 29 | cluster.pathways.dr = "ANY", 30 | n.pcs = "numeric", 31 | trajectory = "data.frame", 32 | layouts = "matrix", 33 | varying.pws = "ANY", 34 | gams = "ANY" 35 | )) 36 | setValidity("Tempora", function(object) 37 | { 38 | if(nrow(object@meta.data) != ncol(object@data)){ 39 | return("The numbers of cells in the expression matrix and metadata are different") 40 | } 41 | if (any(!rownames(object@meta.data) %in% colnames(object@data))){ 42 | return("Cell names in the expression matrix and metadata are different") 43 | } 44 | return(TRUE) 45 | } 46 | ) 47 | 48 | 49 | ###accessors 50 | 51 | #' Data method 52 | #' @name Tempora-class 53 | #' @rdname Tempora-class 54 | #' @exportMethod data 55 | setGeneric("data", function(x) standardGeneric("data")) 56 | #' @name Tempora-class 57 | #' @rdname Tempora-class 58 | #' @exportMethod data<- 59 | setGeneric("data<-", function(x, value) standardGeneric("data<-")) 60 | #' Extract data from Tempora object 61 | #' 62 | #' @rdname Tempora-class 63 | #' @aliases data 64 | #' @param x Tempora object 65 | setMethod("data", "Tempora", function(x) x@data) 66 | 67 | #' @rdname Tempora-class 68 | #' @aliases data<- 69 | #' @param value New value 70 | setMethod("data<-", "Tempora", function(x, value) { 71 | x@data <- value 72 | validObject(x) 73 | return(x) 74 | }) 75 | 76 | #' Metadata method 77 | #' @name Tempora-class 78 | #' @rdname Tempora-class 79 | #' @exportMethod meta.data 80 | setGeneric("meta.data", function(x) standardGeneric("meta.data")) 81 | #' @name Tempora-class 82 | #' @rdname Tempora-class 83 | #' @exportMethod meta.data<- 84 | setGeneric("meta.data<-", function(x, value) standardGeneric("meta.data<-")) 85 | #' Extract metadata from Tempora object 86 | #' 87 | #' @rdname Tempora-class 88 | #' @aliases meta.data 89 | setMethod("meta.data", "Tempora", function(x) x@meta.data) 90 | #' Extract metadata from Tempora object 91 | #' 92 | #' @rdname Tempora-class 93 | #' @aliases meta.data<- 94 | setMethod("meta.data<-", "Tempora", function(x, value) { 95 | x@meta.data <- value 96 | validObject(x) 97 | return(x) 98 | }) 99 | 100 | #' Cluster metadata method 101 | #' @name Tempora-class 102 | #' @rdname Tempora-class 103 | #' @exportMethod cluster.metadata 104 | setGeneric("cluster.metadata", function(x) standardGeneric("cluster.metadata")) 105 | #' @name Tempora-class 106 | #' @rdname Tempora-class 107 | #' @exportMethod cluster.metadata<- 108 | setGeneric("cluster.metadata<-", function(x, value) standardGeneric("cluster.metadata<-")) 109 | #' Extract cluster metadata from Tempora object 110 | #' 111 | #' @rdname Tempora-class 112 | #' @aliases cluster.metadata 113 | setMethod("cluster.metadata", "Tempora", function(x) x@cluster.metadata) 114 | #' @rdname Tempora-class 115 | #' @aliases cluster.metadata<- 116 | setMethod("cluster.metadata<-", "Tempora", function(x, value) { 117 | x@cluster.metadata <- data.frame 118 | validObject(x) 119 | return(x) 120 | }) 121 | 122 | 123 | #' Cluster pathway enrichment profile method 124 | #' @name Tempora-class 125 | #' @rdname Tempora-class 126 | #' @exportMethod cluster.pathways 127 | setGeneric("cluster.pathways", function(x) standardGeneric("cluster.pathways")) 128 | #' @name Tempora-class 129 | #' @rdname Tempora-class 130 | #' @exportMethod cluster.pathways<- 131 | setGeneric("cluster.pathways<-", function(x, value) standardGeneric("cluster.pathways<-")) 132 | #' Extract cluster pathway enrichment profiles from Tempora object 133 | #' 134 | #' @rdname Tempora-class 135 | #' @aliases cluster.pathways 136 | setMethod("cluster.pathways", "Tempora", function(x) x@cluster.pathways) 137 | #' @rdname Tempora-class 138 | #' @aliases cluster.pathways<- 139 | setMethod("cluster.pathways<-", "Tempora", function(x, value) { 140 | x@cluster.pathway <- value 141 | validObject(x) 142 | return(x) 143 | }) 144 | 145 | #' Dimension reduction of cluster pathway enrichment profile method 146 | #' @name Tempora-class 147 | #' @rdname Tempora-class 148 | #' @exportMethod cluster.pathways.dr 149 | setGeneric("cluster.pathways.dr", function(x) standardGeneric("cluster.pathways.dr")) 150 | #' @name Tempora-class 151 | #' @rdname Tempora-class 152 | #' @exportMethod cluster.pathways.dr<- 153 | setGeneric("cluster.pathways.dr<-", function(x, value) standardGeneric("cluster.pathways.dr<-")) 154 | #' Extract PCA of cluster pathway enrichment profiles from Tempora object 155 | #' 156 | #' @rdname Tempora-class 157 | #' @aliases cluster.pathways.dr 158 | setMethod("cluster.pathways.dr", "Tempora", function(x) x@cluster.pathways.dr) 159 | #' @rdname Tempora-class 160 | #' @aliases cluster.pathways.dr<- 161 | setMethod("cluster.pathways.dr<-", "Tempora", function(x, value) { 162 | x@cluster.pathways.dr <- value 163 | validObject(x) 164 | return(x) 165 | }) 166 | 167 | #' Varying pathways method 168 | #' @name Tempora-class 169 | #' @rdname Tempora-class 170 | #' @exportMethod varying.pws 171 | setGeneric("varying.pws", function(x) standardGeneric("varying.pws")) 172 | #' @name Tempora-class 173 | #' @rdname Tempora-class 174 | #' @exportMethod varying.pws<- 175 | setGeneric("varying.pws<-", function(x, value) standardGeneric("varying.pws<-")) 176 | #' Extract varying pathways from Tempora object 177 | #' 178 | #' @rdname Tempora-class 179 | #' @aliases varying.pws 180 | setMethod("varying.pws", "Tempora", function(x) x@varying.pws) 181 | #' @rdname Tempora-class 182 | #' @aliases varying.pws<- 183 | setMethod("varying.pws<-", "Tempora", function(x, value) { 184 | x@varying.pws <- value 185 | validObject(x) 186 | return(x) 187 | }) 188 | 189 | 190 | #' GAMs method 191 | #' @name Tempora-class 192 | #' @rdname Tempora-class 193 | #' @exportMethod gams 194 | setGeneric("gams", function(x) standardGeneric("gams")) 195 | #' @name Tempora-class 196 | #' @rdname Tempora-class 197 | #' @exportMethod gams<- 198 | setGeneric("gams<-", function(x, value) standardGeneric("gams<-")) 199 | #' Extract calculated GAMs from Tempora object 200 | #' 201 | #' @rdname Tempora-class 202 | #' @aliases gams 203 | setMethod("gams", "Tempora", function(x) x@gams) 204 | #' @rdname Tempora-class 205 | #' @aliases gams<- 206 | setMethod("gams<-", "Tempora", function(x, value) { 207 | x@gams <- value 208 | validObject(x) 209 | return(x) 210 | }) 211 | 212 | 213 | #' Number of PCs to use in trajectory construction method 214 | #' @name Tempora-class 215 | #' @rdname Tempora-class 216 | #' @exportMethod n.pcs 217 | setGeneric("n.pcs", function(x) standardGeneric("n.pcs")) 218 | #' @name Tempora-class 219 | #' @rdname Tempora-class 220 | #' @exportMethod n.pcs<- 221 | setGeneric("n.pcs<-", function(x, value) standardGeneric("n.pcs<-")) 222 | #' Extract number of PCA of cluster pathway enrichment profiles to use from Tempora object 223 | #' 224 | #' @rdname Tempora-class 225 | #' @aliases n.pcs 226 | setMethod("n.pcs", "Tempora", function(x) x@cluster.pathways.dr) 227 | #' @rdname Tempora-class 228 | #' @aliases n.pcs<- 229 | setMethod("n.pcs<-", "Tempora", function(x, value) { 230 | x@n.pcs <- value 231 | validObject(x) 232 | return(x) 233 | }) 234 | 235 | #' Trajectory method 236 | #' @name Tempora-class 237 | #' @rdname Tempora-class 238 | #' @exportMethod trajectory 239 | setGeneric("trajectory", function(x) standardGeneric("trajectory")) 240 | #' @name Tempora-class 241 | #' @rdname Tempora-class 242 | #' @exportMethod trajectory 243 | setGeneric("trajectory<-", function(x, value) standardGeneric("trajectory<-")) 244 | #' Extract the constructed trajectory from Tempora object 245 | #' 246 | #' @rdname Tempora-class 247 | #' @aliases trajectory 248 | setMethod("trajectory", "Tempora", function(x) x@trajectory) 249 | #' @rdname Tempora-class 250 | #' @aliases trajectory<- 251 | setMethod("trajectory<-", "Tempora", function(x, value) { 252 | x@trajectory <- value 253 | validObject(x) 254 | return(x) 255 | }) 256 | 257 | #' Trajectory layout method 258 | #' @name Tempora-class 259 | #' @rdname Tempora-class 260 | #' @exportMethod layouts 261 | setGeneric("layouts", function(x) standardGeneric("layouts")) 262 | #' @name Tempora-class 263 | #' @rdname Tempora-class 264 | #' @exportMethod layouts<- 265 | setGeneric("layouts<-", function(x, value) standardGeneric("layouts<-")) 266 | #' Extract the layout of the trajectory from Tempora object 267 | #' 268 | #' @rdname Tempora-class 269 | #' @aliases layouts 270 | setMethod("layouts", "Tempora", function(x) x@layouts) 271 | #' @rdname Tempora-class 272 | #' @aliases layouts<- 273 | setMethod("layouts<-", "Tempora", function(x, value) { 274 | x@layouts <- value 275 | validObject(x) 276 | return(x) 277 | }) 278 | 279 | 280 | ###Create new Tempora object from expression matrix 281 | #' Create new Tempora object from a processed gene expression matrix 282 | #' @param exprMatrix A normalized gene expression matrix containing cells from all timepoints of the time-series study. Batch effect correction is highly recommended before normalization. 283 | #' @param meta.data A dataframe of meta data for all cells in the expression matrix. Each column is a feature and each row stores single cell information. At minimum, this dataframe should contain two columns: a "Clusters" column storing the clustering identity and a "Timepoints" column storing the timepoint when each cell comes from 284 | #' @param timepoint_order An ordered vector of timepoint names from early to late 285 | #' @param cluster_labels (Optional) A vector of cluster annotations (cell types, cell states, cell cycles, etc.), ordered alphanumerically by cluster names. If NULL and \code{cell_markers} is given, automatic cluster annotation using GSVA will be performed. If both are NULL, cluster numbers will be used to label the trajectory plot. 286 | #' @param cell_markers (Optional) A list of possible cell types found in the dataset and their marker genes to be used for automatic cell type identification. If NULL and no \code{cluster_labels} is given, cluster numbers will be used to label the trajectory plot. 287 | #' @param threshold (Optional) Threshold of GSVA score quantile. Cell types with GSVA scores in this quantile or higher compared to all other cell type scores for the same cluster would be included in cluster label. Numeric between 0-1, default to 0.9. 288 | #' @export 289 | #' @importFrom methods new validObject 290 | #' @importFrom stats p.adjust prcomp screeplot 291 | #' @importFrom reshape2 dcast 292 | #' @return A Tempora object containing the expression matrix and metadata 293 | #' @examples \dontrun{tempora_dara <- CreateTemporaObject(exprMatrix, meta.data)} 294 | #' 295 | CreateTemporaObject <- function(exprMatrix, meta.data, timepoint_order, cluster_labels=NULL, cell_markers=NULL, threshold=0.9){ 296 | 297 | if (!'Timepoints' %in% colnames(meta.data)){ 298 | stop("meta.data needs to contain a column named 'Timepoints' for downstream analyses") 299 | } 300 | if (!'Clusters' %in% colnames(meta.data)){ 301 | stop("meta.data needs to contain a column named 'Clusters' for downstream analyses") 302 | } 303 | if (!is.numeric(exprMatrix)) { 304 | stop("Expression matrix is not numeric") 305 | } 306 | if (any(!meta.data$Timepoints %in% timepoint_order)){ 307 | stop("List of timepoints does not match the timepoints in the data") 308 | } 309 | 310 | meta.data$Timepoints <- factor(meta.data$Timepoints, levels = timepoint_order) 311 | meta.data$Timescore <- as.integer(meta.data$Timepoints) 312 | 313 | clustmd <- meta.data[, c("Timepoints", "Clusters")] 314 | clustmd <- dcast(clustmd, Clusters~Timepoints, value.var = "Clusters", fun.aggregate = length) 315 | clustmd[, 2:ncol(clustmd)] <- t(apply(clustmd[, 2:ncol(clustmd)], 1, function(x) x/sum(x))) 316 | clustmd$Cluster_time_score <- apply(clustmd[, 2:ncol(clustmd)], 1, 317 | function(x) sum(mapply(function(t, y) t*y, as.numeric(x), sort(unique(meta.data$Timescore), decreasing = F)))) 318 | colnames(clustmd)[1] <- "Id" 319 | if (!is.null(cluster_labels) & !is.null(cell_markers)){ 320 | clustmd$label <- paste0("Cluster ", paste(rownames(clustmd), cluster_labels, sep="-")) 321 | } else if (!is.null(cluster_labels)){ 322 | clustmd$label <- paste0("Cluster ", paste(rownames(clustmd), cluster_labels, sep="-")) 323 | } else if (!is.null(cell_markers)) { 324 | cat("\nPerforming automated cluster labeling with GSVA...") 325 | cluster_number <- as.numeric(meta.data$Clusters) 326 | names(cluster_number) <- rownames(meta.data) 327 | cluster_labels <- IdentifyCellTypes(exprMatrix, cluster_labels=cluster_number, cell_markers=cell_markers, threshold=threshold) 328 | clustmd$label <- paste0("Cluster ", paste(rownames(clustmd), cluster_labels, sep="-")) 329 | } else { 330 | clustmd$label <- paste("Cluster ", rownames(clustmd)) 331 | } 332 | 333 | tempora <- new("Tempora", 334 | data = exprMatrix, 335 | meta.data = meta.data, 336 | cluster.metadata = clustmd) 337 | 338 | validObject(tempora) 339 | return(tempora) 340 | } 341 | 342 | 343 | #' Import data from a Seurat object 344 | #' 345 | #' Imports gene expression matrix and other metadata from a seurat object 346 | #' @param seuratobj A Seurat or SingleCellExperiment object containing the normalized gene expression matrix and clustering result 347 | #' @param assayType A length-one character vector representing the assay object 348 | #' in which the expression data is stored in the input object. For Seurat v1 349 | #' or v2 objects, set this to "". For Seurat v3 objects, this is often "RNA". 350 | #' For SingleCellExperiment objects, this is often "logcounts". 351 | #' @param assaySlot An optional length-one character vector representing the 352 | #' slot of the Seurat v3 \code{\link[Seurat]{Assay}} object to use. In Seurat 353 | #' v3, normalized data is stored in the "data" slot, and counts in the 354 | #' "counts" slot. 355 | #' @param clusters Name of the column in the meta.data dataframe containing the cluster identity of all cells in the dataset 356 | #' @param timepoints Name of the column in the meta.data dataframe containing the collection time of all cells in the dataset 357 | #' @param timepoint_order An ordered vector of timepoint names from early to late 358 | #' @param cluster_labels (Optional) A vector of cluster annotations (cell types, cell states, cell cycles, etc.), ordered alphanumerically by cluster names. If NULL and \code{cell_markers} is given, automatic cluster annotation using GSVA will be performed. If both are NULL, cluster numbers will be used to label the trajectory plot. 359 | #' @param cell_markers (Optional) A list of possible cell types found in the dataset and their marker genes to be used for automatic cell type identification. If NULL and no \code{cluster_labels} is given, cluster numbers will be used to label the trajectory plot. 360 | #' @param threshold (Optional) Threshold of GSVA score quantile. Cell types with GSVA scores in this quantile or higher compared to all other cell type scores for the same cluster would be included in cluster label. Numeric between 0-1, default to 0.9. 361 | #' @include dataAccess.R 362 | 363 | #' @export 364 | #' @importFrom methods new validObject 365 | #' @importFrom stats p.adjust prcomp screeplot 366 | #' @importFrom reshape2 dcast 367 | #' @return A Tempora object containing the expression matrix and metadata 368 | #' @examples \dontrun{tempora_data <- ImportSeuratObject(seurat_object, assayType="", clusters = "res.0.3", timepoints = "collection_time", 369 | #' timepoint_order = c("0H", "24H", "48H", "72H"), cluster_labels = c("Stem cells", "Differentiated cells"))} 370 | 371 | 372 | ImportSeuratObject <- function(seuratobj, assayType = "", assaySlot = NA, 373 | clusters, timepoints, timepoint_order, cluster_labels=NULL, cell_markers=NULL, threshold=0.9){ 374 | # if(class(seuratobj)[1]=='seurat'){ 375 | # requireNamespace("Seurat") 376 | # } else { 377 | # stop("Not a Seurat object. Tempora only supports importing Seurat objects at the moment. See ?Tempora::CreateTemporaObject to manually create a Tempora object from an expression matrix") 378 | # } 379 | data <- getExpr(seuratobj,assayType,assaySlot) 380 | if (! is.numeric(data)) { 381 | data <- as.matrix(data) 382 | # often Seurat / SingleCellExperiment data matrices are stored as sparse 383 | # Matrix::dgCMatrix objects. Since the S4 class requires this object to be 384 | # numeric, we must convert them to traditional numeric R matrices, despite 385 | # the increased memory costs this entails. Or allow sparse matrices in the 386 | # S4 class, but that would involve some further debugging... 387 | 388 | } 389 | cat("Extracting data...") 390 | metadata <- getMD(seuratobj) 391 | cat("\nExtracting metadata...") 392 | colnames(metadata)[which(colnames(metadata)==clusters)] <- "Clusters" 393 | colnames(metadata)[which(colnames(metadata)==timepoints)] <- "Timepoints" 394 | cat("\nCreating Tempora object...") 395 | tempora_obj <- CreateTemporaObject(data, meta.data = metadata, timepoint_order = timepoint_order, cluster_labels = cluster_labels, cell_markers = cell_markers, threshold=threshold) 396 | validObject(tempora_obj) 397 | return(tempora_obj) 398 | } 399 | 400 | 401 | 402 | -------------------------------------------------------------------------------- /R/Visualize.R: -------------------------------------------------------------------------------- 1 | #' Visualize the trajectory 2 | #' 3 | #' Reduce the dimensionality of the pathway enrichment matrix and build the information-based cluster-cluster network 4 | #' @param object A Tempora object 5 | #' @param layout A 2-column matrix containing the x- and y-coordinates of all vertices in the graph. If NULL, the layout will be obtained using Sugiyama layout 6 | #' @param ... Any additional arguments to plot.igraph 7 | #' @examples \dontrun{tempora_data <- PlotTrajectory(tempora_data)} 8 | #' @export 9 | #' @importFrom RColorBrewer brewer.pal 10 | #' @importFrom igraph layout_with_sugiyama graph_from_data_frame plot.igraph E 11 | #' @importFrom grDevices colorRampPalette 12 | #' @importFrom graphics axis legend par points text 13 | #' @importFrom methods new validObject 14 | #' @importFrom stats p.adjust prcomp screeplot 15 | #' @importFrom scales rescale 16 | #' @importFrom reshape2 dcast 17 | #' 18 | #' 19 | PlotTrajectory <- function(object, layout=NULL, ...){ 20 | 21 | if (class(object)[1] != "Tempora"){ 22 | stop("Not a valid Tempora object") 23 | } 24 | 25 | if (is.null(object@trajectory)){ 26 | stop("BuildTrajectory has not been run. See ?Tempora::BuildTrajectory for details") 27 | } 28 | 29 | edge_graph <- igraph::graph_from_data_frame(d=object@trajectory, vertices = object@cluster.metadata, directed = T) 30 | 31 | if (is.null(layout)){ 32 | l <- igraph::layout_with_sugiyama(edge_graph, layers = object@cluster.metadata$Cluster_time_score, maxiter = 1000) 33 | #l$layout[,2] <- 3-(rescale(object@cluster.metadata$Cluster_time_score, to=c(0,3))) 34 | if (length(levels(object@meta.data$Timepoints)) > 9){ 35 | colours <- colorRampPalette(RColorBrewer::brewer.pal(7, "YlOrRd")) 36 | plot.igraph(edge_graph, ylim=c(-1,1), layout = l$layout, ylab = "Inferred time", vertex.shape = "pie", vertex.pie = lapply(1:nrow(object@cluster.metadata), function(x) as.numeric(object@cluster.metadata[x,2:((length(levels(object@meta.data$Timepoints)))+1)])), 37 | vertex.pie.color=list(colours(length(levels(object@meta.data$Timepoints)))), pie.border=list(rep("white", 4)), vertex.frame.color="white", edge.arrow.size = 0.5, edge.width = 1.5, vertex.label.family="Arial", 38 | vertex.label.color="black", edge.lty = E(edge_graph)$type, ...) 39 | axis(side=2, at=c(-1,1), labels=c("Late","Early"), las=1) 40 | legend("topleft", legend = levels(object@meta.data$Timepoints), fill=colours, bty = "n", border="black") 41 | } else { 42 | colours <- brewer.pal(length(levels(object@meta.data$Timepoints)), "YlOrRd") 43 | plot.igraph(edge_graph, ylim=c(-1,1), ylab = "Inferred time", layout = l$layout, vertex.shape = "pie", vertex.pie = lapply(1:nrow(object@cluster.metadata), function(x) as.numeric(object@cluster.metadata[x,2:((length(levels(object@meta.data$Timepoints)))+1)])), 44 | vertex.pie.color=list(colours), pie.border=list(rep("white", length(levels(object@meta.data$Timepoints)))), vertex.frame.color="white", 45 | vertex.label.family="Arial", vertex.label.color="black", edge.lty = E(edge_graph)$type,...) 46 | legend("topleft", legend = levels(object@meta.data$Timepoints), fill=colours, bty = "n", border = "black") 47 | axis(side=2, at=c(-1,1), labels=c("Late","Early"), las=1) 48 | } 49 | object@layouts <- l$layout 50 | 51 | } else { 52 | if (length(levels(object@meta.data$Timepoints)) > 9){ 53 | colours <- colorRampPalette(RColorBrewer::brewer.pal(7, "YlOrRd")) 54 | plot.igraph(edge_graph, ylim=c(-1,1), layout = layout, ylab = "Inferred time", vertex.shape = "pie", vertex.pie = lapply(1:nrow(object@cluster.metadata), function(x) as.numeric(object@cluster.metadata[x,2:((length(levels(object@meta.data$Timepoints)))+1)])), 55 | vertex.pie.color=list(colours(length(levels(object@meta.data$Timepoints)))), pie.border=list(rep("white", 4)), vertex.frame.color="white", edge.arrow.size = 0.5, edge.width = 1.5, vertex.label.family="Arial", 56 | vertex.label.color="black", edge.lty = E(edge_graph)$type, ...) 57 | axis(side=2, at=c(-1,1), labels=c("Late","Early"), las=1) 58 | legend("topleft", legend = levels(object@meta.data$Timepoints), fill=colours, bty = "n", border="black") 59 | } else { 60 | colours <- brewer.pal(length(levels(object@meta.data$Timepoints)), "YlOrRd") 61 | plot.igraph(edge_graph, ylim=c(-1,1), ylab = "Inferred time", layout = layout, vertex.shape = "pie", vertex.pie = lapply(1:nrow(object@cluster.metadata), function(x) as.numeric(object@cluster.metadata[x,2:((length(levels(object@meta.data$Timepoints)))+1)])), 62 | vertex.pie.color=list(colours), pie.border=list(rep("white", length(levels(object@meta.data$Timepoints)))), vertex.frame.color="white", 63 | vertex.label.family="Arial", vertex.label.color="black", edge.lty = E(edge_graph)$type,...) 64 | legend("topleft", legend = levels(object@meta.data$Timepoints), fill=colours, bty = "n", border = "black") 65 | axis(side=2, at=c(-1,1), labels=c("Late","Early"), las=1) 66 | } 67 | } 68 | 69 | validObject(object) 70 | return(object) 71 | } 72 | 73 | 74 | 75 | -------------------------------------------------------------------------------- /R/dataAccess.R: -------------------------------------------------------------------------------- 1 | # Generics & methods for loading data from various single-cell data classes. 2 | 3 | # Generics ---- 4 | 5 | # ^ getExpr ---- 6 | 7 | #' Get gene expression matrix from input data object 8 | #' 9 | #' Extract the gene expression matrix from a single-cell data object containing 10 | #' the input data for scClustViz visualization. 11 | #' 12 | #' This is a wrapper function to the relevant class's normalized data slot 13 | #' accessor method. Currently supported input object classes: \itemize{ \item 14 | #' Class \code{\link[Seurat]{seurat}/\link[Seurat]{Seurat}} stored in 15 | #' \code{x@data} or \code{x@assays[[assayType]]@assaySlot}, depending on Seurat 16 | #' object version. \item Class 17 | #' \code{\link[SingleCellExperiment]{SingleCellExperiment}} accessed by 18 | #' \code{\link[SummarizedExperiment]{assay}(x,assayType)}. } 19 | #' \href{https://github.com/BaderLab/scClustViz/issues}{Please submit requests 20 | #' for other data objects here!} 21 | #' 22 | #' @param x The single-cell data object. 23 | #' @param assayType A length-one character vector representing the assay object 24 | #' in which the expression data is stored in the input object. For Seurat v1 25 | #' or v2 objects, set this to "". For Seurat v3 objects, this is often "RNA". 26 | #' For SingleCellExperiment objects, this is often "logcounts". See Details 27 | #' for how this argument is used in the accessor functions for each class. 28 | #' @param assaySlot An optional length-one character vector representing the 29 | #' slot of the Seurat v3 \code{\link[Seurat]{Assay}} object to use. In Seurat 30 | #' v3, normalized data is stored in the "data" slot, and counts in the 31 | #' "counts" slot. See Details for how this argument is used in the accessor 32 | #' functions for each class. 33 | #' @name getExpr 34 | #' @author Contributed by Brendan Innes from the BaderLab/scClustViz package 35 | #' @export 36 | #' 37 | setGeneric("getExpr",function(x,assayType,assaySlot) standardGeneric("getExpr")) 38 | 39 | 40 | # ^ getMD ---- 41 | 42 | #' Get metadata from input data object 43 | #' 44 | #' Extract the cell metadata data frame from a single-cell data object 45 | #' containing the input data for scClustViz visualization. 46 | #' 47 | #' This is a wrapper function to the relevant class's cell metadata slot 48 | #' accessor / assignment method. Currently supported input object classes: 49 | #' \itemize{ 50 | #' \item Class \code{\link[Seurat]{seurat}/\link[Seurat]{Seurat}} accessed by 51 | #' \code{x@data.info} or \code{x@meta.data}, 52 | #' depending on Seurat object version. 53 | #' \item Class \code{\link[SingleCellExperiment]{SingleCellExperiment}} 54 | #' accessed by \code{\link[SingleCellExperiment]{colData}(x)}. 55 | #' } 56 | #' \href{https://github.com/BaderLab/scClustViz/issues}{Please submit requests 57 | #' for other data objects here!} 58 | #' 59 | #' @param x The single-cell data object. 60 | #' @name getMD 61 | #' @author Contributed by Brendan Innes from the BaderLab/scClustViz package 62 | #' @export 63 | #' 64 | setGeneric("getMD",function(x) standardGeneric("getMD")) 65 | 66 | 67 | # Methods ---- 68 | 69 | # ^ seurat (v1/2) ---- 70 | suppressMessages( 71 | setMethod("getExpr","seurat",function(x) { 72 | slot(x,"data") 73 | }) 74 | ) 75 | 76 | 77 | suppressMessages( 78 | setMethod("getMD","seurat",function(x) { 79 | if (.hasSlot(x,"meta.data")) { 80 | slot(x,"meta.data") 81 | } else { 82 | slot(x,"data.info") #Seurat v1 83 | } 84 | }) 85 | ) 86 | 87 | 88 | # ^ Seurat (v3) ---- 89 | suppressMessages( 90 | setMethod("getExpr","Seurat",function(x,assayType,assaySlot) { 91 | if (missing(assayType)) { 92 | stop(paste(paste0("assayType must be specified."), 93 | "The following assay data are available in this object:", 94 | paste0(names(slot(x,"assays")),collapse=", "),sep="\n ")) 95 | } 96 | if (assayType %in% names(slot(x,"assays"))) { 97 | if (!missing(assaySlot)) { 98 | if (is.na(assaySlot) | assaySlot == "") { 99 | return(x@assays[[assayType]]@data) 100 | } else { 101 | return(slot(x@assays[[assayType]],assaySlot)) 102 | } 103 | } else { 104 | return(x@assays[[assayType]]@data) 105 | } 106 | } else { 107 | stop(paste(paste0("assayType '",assayType,"' not found."), 108 | "The following assay data are available in this object:", 109 | paste0(names(slot(x,"assays")),collapse=", "),sep="\n ")) 110 | } 111 | }) 112 | ) 113 | 114 | 115 | suppressMessages( 116 | setMethod("getMD","Seurat",function(x) { 117 | return(slot(x,"meta.data")) 118 | }) 119 | ) 120 | 121 | 122 | # ^ SingleCellExperiment ---- 123 | suppressMessages( 124 | setMethod("getExpr","SingleCellExperiment",function(x,assayType) { 125 | if (missing(assayType)) { 126 | stop(paste(paste0("assayType must be specified."), 127 | "The following assay data are available in this object:", 128 | paste0(SummarizedExperiment::assayNames(x),collapse=", "), 129 | sep="\n ")) 130 | } 131 | if (assayType %in% SummarizedExperiment::assayNames(x)) { 132 | return(SummarizedExperiment::assay(x,assayType)) 133 | } else { 134 | stop(paste(paste0("assayType '",assayType,"' not found."), 135 | "The following assay data are available in this object:", 136 | paste0(SummarizedExperiment::assayNames(x),collapse=", "), 137 | sep="\n ")) 138 | } 139 | }) 140 | ) 141 | 142 | 143 | suppressMessages( 144 | setMethod("getMD","SingleCellExperiment", 145 | function(x) SingleCellExperiment::colData(x)) 146 | ) 147 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Tempora: cell trajectory inference using time-series single-cell RNA sequencing data" 3 | output: 4 | github_document: 5 | toc: true 6 | toc_depth: 3 7 | --- 8 | 9 | ```{r, include = FALSE} 10 | knitr::opts_chunk$set( 11 | collapse = TRUE, 12 | comment = "#>" 13 | ) 14 | ``` 15 | 16 | ## Introduction 17 | 18 | Tempora is a novel cell trajectory inference method that orders cells using time information from time-series scRNAseq data. Tempora uses biological pathway information to help identify cell type relationships and can identify important time-dependent pathways to help interpret the inferred trajectory. 19 | 20 | ## Usage 21 | 22 | ### Installation 23 | 24 | You can install Tempora using devtools: 25 | 26 | ```{r eval=FALSE} 27 | # install devtools 28 | install.packages("devtools") 29 | 30 | # install Tempora 31 | devtools::install_github("BaderLab/Tempora") 32 | 33 | library(Tempora) 34 | 35 | ``` 36 | 37 | ### Sample data 38 | 39 | The Tempora package was validated using three datasets: an _in vitro_ differentiation of human skeletal muscle myoblasts, _in vivo_ early development of murine cerebral cortex and _in vivo_ embryonic and postnatal development of murine cerebellum. These processed datasets can be accessed on the Bader Lab website at https://www.baderlab.org/Software/Tempora. 40 | 41 | The MouseCortex dataset will be used in this vignette as an example. 42 | 43 | ### Input data 44 | 45 | Tempora takes processed scRNAseq data as input, either as a gene expression matrix with separate time and cluster labels for all cells, or a Seurat or SingleCellExperiment object containing gene expression data and a clustering result. Tempora does not implement clustering or batch effect correction as part of its pipeline and assumes that the user has input a well-annotated cluster solution free of batch effect into the method. 46 | 47 | ```{r eval=F} 48 | #Load MouseCortex sample data 49 | load("MouseCortex.RData") 50 | ``` 51 | 52 | We can the import the Seurat object containing the murine cerebral cortex development data into a Tempora object to start the analysis. Here, as the clusters have been manually annotated prior to running Tempora, a vector of cluster label is given to the function. If you have yet to annotate your clusters but have a list of marker genes for expected cell types in the data, you can input the list of marker genes to this function to run automated cluster labeling with GSVA. 53 | 54 | ```{r eval=FALSE, tidy=F} 55 | #Import MouseCortex data 56 | #As this is a Seurat v2 object, set assayType to "" 57 | #See ?Tempora::ImportSeuratObject for additional arguments to import Seurat v3 or SingleCellExperiment obbjects 58 | cortex_tempora <- ImportSeuratObject(MouseCortex, clusters = "res.0.6", 59 | timepoints = "Time_points", 60 | assayType = "", 61 | cluster_labels = c("Neurons","Young neurons","APs/RPs", 62 | "IPs","APs/RPs", "Young neurons", "IPs"), 63 | timepoint_order = c("e11", "e13", "e15", "e17")) 64 | ``` 65 | 66 | From the specified clustering result, Tempora will automatically calculate the temporal score of each cluster, which is based on its composition of cells from each timepoint. This information will be stored in the _cluster.metadata_ slot of the Tempora object. 67 | 68 | ### Calculate clusters' pathway enrichment profiles 69 | 70 | Next, the pathway enrichment profiles of the clusters are calculated using [GSVA](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-7) and stored in the _cluster.pathways_ slot of the Tempora object. The default pathway gene set database Tempora uses is the Bader Lab pathway gene set database without electronic annotation Gene Ontology terms, which can be accessed on the [Bader Lab website](http://download.baderlab.org/EM_Genesets/current_release/). 71 | 72 | This function also performs principal component analysis (PCA) on the clusters pathway enrichment profiles to remove redundancy due to overrepresentation of certain pathways in the database. The PCA result is stored in the _cluster.pathways.dr_ slot. Tempora also outputs a scree plot to help users identify the number of principal components (PCs) to be used in downstream trajectory construction. 73 | 74 | ```{r eval=F, tidy=F} 75 | #Estimate pathway enrichment profiles of clusters 76 | cortex_tempora <- CalculatePWProfiles(cortex_tempora, 77 | gmt_path = "Mouse_GOBP_AllPathways_no_GO_iea_September_01_2019_symbol.gmt", 78 | method="gsva", min.sz = 5, max.sz = 200, parallel.sz = 1) 79 | ``` 80 | 81 | ### Build and visualize trajectory 82 | 83 | We can now build the trajectory based on the clusters' pathway enrichment profiles. Tempora employs the mutual information (MI) rank and data processing inequality approach implemented in [ARACNE](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-S1-S7) to calculate MI between all cluster pairs present in the data as well as remove edges with weak MIs. The trajectory is stored as a dataframe of edge lists in the _trajectory_ slot. Tempora then assigns directions to all edges in the network so that edges point from clusters with low temporal scores to clusters with high temporal scores. 84 | 85 | ```{r eval=F, tidy=F} 86 | #Build trajectory with 6 PCs 87 | cortex_tempora <- BuildTrajectory(cortex_tempora, n_pcs = 6, difference_threshold = 0.01) 88 | 89 | ``` 90 | 91 | After building the trajectory, we can visualize it as a network, with the piechart at each node representing the composition of cells collected at different time points in the experiment and the arrow connecting each pair of nodes representing lineage relationship between them. 92 | 93 | ```{r eval=F} 94 | #Visualize the trajectory 95 | cortex_tempora <- PlotTrajectory(cortex_tempora) 96 | ``` 97 | 98 | This function will add a slot _layouts_ containing the x and y coordinates of all nodes, determined using the Sugiyama layered graph drawing algorithm. 99 | 100 | ### Identify temporally dependent pathways 101 | 102 | Finally, we can use Tempora to investigate time-dependent pathways. Tempora fits a generalized additive model to the data to identify pathways whose expressions change over the temporal axis. The results of this analysis is stored in the _varying.pws_ slot of the Tempora object. 103 | 104 | ```{r eval=F} 105 | #Fit GAMs on pathway enrichment profile 106 | cortex_tempora <- IdentifyVaryingPWs(cortex_tempora, pval_threshold = 0.05) 107 | 108 | #Plot expression trends of significant time-varying pathways 109 | PlotVaryingPWs(cortex_tempora) 110 | ``` 111 | 112 | 113 | 114 | 115 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Tempora: cell trajectory inference using time-series single-cell RNA 2 | sequencing data 3 | ================ 4 | 5 | - [Introduction](#introduction) 6 | - [Usage](#usage) 7 | - [Installation](#installation) 8 | - [Sample data](#sample-data) 9 | - [Input data](#input-data) 10 | - [Calculate clusters’ pathway enrichment 11 | profiles](#calculate-clusters-pathway-enrichment-profiles) 12 | - [Build and visualize 13 | trajectory](#build-and-visualize-trajectory) 14 | - [Identify temporally dependent 15 | pathways](#identify-temporally-dependent-pathways) 16 | 17 | ## Introduction 18 | 19 | Tempora is a novel cell trajectory inference method that orders cells 20 | using time information from time-series scRNAseq data. Tempora uses 21 | biological pathway information to help identify cell type relationships 22 | and can identify important time-dependent pathways to help interpret the 23 | inferred trajectory. 24 | 25 | ## Usage 26 | 27 | ### Installation 28 | 29 | You can install Tempora using devtools: 30 | 31 | ``` r 32 | if (!require('devtools')) { 33 | # install devtools 34 | install.packages("devtools") 35 | } 36 | 37 | # install Tempora 38 | devtools::install_github("BaderLab/Tempora") 39 | 40 | library(Tempora) 41 | ``` 42 | 43 | ### Sample data 44 | 45 | The Tempora package was validated using three datasets: an *in vitro* 46 | differentiation of human skeletal muscle myoblasts, *in vivo* early 47 | development of murine cerebral cortex and *in vivo* embryonic and 48 | postnatal development of murine cerebellum. These processed datasets can 49 | be accessed on the Bader Lab website at 50 | . 51 | 52 | The MouseCortex dataset will be used in this vignette as an example. 53 | 54 | ### Downlaod the vignette example data. 55 | 56 | Manually download the data from . 57 | 58 | Or execute the following code to automatically download it. 59 | 60 | ``` r 61 | if (!require('RCurl')) { 62 | install.package('RCurl') 63 | } 64 | 65 | data_url = "https://www.baderlab.org/Software/Tempora?action=AttachFile&do=get&target=" 66 | data_file = "MouseCortex.RData" 67 | dest_data_file <- file.path(getwd(),data_file ) 68 | download.file( 69 | paste(data_url,data_file,sep=""), 70 | destfile=dest_data_file 71 | ) 72 | ``` 73 | 74 | ### Input data 75 | 76 | Tempora takes processed scRNAseq data as input, either as a gene 77 | expression matrix with separate time and cluster labels for all cells, 78 | or a Seurat or SingleCellExperiment object containing gene expression 79 | data and a clustering result. Tempora does not implement clustering or 80 | batch effect correction as part of its pipeline and assumes that the 81 | user has input a well-annotated cluster solution free of batch effect 82 | into the method. 83 | 84 | ``` r 85 | #install Seurat package when using the MouseCortex data. 86 | if (!require('Seurat')) { 87 | install.packages('Seurat') 88 | library('Seurat') 89 | } 90 | 91 | #Load MouseCortex sample data 92 | load("MouseCortex.RData") 93 | ``` 94 | 95 | We can the import the Seurat object containing the murine cerebral 96 | cortex development data into a Tempora object to start the analysis. 97 | Here, as the clusters have been manually annotated prior to running 98 | Tempora, a vector of cluster label is given to the function. If you have 99 | yet to annotate your clusters but have a list of marker genes for 100 | expected cell types in the data, you can input the list of marker genes 101 | to this function to run automated cluster labeling with GSVA. 102 | 103 | ``` r 104 | #Import MouseCortex data 105 | #As this is a Seurat v2 object, set assayType to "" 106 | #See ?Tempora::ImportSeuratObject for additional arguments to import Seurat v3 or SingleCellExperiment obbjects 107 | cortex_tempora <- ImportSeuratObject(MouseCortex, clusters = "res.0.6", 108 | timepoints = "Time_points", 109 | assayType = "", 110 | cluster_labels = c("Neurons","Young neurons","APs/RPs", 111 | "IPs","APs/RPs", "Young neurons", "IPs"), 112 | timepoint_order = c("e11", "e13", "e15", "e17")) 113 | ``` 114 | 115 | From the specified clustering result, Tempora will automatically 116 | calculate the temporal score of each cluster, which is based on its 117 | composition of cells from each timepoint. This information will be 118 | stored in the *cluster.metadata* slot of the Tempora object. 119 | 120 | ### Calculate clusters’ pathway enrichment profiles 121 | 122 | Next, the pathway enrichment profiles of the clusters are calculated 123 | using 124 | [GSVA](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-7) 125 | and stored in the *cluster.pathways* slot of the Tempora object. The 126 | default pathway gene set database Tempora uses is the Bader Lab pathway 127 | gene set database without electronic annotation Gene Ontology terms, 128 | which can be accessed on the [Bader Lab 129 | website](http://download.baderlab.org/EM_Genesets/current_release/). 130 | 131 | To automatically pull the latest version of the gmt file: 132 | ```{r download baderlab gmt file, message=FALSE, warning=FALSE} 133 | if (!require('RCurl')) { 134 | install.package('RCurl') 135 | } 136 | gmt_url = "http://download.baderlab.org/EM_Genesets/current_release/Mouse/symbol/" 137 | 138 | #list all the files on the server 139 | filenames = getURL(gmt_url) 140 | tc = textConnection(filenames) 141 | contents = readLines(tc) 142 | close(tc) 143 | 144 | #get the gmt that has all the pathways and does not include terms inferred from electronic annotations(IEA) 145 | #start with gmt file that has pathways only 146 | rx = gregexpr("(?<=)", 147 | contents, perl = TRUE) 148 | 149 | gmt_file = unlist(regmatches(contents, rx)) 150 | dest_gmt_file <- file.path(getwd(),gmt_file ) 151 | download.file( 152 | paste(gmt_url,gmt_file,sep=""), 153 | destfile=dest_gmt_file 154 | ) 155 | ``` 156 | 157 | 158 | This function also performs principal component analysis (PCA) on the 159 | clusters pathway enrichment profiles to remove redundancy due to 160 | overrepresentation of certain pathways in the database. The PCA result 161 | is stored in the *cluster.pathways.dr* slot. Tempora also outputs a 162 | scree plot to help users identify the number of principal components 163 | (PCs) to be used in downstream trajectory construction. 164 | 165 | ``` r 166 | #Estimate pathway enrichment profiles of clusters 167 | cortex_tempora <- CalculatePWProfiles(cortex_tempora, 168 | gmt_path = gmt_file, 169 | method="gsva", min.sz = 5, max.sz = 200, parallel.sz = 1) 170 | ``` 171 | 172 | ### Build and visualize trajectory 173 | 174 | We can now build the trajectory based on the clusters’ pathway 175 | enrichment profiles. Tempora employs the mutual information (MI) rank 176 | and data processing inequality approach implemented in 177 | [ARACNE](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-S1-S7) 178 | to calculate MI between all cluster pairs present in the data as well as 179 | remove edges with weak MIs. The trajectory is stored as a dataframe of 180 | edge lists in the *trajectory* slot. Tempora then assigns directions to 181 | all edges in the network so that edges point from clusters with low 182 | temporal scores to clusters with high temporal scores. 183 | 184 | ``` r 185 | #Build trajectory with 6 PCs 186 | cortex_tempora <- BuildTrajectory(cortex_tempora, n_pcs = 6, difference_threshold = 0.01) 187 | ``` 188 | 189 | After building the trajectory, we can visualize it as a network, with 190 | the piechart at each node representing the composition of cells 191 | collected at different time points in the experiment and the arrow 192 | connecting each pair of nodes representing lineage relationship between 193 | them. 194 | 195 | ``` r 196 | #Visualize the trajectory 197 | cortex_tempora <- PlotTrajectory(cortex_tempora) 198 | ``` 199 | 200 | This function will add a slot *layouts* containing the x and y 201 | coordinates of all nodes, determined using the Sugiyama layered graph 202 | drawing algorithm. 203 | 204 | ### Identify temporally dependent pathways 205 | 206 | Finally, we can use Tempora to investigate time-dependent pathways. 207 | Tempora fits a generalized additive model to the data to identify 208 | pathways whose expressions change over the temporal axis. The results of 209 | this analysis is stored in the *varying.pws* slot of the Tempora object. 210 | 211 | ``` r 212 | #Fit GAMs on pathway enrichment profile 213 | cortex_tempora <- IdentifyVaryingPWs(cortex_tempora, pval_threshold = 0.05) 214 | 215 | #Plot expression trends of significant time-varying pathways 216 | PlotVaryingPWs(cortex_tempora) 217 | ``` 218 | -------------------------------------------------------------------------------- /Tempora.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | -------------------------------------------------------------------------------- /man/BuildTrajectory.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/BuildTrajectory.R 3 | \name{BuildTrajectory} 4 | \alias{BuildTrajectory} 5 | \title{Build mutual information-based network} 6 | \usage{ 7 | BuildTrajectory(object, n_pcs, difference_threshold = 0.01, loadings = 0.4) 8 | } 9 | \arguments{ 10 | \item{object}{A Tempora object containing a gene expression matrix and metadata} 11 | 12 | \item{n_pcs}{Number of principal components to be used in building the network.} 13 | 14 | \item{difference_threshold}{Percent of permissible difference between the temporal scores of two clusters to determine the direction of their connections. The temporal scores are calculated based on based on the clusters' composition of cells from each timepoint. The directions of edges connecting pairs of clusters will only be determined for cluster pairs with difference in their time scores higher than the threshold. Other edges will remain undirected. Default at 0.01} 15 | 16 | \item{loadings}{Threshold of PCA loadings for pathways to be used in trajectory construction. The higher the loading, the more the pathway contributes to a principal component. Default at 0.4.} 17 | } 18 | 19 | \description{ 20 | Build the information-based cluster-cluster network using reduced-dimensionality pathway enrichment profiles of all clusters. 21 | This network connects related cell types and states across multiple time points. Taking advantage of the available time information, Tempora also infers the directions of all connections in a trajectory that go from early to late clusters. 22 | } 23 | \examples{ 24 | \dontrun{tempora_data <- BuildTrajectory(tempora_data, n_pcs=10, difference_threshold=0.01, loadings=0.4)} 25 | BuildTrajectory 26 | } 27 | -------------------------------------------------------------------------------- /man/CalculatePWProfiles.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/CalculatePWProfiles.R 3 | \name{CalculatePWProfiles} 4 | \alias{CalculatePWProfiles} 5 | \title{Calculate pathway enrichment profile} 6 | \usage{ 7 | CalculatePWProfiles( 8 | object, 9 | gmt_path, 10 | method = "gsva", 11 | min.sz = 5, 12 | max.sz = 200, 13 | parallel.sz = 1 14 | ) 15 | } 16 | \arguments{ 17 | \item{object}{A Tempora object containing a gene expression matrix and metadata (cluster identity and )} 18 | 19 | \item{gmt_path}{Local path to database of pathways or genesets organized as a .gmt file. Genesets files in GMT format can be downloaded at http://baderlab.org/GeneSets. Please ensure} 20 | 21 | \item{method}{Method used to estimate pathway enrichment profile per cluster. Can be "gsva", "ssgsea", "zscore" or "plage", default to "gsva". See ?gsva for more information.} 22 | 23 | \item{min.sz}{Minimum size of the genesets used in enrichment estimation, set to 5 genes by default.} 24 | 25 | \item{max.sz}{Maximum size of the genesets used in enrichment estimation, set to 200 genes by default.} 26 | 27 | \item{parallel.sz}{Type of cluster architecture when using \code{snow}. If 1, no parallelization will be used. If 0, all available cores will be used.} 28 | } 29 | \value{ 30 | An updated Tempora object containing the pathway enrichment profiles of each cluster, which can be accessed at \code{object@cluster.pathways} 31 | CalculatePWProfiles 32 | } 33 | \description{ 34 | Calculate cluster average gene expression profile and determine the pathway enrichment profile of each cluster 35 | } 36 | \examples{ 37 | \dontrun{tempora_data <- CalculatePWProfiles(tempora_data, gmt_path="~/Human_AllPathways_September_01_2019_symbol.gmt", parallel.sz = detectCores()-2)} 38 | } 39 | -------------------------------------------------------------------------------- /man/CreateTemporaObject.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/TemporaObjClass.R 3 | \name{CreateTemporaObject} 4 | \alias{CreateTemporaObject} 5 | \title{Create new Tempora object from a processed gene expression matrix} 6 | \usage{ 7 | CreateTemporaObject( 8 | exprMatrix, 9 | meta.data, 10 | timepoint_order, 11 | cluster_labels = NULL, 12 | cell_markers = NULL, 13 | threshold = 0.9 14 | ) 15 | } 16 | \arguments{ 17 | \item{exprMatrix}{A normalized gene expression matrix containing cells from all timepoints of the time-series study. Batch effect correction is highly recommended before normalization.} 18 | 19 | \item{meta.data}{A dataframe of meta data for all cells in the expression matrix. Each column is a feature and each row stores single cell information. At minimum, this dataframe should contain two columns: a "Clusters" column storing the clustering identity and a "Timepoints" column storing the timepoint when each cell comes from} 20 | 21 | \item{timepoint_order}{An ordered vector of timepoint names from early to late} 22 | 23 | \item{cluster_labels}{(Optional) A vector of cluster annotations (cell types, cell states, cell cycles, etc.), ordered alphanumerically by cluster names. If NULL and \code{cell_markers} is given, automatic cluster annotation using GSVA will be performed. If both are NULL, cluster numbers will be used to label the trajectory plot.} 24 | 25 | \item{cell_markers}{(Optional) A list of possible cell types found in the dataset and their marker genes to be used for automatic cell type identification. If NULL and no \code{cluster_labels} is given, cluster numbers will be used to label the trajectory plot.} 26 | 27 | \item{threshold}{(Optional) Threshold of GSVA score quantile. Cell types with GSVA scores in this quantile or higher compared to all other cell type scores for the same cluster would be included in cluster label. Numeric between 0-1, default to 0.9.} 28 | } 29 | \value{ 30 | A Tempora object containing the expression matrix and metadata 31 | } 32 | \description{ 33 | Create new Tempora object from a processed gene expression matrix 34 | } 35 | \examples{ 36 | \dontrun{tempora_dara <- CreateTemporaObject(exprMatrix, meta.data)} 37 | 38 | } 39 | -------------------------------------------------------------------------------- /man/IdentifyCellTypes.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/IdentifyCellTypes.R 3 | \name{IdentifyCellTypes} 4 | \alias{IdentifyCellTypes} 5 | \title{Calculate pathway enrichment profile} 6 | \usage{ 7 | IdentifyCellTypes(exprMatrix, cluster_labels, cell_markers, threshold = 0.9) 8 | } 9 | \arguments{ 10 | \item{exprMatrix}{A gene expression matrix, with genes in rows and cells in columns.} 11 | 12 | \item{cluster_labels}{A named vector of cluster identifier for each cell in the gene expression matrix} 13 | 14 | \item{cell_markers}{A list of possible cell types found in the dataset and their marker genes.} 15 | 16 | \item{threshold}{Threshold of GSVA score quantile. Cell types with GSVA scores in this quantile or higher compared to all other cell type scores for the same cluster would be included in cluster label. Numeric between 0-1, default to 0.9.} 17 | } 18 | \value{ 19 | A vector of cell types inferred from the expression of marker genes provided 20 | } 21 | \description{ 22 | Calculate cluster average gene expression profile and determine the pathway enrichment profile of each cluster 23 | } 24 | -------------------------------------------------------------------------------- /man/IdentifyVaryingPWs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/IdentifyVaryingPWs.R 3 | \name{IdentifyVaryingPWs} 4 | \alias{IdentifyVaryingPWs} 5 | \title{Calculate temporally changing pathways} 6 | \usage{ 7 | IdentifyVaryingPWs(object, pval_threshold = 0.05) 8 | } 9 | \arguments{ 10 | \item{object}{A Tempora object} 11 | 12 | \item{pval_threshold}{P-value threshold to determine the significance of pathway enrichment over time. Default to 0.05.} 13 | } 14 | \description{ 15 | Identify the pathways that change over time by fitting a generalized additive model 16 | } 17 | \examples{ 18 | \dontrun{tempora_data <- IdentifyVaryingPWs(tempora_data, pval_threshold = 0.05)} 19 | } 20 | -------------------------------------------------------------------------------- /man/ImportSeuratObject.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/TemporaObjClass.R 3 | \name{ImportSeuratObject} 4 | \alias{ImportSeuratObject} 5 | \title{Import data from a Seurat object} 6 | \usage{ 7 | ImportSeuratObject( 8 | seuratobj, 9 | assayType = "", 10 | assaySlot = NA, 11 | clusters, 12 | timepoints, 13 | timepoint_order, 14 | cluster_labels = NULL, 15 | cell_markers = NULL, 16 | threshold = 0.9 17 | ) 18 | } 19 | \arguments{ 20 | \item{seuratobj}{A Seurat or SingleCellExperiment object containing the normalized gene expression matrix and clustering result} 21 | 22 | \item{assayType}{A length-one character vector representing the assay object 23 | in which the expression data is stored in the input object. For Seurat v1 24 | or v2 objects, set this to "". For Seurat v3 objects, this is often "RNA". 25 | For SingleCellExperiment objects, this is often "logcounts".} 26 | 27 | \item{assaySlot}{An optional length-one character vector representing the 28 | slot of the Seurat v3 \code{\link[Seurat]{Assay}} object to use. In Seurat 29 | v3, normalized data is stored in the "data" slot, and counts in the 30 | "counts" slot.} 31 | 32 | \item{clusters}{Name of the column in the meta.data dataframe containing the cluster identity of all cells in the dataset} 33 | 34 | \item{timepoints}{Name of the column in the meta.data dataframe containing the collection time of all cells in the dataset} 35 | 36 | \item{timepoint_order}{An ordered vector of timepoint names from early to late} 37 | 38 | \item{cluster_labels}{(Optional) A vector of cluster annotations (cell types, cell states, cell cycles, etc.), ordered alphanumerically by cluster names. If NULL and \code{cell_markers} is given, automatic cluster annotation using GSVA will be performed. If both are NULL, cluster numbers will be used to label the trajectory plot.} 39 | 40 | \item{cell_markers}{(Optional) A list of possible cell types found in the dataset and their marker genes to be used for automatic cell type identification. If NULL and no \code{cluster_labels} is given, cluster numbers will be used to label the trajectory plot.} 41 | 42 | \item{threshold}{(Optional) Threshold of GSVA score quantile. Cell types with GSVA scores in this quantile or higher compared to all other cell type scores for the same cluster would be included in cluster label. Numeric between 0-1, default to 0.9.} 43 | } 44 | \value{ 45 | A Tempora object containing the expression matrix and metadata 46 | } 47 | \description{ 48 | Imports gene expression matrix and other metadata from a seurat object 49 | } 50 | \examples{ 51 | \dontrun{tempora_data <- ImportSeuratObject(seurat_object, assayType="", clusters = "res.0.3", timepoints = "collection_time", 52 | timepoint_order = c("0H", "24H", "48H", "72H"), cluster_labels = c("Stem cells", "Differentiated cells"))} 53 | } 54 | -------------------------------------------------------------------------------- /man/PlotTrajectory.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/Visualize.R 3 | \name{PlotTrajectory} 4 | \alias{PlotTrajectory} 5 | \title{Visualize the trajectory} 6 | \usage{ 7 | PlotTrajectory(object, layout = NULL, ...) 8 | } 9 | \arguments{ 10 | \item{object}{A Tempora object} 11 | 12 | \item{layout}{A 2-column matrix containing the x- and y-coordinates of all vertices in the graph. If NULL, the layout will be obtained using Sugiyama layout} 13 | 14 | \item{...}{Any additional arguments to plot.igraph} 15 | } 16 | \description{ 17 | Reduce the dimensionality of the pathway enrichment matrix and build the information-based cluster-cluster network 18 | } 19 | \examples{ 20 | \dontrun{tempora_data <- PlotTrajectory(tempora_data)} 21 | } 22 | -------------------------------------------------------------------------------- /man/PlotVaryingPWs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/PlotVaryingPWs.R 3 | \name{PlotVaryingPWs} 4 | \alias{PlotVaryingPWs} 5 | \title{Plot temporally changing pathways} 6 | \usage{ 7 | PlotVaryingPWs(object) 8 | } 9 | \arguments{ 10 | \item{object}{A Tempora object} 11 | } 12 | \description{ 13 | Plot the expression of temporally changing pathways as identified by IdentifyVaryingPWs() 14 | } 15 | \examples{ 16 | \dontrun{tempora_data <- IdentifyVaryingPWs(tempora_data, pval_threshold = 0.05)} 17 | } 18 | -------------------------------------------------------------------------------- /man/Tempora-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/TemporaObjClass.R 3 | \docType{class} 4 | \name{Tempora-class} 5 | \alias{Tempora-class} 6 | \alias{Tempora} 7 | \alias{data,Tempora-method} 8 | \alias{data} 9 | \alias{data<-,Tempora-method} 10 | \alias{data<-} 11 | \alias{meta.data,Tempora-method} 12 | \alias{meta.data} 13 | \alias{meta.data<-,Tempora-method} 14 | \alias{meta.data<-} 15 | \alias{cluster.metadata,Tempora-method} 16 | \alias{cluster.metadata} 17 | \alias{cluster.metadata<-,Tempora-method} 18 | \alias{cluster.metadata<-} 19 | \alias{cluster.pathways,Tempora-method} 20 | \alias{cluster.pathways} 21 | \alias{cluster.pathways<-,Tempora-method} 22 | \alias{cluster.pathways<-} 23 | \alias{cluster.pathways.dr,Tempora-method} 24 | \alias{cluster.pathways.dr} 25 | \alias{cluster.pathways.dr<-,Tempora-method} 26 | \alias{cluster.pathways.dr<-} 27 | \alias{varying.pws,Tempora-method} 28 | \alias{varying.pws} 29 | \alias{varying.pws<-,Tempora-method} 30 | \alias{varying.pws<-} 31 | \alias{gams,Tempora-method} 32 | \alias{gams} 33 | \alias{gams<-,Tempora-method} 34 | \alias{gams<-} 35 | \alias{n.pcs,Tempora-method} 36 | \alias{n.pcs} 37 | \alias{n.pcs<-,Tempora-method} 38 | \alias{n.pcs<-} 39 | \alias{trajectory,Tempora-method} 40 | \alias{trajectory} 41 | \alias{trajectory<-,Tempora-method} 42 | \alias{trajectory<-} 43 | \alias{layouts,Tempora-method} 44 | \alias{layouts} 45 | \alias{layouts<-,Tempora-method} 46 | \alias{layouts<-} 47 | \title{Define a class of Tempora object} 48 | \usage{ 49 | data(x) 50 | 51 | data(x) <- value 52 | 53 | \S4method{data}{Tempora}(x) 54 | 55 | \S4method{data}{Tempora}(x) <- value 56 | 57 | meta.data(x) 58 | 59 | meta.data(x) <- value 60 | 61 | \S4method{meta.data}{Tempora}(x) 62 | 63 | \S4method{meta.data}{Tempora}(x) <- value 64 | 65 | cluster.metadata(x) 66 | 67 | cluster.metadata(x) <- value 68 | 69 | \S4method{cluster.metadata}{Tempora}(x) 70 | 71 | \S4method{cluster.metadata}{Tempora}(x) <- value 72 | 73 | cluster.pathways(x) 74 | 75 | cluster.pathways(x) <- value 76 | 77 | \S4method{cluster.pathways}{Tempora}(x) 78 | 79 | \S4method{cluster.pathways}{Tempora}(x) <- value 80 | 81 | cluster.pathways.dr(x) 82 | 83 | cluster.pathways.dr(x) <- value 84 | 85 | \S4method{cluster.pathways.dr}{Tempora}(x) 86 | 87 | \S4method{cluster.pathways.dr}{Tempora}(x) <- value 88 | 89 | varying.pws(x) 90 | 91 | varying.pws(x) <- value 92 | 93 | \S4method{varying.pws}{Tempora}(x) 94 | 95 | \S4method{varying.pws}{Tempora}(x) <- value 96 | 97 | gams(x) 98 | 99 | gams(x) <- value 100 | 101 | \S4method{gams}{Tempora}(x) 102 | 103 | \S4method{gams}{Tempora}(x) <- value 104 | 105 | n.pcs(x) 106 | 107 | n.pcs(x) <- value 108 | 109 | \S4method{n.pcs}{Tempora}(x) 110 | 111 | \S4method{n.pcs}{Tempora}(x) <- value 112 | 113 | trajectory(x) 114 | 115 | trajectory(x) <- value 116 | 117 | \S4method{trajectory}{Tempora}(x) 118 | 119 | \S4method{trajectory}{Tempora}(x) <- value 120 | 121 | layouts(x) 122 | 123 | layouts(x) <- value 124 | 125 | \S4method{layouts}{Tempora}(x) 126 | 127 | \S4method{layouts}{Tempora}(x) <- value 128 | } 129 | \arguments{ 130 | \item{x}{Tempora object} 131 | 132 | \item{value}{New value} 133 | } 134 | \description{ 135 | A Tempora object contains the input gene expression matrix and metadata, as well as stores the meta data of each cluster, 136 | the clusters' pathway enrichment profiles, the constructed trajectory as well as the Sugiyama layout for the trajectory plot 137 | } 138 | \section{Slots}{ 139 | 140 | \describe{ 141 | \item{\code{data}}{A gene expression matrix (genes x cells), often aggregated from multiple time points} 142 | 143 | \item{\code{meta.data}}{A dataframe containing the metadata for the cells in the gene expression matrix, which at minimum includes the 144 | collection timepoint and cluster identity of each cell} 145 | 146 | \item{\code{cluster.metadata}}{A dataframe containing the metadata for each cell cluster} 147 | 148 | \item{\code{cluster.pathways}}{A dataframe containing the pathway enrichment profile of each cluster as calculated by \code{\link{CalculatePWProfiles}}} 149 | 150 | \item{\code{cluster.pathways.dr}}{A prcomp object containing the PCA of the clusters' pathway enrichment profiles} 151 | 152 | \item{\code{n.pcs}}{The number of principal components to be used in trajectory construction} 153 | 154 | \item{\code{trajectory}}{A dataframe describing the inferred trajectory as inferred by \code{\link{BuildTrajectory}}} 155 | 156 | \item{\code{layouts}}{A matrix containing the Sugiyama layout of the trajectory to be used in \code{\link{PlotTrajectory}}} 157 | 158 | \item{\code{varying.pws}}{A list of temporally varying pathways identified by \code{\link{IdentifyVaryingPWs}}} 159 | }} 160 | 161 | -------------------------------------------------------------------------------- /man/getExpr.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/dataAccess.R 3 | \name{getExpr} 4 | \alias{getExpr} 5 | \title{Get gene expression matrix from input data object} 6 | \usage{ 7 | getExpr(x, assayType, assaySlot) 8 | } 9 | \arguments{ 10 | \item{x}{The single-cell data object.} 11 | 12 | \item{assayType}{A length-one character vector representing the assay object 13 | in which the expression data is stored in the input object. For Seurat v1 14 | or v2 objects, set this to "". For Seurat v3 objects, this is often "RNA". 15 | For SingleCellExperiment objects, this is often "logcounts". See Details 16 | for how this argument is used in the accessor functions for each class.} 17 | 18 | \item{assaySlot}{An optional length-one character vector representing the 19 | slot of the Seurat v3 \code{\link[Seurat]{Assay}} object to use. In Seurat 20 | v3, normalized data is stored in the "data" slot, and counts in the 21 | "counts" slot. See Details for how this argument is used in the accessor 22 | functions for each class.} 23 | } 24 | \description{ 25 | Extract the gene expression matrix from a single-cell data object containing 26 | the input data for scClustViz visualization. 27 | } 28 | \details{ 29 | This is a wrapper function to the relevant class's normalized data slot 30 | accessor method. Currently supported input object classes: \itemize{ \item 31 | Class \code{\link[Seurat]{seurat}/\link[Seurat]{Seurat}} stored in 32 | \code{x@data} or \code{x@assays[[assayType]]@assaySlot}, depending on Seurat 33 | object version. \item Class 34 | \code{\link[SingleCellExperiment]{SingleCellExperiment}} accessed by 35 | \code{\link[SummarizedExperiment]{assay}(x,assayType)}. } 36 | \href{https://github.com/BaderLab/scClustViz/issues}{Please submit requests 37 | for other data objects here!} 38 | } 39 | \author{ 40 | Contributed by Brendan Innes from the BaderLab/scClustViz package 41 | } 42 | -------------------------------------------------------------------------------- /man/getMD.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/dataAccess.R 3 | \name{getMD} 4 | \alias{getMD} 5 | \title{Get metadata from input data object} 6 | \usage{ 7 | getMD(x) 8 | } 9 | \arguments{ 10 | \item{x}{The single-cell data object.} 11 | } 12 | \description{ 13 | Extract the cell metadata data frame from a single-cell data object 14 | containing the input data for scClustViz visualization. 15 | } 16 | \details{ 17 | This is a wrapper function to the relevant class's cell metadata slot 18 | accessor / assignment method. Currently supported input object classes: 19 | \itemize{ 20 | \item Class \code{\link[Seurat]{seurat}/\link[Seurat]{Seurat}} accessed by 21 | \code{x@data.info} or \code{x@meta.data}, 22 | depending on Seurat object version. 23 | \item Class \code{\link[SingleCellExperiment]{SingleCellExperiment}} 24 | accessed by \code{\link[SingleCellExperiment]{colData}(x)}. 25 | } 26 | \href{https://github.com/BaderLab/scClustViz/issues}{Please submit requests 27 | for other data objects here!} 28 | } 29 | \author{ 30 | Contributed by Brendan Innes from the BaderLab/scClustViz package 31 | } 32 | --------------------------------------------------------------------------------