├── src ├── .gitignore ├── Makevars ├── Makevars.my ├── time_code.h ├── functions.cpp ├── Rmytsne.cpp ├── mytsne.h ├── kissrandom.h ├── RcppExports.cpp └── mytsne.cpp ├── .Rbuildignore ├── R ├── CSOmapR-package.R ├── RcppExports.R ├── facilitatetSNE.R └── utils.R ├── man ├── calc_d_rcpp.Rd ├── update_grads_rcpp.Rd ├── getDensity3D.Rd ├── getSignificance.Rd ├── calcNormalizedConnection.Rd ├── getAffinityMat.Rd ├── getContribution.Rd ├── CSOmapR-package.Rd ├── getCoordinates.Rd ├── plot3D.Rd ├── runExactTSNE_R.Rd ├── optimization.Rd └── run_tSNE.Rd ├── CSOmapR.Rproj ├── NAMESPACE ├── DESCRIPTION ├── .gitignore ├── utils └── ref.R ├── README.md ├── CSOmap.R └── LICENSE /src/.gitignore: -------------------------------------------------------------------------------- 1 | *.o 2 | *.so 3 | *.dll 4 | -------------------------------------------------------------------------------- /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^data-raw$ 2 | ^.*\.Rproj$ 3 | ^\.Rproj\.user$ 4 | -------------------------------------------------------------------------------- /src/Makevars: -------------------------------------------------------------------------------- 1 | CXXFLAGS+=-Wno-ignored-attributes 2 | CXX11FLAGS+=-Wno-ignored-attributes 3 | CXX14FLAGS+=-Wno-ignored-attributes -------------------------------------------------------------------------------- /src/Makevars.my: -------------------------------------------------------------------------------- 1 | CXXFLAGS+=-Wno-ignored-attributes 2 | CXX11FLAGS+=-Wno-ignored-attributes 3 | CXX14FLAGS+=-Wno-ignored-attributes -------------------------------------------------------------------------------- /R/CSOmapR-package.R: -------------------------------------------------------------------------------- 1 | ## usethis namespace: start 2 | #' @useDynLib CSOmapR, .registration = TRUE 3 | #' @importFrom Rcpp sourceCpp 4 | #' @import RcppEigen 5 | ## usethis namespace: end 6 | NULL 7 | -------------------------------------------------------------------------------- /man/calc_d_rcpp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/RcppExports.R 3 | \name{calc_d_rcpp} 4 | \alias{calc_d_rcpp} 5 | \title{Calculate d using Rcpp} 6 | \usage{ 7 | calc_d_rcpp(ydata) 8 | } 9 | \arguments{ 10 | \item{ydata}{a matrix of coordinates} 11 | } 12 | \value{ 13 | d 14 | } 15 | \description{ 16 | Calculate d using Rcpp 17 | } 18 | -------------------------------------------------------------------------------- /CSOmapR.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | BuildType: Package 16 | PackageUseDevtools: Yes 17 | PackageInstallArgs: --no-multiarch --with-keep.source 18 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(calcNormalizedConnection) 4 | export(getAffinityMat) 5 | export(getContribution) 6 | export(getCoordinates) 7 | export(getDensity3D) 8 | export(getSignificance) 9 | export(optimization) 10 | export(plot3D) 11 | export(runExactTSNE_R) 12 | export(run_tSNE) 13 | import(Rcpp) 14 | import(RcppEigen) 15 | importFrom(Rcpp,sourceCpp) 16 | useDynLib(CSOmapR) 17 | useDynLib(CSOmapR, .registration = TRUE) 18 | -------------------------------------------------------------------------------- /man/update_grads_rcpp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/RcppExports.R 3 | \name{update_grads_rcpp} 4 | \alias{update_grads_rcpp} 5 | \title{Update gradients using Rcpp} 6 | \usage{ 7 | update_grads_rcpp(grads, ydata, stiffnesses) 8 | } 9 | \arguments{ 10 | \item{grads}{a matrix of gradients} 11 | 12 | \item{ydata}{a matrix of coordinates} 13 | 14 | \item{stiffnesses}{a matrix of stiffnesses} 15 | } 16 | \value{ 17 | updated grads 18 | } 19 | \description{ 20 | Update gradients using Rcpp 21 | } 22 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: CSOmapR 2 | Type: Package 3 | Title: CSOmap in R 4 | Version: 1.0 5 | Date: 2020-09-24 6 | Authors@R: 7 | person(given = "Jason", 8 | family = "Li", 9 | role = c("aut", "cre", "cph", "ctb"), 10 | email = "lijxug@gmail.com", 11 | comment = c(ORCID = "0000-0002-8720-1344")) 12 | Description: 13 | An more powerful version of CSOmap in R 14 | Encoding: UTF-8 15 | License: MIT 16 | Imports: Rcpp (>= 1.0.3) 17 | LinkingTo: Rcpp, RcppEigen 18 | RoxygenNote: 7.1.1 19 | LazyData: true 20 | Suggests: plotly, htmlwidgets 21 | -------------------------------------------------------------------------------- /man/getDensity3D.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{getDensity3D} 4 | \alias{getDensity3D} 5 | \title{get 3D density 6 | Estimate 3D density around each data point based on coordinates.} 7 | \usage{ 8 | getDensity3D(x, y, z, n = 100, ...) 9 | } 10 | \arguments{ 11 | \item{x, y, z}{coordinates} 12 | 13 | \item{n}{numbers of grid points to use for each dimension; recycled if length is less than 3. Default 100.} 14 | 15 | \item{...}{other parameters passed to kde3d} 16 | } 17 | \description{ 18 | get 3D density 19 | Estimate 3D density around each data point based on coordinates. 20 | } 21 | -------------------------------------------------------------------------------- /man/getSignificance.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{getSignificance} 4 | \alias{getSignificance} 5 | \title{Test for significance level} 6 | \usage{ 7 | getSignificance( 8 | coordinates, 9 | labels, 10 | k = 3, 11 | adjusted.method = "fdr", 12 | verbose = F 13 | ) 14 | } 15 | \arguments{ 16 | \item{labels}{a vector of celltype labels, correspond to the coordinates matrix} 17 | 18 | \item{k}{a integer for top k connections} 19 | 20 | \item{coordinate}{a 3D matrix} 21 | } 22 | \value{ 23 | A list contains coordinates, counts, p values and q values 24 | } 25 | \description{ 26 | Test for significance level 27 | } 28 | -------------------------------------------------------------------------------- /man/calcNormalizedConnection.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{calcNormalizedConnection} 4 | \alias{calcNormalizedConnection} 5 | \title{Calculate normalized connection based on connection matrix and cell counts} 6 | \usage{ 7 | calcNormalizedConnection(connection_mt, cell_count_table) 8 | } 9 | \arguments{ 10 | \item{connection_mt}{Named matrix} 11 | 12 | \item{cell_count_table}{Cell count table generated by `table()` or a named vector recording the cell counts} 13 | } 14 | \value{ 15 | Named matrix of normalized connection 16 | } 17 | \description{ 18 | Calculate normalized connection based on connection matrix and cell counts 19 | } 20 | -------------------------------------------------------------------------------- /man/getAffinityMat.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{getAffinityMat} 4 | \alias{getAffinityMat} 5 | \title{Calculate affinity matrix} 6 | \usage{ 7 | getAffinityMat(TPM, LR, denoise = 50, eps = 2.2251e-308, verbose = F, ...) 8 | } 9 | \arguments{ 10 | \item{TPM}{a TPM matrix with gene names as rownames and cell names as colnames} 11 | 12 | \item{LR}{a dataframe/tibble record the information of ligand receptor pairs, 13 | have to have colnames "ligand", "receptor" and an optional third column with weights} 14 | 15 | \item{denoise}{numeric value,} 16 | 17 | \item{eps}{Minimum distances between cells} 18 | 19 | \item{verbose}{logical. If TRUE, print out the progress information} 20 | } 21 | \description{ 22 | Calculate affinity matrix 23 | } 24 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # History files 2 | .Rhistory 3 | .Rapp.history 4 | 5 | # Session Data files 6 | .RData 7 | 8 | # User-specific files 9 | .Ruserdata 10 | 11 | # Example code in package build process 12 | *-Ex.R 13 | 14 | # Output files from R CMD build 15 | /*.tar.gz 16 | 17 | # Output files from R CMD check 18 | /*.Rcheck/ 19 | 20 | # RStudio files 21 | .Rproj.user/ 22 | 23 | # produced vignettes 24 | vignettes/*.html 25 | vignettes/*.pdf 26 | 27 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 28 | .httr-oauth 29 | 30 | # knitr and R markdown default cache directories 31 | *_cache/ 32 | /cache/ 33 | 34 | # Temporary files created by R markdown 35 | *.utf8.md 36 | *.knit.md 37 | 38 | # R Environment Variables 39 | .Renviron 40 | 41 | # Ignore notebook & html 42 | **.html 43 | **.rmd 44 | 45 | # Ignore results 46 | /results/ 47 | 48 | # Ignore data-raw 49 | /data-raw/ 50 | 51 | # Ignore tmp data .dat 52 | **.dat 53 | 54 | -------------------------------------------------------------------------------- /man/getContribution.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{getContribution} 4 | \alias{getContribution} 5 | \title{get LR contribution for all the listed cluster pairs} 6 | \usage{ 7 | getContribution(TPM, LR, detailed_connections, verbose = T) 8 | } 9 | \arguments{ 10 | \item{TPM}{a TPM matrix with gene names as rownames and cell names as colnames} 11 | 12 | \item{LR}{a dataframe/tibble record the information of ligand receptor pairs, 13 | have to have colnames "ligand", "receptor" and an optional third column with weights} 14 | 15 | \item{detailed_connections}{a list generated by `getSignificance`, 16 | which stored the connected cell pairs for each clutser pair} 17 | 18 | \item{verbose}{logical; whether to print progress} 19 | } 20 | \value{ 21 | a named list with sorted LR contributions 22 | } 23 | \description{ 24 | get LR contribution for all the listed cluster pairs 25 | } 26 | -------------------------------------------------------------------------------- /man/CSOmapR-package.Rd: -------------------------------------------------------------------------------- 1 | \name{CSOmapR-package} 2 | \alias{CSOmapR-package} 3 | \alias{CSOmapR} 4 | \docType{package} 5 | \title{ 6 | A short title line describing what the package does 7 | } 8 | \description{ 9 | A more detailed description of what the package does. A length 10 | of about one to five lines is recommended. 11 | } 12 | \details{ 13 | This section should provide a more detailed overview of how to use the 14 | package, including the most important functions. 15 | } 16 | \author{ 17 | Your Name, email optional. 18 | 19 | Maintainer: Your Name 20 | } 21 | \references{ 22 | This optional section can contain literature or other references for 23 | background information. 24 | } 25 | \keyword{ package } 26 | \seealso{ 27 | Optional links to other man pages 28 | } 29 | \examples{ 30 | \dontrun{ 31 | ## Optional simple examples of the most important functions 32 | ## These can be in \dontrun{} and \donttest{} blocks. 33 | } 34 | } 35 | -------------------------------------------------------------------------------- /man/getCoordinates.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{getCoordinates} 4 | \alias{getCoordinates} 5 | \title{Calculate 3D coordinates from expression 6 | A wrapper function to get 3D coordinates directly from expression} 7 | \usage{ 8 | getCoordinates(TPM, LR, method = "tSNE", verbose = F, ...) 9 | } 10 | \arguments{ 11 | \item{TPM}{TPM matrix, with gene names as rownames and cell names as colnames} 12 | 13 | \item{LR}{dataframe/tibble; record the information of ligand receptor pairs, have to have colnames "ligand" and "receptor"} 14 | 15 | \item{method}{string; sepcify the optimization method to use. Can be one of 'Rcpp', 'tSNE' or 'BHtSNE'.} 16 | 17 | \item{verbose}{logical. If TRUE, print out the progress information} 18 | 19 | \item{...}{arguments passsed to different optimization method} 20 | } 21 | \description{ 22 | Calculate 3D coordinates from expression 23 | A wrapper function to get 3D coordinates directly from expression 24 | } 25 | -------------------------------------------------------------------------------- /src/time_code.h: -------------------------------------------------------------------------------- 1 | #ifndef TIME_CODE_H 2 | #define TIME_CODE_H 3 | #include 4 | #if defined(TIME_CODE) 5 | #pragma message "Timing code" 6 | #define INITIALIZE_TIME std::chrono::steady_clock::time_point STARTVAR; 7 | #define START_TIME \ 8 | STARTVAR = std::chrono::steady_clock::now(); 9 | 10 | #define END_TIME(LABEL) { \ 11 | std::chrono::steady_clock::time_point ENDVAR = std::chrono::steady_clock::now(); \ 12 | printf("%s: %ld ms\n",LABEL, std::chrono::duration_cast(ENDVAR-STARTVAR).count()); \ 13 | } 14 | #else 15 | #define INITIALIZE_TIME 16 | #define START_TIME 17 | #define END_TIME(LABEL) {} 18 | 19 | #endif 20 | #endif 21 | -------------------------------------------------------------------------------- /R/RcppExports.R: -------------------------------------------------------------------------------- 1 | # Generated by using Rcpp::compileAttributes() -> do not edit by hand 2 | # Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393 3 | 4 | runExactTSNE_wrapper <- function(X, no_dims, verbose, max_iter, Y_in, init, rand_seed, skip_random_init, max_step_norm, mom_switch_iter, momentum, final_momentum, df) { 5 | .Call(`_CSOmapR_runExactTSNE_wrapper`, X, no_dims, verbose, max_iter, Y_in, init, rand_seed, skip_random_init, max_step_norm, mom_switch_iter, momentum, final_momentum, df) 6 | } 7 | 8 | #' Calculate d using Rcpp 9 | #' 10 | #' @param ydata a matrix of coordinates 11 | #' @return d 12 | #' 13 | calc_d_rcpp <- function(ydata) { 14 | .Call(`_CSOmapR_calc_d_rcpp`, ydata) 15 | } 16 | 17 | #' Update gradients using Rcpp 18 | #' 19 | #' @param grads a matrix of gradients 20 | #' @param ydata a matrix of coordinates 21 | #' @param stiffnesses a matrix of stiffnesses 22 | #' @return updated grads 23 | #' 24 | update_grads_rcpp <- function(grads, ydata, stiffnesses) { 25 | .Call(`_CSOmapR_update_grads_rcpp`, grads, ydata, stiffnesses) 26 | } 27 | 28 | -------------------------------------------------------------------------------- /man/plot3D.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{plot3D} 4 | \alias{plot3D} 5 | \title{Plot 3D figure using plotly 6 | A wrapper function to plot 3D plots using plotly. Not necessary for CSOmap's core functions.} 7 | \usage{ 8 | plot3D( 9 | plt_tbl, 10 | color_by = "density", 11 | title = "3D density", 12 | alpha = 0.8, 13 | save_path = NULL, 14 | ... 15 | ) 16 | } 17 | \arguments{ 18 | \item{plt_tbl}{data.frame/tibble; Should provide coordinates x,y,z.} 19 | 20 | \item{color_by}{string; Specify that by which columns should the data points will be colored} 21 | 22 | \item{title}{string; Title} 23 | 24 | \item{alpha}{numeirc; 0-1 specify the alpha of dots} 25 | 26 | \item{save_path}{string; Speicfy the saving path of the output 3D plot. a `/lib/` will also be generated with the output html. Default = NULL.} 27 | 28 | \item{...}{Other arguments that will be passed to htmlwidgets::saveWidget} 29 | } 30 | \value{ 31 | a plotly object 32 | } 33 | \description{ 34 | Plot 3D figure using plotly 35 | A wrapper function to plot 3D plots using plotly. Not necessary for CSOmap's core functions. 36 | } 37 | -------------------------------------------------------------------------------- /utils/ref.R: -------------------------------------------------------------------------------- 1 | CSOmap <- function(DataSetName) { 2 | library(plotly) 3 | 4 | # load data ---- 5 | TPMpath <- paste0("./data/", DataSetName, "/TPM.txt") 6 | LRpath <- paste0("./data/", DataSetName, "/LR_pairs.txt") 7 | Labelpath <- paste0("./data/", DataSetName, "/label.txt") 8 | TPM_tbl <- read.table(TPMpath, header = TRUE, sep = "\t") 9 | LR <- read.table(LRpath, header = FALSE, sep = "\t") 10 | labelData <- read.table(Labelpath, header = TRUE, sep = "\t") 11 | # create output path 12 | dir.create(paste0("./results/", DataSetName)) 13 | 14 | 15 | # set variables properly ---- 16 | TPM = as.matrix(TPM_tbl[,-1]) 17 | TPM[is.na(TPM)] = 0 18 | 19 | rownames(TPM) = TPM_tbl$X 20 | # genenames <- TPM$X 21 | # cellnames <- colnames(TPM) 22 | labels <- labelData$labels[match(colnames(TPM), labelData$cells)] 23 | labels[is.na(labels)] = "unlabeled" 24 | standards <- unique(labels) 25 | labelIx <- match(labels, standards) 26 | cellCounts <- table(labelIx) 27 | 28 | coords = getCoordinates(TPM, LR) 29 | 30 | # coordsPath <- 31 | # paste0("./results/", DataSetName, "/coordinates.txt") 32 | # write coords 33 | # write.table(coords, coordsPath, quote = FALSE, sep = "\t") 34 | 35 | # test for significance level---- 36 | significance_result = getSignificance(coords, labels, 3) 37 | 38 | return(significance_result) 39 | } 40 | 41 | -------------------------------------------------------------------------------- /man/runExactTSNE_R.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/facilitatetSNE.R 3 | \name{runExactTSNE_R} 4 | \alias{runExactTSNE_R} 5 | \title{Run exact tsne, wrapper for integrated Exact TSNE calculation cpp} 6 | \usage{ 7 | runExactTSNE_R(X, no_dims = 3, ...) 8 | } 9 | \arguments{ 10 | \item{X}{affinity matrix to input} 11 | 12 | \item{no_dims}{integer; Output dimensionality; Default = 3} 13 | 14 | \item{verbose}{logical; Whether to print out debug information; Default = TRUE} 15 | 16 | \item{max_iter}{integer; maximum iteration; Default = 1000} 17 | 18 | \item{Y_in}{user-defined intiate coordinates; Default = NULL} 19 | 20 | \item{init}{TRUE if Y_in were specified} 21 | 22 | \item{rand_seed}{integer; random seed default = -1, set by time.} 23 | 24 | \item{max_step_norm}{Maximum distance that a point is allowed to move on one iteration. Larger steps are clipped to this value. 25 | This prevents possible instabilities during gradient descent. Set to -1 to switch it off. (Default: 5) #'} 26 | 27 | \item{mom_switch_iter}{Numeric; (Default: 250)} 28 | 29 | \item{momentum}{numeric; (Default 0.5)} 30 | 31 | \item{final_momentum}{numeric; (Default 0.8)} 32 | 33 | \item{df}{Degree of freedom of t-distribution, must be greater than 0. Values smaller than 1 correspond to heavier tails, 34 | which can often resolve substructure in the embedding. See Kobak et al. (2019) for details. Default is 1.0} 35 | } 36 | \description{ 37 | Run exact tsne, wrapper for integrated Exact TSNE calculation cpp 38 | } 39 | -------------------------------------------------------------------------------- /src/functions.cpp: -------------------------------------------------------------------------------- 1 | #define EIGEN_USE_BLAS 2 | // [[Rcpp::depends(RcppEigen)]] 3 | #include 4 | #include 5 | 6 | 7 | // using arma::mat; 8 | using namespace Rcpp; 9 | using Eigen::MatrixXd; 10 | using Eigen::VectorXd; 11 | 12 | 13 | //' Calculate d using Rcpp 14 | //' 15 | //' @param ydata a matrix of coordinates 16 | //' @return d 17 | //' 18 | // [[Rcpp::export]] 19 | Eigen::MatrixXd calc_d_rcpp(Eigen::MatrixXd ydata){ 20 | Eigen::MatrixXd y_square = ydata.array().square(); 21 | Eigen::VectorXd sum_ydata = y_square.rowwise().sum(); 22 | 23 | Eigen::MatrixXd YYt = ydata * ydata.transpose(); 24 | 25 | Eigen::MatrixXd d = -2 * YYt; 26 | d.rowwise() += sum_ydata.transpose(); 27 | d.colwise() += sum_ydata; 28 | 29 | return(d); 30 | } 31 | 32 | //' Update gradients using Rcpp 33 | //' 34 | //' @param grads a matrix of gradients 35 | //' @param ydata a matrix of coordinates 36 | //' @param stiffnesses a matrix of stiffnesses 37 | //' @return updated grads 38 | //' 39 | // [[Rcpp::export]] 40 | Eigen::MatrixXd update_grads_rcpp(Eigen::MatrixXd grads, Eigen::MatrixXd ydata, Eigen::MatrixXd stiffnesses){ 41 | for(int i = 0; i 2 | #include "mytsne.h" 3 | #include 4 | using namespace Rcpp; 5 | 6 | // Function that runs the exact t-SNE 7 | // [[Rcpp::export]] 8 | Rcpp::List runExactTSNE_wrapper( 9 | NumericMatrix X, int no_dims, 10 | bool verbose, int max_iter, 11 | NumericMatrix Y_in, bool init, 12 | int rand_seed, bool skip_random_init, double max_step_norm, 13 | int mom_switch_iter, double momentum, double final_momentum, double df ) { 14 | 15 | if (verbose) Rprintf("Wrapper started\n"); 16 | int N = X.ncol(), D = X.nrow(); 17 | double * P =X.begin(); 18 | 19 | // NumericVector costs_vec(max_iter); 20 | // double* costs = costs_vec.begin(); 21 | 22 | if (verbose) Rprintf("Read the %i x %i data matrix successfully!\n", N, D); 23 | // std::vector Y(N * no_dims), costs(N), itercosts(static_cast(std::ceil(max_iter/50.0))); 24 | std::vector Y(N * no_dims), costs(max_iter+1); 25 | 26 | // NumericMatrix Y_mt(N, no_dims); 27 | // double *Y = Y_mt.begin(); 28 | 29 | // Providing user-supplied solution. 30 | if (init) { 31 | // for (int i = 0; i < Y_in.size(); i++) Y[i] = Y_in[i]; 32 | double * Y = Y_in.begin(); 33 | if (verbose) Rprintf("Using user supplied starting positions\n"); 34 | } 35 | 36 | 37 | // Run tsne 38 | int exit_code = runExactTSNE(P, N, D, Y.data(), no_dims, rand_seed, skip_random_init, max_iter, 39 | mom_switch_iter, momentum, final_momentum, costs.data(), df, max_step_norm, verbose); 40 | 41 | if(verbose){ 42 | if(exit_code < 0){ 43 | Rprintf("Error occured, exit_code = %i\n", exit_code); 44 | } else { 45 | Rprintf("Run successful! Returning values now.\n"); 46 | } 47 | } 48 | 49 | return Rcpp::List::create(Rcpp::_["Y"]=Rcpp::NumericMatrix(no_dims, N, Y.data()), 50 | Rcpp::_["costs"]=Rcpp::NumericVector(costs.begin(), costs.end()), 51 | Rcpp::_["N"]=N, 52 | Rcpp::_["D"]=D); 53 | } 54 | 55 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CSOmapR 2 | [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://github.com/lijxug/CSOmapR/blob/master/LICENSE) 3 | 4 | --- 5 | 6 | R package for CSOmap(developing) 7 | 8 | Right now, CSOmapR is only available on the linux-based machines. Installation on windows may encounter errors. 9 | 10 | # Installation 11 | 12 | ``` r 13 | # install.packages("devtools") 14 | devtools::install_github("lijxug/CSOmapR") 15 | 16 | # install CSOmapR.demo package for easy-loading demo data 17 | # devtools::install_github("lijxug/CSOmapR.demo") 18 | ``` 19 | 20 | # Usage 21 | 22 | ## Load demo dataset 23 | ``` r 24 | library(CSOmapR) 25 | library(CSOmapR.demo) 26 | invisible(TPM) 27 | invisible(LR) 28 | invisible(labelData) 29 | ``` 30 | 31 | ## Calculate optimized 3D coordinates 32 | ``` r 33 | affinityMat = getAffinityMat(TPM, LR, verbose = T) 34 | 35 | coords_res = runExactTSNE_R( 36 | X = affinityMat, 37 | no_dims = 3, 38 | max_iter = 1000, 39 | verbose = T 40 | ) 41 | coords = coords_res$Y 42 | rownames(coords) <- colnames(TPM) 43 | colnames(coords) <- c('x', 'y', 'z') 44 | 45 | ``` 46 | 47 | ## Visualization(by 3D density) 48 | ``` r 49 | require(dplyr) 50 | # arrange data 51 | coords_tbl = bind_cols(cellName = rownames(coords), as.data.frame(coords)) 52 | 53 | join_vec = setNames(colnames(labelData)[1], nm = colnames(coords_tbl)[1]) 54 | cellinfo_tbl = left_join(coords_tbl, labelData, by = join_vec) 55 | 56 | density_obj = getDensity3D(cellinfo_tbl$x, cellinfo_tbl$y, cellinfo_tbl$z) 57 | cellinfo_tbl = cellinfo_tbl %>% mutate(density = density_obj) 58 | 59 | p_3Ddensity = plot3D(cellinfo_tbl, color_by = "density", title = "3D density") 60 | 61 | ``` 62 | 63 | ## Get significance 64 | ``` r 65 | signif_results = getSignificance(coords, labels = cellinfo_tbl$labels, verbose = T) 66 | contribution_list = getContribution(TPM, LR, signif_results$detailed_connections) 67 | ``` 68 | 69 | # When dealing with large dataset 70 | We provide two options: optimize coordinates through tSNE(BH algorithm), or downsample the original huge dataset first. 71 | 72 | ``` r 73 | # under development 74 | ``` 75 | 76 | # Citation 77 | Ren, X., Zhong, G., Zhang, Q., Zhang, L., Sun, Y., and Zhang, Z. (2020). Reconstruction of cell spatial organization from single-cell RNA sequencing data based on ligand-receptor mediated self-assembly. Cell Res. 78 | 79 | 80 | -------------------------------------------------------------------------------- /src/mytsne.h: -------------------------------------------------------------------------------- 1 | /* 2 | * 3 | * Copyright (c) 2014, Laurens van der Maaten (Delft University of Technology) 4 | * All rights reserved. 5 | * 6 | * Redistribution and use in source and binary forms, with or without 7 | * modification, are permitted provided that the following conditions are met: 8 | * 1. Redistributions of source code must retain the above copyright 9 | * notice, this list of conditions and the following disclaimer. 10 | * 2. Redistributions in binary form must reproduce the above copyright 11 | * notice, this list of conditions and the following disclaimer in the 12 | * documentation and/or other materials provided with the distribution. 13 | * 3. All advertising materials mentioning features or use of this software 14 | * must display the following acknowledgement: 15 | * This product includes software developed by the Delft University of Technology. 16 | * 4. Neither the name of the Delft University of Technology nor the names of 17 | * its contributors may be used to endorse or promote products derived from 18 | * this software without specific prior written permission. 19 | * 20 | * THIS SOFTWARE IS PROVIDED BY LAURENS VAN DER MAATEN ''AS IS'' AND ANY EXPRESS 21 | * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 22 | * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO 23 | * EVENT SHALL LAURENS VAN DER MAATEN BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 | * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 25 | * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR 26 | * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 27 | * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 28 | * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY 29 | * OF SUCH DAMAGE. 30 | * 31 | */ 32 | #include 33 | 34 | #ifndef TSNE_H 35 | #define TSNE_H 36 | 37 | static inline double sign(double x) { return (x == .0 ? .0 : (x < .0 ? -1.0 : 1.0)); } 38 | 39 | int runExactTSNE(double* P, int N, int D, double* Y, int no_dims, int rand_seed, 40 | bool skip_random_init, int max_iter, int mom_switch_iter, 41 | double momentum, double final_momentum, 42 | double* costs, double df, double max_step_norm, bool verbose); 43 | 44 | 45 | void computeExactGradient(double* P, double* Y, int N, int D, double* dC, double df); 46 | void computeExactGradientTest(double* Y, int N, int D, double df); 47 | double evaluateError(double* P, double* Y, int N, int D, double df, bool verbose); 48 | void zeroMean(double* X, int N, int D); 49 | void computeSquaredEuclideanDistance(double* X, int N, int D, double *DD); 50 | 51 | double randn(); 52 | 53 | #endif 54 | -------------------------------------------------------------------------------- /src/kissrandom.h: -------------------------------------------------------------------------------- 1 | #ifndef KISSRANDOM_H 2 | #define KISSRANDOM_H 3 | 4 | #if defined(_MSC_VER) && _MSC_VER == 1500 5 | typedef unsigned __int32 uint32_t; 6 | typedef unsigned __int64 uint64_t; 7 | #else 8 | #include 9 | #endif 10 | 11 | // KISS = "keep it simple, stupid", but high quality random number generator 12 | // http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf -> "Use a good RNG and build it into your code" 13 | // http://mathforum.org/kb/message.jspa?messageID=6627731 14 | // https://de.wikipedia.org/wiki/KISS_(Zufallszahlengenerator) 15 | 16 | // 32 bit KISS 17 | struct Kiss32Random { 18 | uint32_t x; 19 | uint32_t y; 20 | uint32_t z; 21 | uint32_t c; 22 | 23 | // seed must be != 0 24 | Kiss32Random(uint32_t seed = 123456789) { 25 | x = seed; 26 | y = 362436000; 27 | z = 521288629; 28 | c = 7654321; 29 | } 30 | 31 | uint32_t kiss() { 32 | // Linear congruence generator 33 | x = 69069 * x + 12345; 34 | 35 | // Xor shift 36 | y ^= y << 13; 37 | y ^= y >> 17; 38 | y ^= y << 5; 39 | 40 | // Multiply-with-carry 41 | uint64_t t = 698769069ULL * z + c; 42 | c = t >> 32; 43 | z = (uint32_t) t; 44 | 45 | return x + y + z; 46 | } 47 | inline int flip() { 48 | // Draw random 0 or 1 49 | return kiss() & 1; 50 | } 51 | inline size_t index(size_t n) { 52 | // Draw random integer between 0 and n-1 where n is at most the number of data points you have 53 | return kiss() % n; 54 | } 55 | inline void set_seed(uint32_t seed) { 56 | x = seed; 57 | } 58 | }; 59 | 60 | // 64 bit KISS. Use this if you have more than about 2^24 data points ("big data" ;) ) 61 | struct Kiss64Random { 62 | uint64_t x; 63 | uint64_t y; 64 | uint64_t z; 65 | uint64_t c; 66 | 67 | // seed must be != 0 68 | Kiss64Random(uint64_t seed = 1234567890987654321ULL) { 69 | x = seed; 70 | y = 362436362436362436ULL; 71 | z = 1066149217761810ULL; 72 | c = 123456123456123456ULL; 73 | } 74 | 75 | uint64_t kiss() { 76 | // Linear congruence generator 77 | z = 6906969069LL*z+1234567; 78 | 79 | // Xor shift 80 | y ^= (y<<13); 81 | y ^= (y>>17); 82 | y ^= (y<<43); 83 | 84 | // Multiply-with-carry (uint128_t t = (2^58 + 1) * x + c; c = t >> 64; x = (uint64_t) t) 85 | uint64_t t = (x<<58)+c; 86 | c = (x>>6); 87 | x += t; 88 | c += (x do not edit by hand 2 | // Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393 3 | 4 | #include 5 | #include 6 | 7 | using namespace Rcpp; 8 | 9 | // runExactTSNE_wrapper 10 | Rcpp::List runExactTSNE_wrapper(NumericMatrix X, int no_dims, bool verbose, int max_iter, NumericMatrix Y_in, bool init, int rand_seed, bool skip_random_init, double max_step_norm, int mom_switch_iter, double momentum, double final_momentum, double df); 11 | RcppExport SEXP _CSOmapR_runExactTSNE_wrapper(SEXP XSEXP, SEXP no_dimsSEXP, SEXP verboseSEXP, SEXP max_iterSEXP, SEXP Y_inSEXP, SEXP initSEXP, SEXP rand_seedSEXP, SEXP skip_random_initSEXP, SEXP max_step_normSEXP, SEXP mom_switch_iterSEXP, SEXP momentumSEXP, SEXP final_momentumSEXP, SEXP dfSEXP) { 12 | BEGIN_RCPP 13 | Rcpp::RObject rcpp_result_gen; 14 | Rcpp::RNGScope rcpp_rngScope_gen; 15 | Rcpp::traits::input_parameter< NumericMatrix >::type X(XSEXP); 16 | Rcpp::traits::input_parameter< int >::type no_dims(no_dimsSEXP); 17 | Rcpp::traits::input_parameter< bool >::type verbose(verboseSEXP); 18 | Rcpp::traits::input_parameter< int >::type max_iter(max_iterSEXP); 19 | Rcpp::traits::input_parameter< NumericMatrix >::type Y_in(Y_inSEXP); 20 | Rcpp::traits::input_parameter< bool >::type init(initSEXP); 21 | Rcpp::traits::input_parameter< int >::type rand_seed(rand_seedSEXP); 22 | Rcpp::traits::input_parameter< bool >::type skip_random_init(skip_random_initSEXP); 23 | Rcpp::traits::input_parameter< double >::type max_step_norm(max_step_normSEXP); 24 | Rcpp::traits::input_parameter< int >::type mom_switch_iter(mom_switch_iterSEXP); 25 | Rcpp::traits::input_parameter< double >::type momentum(momentumSEXP); 26 | Rcpp::traits::input_parameter< double >::type final_momentum(final_momentumSEXP); 27 | Rcpp::traits::input_parameter< double >::type df(dfSEXP); 28 | rcpp_result_gen = Rcpp::wrap(runExactTSNE_wrapper(X, no_dims, verbose, max_iter, Y_in, init, rand_seed, skip_random_init, max_step_norm, mom_switch_iter, momentum, final_momentum, df)); 29 | return rcpp_result_gen; 30 | END_RCPP 31 | } 32 | // calc_d_rcpp 33 | Eigen::MatrixXd calc_d_rcpp(Eigen::MatrixXd ydata); 34 | RcppExport SEXP _CSOmapR_calc_d_rcpp(SEXP ydataSEXP) { 35 | BEGIN_RCPP 36 | Rcpp::RObject rcpp_result_gen; 37 | Rcpp::RNGScope rcpp_rngScope_gen; 38 | Rcpp::traits::input_parameter< Eigen::MatrixXd >::type ydata(ydataSEXP); 39 | rcpp_result_gen = Rcpp::wrap(calc_d_rcpp(ydata)); 40 | return rcpp_result_gen; 41 | END_RCPP 42 | } 43 | // update_grads_rcpp 44 | Eigen::MatrixXd update_grads_rcpp(Eigen::MatrixXd grads, Eigen::MatrixXd ydata, Eigen::MatrixXd stiffnesses); 45 | RcppExport SEXP _CSOmapR_update_grads_rcpp(SEXP gradsSEXP, SEXP ydataSEXP, SEXP stiffnessesSEXP) { 46 | BEGIN_RCPP 47 | Rcpp::RObject rcpp_result_gen; 48 | Rcpp::RNGScope rcpp_rngScope_gen; 49 | Rcpp::traits::input_parameter< Eigen::MatrixXd >::type grads(gradsSEXP); 50 | Rcpp::traits::input_parameter< Eigen::MatrixXd >::type ydata(ydataSEXP); 51 | Rcpp::traits::input_parameter< Eigen::MatrixXd >::type stiffnesses(stiffnessesSEXP); 52 | rcpp_result_gen = Rcpp::wrap(update_grads_rcpp(grads, ydata, stiffnesses)); 53 | return rcpp_result_gen; 54 | END_RCPP 55 | } 56 | 57 | static const R_CallMethodDef CallEntries[] = { 58 | {"_CSOmapR_runExactTSNE_wrapper", (DL_FUNC) &_CSOmapR_runExactTSNE_wrapper, 13}, 59 | {"_CSOmapR_calc_d_rcpp", (DL_FUNC) &_CSOmapR_calc_d_rcpp, 1}, 60 | {"_CSOmapR_update_grads_rcpp", (DL_FUNC) &_CSOmapR_update_grads_rcpp, 3}, 61 | {NULL, NULL, 0} 62 | }; 63 | 64 | RcppExport void R_init_CSOmapR(DllInfo *dll) { 65 | R_registerRoutines(dll, NULL, CallEntries, NULL, NULL); 66 | R_useDynamicSymbols(dll, FALSE); 67 | } 68 | -------------------------------------------------------------------------------- /CSOmap.R: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env Rscript 2 | suppressPackageStartupMessages(library("methods")) # Rscript for CMD doesn't load this automatically 3 | 4 | # ---- inputs ---- 5 | suppressWarnings(library("optparse")) 6 | 7 | parser <- OptionParser() 8 | option_list <- list( 9 | make_option(c("-v", "--verbose"), action="store_true", default=TRUE, 10 | help="Print extra output [default]"), 11 | make_option(c("-q", "--quietly"), action="store_false", 12 | dest="verbose", help="Print little output"), 13 | make_option(c("-n", "--nCore"), type="integer", default=1, 14 | help="Number of cores to use. Under development. [default %default]", 15 | metavar="number"), 16 | make_option(c("-e", "--version"), type="character", default="cpp", 17 | help="version of optimization to use. origin or cpp [default %default]", 18 | metavar="number") 19 | ) 20 | 21 | inputs = commandArgs(trailingOnly=FALSE) 22 | script_dir = dirname(sub("--file=", "", inputs[grep("--file=",inputs)])) 23 | if(!length(script_dir)){ 24 | script_dir = getwd() 25 | } 26 | # Loading dependencies & functions ---- 27 | 28 | 29 | # get real args 30 | parser = OptionParser( 31 | usage = 32 | "%prog [options] 33 | Description: 34 | Labeltblpath: a path pointed to a label table, with the first column being the corresponding cellID of TPM matrix. 35 | Every following columns will be parsed as a label vector and passed to `getSignificance` one by one. 36 | This script is designed for running CSOmapR in console.", 37 | option_list = option_list 38 | ) # donot change the format of this doc. 39 | 40 | itfs = parse_args(parser, positional_arguments = TRUE) 41 | opts = itfs$options 42 | args = list( 43 | TPMpath = itfs$args[1], 44 | LRpath = itfs$args[2], 45 | Labeltblpath = itfs$args[3], 46 | output_dir = itfs$args[4] 47 | ) 48 | 49 | if(!dir.exists(args$output_dir)) { 50 | dir.create(args$output_dir) 51 | } 52 | 53 | # if none arguments input 54 | if(anyNA(args)){ 55 | warning("Missing arguments!") 56 | parse_args(parser, args = c("--help")) 57 | q("no", status = 2) 58 | } 59 | 60 | 61 | # ---- loading dependencies ---- 62 | suppressPackageStartupMessages(library("Seurat")) 63 | suppressPackageStartupMessages(library("tidyverse")) 64 | suppressPackageStartupMessages(library("reshape2")) 65 | suppressPackageStartupMessages(library("tictoc")) 66 | suppressPackageStartupMessages(library("Rcpp")) 67 | source("/lustre1/zeminz_pkuhpc/lijiesheng/00.data/05.COVID19/01.codes/CSOmapR/utils/utils.R") # hardcoded, should be modified in the future 68 | 69 | # ---- main ---- 70 | loginfo("Loading input file ...") 71 | tic() 72 | TPM_tbl <- read_tsv(args$TPMpath) 73 | LR <- read.table(args$LRpath, header = FALSE, sep = "\t") 74 | labelData <- read_tsv(args$Labeltblpath) 75 | toc() 76 | 77 | # set variables & validation ---- 78 | TPM = as.matrix(TPM_tbl[,-1]) 79 | rownames(TPM) = TPM_tbl[, 1, drop = T] 80 | TPM[is.na(TPM)] = 0 81 | 82 | # genenames <- TPM$X 83 | # cellnames <- colnames(TPM) 84 | 85 | # check labels 86 | # stopifnot(all(colnames(TPM) %in% labelData[, 1, drop = T])) 87 | 88 | loginfo("CSOmap started ...") 89 | # get coordinates ---- 90 | coords = getCoordinates(TPM, LR, version = opts$version) 91 | coords_tbl = bind_cols(cellName = rownames(coords), as.data.frame(coords)) 92 | coords_outdir = paste0(args$output_dir, "/coordinates.txt") 93 | loginfo("Writing coordintates to ", coords_outdir) 94 | write_tsv(coords_tbl, path = coords_outdir) 95 | 96 | join_vec = setNames(colnames(labelData)[1], nm = colnames(coords_tbl)[1]) 97 | cellinfo_tbl = left_join(coords_tbl, labelData, by = join_vec) 98 | cellinfo_outdir = paste0(args$output_dir, "/cell_infos.tsv") 99 | write_tsv(cellinfo_tbl, path = cellinfo_outdir) 100 | 101 | # get significance ---- 102 | # for(iCol in 2:ncol(labelData)) { 103 | iCol = 2 # ignore the extra columns 104 | labels <- 105 | labelData[, iCol, drop = T][match(colnames(TPM), labelData[, 1, drop = T])] 106 | labels[is.na(labels)] = "unlabeled" 107 | standards <- unique(labels) 108 | labelIx <- match(labels, standards) 109 | cellCounts <- table(labelIx) 110 | 111 | # Get significance 112 | signif_res = getSignificance(coords, labels = labels) 113 | write_csv( 114 | signif_res$pvalue_tbl, 115 | path = paste0( 116 | args$output_dir, 117 | "/signif_interacting_clusters_", 118 | colnames(labelData)[iCol], 119 | ".csv" 120 | ) 121 | ) 122 | # } 123 | 124 | loginfo("Work Done!") 125 | -------------------------------------------------------------------------------- /man/run_tSNE.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/facilitatetSNE.R 3 | \name{run_tSNE} 4 | \alias{run_tSNE} 5 | \title{Wrapper function for FItSNE: fast_tsne.R} 6 | \usage{ 7 | run_tSNE(path2fast_tsneR = NULL, fast_tsne_path = NULL, verbose = T, ...) 8 | } 9 | \arguments{ 10 | \item{fast_tsne_path}{a string specify the path of executable binary fast_tsne} 11 | 12 | \item{verbose}{Print running infos for debugging.} 13 | 14 | \item{...}{include all the following fields that will be passed to fast_tsne} 15 | 16 | \item{path2fast_tsne}{a string specify the fast_tsne.R from FIt-SNE} 17 | 18 | \item{data_path}{a string specify the data_path passed to FIt-SNE} 19 | 20 | \item{load_affinities}{If 'precomputed', input data X will be regarded as precomputed similarities and passed to fast_tsne 21 | If 'load', input similarities will be loaded from the file. 22 | If 'save', input similarities are saved into a file. 23 | If 0 or NULL(default), affinities are neither saved nor loaded} 24 | 25 | \item{X}{data matrix} 26 | 27 | \item{dims}{dimensionality of the embedding. Default 2.} 28 | 29 | \item{perplexity}{perplexity is used to determine the bandwidth of the Gaussian kernel in the input space. Default 30.} 30 | 31 | \item{theta}{Set to 0 for exact. If non-zero, then will use either Barnes Hut or FIt-SNE based on nbody_algo. If Barnes Hut, then 32 | this determins the accuracy of BH approximation. Default 0.5.} 33 | 34 | \item{max_iter}{Number of iterations of t-SNE to run. Default 1000.} 35 | 36 | \item{ann_not_vptree}{use vp-trees (as in bhtsne) or approximate nearest neighbors (default). 37 | set to be True for approximate nearest neighbors} 38 | 39 | \item{exaggeration_factor}{coefficient for early exaggeration (>1). Default = 1, not used.} 40 | 41 | \item{no_momentum_during_exag}{Set to 0 to use momentum and other optimization tricks. 1 to do plain,vanilla 42 | gradient descent (useful for testing large exaggeration coefficients)} 43 | 44 | \item{stop_early_exag_iter}{When to switch off early exaggeration. Default 250.} 45 | 46 | \item{start_late_exag_iter}{When to start late exaggeration. 'auto' means that late exaggeration is not used, unless late_exag_coeff>0. In that 47 | case, start_late_exag_iter is set to stop_early_exag_iter. Otherwise, set to equal the iteration at which late exaggeration should begin. Default 'auto'.} 48 | 49 | \item{late_exag_coeff}{Late exaggeration coefficient. Set to -1 to not use late exaggeration. Default -1} 50 | 51 | \item{learning_rate}{Set to desired learning rate or 'auto', which sets learning rate to N/exaggeration_factor where N is the sample size, or to 200 if 52 | N/exaggeration_factor < 200. Default 'auto'} 53 | 54 | \item{max_step_norm}{Maximum distance that a point is allowed to move on 55 | one iteration. Larger steps are clipped to this value. This prevents 56 | possible instabilities during gradient descent. Set to -1 to switch it 57 | off. Default: 5} 58 | 59 | \item{mom_switch_iter}{Default 250} 60 | 61 | \item{momentum}{Default 0.5} 62 | 63 | \item{final_momentum}{Default 0.8} 64 | 65 | \item{nterms}{If using FIt-SNE, this is the number of interpolation points per sub-interval} 66 | 67 | \item{intervals_per_integer}{See min_num_intervals} 68 | 69 | \item{min_num_intervals}{Let maxloc = ceil(max(max(X))) and minloc = floor(min(min(X))). i.e. the points are in a [minloc]^no_dims by [maxloc]^no_dims interval/square. 70 | The number of intervals in each dimension is either min_num_intervals or ceil((maxloc - minloc)/intervals_per_integer), whichever is 71 | larger. min_num_intervals must be an integer >0, and intervals_per_integer must be >0. Default: min_num_intervals=50, intervals_per_integer = 1} 72 | 73 | \item{sigma}{Fixed sigma value to use when perplexity==-1 Default -1 (None)} 74 | 75 | \item{K}{Number of nearest neighbours to get when using fixed sigma Default -30 (None)} 76 | 77 | \item{initialization}{'random', or N x no_dims array to intialize the solution. Default: 'random'.} 78 | 79 | \item{perplexity_list}{if perplexity==0 then perplexity combination will 80 | be used with values taken from perplexity_list. Default: NULL} 81 | 82 | \item{df}{Degree of freedom of t-distribution, must be greater than 0. 83 | Values smaller than 1 correspond to heavier tails, which can often 84 | resolve substructure in the embedding. See Kobak et al. (2019) for 85 | details. Default is 1.0} 86 | 87 | \item{fft_not_bh}{logical, default FALSE, we only use tSNE for interpreting 3D coordinates} 88 | } 89 | \description{ 90 | Wrapper function for FItSNE: fast_tsne.R 91 | } 92 | -------------------------------------------------------------------------------- /R/facilitatetSNE.R: -------------------------------------------------------------------------------- 1 | 2 | load_FItSNE = function(path2fast_tsneR = NULL){ 3 | source(path2fast_tsneR) 4 | } 5 | 6 | #' Wrapper function for FItSNE: fast_tsne.R 7 | #' 8 | #' @param path2fast_tsne a string specify the fast_tsne.R from FIt-SNE 9 | #' @param data_path a string specify the data_path passed to FIt-SNE 10 | #' @param fast_tsne_path a string specify the path of executable binary fast_tsne 11 | #' @param load_affinities 12 | #' If 'precomputed', input data X will be regarded as precomputed similarities and passed to fast_tsne 13 | #' If 'load', input similarities will be loaded from the file. 14 | #' If 'save', input similarities are saved into a file. 15 | #' If 0 or NULL(default), affinities are neither saved nor loaded 16 | #' @param ... include all the following fields that will be passed to fast_tsne 17 | #' @param X data matrix 18 | #' @param dims dimensionality of the embedding. Default 2. 19 | #' @param perplexity perplexity is used to determine the bandwidth of the Gaussian kernel in the input space. Default 30. 20 | #' @param theta Set to 0 for exact. If non-zero, then will use either Barnes Hut or FIt-SNE based on nbody_algo. If Barnes Hut, then 21 | #' this determins the accuracy of BH approximation. Default 0.5. 22 | #' @param max_iter Number of iterations of t-SNE to run. Default 1000. 23 | #' @param ann_not_vptree use vp-trees (as in bhtsne) or approximate nearest neighbors (default). 24 | #' set to be True for approximate nearest neighbors 25 | #' @param exaggeration_factor coefficient for early exaggeration (>1). Default = 1, not used. 26 | #' @param no_momentum_during_exag Set to 0 to use momentum and other optimization tricks. 1 to do plain,vanilla 27 | #' gradient descent (useful for testing large exaggeration coefficients) 28 | #' @param stop_early_exag_iter When to switch off early exaggeration. Default 250. 29 | #' @param start_late_exag_iter When to start late exaggeration. 'auto' means that late exaggeration is not used, unless late_exag_coeff>0. In that 30 | #' case, start_late_exag_iter is set to stop_early_exag_iter. Otherwise, set to equal the iteration at which late exaggeration should begin. Default 'auto'. 31 | #' @param late_exag_coeff Late exaggeration coefficient. Set to -1 to not use late exaggeration. Default -1 32 | #' @param learning_rate Set to desired learning rate or 'auto', which sets learning rate to N/exaggeration_factor where N is the sample size, or to 200 if 33 | #' N/exaggeration_factor < 200. Default 'auto' 34 | #' @param max_step_norm Maximum distance that a point is allowed to move on 35 | #' one iteration. Larger steps are clipped to this value. This prevents 36 | #' possible instabilities during gradient descent. Set to -1 to switch it 37 | #' off. Default: 5 38 | #' @param mom_switch_iter Default 250 39 | #' @param momentum Default 0.5 40 | #' @param final_momentum Default 0.8 41 | #' @param nterms If using FIt-SNE, this is the number of interpolation points per sub-interval 42 | #' @param intervals_per_integer See min_num_intervals 43 | #' @param min_num_intervals Let maxloc = ceil(max(max(X))) and minloc = floor(min(min(X))). i.e. the points are in a [minloc]^no_dims by [maxloc]^no_dims interval/square. 44 | #' The number of intervals in each dimension is either min_num_intervals or ceil((maxloc - minloc)/intervals_per_integer), whichever is 45 | #' larger. min_num_intervals must be an integer >0, and intervals_per_integer must be >0. Default: min_num_intervals=50, intervals_per_integer = 1 46 | #' @param sigma Fixed sigma value to use when perplexity==-1 Default -1 (None) 47 | #' @param K Number of nearest neighbours to get when using fixed sigma Default -30 (None) 48 | #' @param initialization 'random', or N x no_dims array to intialize the solution. Default: 'random'. 49 | #' @param perplexity_list if perplexity==0 then perplexity combination will 50 | #' be used with values taken from perplexity_list. Default: NULL 51 | #' @param df Degree of freedom of t-distribution, must be greater than 0. 52 | #' Values smaller than 1 correspond to heavier tails, which can often 53 | #' resolve substructure in the embedding. See Kobak et al. (2019) for 54 | #' details. Default is 1.0 55 | #' @param fft_not_bh logical, default FALSE, we only use tSNE for interpreting 3D coordinates 56 | #' @param verbose Print running infos for debugging. 57 | #' @export 58 | #' 59 | run_tSNE = function(path2fast_tsneR = NULL, 60 | fast_tsne_path = NULL, 61 | verbose = T, ...) { 62 | load_FItSNE(path2fast_tsneR) 63 | args = list(...) 64 | 65 | args$dims = ifelse(is.null(args$dims), 2, args$dims) 66 | args$perplexity = ifelse(is.null(args$perplexity), 30, args$perplexity) 67 | args$theta = ifelse(is.null(args$theta), 0.5, args$theta) 68 | args$max_iter = ifelse(is.null(args$max_iter), 1000, args$max_iter) 69 | args$fft_not_bh = ifelse(is.null(args$fft_not_bh), FALSE, args$fft_not_bh) 70 | args$ann_not_vptree = ifelse(is.null(args$ann_not_vptree), TRUE, args$ann_not_vptree) 71 | args$stop_early_exag_iter = ifelse(is.null(args$stop_early_exag_iter), 72 | 250, 73 | args$stop_early_exag_iter) 74 | args$exaggeration_factor = ifelse(is.null(args$exaggeration_factor), 75 | 1, 76 | args$exaggeration_factor) 77 | args$no_momentum_during_exag = ifelse(is.null(args$no_momentum_during_exag), 78 | FALSE, 79 | args$no_momentum_during_exag) 80 | args$start_late_exag_iter = ifelse(is.null(args$start_late_exag_iter), 81 | -1 , 82 | args$start_late_exag_iter) 83 | args$late_exag_coeff = ifelse(is.null(args$late_exag_coeff), -1, args$late_exag_coeff) 84 | args$mom_switch_iter = ifelse(is.null(args$mom_switch_iter), 250 , args$mom_switch_iter) 85 | args$momentum = ifelse(is.null(args$momentum), 0.5 , args$momentum) 86 | args$final_momentum = ifelse(is.null(args$final_momentum), 0.8 , args$final_momentum) 87 | args$learning_rate = ifelse(is.null(args$learning_rate), 'auto', args$learning_rate) 88 | args$n_trees = ifelse(is.null(args$n_trees), 50 , args$n_trees) 89 | args$search_k = ifelse(is.null(args$search_k),-1 , args$search_k) 90 | args$rand_seed = ifelse(is.null(args$rand_seed),-1, args$rand_seed) 91 | args$nterms = ifelse(is.null(args$nterms), 3 , args$nterms) 92 | args$intervals_per_integer = ifelse(is.null(args$intervals_per_integer), 93 | 1 , 94 | args$intervals_per_integer) 95 | args$min_num_intervals = ifelse(is.null(args$min_num_intervals), 50 , args$min_num_intervals) 96 | args$K = ifelse(is.null(args$K),-1 , args$K) 97 | args$sigma = ifelse(is.null(args$sigma),-30 , args$sigma) 98 | args$initialization = ifelse(is.null(args$initialization), 'random', args$initialization) 99 | args$max_step_norm = ifelse(is.null(args$max_step_norm), 5, args$max_step_norm) 100 | # args$data_path = ifelse(is.null(args$data_path), NULL , args$data_path) 101 | # args$result_path = ifelse(is.null(args$result_path), NULL, args$result_path) 102 | # args$load_affinities = ifelse(is.null(args$load_affinities), NULL, args$load_affinities) 103 | # args$fast_tsne_path = ifelse(is.null(args$fast_tsne_path), NULL , args$fast_tsne_path) 104 | args$nthreads = ifelse(is.null(args$nthreads), 0 , args$nthreads) 105 | # args$perplexity_list = ifelse(is.null(args$perplexity_list), NULL , args$perplexity_list) 106 | args$get_costs = ifelse(is.null(args$get_costs), FALSE , args$get_costs) 107 | args$df = ifelse(is.null(args$df), 1.0, args$df) 108 | 109 | if (!is.null(args$load_affinities)) { 110 | # regarded input X as precomputed affinity matrix and output as the fast_tsne requested 111 | if (args$load_affinities == "precomputed") { 112 | if (is.null(args$X)) { 113 | stop("Empty data slot.") 114 | } 115 | if (args$theta == 0) { 116 | # only support exact tSNE 117 | if (verbose) { 118 | message(paste0("Saving affinity matrix to:", "P.dat")) 119 | } 120 | f <- file("P.dat", "wb") 121 | writeBin(as.numeric(args$X), f) 122 | close(f) 123 | args$load_affinities = "load" # rewrite for passing to fast_tsne 124 | } else { 125 | 126 | # row_P = ones([size(P,1)+1,1],'uint32'); 127 | # col_P = ones([1,1],'uint32'); 128 | # val_P = zeros([1,1], 'double'); 129 | # k = 0; 130 | # for i = 1:size(P,1) 131 | # row_P(i, 1) = k; 132 | # for j = 1:size(P,1) 133 | # if P(i,j) ~= 0 134 | # col_P(k+1, 1) = j; 135 | # val_P(k+1, 1) = P(i, j); 136 | # k = k+1; 137 | # end 138 | # end 139 | # end 140 | # row_P(size(P,1)+1,1) = k; 141 | 142 | if (verbose) { 143 | message(paste0("Saving affinity matrix to:", "val/row/col_P.dat")) 144 | } 145 | P = args$X 146 | 147 | tic() 148 | min_P = min(P) # min_P can be specified in the future, to be modified 149 | num_notZero = sum(P > min_P) # the minimum values were ignored and recognized as zero 150 | row_P = rep(0L, nrow(P) + 1) 151 | col_P = rep(0L, num_notZero) 152 | val_P = rep(0, num_notZero) 153 | 154 | k = 0 155 | for(i in 1:nrow(P)){ 156 | row_P[i] = k 157 | for(j in 1:ncol(P)){ 158 | if(P[i,j] != min_P){ 159 | col_P[k] = j 160 | val_P[k] = P[i, j] 161 | k = k+1 162 | } 163 | } 164 | } 165 | row_P[i+1] = k 166 | toc() 167 | 168 | row_P = as.integer(row_P) 169 | col_P = as.integer(col_P) 170 | 171 | # write .dat 172 | f <- file("P_row.dat", "wb") 173 | writeBin(row_P, f) 174 | close(f) 175 | 176 | f <- file("P_col.dat", "wb") 177 | writeBin(col_P, f) 178 | close(f) 179 | 180 | f <- file("P_val.dat", "wb") 181 | writeBin(val_P, f) 182 | close(f) 183 | 184 | args$load_affinities = "load" 185 | } 186 | } 187 | } 188 | args$fast_tsne_path = fast_tsne_path 189 | 190 | out = do.call(fftRtsne, args) 191 | } 192 | 193 | 194 | #' Run exact tsne, wrapper for integrated Exact TSNE calculation cpp 195 | #' 196 | #' @param X affinity matrix to input 197 | #' @param no_dims integer; Output dimensionality; Default = 3 198 | #' @param verbose logical; Whether to print out debug information; Default = TRUE 199 | #' @param max_iter integer; maximum iteration; Default = 1000 200 | #' @param Y_in user-defined intiate coordinates; Default = NULL 201 | #' @param init TRUE if Y_in were specified 202 | #' @param rand_seed integer; random seed default = -1, set by time. 203 | #' @param max_step_norm Maximum distance that a point is allowed to move on one iteration. Larger steps are clipped to this value. 204 | #' This prevents possible instabilities during gradient descent. Set to -1 to switch it off. (Default: 5) #' 205 | #' @param mom_switch_iter Numeric; (Default: 250) 206 | #' @param momentum numeric; (Default 0.5) 207 | #' @param final_momentum numeric; (Default 0.8) 208 | #' @param df Degree of freedom of t-distribution, must be greater than 0. Values smaller than 1 correspond to heavier tails, 209 | #' which can often resolve substructure in the embedding. See Kobak et al. (2019) for details. Default is 1.0 210 | #' @useDynLib CSOmapR 211 | #' @import Rcpp 212 | #' @export 213 | #' 214 | runExactTSNE_R = function(X, no_dims = 3, ...){ 215 | # NumericMatrix X, int no_dims, 216 | # bool verbose, int max_iter, 217 | # NumericMatrix Y_in, bool init, 218 | # int rand_seed, bool skip_random_init, double max_step_norm, 219 | # int mom_switch_iter, double momentum, double final_momentum, double df 220 | args = list(...) 221 | args$verbose = ifelse(is.null(args$verbose), T, args$verbose) 222 | args$max_iter = ifelse(is.null(args$max_iter), 1000, args$max_iter) 223 | args$rand_seed = ifelse(is.null(args$rand_seed), -1, args$rand_seed) 224 | args$init = ifelse(is.null(args$Y_in), F, T) 225 | if(!args$init){args$Y_in = matrix(0, 1, 1)} # default for input 226 | args$skip_random_init = args$init 227 | args$max_step_norm = ifelse(is.null(args$max_step_norm), 5, args$max_step_norm) 228 | args$mom_switch_iter = ifelse(is.null(args$mom_switch_iter), 250 , args$mom_switch_iter) 229 | args$momentum = ifelse(is.null(args$momentum), 0.5 , args$momentum) 230 | args$final_momentum = ifelse(is.null(args$final_momentum), 0.8 , args$final_momentum) 231 | args$df = ifelse(is.null(args$df), 1, args$df) 232 | if(args$verbose) loginfo("Now calculating exact TSNE") 233 | out = do.call(runExactTSNE_wrapper, c(list(X = X, no_dims = no_dims), args)) 234 | if(args$verbose) loginfo("Calculation done!") 235 | out$Y = t(out$Y) 236 | return(out) 237 | } 238 | -------------------------------------------------------------------------------- /src/mytsne.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * 3 | * Copyright (c) 2014, Laurens van der Maaten (Delft University of Technology) 4 | * All rights reserved. 5 | * 6 | * Redistribution and use in source and binary forms, with or without 7 | * modification, are permitted provided that the following conditions are met: 8 | * 1. Redistributions of source code must retain the above copyright 9 | * notice, this list of conditions and the following disclaimer. 10 | * 2. Redistributions in binary form must reproduce the above copyright 11 | * notice, this list of conditions and the following disclaimer in the 12 | * documentation and/or other materials provided with the distribution. 13 | * 3. All advertising materials mentioning features or use of this software 14 | * must display the following acknowledgement: 15 | * This product includes software developed by the Delft University of Technology. 16 | * 4. Neither the name of the Delft University of Technology nor the names of 17 | * its contributors may be used to endorse or promote products derived from 18 | * this software without specific prior written permission. 19 | * 20 | * THIS SOFTWARE IS PROVIDED BY LAURENS VAN DER MAATEN ''AS IS'' AND ANY EXPRESS 21 | * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 22 | * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO 23 | * EVENT SHALL LAURENS VAN DER MAATEN BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 | * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 25 | * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR 26 | * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 27 | * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 28 | * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY 29 | * OF SUCH DAMAGE. 30 | * 31 | */ 32 | 33 | // #include "winlibs/stdafx.h" 34 | #ifdef _WIN32 35 | #define _CRT_SECURE_NO_DEPRECATE 36 | #endif 37 | 38 | #include 39 | #include 40 | #include 41 | #include 42 | // #include "nbodyfft.h" 43 | #include 44 | // #include "annoylib.h" 45 | #include "kissrandom.h" 46 | #include 47 | #include 48 | #include 49 | // #include "vptree.h" 50 | // #include "sptree.h" 51 | #include "mytsne.h" 52 | // #include "progress_bar/ProgressBar.hpp" 53 | // #include "parallel_for.h" 54 | #include "time_code.h" 55 | 56 | #include 57 | 58 | using namespace std::chrono; 59 | 60 | // #ifdef _WIN32 61 | // #include "winlibs/unistd.h" 62 | // #else 63 | // #include 64 | // #endif 65 | 66 | #include 67 | 68 | #define _CRT_SECURE_NO_WARNINGS 69 | 70 | 71 | int itTest = 0; 72 | 73 | using namespace std; 74 | 75 | //Helper function for printing Y at each iteration. Useful for debugging 76 | void print_progress(int iter, double *Y, int N, int no_dims) { 77 | 78 | ofstream myfile; 79 | std::ostringstream stringStream; 80 | stringStream << "dat/intermediate" << iter << ".txt"; 81 | std::string copyOfStr = stringStream.str(); 82 | myfile.open(stringStream.str().c_str()); 83 | for (int j = 0; j < N; j++) { 84 | for (int i = 0; i < no_dims; i++) { 85 | myfile << Y[j * no_dims + i] << " "; 86 | } 87 | myfile << "\n"; 88 | } 89 | myfile.close(); 90 | } 91 | 92 | 93 | // Perform t-SNE 94 | // int TSNE::run(double *X, int N, int D, double *Y, int no_dims, double perplexity, double theta, int rand_seed, 95 | // bool skip_random_init, int max_iter, int stop_lying_iter, int mom_switch_iter, 96 | // double momentum, double final_momentum, double learning_rate, int K, double sigma, 97 | // int nbody_algorithm, int knn_algo, double early_exag_coeff, double *costs, 98 | // bool no_momentum_during_exag, int start_late_exag_iter, double late_exag_coeff, int n_trees, int search_k, 99 | // int nterms, double intervals_per_integer, int min_num_intervals, unsigned int nthreads, 100 | // int load_affinities, int perplexity_list_length, double *perplexity_list, double df, 101 | // double max_step_norm) { 102 | int runExactTSNE(double* P, int N, int D, double* Y, int no_dims, int rand_seed, 103 | bool skip_random_init, int max_iter, int mom_switch_iter, 104 | double momentum, double final_momentum, 105 | double* costs, double df, double max_step_norm, bool verbose) { 106 | 107 | 108 | // Determine whether we are using an exact algorithm 109 | // Allocate some memory 110 | auto *dY = (double *) malloc(N * no_dims * sizeof(double)); 111 | auto *uY = (double *) malloc(N * no_dims * sizeof(double)); 112 | auto *gains = (double *) malloc(N * no_dims * sizeof(double)); 113 | if (dY == nullptr || uY == nullptr || gains == nullptr) throw std::bad_alloc(); 114 | // Initialize gradient to zeros and gains to ones. 115 | for (int i = 0; i < N * no_dims; i++) uY[i] = .0; 116 | for (int i = 0; i < N * no_dims; i++) gains[i] = 1.0; 117 | 118 | // Set random seed 119 | if (skip_random_init != true) { 120 | if (rand_seed >= 0) { 121 | if(verbose){Rprintf("Using random seed: %d\n", rand_seed);} 122 | srand((unsigned int) rand_seed); 123 | } else { 124 | if(verbose){Rprintf("Using current time as random seed...\n");} 125 | srand(time(NULL)); 126 | } 127 | } 128 | 129 | // Initialize solution (randomly) 130 | if (skip_random_init != true) { 131 | if(verbose) {Rprintf("Randomly initializing the solution.\n");} 132 | for (int i = 0; i < N * no_dims; i++) Y[i] = randn() * .0001; 133 | if(verbose) {Rprintf("Y[0] = %lf\n", Y[0]);} 134 | } else { 135 | if(verbose) {Rprintf("Using the given initialization.\n");} 136 | } 137 | 138 | if(verbose) {print_progress(0, Y, N, no_dims);} 139 | 140 | // Perform main training loop 141 | if(verbose) {Rprintf("Similarities loaded \nLearning embedding...\n");} 142 | 143 | std::chrono::steady_clock::time_point start_time = std::chrono::steady_clock::now(); 144 | 145 | if(verbose) { Rprintf("Running iterations: %d\n", max_iter); } 146 | for (int iter = 0; iter < max_iter; iter++) { 147 | itTest = iter; 148 | computeExactGradient(P, Y, N, no_dims, dY,df); 149 | // no_mementum_during_exag was = FALSE in .R 150 | for (int i = 0; i < N * no_dims; i++) 151 | gains[i] = (sign(dY[i]) != sign(uY[i])) ? (gains[i] + .2) : (gains[i] * .8); 152 | for (int i = 0; i < N * no_dims; i++) if (gains[i] < .01) gains[i] = .01; 153 | // for (int i = 0; i < N * no_dims; i++) uY[i] = momentum * uY[i] - learning_rate * gains[i] * dY[i]; 154 | for (int i = 0; i < N * no_dims; i++) uY[i] = momentum * uY[i] - gains[i] * dY[i]; // try remove learning rate 155 | // Clip the step sizes if max_step_norm is provided 156 | if (max_step_norm > 0) { 157 | for (int i=0; i max_step_norm) { 164 | for (int j=0; j(now - start_time).count()/(float)1000.0, C); 191 | } 192 | start_time = std::chrono::steady_clock::now(); 193 | } 194 | } 195 | 196 | if(verbose) {Rprintf("All iterations done, cleaning now ...\n");} 197 | usleep(100000); // pause a little bit to print out information 198 | 199 | // Clean up memory 200 | free(dY); 201 | free(uY); 202 | free(gains); 203 | // free(P); 204 | if(verbose) {Rprintf("Cleanup done ...\n");} 205 | usleep(100000); // pause a little bit to print out information 206 | return 0; 207 | } 208 | 209 | 210 | void computeExactGradientTest(double* Y, int N, int D, double df ) { 211 | // Compute the squared Euclidean distance matrix 212 | double *DD = (double *) malloc(N * N * sizeof(double)); 213 | if (DD == NULL) { 214 | Rprintf("Memory allocation failed!\n"); 215 | exit(1); 216 | } 217 | computeSquaredEuclideanDistance(Y, N, D, DD); 218 | 219 | // Compute Q-matrix and normalization sum 220 | double *Q = (double *) malloc(N * N * sizeof(double)); 221 | if (Q == NULL) { 222 | Rprintf("Memory allocation failed!\n"); 223 | exit(1); 224 | } 225 | double sum_Q = .0; 226 | int nN = 0; 227 | for (int n = 0; n < N; n++) { 228 | for (int m = 0; m < N; m++) { 229 | if (n != m) { 230 | Q[nN + m] = 1.0 / pow(1.0 + DD[nN + m]/(double)df, (df)); 231 | sum_Q += Q[nN + m]; 232 | } 233 | } 234 | nN += N; 235 | } 236 | 237 | // Perform the computation of the gradient 238 | char buffer[500]; 239 | sprintf(buffer, "temp/exact_gradient%d.txt", itTest); 240 | FILE *fp = fopen(buffer, "w"); // Open file for writing 241 | nN = 0; 242 | int nD = 0; 243 | for (int n = 0; n < N; n++) { 244 | double testQij = 0; 245 | double testPos = 0; 246 | double testNeg1 = 0; 247 | double testNeg2 = 0; 248 | double testdC = 0; 249 | int mD = 0; 250 | for (int m = 0; m < N; m++) { 251 | if (n != m) { 252 | testNeg1 += pow(Q[nN + m],(df +1.0)/df) * (Y[nD + 0] - Y[mD + 0]) / sum_Q; 253 | testNeg2 += pow(Q[nN + m],(df +1.0)/df) * (Y[nD + 1] - Y[mD + 1]) / sum_Q; 254 | } 255 | mD += D; 256 | } 257 | fprintf(fp, "%d, %.12e, %.12e\n", n, testNeg1,testNeg2); 258 | 259 | nN += N; 260 | nD += D; 261 | } 262 | fclose(fp); 263 | free(DD); 264 | free(Q); 265 | 266 | } 267 | 268 | 269 | // Compute the exact gradient of the t-SNE cost function 270 | void computeExactGradient(double* P, double* Y, int N, int D, double* dC, double df) { 271 | // Make sure the current gradient contains zeros 272 | for (int i = 0; i < N * D; i++) dC[i] = 0.0; 273 | 274 | // Compute the squared Euclidean distance matrix 275 | auto *DD = (double *) malloc(N * N * sizeof(double)); 276 | if (DD == nullptr) throw std::bad_alloc(); 277 | computeSquaredEuclideanDistance(Y, N, D, DD); 278 | 279 | // Compute Q-matrix and normalization sum 280 | auto *Q = (double *) malloc(N * N * sizeof(double)); 281 | if (Q == nullptr) throw std::bad_alloc(); 282 | 283 | auto *Qpow = (double *) malloc(N * N * sizeof(double)); 284 | if (Qpow == nullptr) throw std::bad_alloc(); 285 | 286 | double sum_Q = .0; 287 | int nN = 0; 288 | for (int n = 0; n < N; n++) { 289 | for (int m = 0; m < N; m++) { 290 | if (n != m) { 291 | //Q[nN + m] = 1.0 / pow(1.0 + DD[nN + m]/(double)df, df); 292 | Q[nN + m] = 1.0 / (1.0 + DD[nN + m]/(double)df); 293 | Qpow[nN + m] = pow(Q[nN + m], df); 294 | sum_Q += Qpow[nN + m]; 295 | } 296 | } 297 | nN += N; 298 | } 299 | 300 | // Perform the computation of the gradient 301 | nN = 0; 302 | int nD = 0; 303 | for (int n = 0; n < N; n++) { 304 | int mD = 0; 305 | for (int m = 0; m < N; m++) { 306 | if (n != m) { 307 | double mult = (P[nN + m] - (Qpow[nN + m] / sum_Q)) * (Q[nN + m]); 308 | for (int d = 0; d < D; d++) { 309 | dC[nD + d] += (Y[nD + d] - Y[mD + d]) * mult; 310 | } 311 | } 312 | mD += D; 313 | } 314 | nN += N; 315 | nD += D; 316 | } 317 | free(Q); 318 | free(Qpow); 319 | free(DD); 320 | } 321 | 322 | 323 | // Evaluate t-SNE cost function (exactly) 324 | double evaluateError(double* P, double* Y, int N, int D, double df, bool verbose) { 325 | // Compute the squared Euclidean distance matrix 326 | double *DD = (double *) malloc(N * N * sizeof(double)); 327 | double *Q = (double *) malloc(N * N * sizeof(double)); 328 | if (DD == NULL || Q == NULL) { 329 | Rprintf("Memory allocation failed!\n"); 330 | exit(1); 331 | } 332 | if(verbose){ Rprintf("computeSquared\n"); } 333 | computeSquaredEuclideanDistance(Y, N, D, DD); 334 | 335 | // Compute Q-matrix and normalization sum 336 | if(verbose){ Rprintf("calculate Q-matrix\n"); } 337 | int nN = 0; 338 | double sum_Q = DBL_MIN; 339 | for (int n = 0; n < N; n++) { 340 | for (int m = 0; m < N; m++) { 341 | if (n != m) { 342 | //Q[nN + m] = 1.0 / pow(1.0 + DD[nN + m]/(double)df, df); 343 | Q[nN + m] = 1.0 / (1.0 + DD[nN + m]/(double)df); 344 | Q[nN +m ] = pow(Q[nN +m ], df); 345 | sum_Q += Q[nN + m]; 346 | } else Q[nN + m] = DBL_MIN; 347 | } 348 | nN += N; 349 | } 350 | //Rprintf("sum_Q: %e", sum_Q); 351 | if(verbose){ Rprintf("normalize Q-matrix\n"); } 352 | for (int i = 0; i < N * N; i++) Q[i] /= sum_Q; 353 | // for (int i = 0; i < N; i++) Rprintf("Q[%d]: %e\n", i, Q[i]); 354 | 355 | //Rprintf("Q[N*N/2+1]: %e, Q[N*N-1]: %e\n", Q[N*N/2+1], Q[N*N/2+2]); 356 | 357 | // Sum t-SNE error 358 | if(verbose){ Rprintf("sum error, to %i\n", N*N); } 359 | double C = .0; 360 | for (int n = 0; n < N * N; n++) { 361 | C += P[n] * log((P[n] + FLT_MIN) / (Q[n] + FLT_MIN)); 362 | } 363 | 364 | // Clean up memory 365 | free(DD); 366 | free(Q); 367 | return C; 368 | } 369 | 370 | 371 | // Compute squared Euclidean distance matrix 372 | void computeSquaredEuclideanDistance(double* X, int N, int D, double* DD) { 373 | const double *XnD = X; 374 | for (int n = 0; n < N; ++n, XnD += D) { 375 | const double *XmD = XnD + D; 376 | double *curr_elem = &DD[n * N + n]; 377 | *curr_elem = 0.0; 378 | double *curr_elem_sym = curr_elem + N; 379 | for (int m = n + 1; m < N; ++m, XmD += D, curr_elem_sym += N) { 380 | *(++curr_elem) = 0.0; 381 | for (int d = 0; d < D; ++d) { 382 | *curr_elem += (XnD[d] - XmD[d]) * (XnD[d] - XmD[d]); 383 | } 384 | *curr_elem_sym = *curr_elem; 385 | } 386 | } 387 | } 388 | 389 | 390 | // Makes data zero-mean 391 | void zeroMean(double* X, int N, int D) { 392 | // Compute data mean 393 | double *mean = (double *) calloc(D, sizeof(double)); 394 | if (mean == NULL) throw std::bad_alloc(); 395 | 396 | int nD = 0; 397 | for (int n = 0; n < N; n++) { 398 | for (int d = 0; d < D; d++) { 399 | mean[d] += X[nD + d]; 400 | } 401 | nD += D; 402 | } 403 | for (int d = 0; d < D; d++) { 404 | mean[d] /= (double) N; 405 | } 406 | 407 | // Subtract data mean 408 | nD = 0; 409 | for (int n = 0; n < N; n++) { 410 | for (int d = 0; d < D; d++) { 411 | X[nD + d] -= mean[d]; 412 | } 413 | nD += D; 414 | } 415 | free(mean); 416 | } 417 | 418 | 419 | // Generates a Gaussian random number 420 | double randn() { 421 | double x, y, radius; 422 | do { 423 | x = 2 * (rand() / ((double) RAND_MAX + 1)) - 1; 424 | y = 2 * (rand() / ((double) RAND_MAX + 1)) - 1; 425 | radius = (x * x) + (y * y); 426 | } while ((radius >= 1.0) || (radius == 0.0)); 427 | radius = sqrt(-2 * log(radius) / radius); 428 | x *= radius; 429 | return x; 430 | } 431 | 432 | -------------------------------------------------------------------------------- /R/utils.R: -------------------------------------------------------------------------------- 1 | # suppressPackageStartupMessages(library("Rcpp")) 2 | # suppressPackageStartupMessages(library("RcppEigen")) 3 | # sourceCpp("utils/functions.cpp") # hard coded, should be modified in the future 4 | 5 | # Main interface ---- 6 | 7 | #' Test for significance level 8 | #' 9 | #' @param coordinate a 3D matrix 10 | #' @param labels a vector of celltype labels, correspond to the coordinates matrix 11 | #' @param k a integer for top k connections 12 | #' @export 13 | #' @return A list contains coordinates, counts, p values and q values 14 | #' 15 | getSignificance = function(coordinates, labels, k = 3, adjusted.method = "fdr", verbose = F) { 16 | stopifnot(is.matrix(coordinates)) 17 | if(ncol(coordinates) < 2 | ncol(coordinates) > 3) 18 | warning("Abnormal number of dimensions of the coordinates") 19 | # preprocess 20 | labels = setNames(labels, nm = rownames(coordinates)) 21 | standards <- unique(labels) 22 | # labelIx <- match(labels, standards) 23 | cellCounts <- table(labels) 24 | 25 | # Calc dist_mtance 26 | dist_mt <- as.matrix(dist(coordinates)) 27 | 28 | # identify topK as a cutoff 29 | if(verbose) loginfo("identify topK") 30 | topKs <- c() 31 | diag(dist_mt) <- Inf 32 | topKs = apply(dist_mt, 1, function(dist_mt_row_i){ 33 | dist_mtSorted <- sort(dist_mt_row_i) 34 | dist_mtSorted[k] 35 | }) 36 | topK <- median(topKs) 37 | 38 | # initiate counts 39 | counts <- 40 | matrix(0, nrow = length(standards), ncol = length(standards)) 41 | colnames(counts) <- standards 42 | rownames(counts) <- standards 43 | 44 | # Cells within topK range are recognized as connected 45 | 46 | # if(verbose) loginfo("calculate connection") 47 | # for (i in 1:nrow(dist_mt)) { # explore cells one by one 48 | # connects <- which(dist_mt[i,] <= topK) 49 | # for (j in connects) { 50 | # counts[labelIx[i], labelIx[j]] = counts[labelIx[i], labelIx[j]] + 1 51 | # } 52 | # } 53 | 54 | # an alternative way 55 | if(verbose) loginfo("calculate detailed connections") 56 | connects_mt = dist_mt <= topK 57 | counts = 58 | apply(apply(connects_mt, 1, function(x) { 59 | tapply(x, labels, sum) 60 | }), 1, function(x) { 61 | tapply(x, labels, sum) 62 | }) 63 | 64 | diag(counts) <- diag(counts) / 2 # diag elements were counted twice 65 | 66 | # Store detailed connected cell pairs 67 | detailed_connections = list() 68 | clusterPairs2run = rbind(t(combn(standards, 2)), matrix(rep(standards, 2), ncol = 2)) 69 | for(i_row in 1:nrow(clusterPairs2run)) { 70 | cluster1 = clusterPairs2run[i_row, 1] 71 | cluster2 = clusterPairs2run[i_row, 2] 72 | 73 | cellsfrom1 = names(labels)[labels == cluster1] 74 | cellsfrom2 = names(labels)[labels == cluster2] 75 | 76 | sub_connects_mt = connects_mt[cellsfrom1, cellsfrom2, drop = F] 77 | 78 | if(cluster1 == cluster2){ # only count once if self2self 79 | sub_connects_mt[lower.tri(sub_connects_mt)] = F 80 | } 81 | sub_connects_df = data.frame( 82 | cell1 = cellsfrom1[row(sub_connects_mt)][sub_connects_mt], 83 | cell2 = cellsfrom2[col(sub_connects_mt)][sub_connects_mt], stringsAsFactors = F 84 | ) 85 | detailed_connections[[paste0(cluster1, "---", cluster2)]] = sub_connects_df 86 | } 87 | 88 | # calculate pvalue using hypergeometric distribution 89 | if(verbose) loginfo("calculate pvalues") 90 | K <- (sum(counts) + sum(diag(counts))) / 2 91 | p_value <- counts 92 | 93 | assertthat::assert_that(all(rownames(counts) == names(cellCounts))) 94 | assertthat::assert_that(all(colnames(counts) == names(cellCounts))) 95 | for (i in 1:nrow(counts)) { 96 | for (j in 1:ncol(counts)) { 97 | if (i == j) { 98 | M <- as.numeric(cellCounts[rownames(counts)[i]]) * (as.numeric(cellCounts[colnames(counts)[j]]) - 1) / 2 99 | } else { 100 | M <- as.numeric(cellCounts[rownames(counts)[i]]) * (as.numeric(cellCounts[colnames(counts)[j]])) 101 | } 102 | N <- sum(cellCounts) * (sum(cellCounts) - 1) / 2 - M 103 | p_value[i, j] <- 104 | phyper(counts[i, j], M, N, K, lower.tail = FALSE) 105 | } 106 | } 107 | 108 | # p adjust 109 | clusters = colnames(p_value) 110 | cluster_pair = paste(clusters[row(p_value)], clusters[col(p_value)], sep="---") 111 | low_idx = lower.tri(p_value, diag = T) 112 | p_value_df = 113 | data.frame(cluster_pair = cluster_pair[low_idx], 114 | p.value = p_value[low_idx], 115 | q.value = p.adjust(p_value[low_idx], method = adjusted.method), 116 | stringsAsFactors = F) 117 | 118 | q_value = p_value 119 | q_value[low_idx] = p_value_df$q.value 120 | q_value = t(q_value) 121 | q_value[low_idx] = p_value_df$q.value 122 | 123 | result = list() 124 | result$connections = counts 125 | result$pvalue = p_value 126 | result$qvalue = q_value 127 | result$pvalue_tbl = p_value_df 128 | result$detailed_connections = detailed_connections 129 | result$topK = topK 130 | return(result) 131 | } 132 | 133 | 134 | #' Optimize the 3D coordinates(cpp) 135 | #' This function is inspired from tsne algorithm. We use similar gradient descent 136 | #' method to optimize our target function specified in our paper. 137 | #' condition can be loose or tight, we suggest using "loose" condition 138 | #' for dataset with over 10000 cells 139 | #' @param affinityMat affinity matrix 140 | #' @param initial_config initial configuration 141 | #' @param k k cells 142 | #' @param max_iter Maximum iteration time 143 | #' @param min_cost Minimum cost 144 | #' @param condition A string, either 'loss' or 'tight' 145 | #' @param momentum initial momentum, default = 0.5 146 | #' @param final_momentum final momentum, default = 0.8 147 | #' @param mom_switch_iter value to which momentum is changed, default = 250 148 | #' @param epsilon initial learning rate, default = 1000 149 | #' @param min_gain minimum gain for delta-bar-delta, default = 0.01 150 | #' @param eps Minimum distances between cells 151 | #' @param epoch numeric, print out lost funciton cost after every *epoch* iterations 152 | #' @param verbose logical. If TRUE, print out the progress information 153 | #' @return a matrix of optimized 3D coordinates 154 | #' @export 155 | #' 156 | optimization <- 157 | function (affinityMat, 158 | initial_config = NULL, 159 | k = 3, 160 | max_iter = 1000, 161 | min_cost = 0, 162 | condition = "tight", 163 | momentum = 0.5, 164 | final_momentum = 0.8, 165 | mom_switch_iter = 250, 166 | epsilon = 1000, 167 | min_gain = 0.01, 168 | eps = 2.2251e-308, 169 | epoch = 100, 170 | verbose = F) { 171 | n = nrow(affinityMat) 172 | 173 | if (!is.null(initial_config) && is.matrix(initial_config)) { 174 | if (nrow(initial_config) != n | ncol(initial_config) != 175 | k) { 176 | stop("initial_config argument does not match necessary configuration for X") 177 | } 178 | ydata = initial_config 179 | } 180 | else { 181 | # ydata = matrix(rnorm(k * n), n) 182 | ydata = (matrix(runif(k * n), n) - 0.5) * 50 183 | } 184 | P = affinityMat 185 | # P = 0.5 * (affinityMat + t(affinityMat)) 186 | # P[P < eps] <- eps 187 | # P = P / sum(P) 188 | grads = matrix(0, nrow(ydata), ncol(ydata)) 189 | incs = matrix(0, nrow(ydata), ncol(ydata)) 190 | gains = matrix(1, nrow(ydata), ncol(ydata)) 191 | if(verbose) loginfo("Iteration started") 192 | iter01 = ".I1_fun" 193 | initiatePB(iter01) 194 | 195 | for (iter in 1:max_iter) { 196 | d = calc_d_rcpp(ydata) 197 | num = 1/(1+d) 198 | diag(num) = 0 199 | Q = num/sum(num) 200 | Q[Q < eps] = eps 201 | P_Q = P - Q 202 | P_Q[P_Q > 0 & d <= 0.01] = -0.01 203 | 204 | # stiffnesses = 4 * P_Q * num 205 | stiffnesses = 4 * (P-Q) * num 206 | 207 | grads = update_grads_rcpp(grads, ydata, stiffnesses) 208 | 209 | gains = ((gains + 0.2) * abs(sign(grads) != sign(incs)) + 210 | gains * 0.8 * abs(sign(grads) == sign(incs))) 211 | gains[gains < min_gain] = min_gain 212 | incs = momentum * incs - epsilon * (gains * grads) 213 | ydata = ydata + incs 214 | ydata = sweep(ydata, 2, apply(ydata, 2, mean)) 215 | if (iter == mom_switch_iter) 216 | momentum = final_momentum 217 | if (iter %% epoch == 0) { 218 | cost = sum(apply(P * log((P + eps) / (Q + eps)), 1, 219 | sum)) 220 | message("Iteration #", iter, " loss function cost is: ", 221 | cost) 222 | if (cost < min_cost) 223 | break 224 | } 225 | range = max(abs(ydata)) 226 | if (condition == "tight") { 227 | if (range > 50 && iter %% 10 == 0) { 228 | ydata = ydata * 50 / range 229 | } 230 | } else { 231 | if (range > 50 && iter %% max_iter == 0) { 232 | ydata = ydata * 50 / range 233 | } 234 | } 235 | } 236 | ydata 237 | } 238 | 239 | #' Calculate affinity matrix 240 | #' @param TPM a TPM matrix with gene names as rownames and cell names as colnames 241 | #' @param LR a dataframe/tibble record the information of ligand receptor pairs, 242 | #' have to have colnames "ligand", "receptor" and an optional third column with weights 243 | #' 244 | #' @param denoise numeric value, 245 | #' @param eps Minimum distances between cells 246 | #' @param verbose logical. If TRUE, print out the progress information 247 | #' @export 248 | #' 249 | getAffinityMat = function(TPM, 250 | LR, 251 | denoise = 50, 252 | eps = 2.2251e-308, 253 | verbose = F, 254 | ...) { 255 | 256 | genenames = rownames(TPM) 257 | cellnames = colnames(TPM) 258 | 259 | # get the TPM of ligands and receptors 260 | if(verbose) loginfo("Extracting affinity matrix") 261 | # ligandsIndex <- match(LR[, 1, drop = T], genenames) 262 | # receptorIndex <- match(LR[, 2, drop = T], genenames) 263 | 264 | flt_LR = LR[(LR[, 1] %in% rownames(TPM)) & (LR[, 2] %in% rownames(TPM)), ] 265 | 266 | reverse_flag = flt_LR[, 1] != flt_LR[, 2] 267 | reverse_LR = flt_LR 268 | reverse_LR[, 1] = flt_LR[,2] 269 | reverse_LR[, 2] = flt_LR[,1] 270 | reverse_LR = reverse_LR[reverse_flag, ] 271 | 272 | combn_LR = rbind( 273 | flt_LR, 274 | reverse_LR 275 | ) 276 | 277 | ligandsTPM <- 278 | as.matrix(TPM[combn_LR[, 1], ]) 279 | receptorTPM <- 280 | as.matrix(TPM[combn_LR[, 2], ]) 281 | 282 | # determine weight scores 283 | if(ncol(combn_LR) > 3){ 284 | LRscores = combn_LR[, 3] 285 | } else { 286 | LRscores <- rep(1, nrow(combn_LR)) 287 | } 288 | 289 | if(verbose) loginfo("Extracting coordinates affinity matrix") 290 | affinityMat <- t(ligandsTPM) %*% diag(LRscores) %*% receptorTPM 291 | 292 | if(verbose) loginfo("Denoising ...") 293 | # get coordinates through affinity matrix 294 | for (i in 1:nrow(affinityMat)) { 295 | affinityArray <- affinityMat[i, ] 296 | affinityArraySorted <- sort(affinityArray, decreasing = TRUE) 297 | affinityArray[affinityArray <= affinityArraySorted[denoise]] = 0 298 | affinityMat[i, ] = affinityArray 299 | } 300 | 301 | # symmetrize P-values 302 | P = 0.5 * (affinityMat + t(affinityMat)) 303 | P[P < eps] <- eps 304 | # normalize 305 | P = P/sum(P) 306 | return(P) 307 | } 308 | 309 | #' Calculate 3D coordinates from expression 310 | #' A wrapper function to get 3D coordinates directly from expression 311 | #' @param TPM TPM matrix, with gene names as rownames and cell names as colnames 312 | #' @param LR dataframe/tibble; record the information of ligand receptor pairs, have to have colnames "ligand" and "receptor" 313 | #' @param method string; sepcify the optimization method to use. Can be one of 'Rcpp', 'tSNE' or 'BHtSNE'. 314 | #' @param verbose logical. If TRUE, print out the progress information 315 | #' @param ... arguments passsed to different optimization method 316 | #' @export 317 | #' 318 | #' 319 | getCoordinates = function(TPM, LR, method = 'tSNE', verbose = F, ...) { 320 | # Get affinity 321 | affinityMat = getAffinityMat(TPM, LR) 322 | # optimization 323 | if(verbose) loginfo("Optimizing coordinates") 324 | if(method == 'Rcpp'){ 325 | coords <- optimization(affinityMat, verbose = verbose, ...) 326 | } else if(method == 'tSNE'){ 327 | coords_res = runExactTSNE_R( 328 | X = affinityMat, 329 | no_dims = 3, 330 | max_iter = 1000, 331 | verbose = verbose, ... 332 | ) 333 | coords = coords_res$Y 334 | } 335 | rownames(coords) <- colnames(TPM) 336 | colnames(coords) <- c('x', 'y', 'z') 337 | coords 338 | } 339 | 340 | 341 | #' get LR contribution for all the listed cluster pairs 342 | #' 343 | #' @param TPM a TPM matrix with gene names as rownames and cell names as colnames 344 | #' @param LR a dataframe/tibble record the information of ligand receptor pairs, 345 | #' have to have colnames "ligand", "receptor" and an optional third column with weights 346 | #' @param detailed_connections a list generated by `getSignificance`, 347 | #' which stored the connected cell pairs for each clutser pair 348 | #' @param verbose logical; whether to print progress 349 | #' @return a named list with sorted LR contributions 350 | #' @export 351 | #' 352 | getContribution = function(TPM, LR, detailed_connections, verbose = T){ 353 | LR[, 1] = as.character(LR[, 1]) 354 | LR[, 2] = as.character(LR[, 2]) 355 | if(verbose) {loginfo("Extracting data matrix")} 356 | ligands_existed = intersect(rownames(TPM), LR[, 1]) 357 | receptors_existed = intersect(rownames(TPM), LR[, 2]) 358 | 359 | flt_LR = LR[(LR[, 1] %in% ligands_existed) & (LR[, 2]%in% receptors_existed), ] 360 | 361 | if(ncol(LR) > 2){ 362 | LRscores = flt_LR[, 3] 363 | } else { 364 | LRscores = rep(1, nrow(LR)) 365 | } 366 | 367 | if(verbose) {loginfo("Calculate contribution of ", length(detailed_connections), " cluster pairs.")} 368 | LR_contri_lst = list() 369 | # calculate contribution cell-pair by cell-pair 370 | for(target_clusterPair in names(detailed_connections)){ 371 | if(nrow(detailed_connections[[target_clusterPair]]) < 3){ 372 | warning("Number of connected cells in ", target_clusterPair, " is lower than 3.\n") 373 | } 374 | L1S = TPM[flt_LR[, 1], detailed_connections[[target_clusterPair]][, 1], drop = F] 375 | R1S = TPM[flt_LR[, 2], detailed_connections[[target_clusterPair]][, 1], drop = F] 376 | L1R = TPM[flt_LR[, 1], detailed_connections[[target_clusterPair]][, 2], drop = F] 377 | R1R = TPM[flt_LR[, 2], detailed_connections[[target_clusterPair]][, 2], drop = F] 378 | 379 | all_intensity = L1S * LRscores * R1R + R1S * LRscores * L1R 380 | rownames(all_intensity) = paste0(flt_LR[, 1], "---", flt_LR[, 2]) 381 | contribution_mt = t(t(all_intensity) / (colSums(all_intensity))) 382 | 383 | contribution_forCluster = sort(rowSums(contribution_mt) / ncol(contribution_mt), decreasing = T) 384 | # head(contribution_forCluster) 385 | LR_contri_lst[[target_clusterPair]] = contribution_forCluster 386 | if(which(names(detailed_connections) == target_clusterPair) %% 100 ==0 && verbose){ 387 | loginfo(sprintf("%d/%d cluster pairs calculated\n", which(names(detailed_connections) == target_clusterPair), length(names(detailed_connections)))) 388 | } 389 | } 390 | # clear memory every loop 391 | rm(L1S, L1R, R1S, R1R, all_intensity, contribution_mt, contribution_forCluster) 392 | invisible(gc()) 393 | 394 | return(LR_contri_lst) 395 | } 396 | 397 | #' Calculate normalized connection based on connection matrix and cell counts 398 | #' 399 | #' @param connection_mt Named matrix 400 | #' @param cell_count_table Cell count table generated by `table()` or a named vector recording the cell counts 401 | #' @return Named matrix of normalized connection 402 | #' @export 403 | calcNormalizedConnection = function(connection_mt, cell_count_table){ 404 | stopifnot(all(rownames(connection_mt) == colnames(connection_mt))) 405 | cell_count_table = cell_count_table[rownames(connection_mt)] 406 | count_mt = lapply(rownames(connection_mt), function(x){ 407 | cell_count_table[x] * cell_count_table 408 | }) %>% do.call(what = rbind, .) 409 | rownames(count_mt) = rownames(connection_mt) 410 | count_mt = count_mt[, colnames(connection_mt)] 411 | diag(count_mt) = cell_count_table * (cell_count_table-1) 412 | normalized_connection = connection_mt/count_mt 413 | return(normalized_connection) 414 | } 415 | 416 | # Density ---- 417 | #' get 3D density 418 | #' Estimate 3D density around each data point based on coordinates. 419 | #' @param x,y,z coordinates 420 | #' @param n numbers of grid points to use for each dimension; recycled if length is less than 3. Default 100. 421 | #' @param ... other parameters passed to kde3d 422 | #' @export 423 | getDensity3D = function(x, y, z, n = 100, ...) { 424 | tryCatch({ 425 | dens <- kde3d(x = x, y = y, z =z, n = n, ...) 426 | }, error = function(e) { 427 | print(e) 428 | warning("Swith bandwidth to h = 1") 429 | dens <<- kde2d(x = x, 430 | y = y, 431 | n = n, 432 | h = 1) 433 | }) 434 | ix <- findInterval(x, dens$x) 435 | iy <- findInterval(y, dens$y) 436 | iz <- findInterval(z, dens$z) 437 | ii <- cbind(ix, iy, iz) 438 | return(dens$d[ii]) 439 | } 440 | 441 | 442 | # function adapted from R package misc3d 443 | # https://cran.r-project.org/web/packages/misc3d/index.html 444 | kde3d = function (x, y, z, h, n = 20, lims = c(range(x), range(y), range(z))) 445 | { 446 | nx <- length(x) 447 | if (length(y) != nx || length(z) != nx) 448 | stop("data vectors must be the same length") 449 | if (missing(h)) 450 | h <- c(MASS::bandwidth.nrd(x), MASS::bandwidth.nrd(y), 451 | MASS::bandwidth.nrd(z))/6 452 | else if (length(h) != 3) 453 | h <- rep(h, length = 3) 454 | if (length(n) != 3) 455 | n <- rep(n, length = 3) 456 | if (length(lims) == 2) 457 | lims <- rep(lims, length = 6) 458 | gx <- seq(lims[1], lims[2], length = n[1]) 459 | gy <- seq(lims[3], lims[4], length = n[2]) 460 | gz <- seq(lims[5], lims[6], length = n[3]) 461 | mx <- matrix(outer(gx, x, dnorm, h[1]), n[1], nx) 462 | my <- matrix(outer(gy, y, dnorm, h[2]), n[2], nx) 463 | mz <- matrix(outer(gz, z, dnorm, h[3]), n[3], nx) 464 | v <- array(0, n) 465 | tmy.nx <- t(my)/nx 466 | for (k in 1:n[3]) { 467 | tmy.nz.zk <- tmy.nx * mz[k, ] 468 | v[, , k] <- mx %*% tmy.nz.zk 469 | } 470 | return(list(x = gx, y = gy, z = gz, d = v)) 471 | } 472 | 473 | # function adapted from R package MASS 474 | kde2d = function (x, y, h, n = 25, lims = c(range(x), range(y))) 475 | # https://cran.r-project.org/web/packages/MASS/index.html 476 | { 477 | nx <- length(x) 478 | if (length(y) != nx) 479 | stop("data vectors must be the same length") 480 | if (any(!is.finite(x)) || any(!is.finite(y))) 481 | stop("missing or infinite values in the data are not allowed") 482 | if (any(!is.finite(lims))) 483 | stop("only finite values are allowed in 'lims'") 484 | n <- rep(n, length.out = 2L) 485 | gx <- seq.int(lims[1L], lims[2L], length.out = n[1L]) 486 | gy <- seq.int(lims[3L], lims[4L], length.out = n[2L]) 487 | h <- if (missing(h)) 488 | c(bandwidth.nrd(x), bandwidth.nrd(y)) 489 | else rep(h, length.out = 2L) 490 | if (any(h <= 0)) 491 | stop("bandwidths must be strictly positive") 492 | h <- h/4 493 | ax <- outer(gx, x, "-")/h[1L] 494 | ay <- outer(gy, y, "-")/h[2L] 495 | z <- tcrossprod(matrix(dnorm(ax), , nx), matrix(dnorm(ay), , nx))/(nx * h[1L] * h[2L]) 496 | list(x = gx, y = gy, z = z) 497 | } 498 | 499 | 500 | # Inform functions ---- 501 | loginfo <- function(..., printnow = T) { 502 | msg = paste0(list(...), collapse = "") 503 | msg <- paste0("[",format(Sys.time()), "] ", msg,"\n") 504 | if(printnow) 505 | cat(msg) 506 | invisible(msg) 507 | } 508 | 509 | # Others ---- 510 | paste2columns = function(x, y, delim = "---") { 511 | stopifnot(length(x) == length(y)) 512 | x = as.character(x) 513 | y = as.character(y) 514 | z = c() 515 | for (i in 1:length(x)) { 516 | z = c(z, paste0(sort(c(x[i], y[i])), collapse = delim)) 517 | } 518 | z 519 | } 520 | 521 | # 3DPlot ---- 522 | #' Plot 3D figure using plotly 523 | #' A wrapper function to plot 3D plots using plotly. Not necessary for CSOmap's core functions. 524 | #' 525 | #' @param plt_tbl data.frame/tibble; Should provide coordinates x,y,z. 526 | #' @param color_by string; Specify that by which columns should the data points will be colored 527 | #' @param title string; Title 528 | #' @param alpha numeirc; 0-1 specify the alpha of dots 529 | #' @param save_path string; Speicfy the saving path of the output 3D plot. a `/lib/` will also be generated with the output html. Default = NULL. 530 | #' @param ... Other arguments that will be passed to htmlwidgets::saveWidget 531 | #' @return a plotly object 532 | #' @export 533 | plot3D = function(plt_tbl, 534 | color_by = "density", 535 | title = "3D density", 536 | alpha = 0.8, 537 | save_path = NULL, 538 | ...) { 539 | 540 | if (!requireNamespace("plotly", quietly = TRUE) || !requireNamespace("htmlwidgets")) { 541 | stop("Package \"plotly\" and \"htmlwidgets\" are needed for this function to work. Please install it.") 542 | } 543 | fig_density = plotly::plot_ly( 544 | plt_tbl, 545 | x = ~ x, 546 | y = ~ y, 547 | z = ~ z, 548 | alpha = alpha 549 | ) 550 | fig_density = plotly::add_markers(fig_density, color = eval(parse(text = sprintf("~%s", color_by)))) 551 | fig_density = plotly::layout(fig_density, title = title) 552 | 553 | if(!is.null(save_path)){ 554 | htmlwidgets::saveWidget( 555 | fig_density, 556 | file = save_path, 557 | selfcontained = F, 558 | libdir = paste0(dirname(save_path), "/lib/") 559 | ) 560 | } 561 | invisible(fig_density) 562 | } -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | --------------------------------------------------------------------------------