├── .gitignore ├── Dockerfile ├── README.md ├── config ├── sing_emb.yaml └── sing_emb_mgraph.yaml └── scripts ├── additional_network_functions.r ├── download_data.r ├── foxc12 ├── chimeras_compare_frequency.r ├── control_chim_timing.r ├── definition_cell_types_endoderm_ectoderm_embryonic_mesoderm.r ├── differential_expression_analysis.r ├── foxc_chim_timing.r ├── generate_chimera_tetraploid_data_analysis.r ├── preprocessing │ ├── foxc_control_gating.r │ ├── foxc_control_remove_exe_ectoderm_and_parietal_endo_cls.r │ ├── foxc_control_split_into_experiment_type.r │ ├── merge_umi_mat_with_wt10_umi_mat.r │ └── summary_preprocessing.r ├── transfer_cell_type_annotation.r └── transfer_time_annotation.r ├── generate_mc_mgraph_network ├── annot_mc_by_flows.r ├── gen_mc.r ├── gen_mc2d_umap.r ├── gen_mgraph.r ├── gen_network.r └── generic_mc.r ├── generate_paper_figures ├── fig_1.r ├── fig_2.r ├── fig_3.r ├── fig_4.r ├── fig_5.r ├── fig_6.r ├── fig_7.r ├── fig_s1.r ├── fig_s2.r ├── fig_s3.r ├── fig_s4.r ├── fig_s5.r ├── fig_s6.r ├── fig_s7.r ├── generate_all_figures.r ├── plot_3d_vein.r ├── plot_network.r └── plot_vein.r ├── initialize_scripts.r ├── parameter_stability_analysis.r ├── single_embryo_timing.r └── test_cmp_network_flow_model.r /.gitignore: -------------------------------------------------------------------------------- 1 | # History files 2 | .Rhistory 3 | .Rapp.history 4 | 5 | # Session Data files 6 | .RData 7 | 8 | # User-specific files 9 | .Ruserdata 10 | 11 | # Example code in package build process 12 | *-Ex.R 13 | 14 | # Output files from R CMD build 15 | /*.tar.gz 16 | 17 | # Output files from R CMD check 18 | /*.Rcheck/ 19 | 20 | # RStudio files 21 | .Rproj.user/ 22 | 23 | # produced vignettes 24 | vignettes/*.html 25 | vignettes/*.pdf 26 | 27 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 28 | .httr-oauth 29 | 30 | # knitr and R markdown default cache directories 31 | *_cache/ 32 | /cache/ 33 | 34 | # Temporary files created by R markdown 35 | *.utf8.md 36 | *.knit.md 37 | 38 | # R Environment Variables 39 | .Renviron 40 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM bioconductor/bioconductor_docker:RELEASE_3_12 2 | 3 | # Install rpm dependencies 4 | RUN apt-get update && apt-get install -y git-core libcurl4-openssl-dev libgit2-dev libicu-dev libssl-dev libxml2-dev make pandoc pandoc-citeproc zlib1g-dev libgtk2.0-dev libcairo2-dev libxt-dev xvfb xauth xfonts-base vim && rm -rf /var/lib/apt/lists/* 5 | 6 | 7 | RUN R -e 'remotes::install_github("tanaylab/tgconfig")' 8 | RUN R -e 'remotes::install_github("tanaylab/tglkmeans")' 9 | RUN R -e 'install.packages("tidyverse")' 10 | RUN R -e 'install.packages("pheatmap")' 11 | RUN R -e 'install.packages("gridExtra")' 12 | RUN R -e 'install.packages("umap")' 13 | RUN R -e 'install.packages("Matrix")' 14 | RUN R -e 'install.packages("shape")' 15 | RUN R -e 'install.packages("qlcMatrix")' 16 | RUN R -e 'install.packages("ggrepel")' 17 | RUN R -e 'BiocManager::install("tanaylab/metacell")' 18 | RUN R -e 'BiocManager::install("lpsymphony")' 19 | 20 | RUN git clone https://github.com/tanaylab/embflow.git 21 | 22 | WORKDIR /embflow 23 | 24 | RUN R -e 'source("scripts/download_data.r")' 25 | 26 | # Run R 27 | CMD ["R"] 28 | 29 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![DOI](https://zenodo.org/badge/346700550.svg)](https://zenodo.org/badge/latestdoi/346700550) 2 | 3 | A single embryo, single cell time-resolved model for mouse gastrulation 4 | ======================================================================= 5 | 6 | This repository contains all the code for reproducing the analysis from the gastrulation flow paper Mittnenzweig et al. (2021). The analysis is done with the [metacell](https://github.com/tanaylab/metacell) R package, that also contains the code for generating the network flow model. 7 | 8 | ### Quick links 9 | 10 | - Metacell paper: Baran et al. 2019 [Genome Biology](https://doi.org/10.1186/s13059-019-1812-2) 11 | 12 | - [Metacell R package](https://github.com/tanaylab/metacell) 13 | 14 | - Raw FASTQ files and processed UMI tables are available under GEO accession [GSE169210](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE169210) 15 | 16 | ### Requirements (R packages) 17 | 18 | - metacell 19 | - lpsymphony 20 | - pheatmap 21 | - gridExtra 22 | - Matrix 23 | - tidyverse 24 | - shape 25 | - umap 26 | - qlcMatrix 27 | - ggrepel 28 | 29 | ### Usage 30 | 31 | After cloning the github repository, users should open an R session in the repository root directory and download/initialize the scRNA database (~ 4.7 GB): 32 | ``` r 33 | # Loading code and downloading required data files 34 | source("scripts/download_data.r") 35 | 36 | ``` 37 | The repository root directory should now contain the subfolders *scripts/* containing all the R scripts, *scrna_db* containing the metacell R objects, *config/*, *data/* containing additional data generated by the scripts and *figs/paper_figs/*. 38 | 39 | ### Generating all figures 40 | All figures of the paper can be regenerated by running: 41 | ``` r 42 | # load all scripts 43 | source("scripts/initialize_scripts.r") 44 | generate_all_figures() 45 | 46 | ``` 47 | Please note that this will take some time. If you are interested in regenerating a specific figure, see the paragraph below. 48 | 49 | 50 | #### Regenerating plots for a specific figure 51 | For each figure (Figures 1-7 and S1-7), there is a corresponding script in *scripts/generate_paper_figures/*. Each script contains a function *gen_fig_xyz_plots()* at the top, that contains further subfunctions and explanations related to the analysis of that figure. E.g., for regenerating the plots of figure 1, users should run the following code: 52 | 53 | ``` r 54 | # load metacell package 55 | library("metacell") 56 | # initializing the metacell scrna database 57 | scdb_init("scrna_db") 58 | 59 | # Generating plots of Figure 1 60 | source("scripts/generate_paper_figures/fig_1.r") 61 | gen_fig_1_plots() 62 | 63 | ``` 64 | The content of *gen_fig_1_plots()* looks as follwos: 65 | ``` r 66 | gen_fig_1_plots = function() { 67 | 68 | if(!dir.exists("figs/paper_figs")) { 69 | dir.create("figs/paper_figs") 70 | } 71 | dir_name = "figs/paper_figs/fig1" 72 | 73 | if(!dir.exists(dir_name)) { 74 | dir.create(dir_name) 75 | } 76 | 77 | fig1_b() 78 | fig1_cde() 79 | fig1_f() 80 | fig1_g_mc_time_distributions() 81 | fig1_g_heatmap(plot_pdf = T) 82 | fig1_h() 83 | 84 | } 85 | ``` 86 | Figure plots are saved in *figs/paper_figs/fig1/* 87 | 88 | 89 | #### Computing metacell object, manifold graph and network flow model for wildtype manifold 90 | Standard metacell analysis is performed as described in [Baran et al. 2019](https://doi.org/10.1186/s13059-019-1812-2). To recompute the metacell object, please run 91 | ``` r 92 | source("scripts/generate_mc_mgraph_network/gen_mc.r") 93 | ``` 94 | This will generate a metacell object with id *sing_emb_wt10_bs500f*. Note, that because of random seeding of the boostrap procedure involved in calculating the metacell cover, 95 | the computed metacell cover will slightly deviate from *sing_emb_wt10_recolored* used in the paper. Manifold graphs and 2D projections can be recomputed through 96 | ``` r 97 | source("scripts/generate_mc_mgraph_network/gen_mgraph.r") 98 | source("scripts/generate_mc_mgraph_network/gen_mgraph_umap.r") 99 | generate_mgraph_wt10() 100 | gen_mc2d_umap_wt10() 101 | ``` 102 | The network flow model can be generated using 103 | ``` r 104 | source("scripts/generate_mc_mgraph_network/gen_network.r") 105 | build_sing_emb_wt10_network() 106 | ``` 107 | Metacells were clustered and annotated using the network flow model. 108 | ``` r 109 | source("scripts/generate_mc_mgraph_network/annot_mc_by_flows.r") 110 | cluster_metacells_by_flow(mct_id = "sing_emb_wt10",K = 65) 111 | ``` 112 | 113 | #### Single-embryo timing 114 | To regenerate the single-embryo timing data underlying Figure 1, please run 115 | ``` r 116 | # load metacell package 117 | library("metacell") 118 | # initializing the metacell scrna database 119 | scdb_init("scrna_db") 120 | 121 | source("scripts/single_embryo_timing.r") 122 | gen_fig_1_plots() 123 | embryo_ranks = gen_single_embryo_timing() 124 | 125 | # subfunctions calculating intrinsic_rank and reference_rank of each embryo 126 | # are contained in gen_single_embryo_timing() 127 | ``` 128 | The output data frame *embryo_ranks* was added to the single-cell metadata information of the metacell matrix object. All subsequent functions using single-embryo time information, are extracting it from the *cell_metadata* entry of the WT metacell single-cell matrix object *sing_emb_wt10*. 129 | ``` r 130 | # load metacell package 131 | library("metacell") 132 | # initializing the metacell scrna database 133 | scdb_init("scrna_db") 134 | 135 | mat = scdb_mat("sing_emb_wt10") 136 | md = mat@cell_metadata 137 | ``` 138 | 139 | #### Parameter stability analysis of network flow model 140 | The parameter stability analysis of network flows underlying Figure S2A can be regenerated using 141 | ``` r 142 | # regnerate data - this might take some time 143 | source("scripts/parameter_stability_analysis.r") 144 | gen_parameter_stability_analysis() 145 | 146 | # replotting Figure S2A 147 | source("scripts/generate_paper_figures/fig_s2.r") 148 | fig_s2a() 149 | ``` 150 | 151 | #### Foxc12 chimera and tetraploid analysis 152 | To generate specific plots of Figures 6, S6 and S7, please run the corresponding functions from *fig_6.r*, *fig_s6.r* or *fig_s7.r*. Users interested in recomputing parts of the Foxc12 chimera and tetraploid embryo analysis (not needed for regenerating the plots), should run the following functions: 153 | ``` r 154 | library("metacell") 155 | scdb_init("scrna_db/") 156 | 157 | source("scripts/foxc12/generate_chimera_tetraploid_data_analysis.r") 158 | 159 | # Chimera embryos injected with Foxc12 DKO cells 160 | foxc_chimera_generate_time_and_cell_type_annotation() 161 | 162 | # Chimera embryos injected with control cells 163 | control_chimera_generate_time_and_cell_type_annotation() 164 | 165 | # Tetraploid embryos injected with Foxc12 DKO cells 166 | foxc_tetraploid_generate_time_and_cell_type_annotation() 167 | 168 | # Tetraploid embryos injected with control cells 169 | control_tetraploid_generate_time_and_cell_type_annotation() 170 | ``` 171 | This will transfer cell-type and time annotation from the wt atlas to chimera/tetraploid embryos. Output is saved in *data/chimera_tetraploid_analysis/*. Scripts involved in preprocessing plates from the chimera and tetraploid embryo analyis are saved in the *scripts/foxc12/preprocessing/*. This includes 172 | - Gating of single cells using the FACS GFP channel 173 | - Removing cells from extraembryonic ectoderm and parietal endoderm 174 | - Merging each single-cell matrix with the wt single-cell matrix and creating a joint single-cell graph (metacell *cgraph* object). 175 | See summary_preprocessing.r and the corresponding scripts for more details. 176 | 177 | ### Using Docker 178 | 179 | To generate all the figures of the paper using the docker image, please run the following commands: 180 | 181 | ```bash 182 | docker pull tanaylab/embflow:latest 183 | mkdir figs 184 | docker run -ti --user $(id -u):$(id -g) -v $(pwd)/figs:/embflow/figs tanaylab/embflow:latest 185 | ``` 186 | 187 | And then run within the R session: 188 | 189 | ```R 190 | source("scripts/initialize_scripts.r") 191 | generate_all_figures() 192 | ``` 193 | 194 | The figures would be then generated in the mounted directory "figs". 195 | 196 | 197 | ### Contact 198 | For help, please contact 199 | -------------------------------------------------------------------------------- /config/sing_emb.yaml: -------------------------------------------------------------------------------- 1 | mcell_mc2d_K: 25 2 | mcell_mc2d_T_edge: 0.019 3 | mcell_mc2d_proj_blur: 0.02 4 | mcell_mc2d_max_confu_deg: 4 5 | mcell_mc2d_edge_asym: FALSE 6 | mcell_mc2d_k_expand_inout_factor: NULL 7 | mcell_mc2d_gene_cell_cex: 1.2 8 | mcell_mc2d_cex: 1.6 9 | 10 | mcell_mc2d_height: 1000 11 | mcell_mc2d_width: 1000 12 | 13 | mcp_heatmap_width: 2000 14 | mcell_mc2d_plot_key: FALSE 15 | mc_plot_device: "png" 16 | 17 | 18 | -------------------------------------------------------------------------------- /config/sing_emb_mgraph.yaml: -------------------------------------------------------------------------------- 1 | mcell_mc2d_K: 25 2 | mcell_mc2d_T_edge: 0.0002 3 | mcell_mc2d_proj_blur: 0.02 4 | mcell_mc2d_max_confu_deg: 4 5 | mcell_mc2d_edge_asym: FALSE 6 | mcell_mc2d_k_expand_inout_factor: NULL 7 | mcell_mc2d_gene_cell_cex: 1.2 8 | mcell_mc2d_cex: 1.2 9 | 10 | mcell_mc2d_height: 1500 11 | mcell_mc2d_width: 1500 12 | 13 | mcell_mgraph_T_edge: 0.001 14 | mcell_mgraph_max_confu_deg: 4 15 | 16 | mcp_heatmap_width: 2000 17 | mcell_mc2d_plot_key: FALSE 18 | mc_plot_device: "png" 19 | 20 | -------------------------------------------------------------------------------- /scripts/additional_network_functions.r: -------------------------------------------------------------------------------- 1 | 2 | mctnetwork_get_egc_on_cluster_transition = function(mct, min_time, max_time, type1, type2, mc_type=NULL) { 3 | mc = scdb_mc(mct@mc_id) 4 | e_gc = mc@e_gc 5 | net = mct@network 6 | 7 | if(is.null(mc_type)) { 8 | mc_type = mc@colors 9 | names(mc_type) = as.character(1:length(mc_type)) 10 | } 11 | 12 | # flow_mm = mctnetwork_get_flow_mat(mct, time, max_time=time) 13 | 14 | f_t = net$time1 >= min_time & net$time2 <= max_time & 15 | net$time1 == net$time2-1 & 16 | net$type1 != "growth" & net$type2!="growth" 17 | 18 | net = net[f_t,] 19 | f_types = mc_type[as.numeric(net$mc1)]==type1 & mc_type[as.numeric(net$mc2)]==type2 20 | 21 | net = net[f_types,] 22 | 23 | src_mc_wgt = tapply(net$flow, net$mc1, sum) 24 | targ_mc_wgt = tapply(net$flow, net$mc2, sum) 25 | src_mc_wgt_n = as.vector(src_mc_wgt/sum(src_mc_wgt)) 26 | names(src_mc_wgt_n) = names(src_mc_wgt) 27 | targ_mc_wgt_n = as.vector(targ_mc_wgt/sum(targ_mc_wgt)) 28 | names(targ_mc_wgt_n) = names(targ_mc_wgt) 29 | 30 | src_e_gc = colSums(t(e_gc[,names(src_mc_wgt_n)]) * src_mc_wgt_n) 31 | targ_e_gc = colSums(t(e_gc[,names(targ_mc_wgt_n)]) * targ_mc_wgt_n) 32 | 33 | return(data.frame(src = src_e_gc, targ = targ_e_gc, lf = log2(1e-5+targ_e_gc)-log2(1e-5+src_e_gc))) 34 | } 35 | -------------------------------------------------------------------------------- /scripts/download_data.r: -------------------------------------------------------------------------------- 1 | # script for downloading and initializing the single-cell RNA-seq database for the metacell package 2 | 3 | download.file("https://embflow.s3.eu-west-1.amazonaws.com/data_embflow.tar.gz", "data_embflow.tar.gz") 4 | 5 | system("tar -xvf data_embflow.tar.gz data") 6 | 7 | download.file("https://embflow.s3.eu-west-1.amazonaws.com/scrna_db_embflow.tar.gz","scrna_db_embflow.tar.gz") 8 | 9 | system("tar -xvf scrna_db_embflow.tar.gz scrna_db") 10 | 11 | file.remove("data_embflow.tar.gz") 12 | file.remove("scrna_db_embflow.tar.gz") 13 | 14 | 15 | if(!dir.exists("figs")) { 16 | dir.create("figs") 17 | } 18 | if(!dir.exists("figs/paper_figs")) { 19 | dir.create("figs/paper_figs") 20 | } 21 | 22 | -------------------------------------------------------------------------------- /scripts/foxc12/chimeras_compare_frequency.r: -------------------------------------------------------------------------------- 1 | library("zoo") 2 | 3 | 4 | chim_plot_ct_frequency_per_ct = function(plot_pdf = F) { 5 | 6 | mat_nm1 = "foxc_chim_wt10" 7 | mat_nm2 = "control_chim_wt10" 8 | 9 | rank_to_time = read.table(file = "data/revision/wt10_transcriptional_rank_developmental_time.txt",stringsAsFactors = F,h = T,sep = "\t") 10 | dev_time = rank_to_time$developmental_time 11 | dev_time = c(1:length(dev_time)) 12 | 13 | age_field_host = "best_rank_host" 14 | age_field_ko = "best_rank_ko" 15 | age_field_control = "best_rank_control" 16 | 17 | 18 | fig_dir = sprintf("%s/fig6/ct_freq",.scfigs_base) 19 | if(!dir.exists(fig_dir)) { 20 | dir.create(fig_dir) 21 | } 22 | 23 | roll_width = 4 24 | 25 | load(file = sprintf("data/revision/%s/color_annotation/cmp_annot.Rda",mat_nm1)) 26 | cmp_annot1 = cmp_annot 27 | load(file = sprintf("data/revision/%s/color_annotation/cmp_annot.Rda",mat_nm2)) 28 | cmp_annot2 = cmp_annot 29 | 30 | query_cls_col1 = cmp_annot1$query_cls_col 31 | query_cls1 = names(query_cls_col1) 32 | query_cls_col2 = cmp_annot2$query_cls_col 33 | query_cls2 = names(query_cls_col2) 34 | 35 | 36 | mat1 = scdb_mat(mat_nm1) 37 | mat2 = scdb_mat(mat_nm2) 38 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 39 | cgraph = scdb_cgraph(mat_nm1) 40 | col_to_ct = mc_wt@color_key$group 41 | names(col_to_ct) = mc_wt@color_key$color 42 | excluded_colors = c("#F6BFCB","#7F6874") 43 | included_colors = setdiff(unique(mc_wt@colors),excluded_colors) 44 | 45 | df_chim1 = read.table(sprintf("data/revision/%s/time_match/time_match_summary.txt",mat_nm1),sep = "\t",h = T,stringsAsFactors = F) 46 | rownames(df_chim1) = df_chim1$embryo 47 | df_chim2 = read.table(sprintf("data/revision/%s/time_match/time_match_summary.txt",mat_nm2),sep = "\t",h = T,stringsAsFactors = F) 48 | rownames(df_chim2) = df_chim2$embryo 49 | 50 | f = !(df_chim1$embryo %in% c("6D_m2e3","6D_m3e1","6D_m3e7")) 51 | chim_embryos1 = df_chim1$embryo[f] 52 | chim_embryos1 = chim_embryos1[order(df_chim1[chim_embryos1,"best_rank_host"])] 53 | chim_embryos2 = df_chim2$embryo[order(df_chim2[,"best_rank_host"])] 54 | 55 | ko_cls = query_cls1[( mat1@cell_metadata[query_cls1,"cell_type"] == "KO" ) & ( query_cls_col1[query_cls1] %in% included_colors )] 56 | control_cls = query_cls2[( mat2@cell_metadata[query_cls2,"cell_type"] == "control" ) & ( query_cls_col2[query_cls2] %in% included_colors )] 57 | host_cls1 = query_cls1[( mat1@cell_metadata[query_cls1,"cell_type"] %in% c("host") ) & ( query_cls_col1[query_cls1] %in% included_colors )] 58 | host_cls2 = query_cls2[( mat1@cell_metadata[query_cls2,"cell_type"] %in% c("host") ) & ( query_cls_col2[query_cls2] %in% included_colors )] 59 | wt10_cls = names(mc_wt@mc)[mc_wt@colors[mc_wt@mc] %in% included_colors] 60 | 61 | emb_wt_age = unique(mat1@cell_metadata[wt10_cls,c("transcriptional_rank","age_group")]) 62 | emb_wt_age = emb_wt_age[order(emb_wt_age$transcriptional_rank),] 63 | 64 | wt10_emb_vs_ct = table(mat1@cell_metadata[wt10_cls,"transcriptional_rank"],mc_wt@colors[mc_wt@mc[wt10_cls]]) 65 | ko_emb_vs_ct = table(mat1@cell_metadata[ko_cls,"embryo"],query_cls_col1[ko_cls]) 66 | control_emb_vs_ct = table(mat2@cell_metadata[control_cls,"embryo"],query_cls_col2[control_cls]) 67 | host_emb_vs_ct1 = table(mat1@cell_metadata[host_cls1,"embryo"],query_cls_col1[host_cls1]) 68 | host_emb_vs_ct2 = table(mat2@cell_metadata[host_cls2,"embryo"],query_cls_col2[host_cls2]) 69 | ko_emb_vs_ct = ko_emb_vs_ct[chim_embryos1,] 70 | control_emb_vs_ct = control_emb_vs_ct[chim_embryos2,] 71 | host_emb_vs_ct1 = host_emb_vs_ct1[chim_embryos1,] 72 | host_emb_vs_ct2 = host_emb_vs_ct2[chim_embryos2,] 73 | 74 | wt10_emb_vs_ct_n = wt10_emb_vs_ct/rowSums(wt10_emb_vs_ct[,included_colors]) 75 | 76 | # next calculate moving 90% moving average window for every cell type 77 | 78 | mov_mean = apply(wt10_emb_vs_ct_n,2,function(freq_ct) { 79 | 80 | n = length(freq_ct) 81 | 82 | freq_ct = c(freq_ct,freq_ct[(n - roll_width + 1):n]) 83 | 84 | freq_mean = rollmean(x = freq_ct,k = 2*roll_width+1) 85 | return(freq_mean) 86 | }) 87 | 88 | mov_sd = apply(wt10_emb_vs_ct_n,2,function(freq_ct) { 89 | 90 | freq_ct = c(freq_ct,freq_ct[(length(freq_ct) - roll_width + 1):length(freq_ct)]) 91 | 92 | freq_sd = rollapply(data = freq_ct,width = 2*roll_width+1,sd) 93 | 94 | return(freq_sd) 95 | }) 96 | 97 | upper_sd = mov_mean + mov_sd 98 | lower_sd = mov_mean - mov_sd 99 | 100 | 101 | upper_lim = apply(wt10_emb_vs_ct_n,2,function(freq_ct) { 102 | 103 | freq_ct = c(freq_ct,freq_ct[(length(freq_ct) - roll_width + 1):length(freq_ct)]) 104 | 105 | upper_freq = rollapply(data = freq_ct,width = 2*roll_width+1,function(v) { 106 | return(quantile(v,0.95)) 107 | }) 108 | 109 | return(upper_freq) 110 | }) 111 | 112 | lower_lim = apply(wt10_emb_vs_ct_n,2,function(freq_ct) { 113 | 114 | freq_ct = c(freq_ct,freq_ct[(length(freq_ct) - roll_width + 1):length(freq_ct)]) 115 | 116 | lower_freq = rollapply(data = freq_ct,width = 2*roll_width+1,function(v) { 117 | return(quantile(v,0.05)) 118 | }) 119 | 120 | return(lower_freq) 121 | }) 122 | 123 | #rownames(upper_lim) = c((roll_width+1):(153 - roll_width)) 124 | #rownames(lower_lim) = c((roll_width+1):(153 - roll_width)) 125 | 126 | rownames(upper_lim) = c((roll_width+1):153) 127 | rownames(lower_lim) = c((roll_width+1):153) 128 | rownames(upper_sd) = c((roll_width+1):153) 129 | rownames(lower_sd) = c((roll_width+1):153) 130 | rownames(mov_mean) = c((roll_width+1):153) 131 | 132 | ko_emb_vs_ct_n = ko_emb_vs_ct/rowSums(ko_emb_vs_ct[,intersect(included_colors,colnames(ko_emb_vs_ct))]) 133 | control_emb_vs_ct_n = control_emb_vs_ct/rowSums(control_emb_vs_ct[,intersect(included_colors,colnames(control_emb_vs_ct))]) 134 | host_emb_vs_ct1_n = host_emb_vs_ct1/rowSums(host_emb_vs_ct1[,intersect(included_colors,colnames(host_emb_vs_ct1))]) 135 | host_emb_vs_ct2_n = host_emb_vs_ct2/rowSums(host_emb_vs_ct2[,intersect(included_colors,colnames(host_emb_vs_ct2))]) 136 | 137 | emb_vs_ct_ls = list(ko = ko_emb_vs_ct_n,host1 = host_emb_vs_ct1_n,wt10 = wt10_emb_vs_ct_n,host2 = host_emb_vs_ct2_n,control = control_emb_vs_ct_n) 138 | 139 | 140 | 141 | min_rank_plot = 105 142 | 143 | xlim_min = dev_time[min_rank_plot] 144 | #xlim_min = 7.6 145 | xlim_max = dev_time[153] 146 | 147 | for(cell_type in included_colors) { 148 | print(cell_type) 149 | 150 | 151 | ct_freq_ls = lapply(emb_vs_ct_ls,FUN = function(emb_vs_ct) { 152 | if(cell_type %in% colnames(emb_vs_ct)) { 153 | ct_freq = emb_vs_ct[,cell_type] 154 | } else { 155 | ct_freq = rep(0,nrow(emb_vs_ct)) 156 | names(ct_freq) = rownames(emb_vs_ct) 157 | } 158 | return(ct_freq) 159 | }) 160 | 161 | #calculate 90% moving average window 162 | wt10_freq = ct_freq_ls$wt10 163 | 164 | #x_ranks = c(min_rank_plot:(153-roll_width)) 165 | x_ranks = c(min_rank_plot:(153)) 166 | #upper_freq = upper_lim[as.character(c(min_rank_plot:(153-roll_width))),cell_type] 167 | #lower_freq = lower_lim[as.character(c(min_rank_plot:(153-roll_width))),cell_type] 168 | upper_freq = upper_sd[as.character(c(min_rank_plot:(153))),cell_type] 169 | lower_freq = lower_sd[as.character(c(min_rank_plot:(153))),cell_type] 170 | 171 | 172 | ylim_max = max(ct_freq_ls$ko,ct_freq_ls$host1,ct_freq_ls$host2,ct_freq_ls$control,ct_freq_ls$wt10) 173 | 174 | cell_type_nm = gsub("/","_",col_to_ct[cell_type]) 175 | 176 | if(plot_pdf) { 177 | pdf(sprintf("%s/%s.pdf",fig_dir,cell_type_nm),w = 7,h = 5.6,useDingbats = F) 178 | par(mar = c(6,6,6,4)) 179 | plot(x = dev_time[emb_wt_age$transcriptional_rank],y = ct_freq_ls$wt10,ylim = c(0,1.2*ylim_max), 180 | pch = 19,cex = 1,col = "gray80",main = col_to_ct[cell_type],ylab = "Fraction",xlab = "Transcriptional rank",cex.main = 3, 181 | cex.lab = 2,cex.axis = 2,xaxs = 'i',yaxs= 'i',xlim = c(xlim_min,xlim_max)) 182 | 183 | 184 | polygon(x = c(dev_time[x_ranks],dev_time[rev(x_ranks)]),y = c(upper_freq,rev(lower_freq)),col = "gray80",border = NA) 185 | 186 | lines(x = dev_time[x_ranks],y = mov_mean[as.character(x_ranks),cell_type]) 187 | 188 | points(x = dev_time[emb_wt_age$transcriptional_rank],y = ct_freq_ls$wt10, 189 | pch = 19,cex = 1,col = "gray50") 190 | 191 | segments(x0 = dev_time[df_chim1[names(ct_freq_ls$host1),age_field_host]],y0 = ct_freq_ls$host1,x1 = dev_time[df_chim1[names(ct_freq_ls$ko),age_field_host]],y1 = ct_freq_ls$ko,lwd = 1) 192 | 193 | segments(x0 = dev_time[df_chim2[names(ct_freq_ls$host2),age_field_host]],y0 = ct_freq_ls$host2,x1 = dev_time[df_chim2[names(ct_freq_ls$control),age_field_host]],y1 = ct_freq_ls$control,lwd = 1) 194 | 195 | 196 | points(x = dev_time[df_chim1[names(ct_freq_ls$host1),age_field_host]],y = ct_freq_ls$host1, pch = 19,cex = 3,col = "black") 197 | points(x = dev_time[df_chim1[names(ct_freq_ls$ko),age_field_host]],y = ct_freq_ls$ko, pch = 19,cex = 3,col = "#83c26d") 198 | 199 | points(x = dev_time[df_chim2[names(ct_freq_ls$host2),age_field_host]],y = ct_freq_ls$host2, pch = 17,cex = 3,col = "black") 200 | points(x = dev_time[df_chim2[names(ct_freq_ls$control),age_field_host]],y = ct_freq_ls$control, pch = 17,cex = 3,col = "cornflowerblue") 201 | 202 | dev.off() 203 | 204 | } else { 205 | 206 | png(sprintf("%s/%s.png",fig_dir,cell_type_nm),w = 800,h = 600) 207 | par(mar = c(6,6,6,4)) 208 | plot(x = dev_time[emb_wt_age$transcriptional_rank],y = ct_freq_ls$wt10,ylim = c(0,1.2*ylim_max), 209 | pch = 19,cex = 1,col = "gray80",main = col_to_ct[cell_type],ylab = "Fraction",xlab = "Transcriptional rank",cex.main = 3, 210 | cex.lab = 3,xaxs = 'i',yaxs= 'i',xlim = c(xlim_min,xlim_max)) 211 | 212 | 213 | polygon(x = c(dev_time[x_ranks],dev_time[rev(x_ranks)]),y = c(upper_freq,rev(lower_freq)),col = "gray80",border = NA) 214 | 215 | lines(x = dev_time[x_ranks],y = mov_mean[as.character(x_ranks),cell_type]) 216 | 217 | points(x = dev_time[emb_wt_age$transcriptional_rank],y = ct_freq_ls$wt10, 218 | pch = 19,cex = 1,col = "black") 219 | 220 | segments(x0 = dev_time[df_chim1[names(ct_freq_ls$host1),age_field_host]],y0 = ct_freq_ls$host1,x1 = dev_time[df_chim1[names(ct_freq_ls$ko),age_field_host]],y1 = ct_freq_ls$ko,lwd = 1) 221 | 222 | segments(x0 = dev_time[df_chim2[names(ct_freq_ls$host2),age_field_host]],y0 = ct_freq_ls$host2,x1 = dev_time[df_chim2[names(ct_freq_ls$control),age_field_host]],y1 = ct_freq_ls$control,lwd = 1) 223 | 224 | 225 | points(x = dev_time[df_chim1[names(ct_freq_ls$host1),age_field_host]],y = ct_freq_ls$host1, pch = 19,cex = 3,col = "black") 226 | points(x = dev_time[df_chim1[names(ct_freq_ls$ko),age_field_host]],y = ct_freq_ls$ko, pch = 19,cex = 3,col = "#83c26d") 227 | 228 | points(x = dev_time[df_chim2[names(ct_freq_ls$host2),age_field_host]],y = ct_freq_ls$host2, pch = 17,cex = 3,col = "black") 229 | points(x = dev_time[df_chim2[names(ct_freq_ls$control),age_field_host]],y = ct_freq_ls$control, pch = 17,cex = 3,col = "cornflowerblue") 230 | 231 | legend(x = "topleft",legend = c("wt","host KO","host control","KO","control"),pch = c(19,19,17,19,17),col = c("black","black","black","#83c26d","cornflowerblue")) 232 | 233 | dev.off() 234 | } 235 | 236 | 237 | } 238 | 239 | } 240 | 241 | 242 | chim_plot_legend = function() { 243 | 244 | pdf("figs_revision/chimera_foxc_control/ct_freq/legend_cell_types.pdf",useDingbats = F) 245 | plot(NULL ,xaxt='n',yaxt='n',bty='n',ylab='',xlab='', xlim=0:1, ylim=0:1) 246 | legend(x = "center",legend = c("WT","Host KO","Host Control","KO","Control"),pch = c(19,19,17,19,17),col = c("gray50","black","black","#83c26d","cornflowerblue")) 247 | dev.off() 248 | 249 | 250 | } -------------------------------------------------------------------------------- /scripts/foxc12/control_chim_timing.r: -------------------------------------------------------------------------------- 1 | source("scripts/foxc12/transfer_time_annotation.r") 2 | library("Matrix") 3 | library("dplyr") 4 | 5 | 6 | 7 | control_chim_timing = function(tag = "all",included_colors = NULL) { 8 | 9 | mat_nm = "control_chim_wt10" 10 | mc_id = "control_chim_wt10_recolored" 11 | cgraph_id = mat_nm 12 | mat = scdb_mat(mat_nm) 13 | mc = scdb_mc(mc_id) 14 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 15 | 16 | excluded_colors = c("#F6BFCB","#7F6874") 17 | 18 | if(is.null(included_colors)) { 19 | included_colors = setdiff(mc_wt@color_key$color,excluded_colors) 20 | } 21 | 22 | load(sprintf("data/chimera_tetraploid_analysis/%s/color_annotation/cmp_annot.Rda",mat_nm)) 23 | query_cls_col = cmp_annot$query_cls_col 24 | 25 | f = mat@cell_metadata[colnames(mat@mat),"cell_type"] %in% c("host","control","KO") 26 | chim_cls = colnames(mat@mat)[f] 27 | chim_cls = intersect(chim_cls,names(cmp_annot$query_cls_col)) 28 | 29 | excluded_cls = chim_cls[cmp_annot$query_cls_col[chim_cls] %in% excluded_colors] 30 | chim_cls = setdiff(chim_cls,excluded_cls) 31 | 32 | chim_cls = chim_cls[query_cls_col[chim_cls] %in% included_colors] 33 | 34 | chim_embryos = unique(mat@cell_metadata[chim_cls,"embryo"]) 35 | 36 | wt10_cls = intersect(names(mc_wt@mc)[ !(mc_wt@colors[mc_wt@mc] %in% excluded_colors) ],colnames(mat@mat)) 37 | atlas_time = mat@cell_metadata[wt10_cls,"transcriptional_rank"] 38 | names(atlas_time) = wt10_cls 39 | 40 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s",mat_nm) 41 | if(!dir.exists(data_dir)) { 42 | dir.create(data_dir) 43 | } 44 | 45 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/time_match",mat_nm) 46 | if(!dir.exists(data_dir)) { 47 | dir.create(data_dir) 48 | } 49 | 50 | # first get atlas time distribution 51 | 52 | atlas_time_dist = get_atlas_time_dist(atlas_time = atlas_time,graph_id = cgraph_id) 53 | 54 | # first timing using host cells only 55 | host_cls = chim_cls[(mat@cell_metadata[chim_cls,"cell_type"] == "host") & ( mat@cell_metadata[chim_cls,"embryo"] %in% chim_embryos )] 56 | query_cls_md = mat@cell_metadata[host_cls,"embryo"] 57 | names(query_cls_md) = host_cls 58 | 59 | query_time_dist_host = get_query_time_dist(query_cls_md = query_cls_md,atlas_time = atlas_time,graph_id = cgraph_id) 60 | 61 | # timing using control cells 62 | 63 | control_cls = chim_cls[( mat@cell_metadata[chim_cls,"cell_type"] == "control" ) & ( mat@cell_metadata[chim_cls,"embryo"] %in% chim_embryos )] 64 | query_cls_md = mat@cell_metadata[control_cls,"embryo"] 65 | names(query_cls_md) = control_cls 66 | 67 | query_time_dist_control = get_query_time_dist(query_cls_md = query_cls_md,atlas_time = atlas_time,graph_id = cgraph_id) 68 | 69 | 70 | time_dist_host_control = list(atlas_time_dist = atlas_time_dist$atlas_time_dist, 71 | host = query_time_dist_host$query_time_dist, 72 | control = query_time_dist_control$query_time_dist) 73 | 74 | if(tag != "all") { 75 | file_nm = sprintf("%s/time_dist_host_control_%s.Rda",data_dir,tag) 76 | } else { 77 | file_nm = sprintf("%s/time_dist_host_control.Rda",data_dir) 78 | } 79 | 80 | save(time_dist_host_control,file = file_nm) 81 | 82 | chim_emb_summary = as.data.frame.matrix(table(mat@cell_metadata[chim_cls,"embryo"],mat@cell_metadata[chim_cls,"cell_type"])) 83 | chim_emb_summary$embryo = rownames(chim_emb_summary) 84 | chim_emb_summary$best_rank_host = NA 85 | chim_emb_summary[rownames(time_dist_host_control$host),"best_rank_host"] = time_dist_best_match(atlas_time_dist = time_dist_host_control$atlas_time_dist, 86 | query_time_dist = time_dist_host_control$host) 87 | chim_emb_summary$best_rank_control = NA 88 | chim_emb_summary[rownames(time_dist_host_control$control),"best_rank_control"] = time_dist_best_match(atlas_time_dist = time_dist_host_control$atlas_time_dist, 89 | query_time_dist = time_dist_host_control$control) 90 | 91 | if(tag != "all") { 92 | file_nm = sprintf("%s/time_match_summary_%s.txt",data_dir,tag) 93 | } else { 94 | file_nm = sprintf("%s/time_match_summary.txt",data_dir) 95 | } 96 | 97 | write.table(chim_emb_summary,file = file_nm,sep ="\t",row.names = F) 98 | } 99 | 100 | time_dist_best_match = function(query_time_dist,atlas_time_dist) { 101 | 102 | query_ref_cor = tgs_cor(t(as.matrix(as.data.frame.matrix(query_time_dist))), 103 | t(as.matrix(as.data.frame.matrix(atlas_time_dist)))) 104 | 105 | best_fit = apply(query_ref_cor,1,FUN = function(x) { 106 | return(mean(as.numeric(colnames(query_ref_cor))[order(x,decreasing = T)][1:5])) 107 | }) 108 | 109 | return(best_fit) 110 | } 111 | 112 | control_chimera_plot_cumulative_distribution_control_host = function(mat_nm,fig_dir = NULL,tag = "all",plot_pdf = T,chim_col = "cornflowerblue") { 113 | 114 | fig_scale = 1.4 115 | lwd = 15 116 | x_pos_text = 85 117 | xlim_min = 75 118 | xlim_max = 153 119 | 120 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/time_match",mat_nm) 121 | if(tag == "all") { 122 | load(sprintf("%s/time_dist_host_control.Rda",data_dir)) 123 | } else { 124 | load(sprintf("%s/time_dist_host_control_%s.Rda",data_dir,tag)) 125 | } 126 | 127 | host_dist_all = time_dist_host_control$host 128 | control_dist_all = time_dist_host_control$control 129 | 130 | if(tag == "all") { 131 | fn = sprintf("%s/time_match_summary.txt",data_dir) 132 | } else { 133 | fn = sprintf("%s/time_match_summary_%s.txt",data_dir,tag) 134 | } 135 | chim_time_summary = read.table(file = fn,sep ="\t",h = T,stringsAsFactors = F) 136 | chim_embryos = chim_time_summary$embryo[order(chim_time_summary$best_rank_host)] 137 | 138 | ks_statistic = rep(0,length(chim_embryos)) 139 | p_value = rep(0,length(chim_embryos)) 140 | delta_median = p_value 141 | 142 | for (i in 1:length(chim_embryos)) { 143 | chimera = chim_embryos[i] 144 | host_dens = cumsum(host_dist_all[chimera,])/sum(host_dist_all[chimera,]) 145 | control_dens = cumsum(control_dist_all[chimera,])/sum(control_dist_all[chimera,]) 146 | 147 | host_median = median(rep(c(1:153),host_dist_all[chimera,])) 148 | control_median = median(rep(c(1:153),control_dist_all[chimera,])) 149 | 150 | delta_median[i] = host_median - control_median 151 | 152 | ks_control_host_chim = ks.test(x = rep(c(1:153),host_dist_all[chimera,]),y = rep(c(1:153),control_dist_all[chimera,])) 153 | ks_statistic[i] = ks_control_host_chim$statistic 154 | p_value[i] = ks_control_host_chim$p.value 155 | 156 | 157 | N_host = sum(host_dist_all[chimera,]) 158 | N_control = sum(control_dist_all[chimera,]) 159 | 160 | if(!is.null(fig_dir)) { 161 | if(!dir.exists(fig_dir)) { 162 | dir.create(fig_dir) 163 | } 164 | 165 | if (plot_pdf) { 166 | if(tag == "all") { 167 | fn = sprintf("%s/cumulative_time_dist_%d.pdf",fig_dir,i) 168 | } else { 169 | fn = sprintf("%s/cumulative_time_dist_%d_%s.pdf",fig_dir,i,tag) 170 | } 171 | pdf(fn,w = 5*fig_scale,h = 4*fig_scale,useDingbats = F) 172 | plot(c(xlim_min:xlim_max),host_dens[xlim_min:xlim_max],type = "l", main = paste0("Control ",i),lwd = lwd, 173 | xlab = "Transcriptional rank",ylab = "",ylim = c(0,1),col = "gray30",cex.main = 3,cex.lab = 2,cex.axis = 2) 174 | 175 | lines(c(xlim_min:xlim_max),control_dens[xlim_min:xlim_max],type = "l",lwd = lwd,col = "cornflowerblue") 176 | 177 | text(x = rep(x_pos_text,4), y = c(0.9,0.75,0.6,0.45), 178 | labels = c(sprintf("D = %1.1e",ks_control_host_chim$statistic),sprintf("p < %1.1e",ks_control_host_chim$p.value), 179 | sprintf("N host = %d",N_host),sprintf("N Control = %d",N_control)),cex = 1.2) 180 | 181 | dev.off() 182 | } else { 183 | png(sprintf("%s/cumulative_time_dist_%s.png",fig_dir,chimera),w = 500*fig_scale,h = 400*fig_scale) 184 | plot(dev_time,host_dens,type = "l", main = chimera,lwd = lwd, 185 | xlab = "Developmental time",ylab = "",ylim = c(0,1),col = "gray30",cex.main = 3,cex.lab = 2,cex.axis = 2) 186 | 187 | lines(dev_time,control_dens,type = "l",lwd = lwd,col = "cornflowerblue") 188 | 189 | text(x = rep(x_pos_text,4), y = c(0.9,0.75,0.6,0.45), 190 | labels = c(sprintf("D = %1.1e",ks_control_host_chim$statistic),sprintf("p < %1.1e",ks_control_host_chim$p.value), 191 | sprintf("N host = %d",N_host),sprintf("N control = %d",N_control)),cex = 1.2) 192 | dev.off() 193 | } 194 | 195 | 196 | } 197 | 198 | } 199 | 200 | return(list(d_ks = ks_statistic,p_value = p_value,delta_median = delta_median,chimera = chim_embryos)) 201 | } 202 | -------------------------------------------------------------------------------- /scripts/foxc12/definition_cell_types_endoderm_ectoderm_embryonic_mesoderm.r: -------------------------------------------------------------------------------- 1 | # definition of cell types that belong to endoderm, ectoderm or embryonic mesoderm respectively 2 | 3 | embryonic_meso_ct_colors = function() { 4 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 5 | ct_to_col = mc_wt@color_key$color 6 | names(ct_to_col) = mc_wt@color_key$group 7 | 8 | included_ct = c("Caudal mesoderm","Early nascent mesoderm","Late nascent mesoderm","Paraxial mesoderm","Rostral mesoderm","Lateral & intermediate mesoderm", 9 | "Cardiac mesoderm","Cardiomyocytes") 10 | 11 | if(sum(is.na(ct_to_col[included_ct])) > 0) { 12 | stop(sprintf("one of the cell types is not correctly defined \n %s",included_ct[is.na(ct_to_col[included_ct])])) 13 | } 14 | 15 | return(ct_to_col[included_ct]) 16 | } 17 | 18 | endo_ct_colors = function() { 19 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 20 | ct_to_col = mc_wt@color_key$color 21 | names(ct_to_col) = mc_wt@color_key$group 22 | 23 | included_ct = c("Definitive endoderm","Foregut","Node/Notochord") 24 | 25 | 26 | if(sum(is.na(ct_to_col[included_ct])) > 0) { 27 | stop("one of the cell types is not correctly defined") 28 | } 29 | 30 | return(ct_to_col[included_ct]) 31 | } 32 | 33 | ectoderm_ct_colors = function() { 34 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 35 | ct_to_col = mc_wt@color_key$color 36 | names(ct_to_col) = mc_wt@color_key$group 37 | 38 | included_ct = c("Surface ectoderm","Rostral neural plate","Definitive ectoderm","Caudal neural plate","Caudal epiblast") 39 | 40 | if(sum(is.na(ct_to_col[included_ct])) > 0) { 41 | stop("one of the cell types is not correctly defined") 42 | } 43 | 44 | return(ct_to_col[included_ct]) 45 | } 46 | 47 | 48 | -------------------------------------------------------------------------------- /scripts/foxc12/foxc_chim_timing.r: -------------------------------------------------------------------------------- 1 | source("scripts/foxc12/transfer_time_annotation.r") 2 | library("Matrix") 3 | library("dplyr") 4 | 5 | 6 | 7 | foxc_chim_timing = function(tag = "all",included_colors = NULL) { 8 | 9 | mat_nm = "foxc_chim_wt10" 10 | mc_id = "foxc_chim_wt10_recolored" 11 | cgraph_id = mat_nm 12 | mat = scdb_mat(mat_nm) 13 | mc = scdb_mc(mc_id) 14 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 15 | 16 | excluded_colors = c("#F6BFCB","#7F6874") 17 | 18 | if(is.null(included_colors)) { 19 | included_colors = setdiff(mc_wt@color_key$color,excluded_colors) 20 | } 21 | 22 | load(sprintf("data/chimera_tetraploid_analysis/%s/color_annotation/cmp_annot.Rda",mat_nm)) 23 | query_cls_col = cmp_annot$query_cls_col 24 | 25 | f = mat@cell_metadata[colnames(mat@mat),"cell_type"] %in% c("host","control","KO") 26 | chim_cls = colnames(mat@mat)[f] 27 | chim_cls = intersect(chim_cls,names(cmp_annot$query_cls_col)) 28 | 29 | excluded_cls = chim_cls[cmp_annot$query_cls_col[chim_cls] %in% excluded_colors] 30 | chim_cls = setdiff(chim_cls,excluded_cls) 31 | 32 | chim_cls = chim_cls[query_cls_col[chim_cls] %in% included_colors] 33 | 34 | chim_embryos = unique(mat@cell_metadata[chim_cls,"embryo"]) 35 | 36 | wt10_cls = intersect(names(mc_wt@mc)[ !(mc_wt@colors[mc_wt@mc] %in% excluded_colors) ],colnames(mat@mat)) 37 | atlas_time = mat@cell_metadata[wt10_cls,"transcriptional_rank"] 38 | names(atlas_time) = wt10_cls 39 | 40 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s",mat_nm) 41 | if(!dir.exists(data_dir)) { 42 | dir.create(data_dir) 43 | } 44 | 45 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/time_match",mat_nm) 46 | if(!dir.exists(data_dir)) { 47 | dir.create(data_dir) 48 | } 49 | 50 | # first get atlas time distribution 51 | 52 | atlas_time_dist = get_atlas_time_dist(atlas_time = atlas_time,graph_id = cgraph_id) 53 | 54 | # first timing using host cells only 55 | host_cls = chim_cls[(mat@cell_metadata[chim_cls,"cell_type"] == "host") & ( mat@cell_metadata[chim_cls,"embryo"] %in% chim_embryos )] 56 | query_cls_md = mat@cell_metadata[host_cls,"embryo"] 57 | names(query_cls_md) = host_cls 58 | 59 | query_time_dist_host = get_query_time_dist(query_cls_md = query_cls_md,atlas_time = atlas_time,graph_id = cgraph_id) 60 | 61 | # timing using KO cells 62 | 63 | ko_cls = chim_cls[( mat@cell_metadata[chim_cls,"cell_type"] == "KO" ) & ( mat@cell_metadata[chim_cls,"embryo"] %in% chim_embryos )] 64 | query_cls_md = mat@cell_metadata[ko_cls,"embryo"] 65 | names(query_cls_md) = ko_cls 66 | 67 | query_time_dist_ko = get_query_time_dist(query_cls_md = query_cls_md,atlas_time = atlas_time,graph_id = cgraph_id) 68 | 69 | 70 | time_dist_host_ko = list(atlas_time_dist = atlas_time_dist$atlas_time_dist, 71 | host = query_time_dist_host$query_time_dist, 72 | ko = query_time_dist_ko$query_time_dist) 73 | 74 | if(tag != "all") { 75 | file_nm = sprintf("%s/time_dist_host_ko_%s.Rda",data_dir,tag) 76 | } else { 77 | file_nm = sprintf("%s/time_dist_host_ko.Rda",data_dir) 78 | } 79 | 80 | save(time_dist_host_ko,file = file_nm) 81 | 82 | chim_emb_summary = as.data.frame.matrix(table(mat@cell_metadata[chim_cls,"embryo"],mat@cell_metadata[chim_cls,"cell_type"])) 83 | chim_emb_summary$embryo = rownames(chim_emb_summary) 84 | chim_emb_summary$best_rank_host = NA 85 | chim_emb_summary[rownames(time_dist_host_ko$host),"best_rank_host"] = time_dist_best_match(atlas_time_dist = time_dist_host_ko$atlas_time_dist, 86 | query_time_dist = time_dist_host_ko$host) 87 | chim_emb_summary$best_rank_ko = NA 88 | chim_emb_summary[rownames(time_dist_host_ko$ko),"best_rank_ko"] = time_dist_best_match(atlas_time_dist = time_dist_host_ko$atlas_time_dist, 89 | query_time_dist = time_dist_host_ko$ko) 90 | 91 | if(tag != "all") { 92 | file_nm = sprintf("%s/time_match_summary_%s.txt",data_dir,tag) 93 | } else { 94 | file_nm = sprintf("%s/time_match_summary.txt",data_dir) 95 | } 96 | 97 | write.table(chim_emb_summary,file = file_nm,sep ="\t",row.names = F) 98 | } 99 | 100 | 101 | foxc_chimera_plot_cumulative_distribution_ko_host = function(mat_nm,fig_dir = NULL,tag = "all",plot_pdf = F,chim_col = "#83c26d") { 102 | 103 | fig_scale = 1.4 104 | lwd = 15 105 | x_pos_text = 85 106 | xlim_min = 75 107 | xlim_max = 153 108 | 109 | 110 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/time_match",mat_nm) 111 | if(tag == "all") { 112 | load(sprintf("%s/time_dist_host_ko.Rda",data_dir)) 113 | } else { 114 | load(sprintf("%s/time_dist_host_ko_%s.Rda",data_dir,tag)) 115 | } 116 | 117 | 118 | host_dist_all = time_dist_host_ko$host 119 | ko_dist_all = time_dist_host_ko$ko 120 | 121 | if(tag == "all") { 122 | fn = sprintf("%s/time_match_summary.txt",data_dir) 123 | } else { 124 | fn = sprintf("%s/time_match_summary_%s.txt",data_dir,tag) 125 | } 126 | chim_time_summary = read.table(file = fn,sep ="\t",h = T,stringsAsFactors = F) 127 | chim_embryos = chim_time_summary$embryo[order(chim_time_summary$best_rank_host)] 128 | 129 | ks_statistic = rep(0,length(chim_embryos)) 130 | p_value = rep(0,length(chim_embryos)) 131 | delta_median = p_value 132 | 133 | for (i in 1:length(chim_embryos)) { 134 | chimera = chim_embryos[i] 135 | print(chimera) 136 | host_dens = cumsum(host_dist_all[chimera,])/sum(host_dist_all[chimera,]) 137 | ko_dens = cumsum(ko_dist_all[chimera,])/sum(ko_dist_all[chimera,]) 138 | 139 | host_median = median(rep(c(1:153),host_dist_all[chimera,])) 140 | ko_median = median(rep(c(1:153),ko_dist_all[chimera,])) 141 | 142 | delta_median[i] = host_median - ko_median 143 | 144 | ks_ko_host_chim = ks.test(x = rep(c(1:153),host_dist_all[chimera,]),y = rep(c(1:153),ko_dist_all[chimera,])) 145 | ks_statistic[i] = ks_ko_host_chim$statistic 146 | p_value[i] = ks_ko_host_chim$p.value 147 | 148 | N_host = sum(host_dist_all[chimera,]) 149 | N_ko = sum(ko_dist_all[chimera,]) 150 | 151 | if(!is.null(fig_dir)) { 152 | 153 | if(!dir.exists(fig_dir)) { 154 | dir.create(fig_dir) 155 | } 156 | 157 | if (plot_pdf) { 158 | if(tag == "all") { 159 | fn = sprintf("%s/cumulative_time_dist_%d.pdf",fig_dir,i) 160 | } else { 161 | fn = sprintf("%s/cumulative_time_dist_%d_%s.pdf",fig_dir,i,tag) 162 | } 163 | pdf(fn,w = 5*fig_scale,h = 4*fig_scale,useDingbats = F) 164 | plot(c(xlim_min:xlim_max),host_dens[xlim_min:xlim_max],type = "l", main = paste0("FoxC DKO ",i),lwd = lwd, 165 | xlab = "Transcriptional rank",ylab = "",ylim = c(0,1),col = "gray30",cex.main = 3,cex.lab = 2,cex.axis = 2) 166 | 167 | lines(c(xlim_min:xlim_max),ko_dens[xlim_min:xlim_max],type = "l",lwd = lwd,col = "#83c26d") 168 | 169 | text(x = rep(x_pos_text,4), y = c(0.9,0.75,0.6,0.45), 170 | labels = c(sprintf("D = %1.1e",ks_ko_host_chim$statistic),sprintf("p < %1.1e",ks_ko_host_chim$p.value), 171 | sprintf("N host = %d",N_host),sprintf("N KO = %d",N_ko)),cex = 1.2) 172 | 173 | dev.off() 174 | 175 | } else { 176 | 177 | if(tag == "all") { 178 | fn = sprintf("%s/cumulative_time_dist_%s.png",fig_dir,chimera) 179 | } else { 180 | fn = sprintf("%s/cumulative_time_dist_%s_%s.png",fig_dir,chimera,tag) 181 | } 182 | png(fn,w = 500*fig_scale,h = 400*fig_scale) 183 | plot(c(xlim_min:xlim_max),host_dens[xlim_min:xlim_max],type = "l", main = paste0("FoxC DKO ",i),lwd = lwd, 184 | xlab = "Transcriptional rank",ylab = "",ylim = c(0,1),col = "gray30",cex.main = 3,cex.lab = 2,cex.axis = 2) 185 | 186 | lines(c(xlim_min:xlim_max),ko_dens[xlim_min:xlim_max],type = "l",lwd = lwd,col = chim_col) 187 | 188 | text(x = rep(x_pos_text,4), y = c(0.9,0.75,0.6,0.45), 189 | labels = c(sprintf("D = %1.1e",ks_ko_host_chim$statistic),sprintf("p < %1.1e",ks_ko_host_chim$p.value), 190 | sprintf("N host = %d",N_host),sprintf("N KO = %d",N_ko)),cex = 1.2) 191 | dev.off() 192 | } 193 | 194 | } 195 | 196 | 197 | } 198 | 199 | return(list(d_ks = ks_statistic,p_value = p_value,delta_median = delta_median,chimera = chim_embryos)) 200 | } 201 | 202 | 203 | time_dist_best_match = function(query_time_dist,atlas_time_dist) { 204 | 205 | query_ref_cor = tgs_cor(t(as.matrix(as.data.frame.matrix(query_time_dist))), 206 | t(as.matrix(as.data.frame.matrix(atlas_time_dist)))) 207 | 208 | best_fit = apply(query_ref_cor,1,FUN = function(x) { 209 | return(mean(as.numeric(colnames(query_ref_cor))[order(x,decreasing = T)][1:5])) 210 | }) 211 | 212 | return(best_fit) 213 | } 214 | -------------------------------------------------------------------------------- /scripts/foxc12/generate_chimera_tetraploid_data_analysis.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | scdb_init("scrna_db/",force_reinit = T) 3 | tgconfig::override_params(config_file = "config/sing_emb.yaml",package = "metacell") 4 | source("scripts/foxc12/transfer_cell_type_annotation.r") 5 | source("scripts/foxc12/transfer_time_annotation.r") 6 | source("scripts/foxc12/foxc_chim_timing.r") 7 | source("scripts/foxc12/control_chim_timing.r") 8 | source("scripts/foxc12/definition_cell_types_endoderm_ectoderm_embryonic_mesoderm.r") 9 | 10 | foxc_chimera_generate_time_and_cell_type_annotation = function() { 11 | 12 | if(!dir.exists("data/chimera_tetraploid_analysis")) { 13 | dir.create("data/chimera_tetraploid_analysis") 14 | } 15 | # generate single-cell graph 16 | gen_cgraph(mat_nm = "foxc_chim_wt10") 17 | message("generated cgraph") 18 | transfer_color_chimera_tetraploid(mat_nm = "foxc_chim_wt10",tag = "KO") 19 | message("transfered cell type annotation") 20 | foxc_chim_timing() 21 | message("tranfered time annotation") 22 | 23 | # generate time distributions per embryo based on endo- and ectoderm cells 24 | # endo_ct_colors() and ectoderm_ct_colors just return the colors of ecto- and endodermal cell types 25 | endo_colors = endo_ct_colors() 26 | ecto_colors = ectoderm_ct_colors() 27 | included_colors = c(endo_colors,ecto_colors) 28 | foxc_chim_timing(tag = "endo_ecto",included_colors = included_colors) 29 | 30 | # Embryonic mesoderm 31 | included_colors = embryonic_meso_ct_colors() 32 | foxc_chim_timing(tag = "emb_meso",included_colors = included_colors) 33 | 34 | } 35 | 36 | control_chimera_generate_time_and_cell_type_annotation = function() { 37 | 38 | if(!dir.exists("data/chimera_tetraploid_analysis")) { 39 | dir.create("data/chimera_tetraploid_analysis") 40 | } 41 | # generate single-cell graph 42 | gen_cgraph(mat_nm = "control_chim_wt10") 43 | message("generated cgraph") 44 | transfer_color_chimera_tetraploid(mat_nm = "control_chim_wt10",tag = "control") 45 | message("transfered cell type annotation") 46 | control_chim_timing() 47 | message("tranfered time annotation") 48 | 49 | # generate time distributions per embryo based on endo- and ectoderm cells 50 | # endo_ct_colors() and ectoderm_ct_colors just return the colors of ecto- and endodermal cell types 51 | endo_colors = endo_ct_colors() 52 | ecto_colors = ectoderm_ct_colors() 53 | included_colors = c(endo_colors,ecto_colors) 54 | control_chim_timing(tag = "endo_ecto",included_colors = included_colors) 55 | 56 | # Embryonic mesoderm 57 | included_colors = embryonic_meso_ct_colors() 58 | control_chim_timing(tag = "emb_meso",included_colors = included_colors) 59 | 60 | } 61 | 62 | foxc_tetraploid_generate_time_and_cell_type_annotation = function() { 63 | 64 | if(!dir.exists("data/chimera_tetraploid_analysis")) { 65 | dir.create("data/chimera_tetraploid_analysis") 66 | } 67 | # generate single-cell graph 68 | gen_cgraph(mat_nm = "foxc_tetra_wt10") 69 | message("generated cgraph") 70 | transfer_color_chimera_tetraploid(mat_nm = "foxc_tetra_wt10",tag = "KO") 71 | 72 | foxc_tetra_timing() 73 | 74 | 75 | } 76 | 77 | control_tetraploid_generate_time_and_cell_type_annotation = function() { 78 | 79 | if(!dir.exists("data/chimera_tetraploid_analysis")) { 80 | dir.create("data/chimera_tetraploid_analysis") 81 | } 82 | # generate single-cell graph 83 | gen_cgraph(mat_nm = "control_tetra_wt10") 84 | message("generated cgraph") 85 | transfer_color_chimera_tetraploid(mat_nm = "control_tetra_wt10",tag = "control") 86 | 87 | control_tetra_timing() 88 | 89 | 90 | } 91 | 92 | 93 | gen_cgraph = function(mat_nm,T_vm = 0.1,Knn = 100) { 94 | 95 | bad_genes = read.table(file = "data/external_data/sing_emb_wt10.bad_genes.txt",sep = "\t",stringsAsFactors = F)$x 96 | bad_genes = c(bad_genes,c("Igf2","AK145379;H19","Polg","Slc25a4","Peg10","Igf2as","AK086477;Sh3glb1")) 97 | 98 | mcell_add_gene_stat(mat_nm, mat_nm, force=T) 99 | 100 | mcell_gset_filter_varmean(mat_nm, mat_nm, T_vm=T_vm, force_new=T) 101 | mcell_gset_filter_cov(mat_nm, mat_nm, T_tot=50, T_top3=3) 102 | 103 | gset = scdb_gset(mat_nm) 104 | nms = names(gset@gene_set) 105 | #bad gene that will be removed from list of genes that helps to mark metacell 106 | bad_g = c(grep("^Rpl",nms,v=T),grep("^Gm",nms,v=T),grep("Rps",nms,v=T)) 107 | 108 | bad_g = c(bad_g, bad_genes) 109 | 110 | gset_f = gset_new_restrict_nms(gset=gset, bad_g, inverse=T, "feat filt") 111 | scdb_add_gset(mat_nm, gset_f) 112 | 113 | mcell_add_cgraph_from_mat_bknn(mat_id=mat_nm, 114 | gset_id = mat_nm, 115 | graph_id=mat_nm, 116 | K=Knn, 117 | dsamp=T) 118 | 119 | } 120 | 121 | 122 | foxc_tetra_timing = function() { 123 | 124 | mat_nm = "foxc_tetra_wt10" 125 | mc_id = "foxc_tetra_wt10_recolored" 126 | cgraph_id = mat_nm 127 | mat = scdb_mat(mat_nm) 128 | mc = scdb_mc(mc_id) 129 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 130 | 131 | excluded_colors = c("#F6BFCB","#7F6874") 132 | 133 | load(sprintf("data/chimera_tetraploid_analysis/%s/color_annotation/cmp_annot.Rda",mat_nm)) 134 | 135 | f = mat@cell_metadata[colnames(mat@mat),"cell_type"] %in% c("KO") 136 | tetra_cls = colnames(mat@mat)[f] 137 | tetra_cls = intersect(tetra_cls,names(cmp_annot$query_cls_col)) 138 | 139 | excluded_cls = tetra_cls[cmp_annot$query_cls_col[tetra_cls] %in% excluded_colors] 140 | tetra_cls = setdiff(tetra_cls,excluded_cls) 141 | 142 | ko_embryos = unique(mat@cell_metadata[tetra_cls,"embryo"]) 143 | 144 | wt10_cls = intersect(names(mc_wt@mc)[ !(mc_wt@colors[mc_wt@mc] %in% excluded_colors) ],colnames(mat@mat)) 145 | atlas_time = mat@cell_metadata[wt10_cls,"transcriptional_rank"] 146 | names(atlas_time) = wt10_cls 147 | 148 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s",mat_nm) 149 | if(!dir.exists(data_dir)) { 150 | dir.create(data_dir) 151 | } 152 | 153 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/time_match",mat_nm) 154 | if(!dir.exists(data_dir)) { 155 | dir.create(data_dir) 156 | } 157 | 158 | 159 | 160 | # first get atlas time distribution 161 | 162 | atlas_time_dist = get_atlas_time_dist(atlas_time = atlas_time,graph_id = cgraph_id) 163 | 164 | # timing using KO cells 165 | 166 | ko_cls = tetra_cls[( mat@cell_metadata[tetra_cls,"cell_type"] == "KO" ) & ( mat@cell_metadata[tetra_cls,"embryo"] %in% ko_embryos )] 167 | query_cls_md = mat@cell_metadata[ko_cls,"embryo"] 168 | names(query_cls_md) = ko_cls 169 | 170 | query_time_dist_ko = get_query_time_dist(query_cls_md = query_cls_md,atlas_time = atlas_time,graph_id = cgraph_id) 171 | 172 | 173 | time_dist_ko = list(atlas_time_dist = atlas_time_dist$atlas_time_dist, 174 | ko = query_time_dist_ko$query_time_dist) 175 | 176 | save(time_dist_ko,file = sprintf("%s/time_dist_ko.Rda",data_dir)) 177 | 178 | chim_emb_summary = as.data.frame.matrix(table(mat@cell_metadata[tetra_cls,"embryo"],mat@cell_metadata[tetra_cls,"cell_type"])) 179 | chim_emb_summary$embryo = rownames(chim_emb_summary) 180 | chim_emb_summary$best_rank_ko = NA 181 | chim_emb_summary[rownames(time_dist_ko$ko),"best_rank_ko"] = time_dist_best_match(atlas_time_dist = time_dist_ko$atlas_time_dist, 182 | query_time_dist = time_dist_ko$ko) 183 | 184 | write.table(chim_emb_summary,file = sprintf("%s/time_match_summary.txt",data_dir),sep ="\t",row.names = F) 185 | } 186 | 187 | 188 | control_tetra_timing = function() { 189 | 190 | mat_nm = "control_tetra_wt10" 191 | mc_id = "control_tetra_wt10_recolored" 192 | cgraph_id = mat_nm 193 | mat = scdb_mat(mat_nm) 194 | mc = scdb_mc(mc_id) 195 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 196 | 197 | excluded_colors = c("#F6BFCB","#7F6874") 198 | 199 | load(sprintf("data/chimera_tetraploid_analysis/%s/color_annotation/cmp_annot.Rda",mat_nm)) 200 | 201 | f = mat@cell_metadata[colnames(mat@mat),"cell_type"] %in% c("control") 202 | tetra_cls = colnames(mat@mat)[f] 203 | tetra_cls = intersect(tetra_cls,names(cmp_annot$query_cls_col)) 204 | 205 | excluded_cls = tetra_cls[cmp_annot$query_cls_col[tetra_cls] %in% excluded_colors] 206 | tetra_cls = setdiff(tetra_cls,excluded_cls) 207 | 208 | control_embryos = unique(mat@cell_metadata[tetra_cls,"embryo"]) 209 | 210 | wt10_cls = intersect(names(mc_wt@mc)[ !(mc_wt@colors[mc_wt@mc] %in% excluded_colors) ],colnames(mat@mat)) 211 | atlas_time = mat@cell_metadata[wt10_cls,"transcriptional_rank"] 212 | names(atlas_time) = wt10_cls 213 | 214 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s",mat_nm) 215 | if(!dir.exists(data_dir)) { 216 | dir.create(data_dir) 217 | } 218 | 219 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/time_match",mat_nm) 220 | if(!dir.exists(data_dir)) { 221 | dir.create(data_dir) 222 | } 223 | 224 | 225 | 226 | # first get atlas time distribution 227 | 228 | atlas_time_dist = get_atlas_time_dist(atlas_time = atlas_time,graph_id = cgraph_id) 229 | 230 | # timing using control cells 231 | 232 | control_cls = tetra_cls[( mat@cell_metadata[tetra_cls,"cell_type"] == "control" ) & ( mat@cell_metadata[tetra_cls,"embryo"] %in% control_embryos )] 233 | query_cls_md = mat@cell_metadata[control_cls,"embryo"] 234 | names(query_cls_md) = control_cls 235 | 236 | query_time_dist_control = get_query_time_dist(query_cls_md = query_cls_md,atlas_time = atlas_time,graph_id = cgraph_id) 237 | 238 | 239 | time_dist_control = list(atlas_time_dist = atlas_time_dist$atlas_time_dist, 240 | control = query_time_dist_control$query_time_dist) 241 | 242 | save(time_dist_control,file = sprintf("%s/time_dist_control.Rda",data_dir)) 243 | 244 | chim_emb_summary = as.data.frame.matrix(table(mat@cell_metadata[tetra_cls,"embryo"],mat@cell_metadata[tetra_cls,"cell_type"])) 245 | chim_emb_summary$embryo = rownames(chim_emb_summary) 246 | chim_emb_summary$best_rank_control = NA 247 | chim_emb_summary[rownames(time_dist_control$control),"best_rank_control"] = time_dist_best_match(atlas_time_dist = time_dist_control$atlas_time_dist, 248 | query_time_dist = time_dist_control$control) 249 | 250 | write.table(chim_emb_summary,file = sprintf("%s/time_match_summary.txt",data_dir),sep ="\t",row.names = F) 251 | } 252 | 253 | 254 | 255 | -------------------------------------------------------------------------------- /scripts/foxc12/preprocessing/foxc_control_gating.r: -------------------------------------------------------------------------------- 1 | library("dplyr") 2 | 3 | arsinh = function(x,loc = 0,w = 1) { 4 | 5 | return(log((x - loc)/w + sqrt(((x - loc)/w)^2 + 1))) 6 | } 7 | 8 | 9 | calc_a_b_from_x_y = function(x1,y1,x2,y2) { 10 | 11 | b = (y2 - y1)/(x2 - x1) 12 | a = (y1*x2 - x1*y2)/(x2 - x1) 13 | return(c(a,b)) 14 | } 15 | 16 | gating_of_foxc_control = function(fig_dir) { 17 | 18 | mat_nm = "foxc_control" 19 | mat = scdb_mat(mat_nm) 20 | 21 | if(!dir.exists(fig_dir)) { 22 | dir.create(fig_dir) 23 | } 24 | 25 | cls = mat@cell_metadata$cell[mat@cell_metadata$embryo != "empty"] 26 | 27 | cls_type = rep("empty",nrow(mat@cell_metadata)) 28 | names(cls_type) = rownames(mat@cell_metadata) 29 | 30 | cls_type[cls] = "unclear" 31 | 32 | # parameters needed for the transformation 33 | loc_x = -600 34 | w_x = 400 35 | 36 | gfp_tr = arsinh(mat@cell_metadata[cls,"gfp_a"], loc = loc_x,w = w_x) 37 | ssc_w = mat@cell_metadata[cls,"ssc_w"] 38 | fsc_a = mat@cell_metadata[cls,"fsc_a"] 39 | xlim = c(min(gfp_tr),max(gfp_tr)) 40 | ylim = c(min(fsc_a),max(fsc_a)) 41 | 42 | col_fsc_ssc = densCols(x = gfp_tr,y = fsc_a) 43 | 44 | names(gfp_tr) = cls 45 | names(ssc_w) = cls 46 | names(fsc_a) = cls 47 | 48 | pdf(sprintf("%s/gfp_vs_fsc_all_cls.pdf",fig_dir),useDingbats = F) 49 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "GFP",main = "FoxC and control chimera and tetraploid cells", 50 | xlim = xlim,ylim = ylim) 51 | dev.off() 52 | 53 | # Gating of control chimera 54 | 55 | loc_x = -300 56 | w_x = 400 57 | 58 | a_l = -180000 59 | b_l = 150000 60 | a_h = -290000 61 | b_h = 150000 62 | 63 | 64 | cls_f = cls[mat@cell_metadata[cls,"Experiment_type"] %in% c("Control chimera")] 65 | 66 | ssc_w = mat@cell_metadata[cls_f,"ssc_w"] 67 | fsc_a = mat@cell_metadata[cls_f,"fsc_a"] 68 | gfp_tr = arsinh(mat@cell_metadata[cls_f,"gfp_a"], loc = loc_x,w = w_x) 69 | names(gfp_tr) = cls_f 70 | names(fsc_a) = cls_f 71 | xlim = c(min(gfp_tr),max(gfp_tr)) 72 | #ylim = c(min(fsc_a),max(fsc_a)) 73 | 74 | col_dens = densCols(gfp_tr,fsc_a) 75 | pdf(sprintf("%s/control_chimera_gfp_vs_fsc.pdf",fig_dir),useDingbats = F) 76 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "gfp",col= col_dens,main = "Control chimera cells", 77 | xlim = xlim,ylim = ylim) 78 | 79 | abline(a = a_l,b = b_l,lty = "dashed") 80 | abline(a = a_h,b = b_h,lty = "dashed") 81 | dev.off() 82 | 83 | 84 | f_control = fsc_a < a_h + gfp_tr*b_h 85 | f_host = fsc_a > a_l + gfp_tr*b_l 86 | 87 | col_cls= rep("gray",length(cls_f)) 88 | col_cls[f_control] = "cornflowerblue" 89 | col_cls[f_host] = "black" 90 | 91 | pdf(sprintf("%s/control_chimera_gfp_vs_fsc_gated.pdf",fig_dir),useDingbats = F) 92 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "GFP",main = "Control chimera cells",col = col_cls, 93 | xlim = xlim,ylim = ylim) 94 | 95 | abline(a = a_l,b = b_l,lty = "dashed") 96 | abline(a = a_h,b = b_h,lty = "dashed") 97 | 98 | legend(x = "topleft",legend = c("unclear","control","Host"),col = c("gray","cornflowerblue","black"),pch = 19) 99 | dev.off() 100 | 101 | 102 | cls_type[cls_f[f_control]] = "control" 103 | cls_type[cls_f[f_host]] = "host" 104 | 105 | embryos = unique(mat@cell_metadata[cls_f,"embryo"]) 106 | 107 | 108 | # Gating of control tetraploids 109 | 110 | loc_x = -300 111 | w_x = 400 112 | 113 | a_l = -180000 114 | b_l = 150000 115 | a_h = -290000 116 | b_h = 150000 117 | 118 | 119 | cls_f = cls[mat@cell_metadata[cls,"Experiment_type"] %in% c("Control tetraploid")] 120 | 121 | ssc_w = mat@cell_metadata[cls_f,"ssc_w"] 122 | fsc_a = mat@cell_metadata[cls_f,"fsc_a"] 123 | gfp_tr = arsinh(mat@cell_metadata[cls_f,"gfp_a"], loc = loc_x,w = w_x) 124 | names(gfp_tr) = cls_f 125 | names(fsc_a) = cls_f 126 | xlim = c(min(gfp_tr),max(gfp_tr)) 127 | #ylim = c(min(fsc_a),max(fsc_a)) 128 | 129 | col_dens = densCols(gfp_tr,fsc_a) 130 | pdf(sprintf("%s/control_tetra_gfp_vs_fsc.pdf",fig_dir),useDingbats = F) 131 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "gfp",col= col_dens,main = "Control tetraploid cells", 132 | xlim = xlim,ylim = ylim) 133 | 134 | abline(a = a_l,b = b_l,lty = "dashed") 135 | abline(a = a_h,b = b_h,lty = "dashed") 136 | dev.off() 137 | 138 | 139 | f_control = fsc_a < a_h + gfp_tr*b_h 140 | f_host = fsc_a > a_l + gfp_tr*b_l 141 | 142 | col_cls= rep("gray",length(cls_f)) 143 | col_cls[f_control] = "cornflowerblue" 144 | col_cls[f_host] = "black" 145 | 146 | pdf(sprintf("%s/control_tetra_gfp_vs_fsc_gated.pdf",fig_dir),useDingbats = F) 147 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "GFP",main = "Control tetraploid cells",col = col_cls, 148 | xlim = xlim,ylim = ylim) 149 | 150 | abline(a = a_l,b = b_l,lty = "dashed") 151 | abline(a = a_h,b = b_h,lty = "dashed") 152 | 153 | legend(x = "topleft",legend = c("unclear","control","Host"),col = c("gray","cornflowerblue","black"),pch = 19) 154 | dev.off() 155 | 156 | 157 | cls_type[cls_f[f_control]] = "control" 158 | cls_type[cls_f[f_host]] = "host" 159 | 160 | embryos = unique(mat@cell_metadata[cls_f,"embryo"]) 161 | 162 | 163 | # Gating of Foxc chimera 164 | 165 | loc_x = 0 166 | w_x = 50 167 | 168 | a_l = -60000 169 | b_l = 60000 170 | a_h = -110000 171 | b_h = 60000 172 | 173 | 174 | cls_f = cls[mat@cell_metadata[cls,"Experiment_type"] %in% c("Foxc chimera")] 175 | 176 | fsc_a = mat@cell_metadata[cls_f,"fsc_a"] 177 | ssc_w = mat@cell_metadata[cls_f,"ssc_w"] 178 | gfp_tr = arsinh(mat@cell_metadata[cls_f,"gfp_a"], loc = loc_x,w = w_x) 179 | names(gfp_tr) = cls_f 180 | names(fsc_a) = cls_f 181 | xlim = c(min(gfp_tr),max(gfp_tr)) 182 | #ylim = c(min(fsc_a),max(fsc_a)) 183 | 184 | col_dens = densCols(gfp_tr,fsc_a) 185 | pdf(sprintf("%s/foxc_chimera_gfp_vs_fsc.pdf",fig_dir),useDingbats = F) 186 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "GFP",col= col_dens,main = "FoxC chimera cells", 187 | xlim = xlim,ylim = ylim) 188 | 189 | abline(a = a_l,b = b_l,lty = "dashed") 190 | abline(a = a_h,b = b_h,lty = "dashed") 191 | dev.off() 192 | 193 | 194 | f_ko = fsc_a < a_h + gfp_tr*b_h 195 | f_host = fsc_a > a_l + gfp_tr*b_l 196 | 197 | col_cls= rep("gray",length(cls_f)) 198 | col_cls[f_ko] = "#83c26d" 199 | col_cls[f_host] = "black" 200 | 201 | pdf(sprintf("%s/foxc_chimera_gfp_vs_fsc_gated.pdf",fig_dir),useDingbats = F) 202 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "GFP",main = "FoxC chimera cells",col = col_cls, 203 | xlim = xlim,ylim = ylim) 204 | 205 | abline(a = a_l,b = b_l,lty = "dashed") 206 | abline(a = a_h,b = b_h,lty = "dashed") 207 | 208 | legend(x = "topleft",legend = c("unclear","KO","Host"),col = c("gray","#83c26d","black"),pch = 19) 209 | dev.off() 210 | 211 | 212 | cls_type[cls_f[f_ko]] = "KO" 213 | cls_type[cls_f[f_host]] = "host" 214 | 215 | embryos = unique(mat@cell_metadata[cls_f,"embryo"]) 216 | 217 | 218 | 219 | # Gating of Foxc tetraploid 220 | 221 | loc_x = -700 222 | w_x = 400 223 | 224 | a_l = -200000 225 | b_l = 100000 226 | a_h = -260000 227 | b_h = 100000 228 | 229 | cls_f = cls[mat@cell_metadata[cls,"Experiment_type"] %in% c("Foxc tetraploid")] 230 | 231 | fsc_a = mat@cell_metadata[cls_f,"fsc_a"] 232 | ssc_w = mat@cell_metadata[cls_f,"ssc_w"] 233 | gfp_tr = arsinh(mat@cell_metadata[cls_f,"gfp_a"], loc = loc_x,w = w_x) 234 | names(gfp_tr) = cls_f 235 | names(fsc_a) = cls_f 236 | xlim = c(min(gfp_tr),max(gfp_tr)) 237 | #ylim = c(min(fsc_a),max(fsc_a)) 238 | 239 | col_dens = densCols(gfp_tr,fsc_a) 240 | pdf(sprintf("%s/foxc_tetraploid_gfp_vs_fsc.pdf",fig_dir),useDingbats = F) 241 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "gfp",col= col_dens,main = "FoxC tetraploid cells", 242 | xlim = xlim,ylim = ylim) 243 | 244 | abline(a = a_l,b = b_l,lty = "dashed") 245 | abline(a = a_h,b = b_h,lty = "dashed") 246 | dev.off() 247 | 248 | 249 | f_ko = fsc_a < a_h + gfp_tr*b_h 250 | f_host = fsc_a > a_l + gfp_tr*b_l 251 | 252 | col_cls= rep("gray",length(cls_f)) 253 | col_cls[f_ko] = "#83c26d" 254 | col_cls[f_host] = "black" 255 | 256 | pdf(sprintf("%s/foxc_tetraploid_gfp_vs_fsc_gated.pdf",fig_dir),useDingbats = F) 257 | plot(gfp_tr,fsc_a,pch = 19,cex = 0.4,ylab = "FSC-A",xlab = "GFP",main = "FoxC tetraploid cells",col = col_cls, 258 | xlim = xlim,ylim = ylim) 259 | 260 | abline(a = a_l,b = b_l,lty = "dashed") 261 | abline(a = a_h,b = b_h,lty = "dashed") 262 | 263 | legend(x = "topleft",legend = c("unclear","KO","Host"),col = c("gray","#83c26d","black"),pch = 19) 264 | dev.off() 265 | 266 | 267 | cls_type[cls_f[f_ko]] = "KO" 268 | cls_type[cls_f[f_host]] = "host" 269 | 270 | embryos = unique(mat@cell_metadata[cls_f,"embryo"]) 271 | 272 | 273 | df_gating = data.frame(cell = rownames(mat@cell_metadata),cell_type = cls_type,stringsAsFactors = F) 274 | 275 | if(0) { 276 | 277 | df_gating = data.frame(cell = rownames(mat@cell_metadata),cell_type = cls_type,stringsAsFactors = F) 278 | mat = scdb_mat(mat_nm) 279 | md = mat@cell_metadata 280 | md = left_join(md,df_gating,by = "cell") 281 | rownames(md) = md$cell 282 | mat@cell_metadata = md 283 | scdb_add_mat(id = mat_nm,mat = mat) 284 | } 285 | 286 | return(df_gating) 287 | } 288 | 289 | 290 | -------------------------------------------------------------------------------- /scripts/foxc12/preprocessing/foxc_control_remove_exe_ectoderm_and_parietal_endo_cls.r: -------------------------------------------------------------------------------- 1 | # removes extraembryonic ectoderm and parietal endoderm cells 2 | 3 | if(0) { 4 | mat_nm = "foxc_control" 5 | 6 | mc_wt = scdb_mc("sing_emb_wt9_bs500f") 7 | 8 | mat_query = scdb_mat(mat_nm) 9 | scdb_add_mat(paste0(mat_nm,"_w_exe_ecto"),mat_query) 10 | gset = scdb_gset("sing_emb_wt9") 11 | feat_genes = names(gset@gene_set) 12 | 13 | egc_type = t(tgs_matrix_tapply(mc_wt@e_gc[feat_genes,],mc_wt@colors,mean)) 14 | egc_type = egc_type[,colnames(egc_type) != "gray"] 15 | rownames(egc_type) = feat_genes 16 | 17 | legc = log2(egc_type + 1e-5) 18 | 19 | query_ref_cor = tgs_cor(as.matrix(mat_query@mat[feat_genes,]),egc_type) 20 | 21 | best_ref_type = colnames(query_ref_cor)[apply(query_ref_cor,1,which.max)] 22 | 23 | f_exe_ecto_parietal_endo = best_ref_type %in% c("#1A1A1A","#989898") 24 | 25 | cls_f = rownames(query_ref_cor)[f_exe_ecto_parietal_endo] 26 | 27 | mat_new = scm_ignore_cells(scmat = mat_query,ig_cells = union(cls_f,mat_query@ignore_cells)) 28 | 29 | scdb_add_mat(id = mat_nm,mat = mat_new) 30 | } 31 | -------------------------------------------------------------------------------- /scripts/foxc12/preprocessing/foxc_control_split_into_experiment_type.r: -------------------------------------------------------------------------------- 1 | source("scripts/foxc12/preprocessing/merge_umi_mat_with_wt10_umi_mat.r") 2 | 3 | if(0) { 4 | 5 | ignore_embryos = c("6D_m2e3","6D_m3e1","6D_m3e7") 6 | 7 | mat_nm = "foxc_control" 8 | mat = scdb_mat(mat_nm) 9 | 10 | 11 | df_nm = data.frame(type = c("Control chimera","Control tetraploid","Foxc chimera","Foxc tetraploid"), 12 | mat_nm = c("control_chim","control_tetra","foxc_chim","foxc_tetra"),stringsAsFactors = F) 13 | 14 | for (i in 1:4) { 15 | 16 | exp_nm = df_nm[i,"type"] 17 | new_mat_nm = df_nm[i,"mat_nm"] 18 | 19 | cls = colnames(mat@mat)[mat@cell_metadata[colnames(mat@mat),"Experiment_type"] == exp_nm] 20 | cls = cls[!(mat@cell_metadata[cls,"embryo"] %in% ignore_embryos)] 21 | 22 | mat_new = scm_ignore_cells(scmat = mat,ig_cells = cls,reverse = T) 23 | 24 | scdb_add_mat(id = new_mat_nm,mat = mat_new) 25 | 26 | merge_umi_mat_with_wt10(mat_nm = new_mat_nm,new_mat_nm = paste0(new_mat_nm,"_wt10")) 27 | } 28 | 29 | 30 | } -------------------------------------------------------------------------------- /scripts/foxc12/preprocessing/merge_umi_mat_with_wt10_umi_mat.r: -------------------------------------------------------------------------------- 1 | library("tidyverse") 2 | 3 | merge_umi_mat_with_wt10 = function(mat_nm,new_mat_nm) { 4 | mat_nm1 = mat_nm 5 | mat_nm2 = "sing_emb_wt10" 6 | 7 | new_mat_id = new_mat_nm 8 | 9 | mat1 = scdb_mat(mat_nm1) 10 | mat2 = scdb_mat(mat_nm2) 11 | 12 | mat1@cell_metadata$cell_type = as.character(mat1@cell_metadata$cell_type) 13 | 14 | ignored_cls = union(mat1@ignore_cells,mat2@ignore_cells) 15 | ignored_genes = union(mat1@ignore_genes,mat2@ignore_genes) 16 | 17 | mat_ls = c(mat1,mat2) 18 | 19 | mat_all = rbind(cbind(mat1@mat,mat1@ignore_cmat),cbind(mat1@ignore_gmat,mat1@ignore_gcmat)) 20 | md_all = mat1@cell_metadata 21 | 22 | 23 | i = 2 24 | mat = mat_ls[[i]] 25 | mat_tmp = rbind(cbind(mat@mat,mat@ignore_cmat),cbind(mat@ignore_gmat,mat@ignore_gcmat)) 26 | mat_tmp = mat_tmp[rownames(mat_all),] 27 | mat_all = cbind(mat_all,mat_tmp) 28 | md_all = bind_rows(md_all,mat@cell_metadata) 29 | 30 | rownames(md_all) = md_all$cell 31 | md_all[colnames(mat2@mat),"cell_type"] = "wt10" 32 | 33 | 34 | 35 | mat_new = scm_new_matrix(mat = mat_all, stat_type = "umi",cell_metadata = md_all) 36 | mat_new = scm_ignore_cells(scmat = mat_new,ig_cells = ignored_cls) 37 | mat_new = scm_ignore_genes(scmat = mat_new,ig_genes = ignored_genes) 38 | 39 | 40 | scdb_add_mat(id = new_mat_id,mat = mat_new) 41 | 42 | } 43 | -------------------------------------------------------------------------------- /scripts/foxc12/preprocessing/summary_preprocessing.r: -------------------------------------------------------------------------------- 1 | # summary of preprocessing of Foxc Control plates 2 | 3 | 4 | # 1. cells with #UMIs < 1000 were removed 5 | # 2. the following genes were removed from the single cell matrix 6 | # mat = scdb_mat(mat_name) 7 | # nms = c(rownames(mat@mat), rownames(mat@ignore_gmat)) 8 | # bad_genes = c(grep("^mt\\-", nms, v=T), "Neat1",grep("ERCC", nms,v=T), "Atpase6", "Xist", "Malat1", "Cytb","AK018753","AK140265","AK163440","DQ539915") 9 | # mcell_mat_ignore_genes(mat_name, mat_name,bad_genes,reverse=F) 10 | # mcell_mat_ignore_small_cells(mat_name, mat_name, 1000) 11 | 12 | # 3. duplicate cells (from FACS sorting) were removed 13 | # 4. cells from empty wells were removed (should anyway have less than 1000 UMIs) 14 | 15 | # 5. cells were gated based on FACS GFP fluorescent signal using the script foxc_control_gating.r 16 | # source("scripts/foxc12/preprocessing/foxc_control_gating.r") 17 | # gating_of_foxc_control("figs/paper_figs/fig_s6/fig_s6b") 18 | 19 | # 6. Extraembryonic ectoderm and parietal endoderm cells were removed from analysis 20 | # see code in scripts/foxc12/preprocessing/foxc_control_remove_exe_ectoderm_and_parietal_endo_cls.r 21 | 22 | # 7. Single cell matrix was split into four submatrices, one for Foxc/Control Chimera/Tetraploid 23 | # Each matrix was merged with the wildtype sing_emb_wt10 matrix 24 | # see scripts/foxc12/preprocessing/foxc_control_split_into_experiment_type.r -------------------------------------------------------------------------------- /scripts/foxc12/transfer_cell_type_annotation.r: -------------------------------------------------------------------------------- 1 | 2 | transfer_color_chimera_tetraploid = function(mat_nm,tag = "query") { 3 | 4 | cgraph_id = mat_nm 5 | mat = scdb_mat(mat_nm) 6 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 7 | 8 | host_cls = colnames(mat@mat)[( mat@cell_metadata[colnames(mat@mat),"cell_type"] == "host" )] 9 | injected_cls = colnames(mat@mat)[( mat@cell_metadata[colnames(mat@mat),"cell_type"] %in% c("KO","control") )] 10 | wt10_cls = names(mc_wt@mc) 11 | 12 | 13 | query_cls = c(host_cls,injected_cls) 14 | query_cls_md = c(rep("host",length(host_cls)),rep(tag,length(injected_cls))) 15 | names(query_cls_md) = query_cls 16 | 17 | ref_cls_color = mc_wt@colors[mc_wt@mc[wt10_cls]] 18 | names(ref_cls_color) = wt10_cls 19 | 20 | cmp_annot = color_query_by_reference(cgraph_id = cgraph_id,query_cls_md = query_cls_md,ref_cls_color = ref_cls_color) 21 | 22 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s",mat_nm) 23 | if(!dir.exists(data_dir)) { 24 | dir.create(data_dir) 25 | } 26 | data_dir = sprintf("data/chimera_tetraploid_analysis/%s/color_annotation",mat_nm) 27 | if(!dir.exists(data_dir)) { 28 | dir.create(data_dir) 29 | } 30 | print(data_dir) 31 | save(cmp_annot,file = sprintf("%s/cmp_annot.Rda",data_dir)) 32 | 33 | } 34 | 35 | 36 | 37 | 38 | color_query_by_reference = function(cgraph_id, 39 | query_cls_md, 40 | ref_cls_color, 41 | max_knn_cgraph = 50, 42 | knn_color_space = 1, 43 | color_query_cls_with_few_ref_neighbors = T, 44 | threshold_few_ref_neighbors = 0.5) { 45 | 46 | cgraph = scdb_cgraph(cgraph_id) 47 | 48 | query_cls_f = intersect(names(query_cls_md),cgraph@nodes) 49 | 50 | ref_cls = names(ref_cls_color) 51 | ref_cls_f = intersect(ref_cls,cgraph@nodes) 52 | all_cls = c(query_cls_f,ref_cls_f) 53 | 54 | cls1 = levels(cgraph@edges$mc1) 55 | cls2 = levels(cgraph@edges$mc2) 56 | 57 | # add color and type to 58 | cl_to_type = c(query_cls_md[query_cls_f],rep("ref",length(ref_cls_f))) 59 | names(cl_to_type) = all_cls 60 | 61 | type_cls1 = cl_to_type[cls1] 62 | names(type_cls1) = cls1 63 | type_cls2 = cl_to_type[cls2] 64 | names(type_cls2) = cls2 65 | 66 | color_cls1 = ref_cls_color[cls1] 67 | names(color_cls1) = cls1 68 | color_cls2 = ref_cls_color[cls2] 69 | names(color_cls2) = cls2 70 | 71 | cgraph@edges$type1 = type_cls1[cgraph@edges$mc1] 72 | cgraph@edges$type2 = type_cls2[cgraph@edges$mc2] 73 | cgraph@edges$color1 = color_cls1[cgraph@edges$mc1] 74 | cgraph@edges$color2 = color_cls2[cgraph@edges$mc2] 75 | 76 | 77 | # report on cells which have few reference neighbors 78 | cl_vs_type = table(cgraph@edges$mc1,cgraph@edges$type2) 79 | cl_vs_type = cl_vs_type[all_cls,] 80 | cls_with_few_neighbors = rownames(cl_vs_type)[rowSums(cl_vs_type) <10] 81 | cl_vs_type_n = cl_vs_type[rowSums(cl_vs_type) > 0,] 82 | cl_vs_type_n = cl_vs_type_n/rowSums(cl_vs_type_n) 83 | 84 | cls_with_low_fraction_of_ref_cell_neighbors = rownames(cl_vs_type_n)[1 - cl_vs_type_n[,"ref"] > threshold_few_ref_neighbors] 85 | 86 | if(!color_query_cls_with_few_ref_neighbors) { 87 | query_cls_f = setdiff(query_cls_f,cls_with_low_fraction_of_ref_cell_neighbors) 88 | } 89 | 90 | # color annotation of query_cls - restrict to max_knn_cgraph neighbors in the cgraph 91 | 92 | f = cgraph@edges$w > (1 - max_knn_cgraph/100) 93 | graph_f = cgraph@edges[f,] 94 | 95 | 96 | # calculate color contingency table for query vs ref and ref vs ref 97 | cl_vs_color = table(graph_f$mc1,graph_f$color2) 98 | cl_vs_color_ref = cl_vs_color[ref_cls_f,] 99 | cl_vs_color_query = cl_vs_color[query_cls_f,] 100 | cl_vs_color_ref = cl_vs_color_ref[rowSums(cl_vs_color_ref) > 0,] 101 | cl_vs_color_query = cl_vs_color_query[rowSums(cl_vs_color_query) > 0,] 102 | 103 | ref_ref_col_cor = tgs_cor_knn(x = t(cl_vs_color_ref),y = t(cl_vs_color_ref),knn = (knn_color_space + 1)) 104 | query_ref_col_cor = tgs_cor_knn(x = t(cl_vs_color_query),y = t(cl_vs_color_ref),knn = knn_color_space) 105 | 106 | 107 | sample_nn = sample(2:(knn_color_space + 1),size = nrow(cl_vs_color_ref),replace = T) 108 | sample_ind = c(0:(nrow(cl_vs_color_ref)-1))*(knn_color_space + 1) + sample_nn 109 | 110 | original_cl = ref_ref_col_cor$col1[sample_ind] 111 | ref_sampled_color = ref_cls_color[as.character(ref_ref_col_cor$col2[sample_ind])] 112 | ref_original_color = ref_cls_color[as.character(original_cl)] 113 | names(ref_original_color) = original_cl 114 | names(ref_sampled_color) = original_cl 115 | 116 | 117 | # sample color for query cls 118 | sample_nn = sample(1:knn_color_space,size = nrow(cl_vs_color_query),replace = T) 119 | sample_ind = c(0:(nrow(cl_vs_color_query)-1))*knn_color_space + sample_nn 120 | 121 | query_cls_col = ref_cls_color[as.character(query_ref_col_cor$col2[sample_ind])] 122 | names(query_cls_col) = as.character(query_ref_col_cor$col1[sample_ind]) 123 | 124 | cmp_col_annot = list(query_cls_col = query_cls_col, 125 | cls_with_low_fraction_of_ref_cell_neighbors = cls_with_low_fraction_of_ref_cell_neighbors, 126 | ref_cls_orig_color = ref_original_color, 127 | ref_cls_sampled_color = ref_sampled_color, 128 | cls_with_few_neighbors = cls_with_few_neighbors, 129 | max_knn_cgraph = max_knn_cgraph, 130 | knn_color_space = knn_color_space, 131 | query_cls_col_dist = cl_vs_color_query, 132 | ref_cls_col_dist = cl_vs_color_ref, 133 | cl_type_dist = cl_vs_type) 134 | 135 | return(cmp_col_annot) 136 | } 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | -------------------------------------------------------------------------------- /scripts/foxc12/transfer_time_annotation.r: -------------------------------------------------------------------------------- 1 | library("Matrix") 2 | library("qlcMatrix") 3 | 4 | get_query_time_dist = function(query_cls_md,atlas_time,graph_id) { 5 | 6 | cgraph = scdb_cgraph(graph_id) 7 | 8 | query_cls = intersect(names(query_cls_md),cgraph@nodes) 9 | atlas_cls = intersect(names(atlas_time),cgraph@nodes) 10 | atlas_time = atlas_time[atlas_cls] 11 | query_cls_md = query_cls_md[query_cls] 12 | 13 | 14 | cell_names = c(1:length(cgraph@nodes)) 15 | names(cell_names) = cgraph@nodes 16 | 17 | graph = cgraph@edges 18 | graph$mc1 = as.factor(graph$mc1) 19 | graph$mc2 = as.factor(graph$mc2) 20 | levels(graph$mc1) = cell_names[levels(graph$mc1)] 21 | levels(graph$mc2) = cell_names[levels(graph$mc2)] 22 | 23 | # adjacency matrix 24 | knn_mat = sparseMatrix(as.numeric(graph$mc1),as.numeric(graph$mc2),x = graph$w) 25 | colnames(knn_mat) = cgraph@nodes 26 | rownames(knn_mat) = cgraph@nodes 27 | 28 | knn_mat_f = knn_mat[query_cls,atlas_cls] 29 | 30 | a = rowMax(X = knn_mat_f,which = T) 31 | time_match_ind = summary(a$which) 32 | 33 | query_time_match = atlas_time[time_match_ind$j] 34 | names(query_time_match) = query_cls[time_match_ind$i] 35 | 36 | query_time_dist = table(query_cls_md[time_match_ind$i],atlas_time[time_match_ind$j]) 37 | tmp = matrix(0,nrow = nrow(query_time_dist),ncol = length(unique(atlas_time))) 38 | rownames(tmp) = rownames(query_time_dist) 39 | colnames(tmp) = sort(unique(atlas_time)) 40 | tmp[rownames(query_time_dist),colnames(query_time_dist)] = query_time_dist 41 | query_time_dist = tmp 42 | 43 | 44 | return(list(query_time_dist = query_time_dist,query_time_match = query_time_match)) 45 | } 46 | 47 | get_atlas_time_dist = function(query_cls_md,atlas_time,graph_id) { 48 | 49 | 50 | # query_cls_md is a named vector with query_cls as names and the embryo annotation or sth similar as value 51 | # atlas_time is a named vector giving the atlas_time for each atlas cell 52 | 53 | cgraph = scdb_cgraph(graph_id) 54 | 55 | atlas_cls = intersect(names(atlas_time),cgraph@nodes) 56 | atlas_time = atlas_time[atlas_cls] 57 | 58 | 59 | cell_names = c(1:length(cgraph@nodes)) 60 | names(cell_names) = cgraph@nodes 61 | 62 | graph = cgraph@edges 63 | graph$mc1 = as.factor(graph$mc1) 64 | graph$mc2 = as.factor(graph$mc2) 65 | levels(graph$mc1) = cell_names[levels(graph$mc1)] 66 | levels(graph$mc2) = cell_names[levels(graph$mc2)] 67 | 68 | knn_mat = sparseMatrix(as.numeric(graph$mc1),as.numeric(graph$mc2),x = graph$w) 69 | colnames(knn_mat) = cgraph@nodes 70 | rownames(knn_mat) = cgraph@nodes 71 | 72 | knn_mat_f = knn_mat[atlas_cls,atlas_cls] 73 | for(i in unique(atlas_time)) { 74 | f = atlas_time == i 75 | knn_mat_f[f,f] = 0 76 | } 77 | 78 | a = rowMax(X = knn_mat_f,which = T) 79 | time_match_ind = summary(a$which) 80 | 81 | atlas_time_match = atlas_time[time_match_ind$j] 82 | names(atlas_time_match) = atlas_cls[time_match_ind$i] 83 | 84 | q_time_match_tmp = table(atlas_time[time_match_ind$i], atlas_time_match) 85 | q_time_match = matrix(0,nrow = length(unique(atlas_time)),ncol = length(unique(atlas_time))) 86 | rownames(q_time_match) = sort(unique(atlas_time)) 87 | colnames(q_time_match) = sort(unique(atlas_time)) 88 | q_time_match[rownames(q_time_match_tmp),colnames(q_time_match_tmp)] = q_time_match_tmp 89 | 90 | return(list(atlas_time_dist = q_time_match,atlas_time_match = atlas_time_match)) 91 | } 92 | 93 | get_query_and_atlas_time_dist = function(query_cls_md,atlas_time,graph_id) { 94 | 95 | # query_cls_md is a named vector with query_cls as names and the embryo annotation or sth similar as value 96 | # atlas_time is a named vector giving the atlas_time per for each atlas cell 97 | 98 | cgraph = scdb_cgraph(graph_id) 99 | 100 | query_cls = intersect(names(query_cls_md),cgraph@nodes) 101 | atlas_cls = intersect(names(atlas_time),cgraph@nodes) 102 | atlas_time = atlas_time[atlas_cls] 103 | query_cls_md = query_cls_md[query_cls] 104 | 105 | 106 | 107 | cell_names = c(1:length(cgraph@nodes)) 108 | names(cell_names) = cgraph@nodes 109 | 110 | graph = cgraph@edges 111 | graph$mc1 = as.factor(graph$mc1) 112 | graph$mc2 = as.factor(graph$mc2) 113 | levels(graph$mc1) = cell_names[levels(graph$mc1)] 114 | levels(graph$mc2) = cell_names[levels(graph$mc2)] 115 | 116 | knn_mat = sparseMatrix(as.numeric(graph$mc1),as.numeric(graph$mc2),x = graph$w) 117 | colnames(knn_mat) = cgraph@nodes 118 | rownames(knn_mat) = cgraph@nodes 119 | 120 | knn_mat_f = knn_mat[atlas_cls,atlas_cls] 121 | for(i in unique(atlas_time)) { 122 | f = atlas_time == i 123 | knn_mat_f[f,f] = 0 124 | } 125 | 126 | time_match = apply(knn_mat_f,1,which.max) 127 | time_match = atlas_time[time_match] 128 | 129 | q_time_match_tmp = table(atlas_time, time_match) 130 | q_time_match = matrix(0,nrow = length(unique(atlas_time)),ncol = length(unique(atlas_time))) 131 | rownames(q_time_match) = sort(unique(atlas_time)) 132 | colnames(q_time_match) = sort(unique(atlas_time)) 133 | q_time_match[rownames(q_time_match_tmp),colnames(q_time_match_tmp)] = q_time_match_tmp 134 | 135 | 136 | query_time_match = apply(knn_mat[query_cls,atlas_cls],1,which.max) 137 | query_time_dist = table(query_cls_md,atlas_time[query_time_match]) 138 | tmp = matrix(0,nrow = nrow(query_time_dist),ncol = length(unique(atlas_time))) 139 | rownames(tmp) = rownames(query_time_dist) 140 | colnames(tmp) = sort(unique(atlas_time)) 141 | tmp[rownames(query_time_dist),colnames(query_time_dist)] = query_time_dist 142 | query_time_dist = tmp 143 | 144 | 145 | return(list(atlas_time_dist = q_time_match,query_time_dist = query_time_dist)) 146 | } 147 | 148 | 149 | get_best_time_match = function(query_time_dist,atlas_time_dist) { 150 | 151 | query_time_dist_cum = apply(query_time_dist,1,cumsum) 152 | atlas_time_dist_cum = apply(atlas_time_dist,1,cumsum) 153 | 154 | query_to_atlas = tgs_cor(query_time_dist_cum,atlas_time_dist_cum) 155 | 156 | atlas_best_match = apply(query_to_atlas,1,which.max) 157 | return(query_to_atlas) 158 | } 159 | 160 | 161 | 162 | -------------------------------------------------------------------------------- /scripts/generate_mc_mgraph_network/annot_mc_by_flows.r: -------------------------------------------------------------------------------- 1 | 2 | cluster_metacells_by_flow = function(mct_id = "sing_emb_wt10",K = 65) { 3 | 4 | clst_flows = mctnetwork_clust_flows(mct_id, K) 5 | 6 | fclst = clst_flows$clust 7 | mc_ord = clst_flows$hc$order 8 | 9 | mc_cluster = data.frame(mc = mc_ord,cluster = fclst[mc_ord],mc_rank = c(1:length(mc_ord))) 10 | return(mc_cluster) 11 | } -------------------------------------------------------------------------------- /scripts/generate_mc_mgraph_network/gen_mc.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | source("scripts/generate_mc_mgraph_network/generic_mc.r") 3 | 4 | generate_mc_wt10 = function() { 5 | 6 | tgconfig::override_params(config_file = "config/sing_emb.yaml",package = "metacell") 7 | 8 | scdb_init("scrna_db",force_reinit = T) 9 | scfigs_init("figs") 10 | 11 | # first iteration without out filtered genes 12 | # generate_mc(mat_nm, color_key=NA,recompute = T) 13 | # then remove genes from list of feature genes by filter bad gene modules 14 | # remove bad genes in second iteration 15 | bad_genes = read.table("data/external_data/sing_emb_wt10.bad_genes.txt",sep = "\t",stringsAsFactors = F) 16 | bad_genes = bad_genes[,1] 17 | 18 | mat_nm = "sing_emb_wt10" 19 | 20 | generate_mc(mat_nm, color_key=NA,add_bad_genes = bad_genes,recompute = T) 21 | 22 | } 23 | -------------------------------------------------------------------------------- /scripts/generate_mc_mgraph_network/gen_mc2d_umap.r: -------------------------------------------------------------------------------- 1 | library("umap") 2 | # generate mgraph and mc2d object 3 | 4 | gen_mc2d_umap_wt10 = function() { 5 | 6 | mc_id = "sing_emb_wt10_recolored" 7 | mgraph = scdb_mgraph("sing_emb_wt10_recolored_logist") 8 | mc = scdb_mc(mc_id) 9 | gset = scdb_gset("sing_emb_wt10") 10 | graph_id = "sing_emb_wt10" 11 | feat_genes = names(gset@gene_set) 12 | # next generate 2d projection using umap 13 | 14 | tgconfig::set_param(param = "mcell_mc2d_max_confu_deg",value = 4,package = "metacell") 15 | 16 | mc2d_id = "sing_emb_wt10_recolored_umap" 17 | symmetrize = F 18 | umap_mgraph = F 19 | 20 | uconf = umap.defaults 21 | #uconf$n_neighbors=6 22 | #uconf$min_dist=0.9 23 | uconf$n_neighbors=4 24 | uconf$min_dist =0.9 25 | uconf$bandwidth=1.3 26 | 27 | mc_xy = mc2d_comp_graph_coord_umap(mc, feat_genes, mgraph@mgraph, uconf, umap_mgraph) 28 | xy = mc2d_comp_cell_coord(mc_id = mc_id,graph_id = graph_id, mgraph = mgraph@mgraph, cl_xy = mc_xy, symmetrize=symmetrize) 29 | scdb_add_mc2d(mc2d_id, tgMC2D(mc_id, mc_xy$mc_x, mc_xy$mc_y, xy$x, xy$y, mgraph@mgraph)) 30 | 31 | mcell_mc2d_plot(mc2d_id) 32 | } 33 | 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /scripts/generate_mc_mgraph_network/gen_mgraph.r: -------------------------------------------------------------------------------- 1 | 2 | 3 | generate_mgraph_wt10 = function() { 4 | tgconfig::override_params(config_file = "config/sing_emb_mgraph.yaml",package = "metacell") 5 | # generate mgraph and mc2d object 6 | 7 | # generate mgraph using logistic distance 8 | 9 | feat_gset = "sing_emb_wt10" 10 | mc_id = "sing_emb_wt10_recolored" 11 | #mgraph_id = paste0(mc_id,"_logist") 12 | mgraph_id = "sing_emb_wt10_recolored_logist" 13 | logist_loc = 1 14 | logist_scale = 0.2 15 | logist_eps = 4e-5 16 | max_d_fold = 3 17 | tgconfig::set_param(param = "mcell_mgraph_max_confu_deg",value = 4,package = "metacell") 18 | 19 | 20 | mc = scdb_mc(mc_id) 21 | gset = scdb_gset(feat_gset) 22 | feat_genes = names(gset@gene_set) 23 | 24 | mgraph = mgraph_comp_logist(mc = mc, genes = feat_genes, loc = logist_loc,scale = logist_scale, eps = logist_eps, max_d_fold = max_d_fold) 25 | 26 | scdb_add_mgraph(id = mgraph_id,mgraph = tgMCManifGraph(mc_id = mc_id,mgraph = mgraph)) 27 | 28 | } 29 | 30 | -------------------------------------------------------------------------------- /scripts/generate_mc_mgraph_network/gen_network.r: -------------------------------------------------------------------------------- 1 | 2 | library("metacell") 3 | scdb_init("scrna_db/", force_reinit=T) 4 | source("scripts/generate_paper_figures/plot_network.r") 5 | #tgconfig::override_params("config/sing_emb.yaml","metacell") 6 | 7 | 8 | build_sing_emb_wt10_network = function(net_id = "sing_emb_wt10") { 9 | 10 | mat_id = "sing_emb_wt10" 11 | mc_id = "sing_emb_wt10_recolored" 12 | mgraph_id = "sing_emb_wt10_recolored_logist" 13 | #net_id = "sing_emb_wt10" 14 | fig_dir = "figs/sing_emb_wt10.net" 15 | if(!dir.exists(fig_dir)) { 16 | dir.create(fig_dir) 17 | } 18 | 19 | mc = scdb_mc(mc_id) 20 | #capacity_var_factor = rep(0.25,ncol(mc@e_gc)) 21 | capacity_var_factor = rep(0.4,ncol(mc@e_gc)) 22 | 23 | # next define the mc_leak parameter 24 | 25 | mc_leak = get_mc_leak_parameter_endo(mc_id = mc_id,leak_emb_endo = 0.12,leak_exe_endo = 0.17) 26 | 27 | build_net(mat_id = mat_id, 28 | mc_id = mc_id, 29 | mgraph_id = mgraph_id, 30 | net_id = net_id, 31 | fig_dir = fig_dir, 32 | age_field = "age_group",mc_leak = mc_leak, 33 | capacity_var_factor = capacity_var_factor, 34 | k_norm_ext_cost = 1, 35 | k_ext_norm_cost = 1, 36 | k_ext_ext_cost = 1) 37 | 38 | 39 | 40 | } 41 | 42 | 43 | 44 | build_net = function(mat_id,mc_id,mgraph_id,net_id,fig_dir, 45 | age_field = "age_group", 46 | mc_leak = NULL, 47 | capacity_var_factor = NULL, 48 | t_exp = 1,T_cost = 1e+5, 49 | flow_tolerance = 0.01, 50 | network_color_ord = NULL, 51 | off_capacity_cost1 = 1, 52 | off_capacity_cost2 = 1000, 53 | k_norm_ext_cost = 2, 54 | k_ext_norm_cost = 2, 55 | k_ext_ext_cost = 100) { 56 | 57 | mat = scdb_mat(mat_id) 58 | mc = scdb_mc(mc_id) 59 | mgraph = scdb_mgraph(mgraph_id) 60 | md = mat@cell_metadata 61 | cell_time = md[,age_field] 62 | names(cell_time) = rownames(md) 63 | 64 | if(is.null(mc_leak)) { 65 | leak = rep(0,max(mc@mc)) 66 | } 67 | if(is.null(capacity_var_factor)) { 68 | capacity_var_factor = rep(0.4,max(mc@mc)) 69 | } 70 | 71 | if(is.null(mc_leak)) { 72 | f_extra = mc@colors == "#F6BFCB" | mc@colors == "#7F6874" 73 | f_endo = mc@colors == "#0F4A9C" | mc@colors == "#EF5A9D" | mc@colors == "#F397C0" | mc@colors == "#c19f70" 74 | f_pgc = mc@colors == "#FACB12" 75 | leak = rep(0, max(mc@mc)) 76 | leak[f_extra] = 0.17 77 | leak[f_endo] = 0.12 78 | } else { 79 | leak = mc_leak 80 | } 81 | 82 | 83 | mcell_new_mctnetwork(net_id = net_id, 84 | mc_id = mc_id, 85 | mgraph_id = mgraph_id, 86 | cell_time = cell_time) 87 | mct = scdb_mctnetwork(net_id) 88 | 89 | #computing manifold costs (based on mgraph distances 90 | mct = mctnetwork_comp_manifold_costs(mct,t_exp=t_exp, T_cost=T_cost) 91 | message("computed manifold costs") 92 | 93 | #generating network structure 94 | mct = mctnetwork_gen_network(mct, mc_leak = leak,capacity_var_factor = capacity_var_factor, 95 | k_norm_ext_cost = k_norm_ext_cost,k_ext_norm_cost = k_ext_norm_cost,k_ext_ext_cost = k_ext_ext_cost, 96 | off_capacity_cost1 = off_capacity_cost1,off_capacity_cost2 = off_capacity_cost2) 97 | message("generated network") 98 | #solving the flow problem 99 | mct = mctnetwork_gen_mincost_flows(mct, flow_tolerance = flow_tolerance) 100 | message("solved network flow problem") 101 | 102 | #compute propagatation forward and background 103 | mct = mctnetwork_comp_propagation(mct) 104 | 105 | #adding back the object with the network and flows 106 | scdb_add_mctnetwork(net_id, mct) 107 | 108 | mct = scdb_mctnetwork(net_id) 109 | 110 | #to plot the "big" network diagram 111 | if(is.null(network_color_ord)) { 112 | network_color_ord = mc@color_key$color 113 | } 114 | 115 | 116 | if(!dir.exists(fig_dir)) { 117 | dir.create(fig_dir) 118 | } 119 | 120 | mm_mctnetwork_plot_net(mct_id = net_id, 121 | fn = sprintf("%s/%s_net.png",fig_dir,net_id),w = 3500, h = 4900, 122 | dx_back = 0, 123 | colors_ordered=network_color_ord,plot_pdf = F, 124 | show_axes = F, 125 | show_over_under_flow = F,mc_cex = 1,max_lwd = 15,edge_w_scale = 2e-4, 126 | plot_mc_ids = F) 127 | 128 | 129 | message("plotted the network") 130 | } 131 | 132 | 133 | 134 | get_mc_leak_parameter_endo = function(mc_id,leak_emb_endo,leak_exe_endo) { 135 | 136 | mc = scdb_mc(mc_id) 137 | 138 | legc = log2(mc@e_gc + 1e-5) 139 | 140 | mc_leak = rep(0,ncol(legc)) 141 | 142 | # first separation embryonic endoderm (including node/notochord) from meso/-ectoderm 143 | x1 = -16 144 | y1 = -12 145 | x2 = -12 146 | y2 = -16 147 | 148 | b_emb_endo = (y2 - y1)/(x2 - x1) 149 | a_emb_endo = (y1*x2 - y2*x1)/(x2 - x1) 150 | 151 | f_endo = (legc["Foxa1",] > a_emb_endo + b_emb_endo*legc["Foxa2",]) 152 | 153 | mc_leak[f_endo] = leak_emb_endo 154 | 155 | # second separation extraembryonic from embryonic endoderm uses Apoe 156 | x1 = -8.4 157 | y1 = -14 158 | x2 = -11 159 | y2 = -8.4 160 | 161 | b_exe_endo = (y2 - y1)/(x2 - x1) 162 | a_exe_endo = (y1*x2 - y2*x1)/(x2 - x1) 163 | 164 | f_exe = (legc["Ttr",] > a_exe_endo + b_exe_endo*legc["Apoe",]) 165 | 166 | mc_leak[f_exe] = leak_exe_endo 167 | 168 | return(mc_leak) 169 | } 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | -------------------------------------------------------------------------------- /scripts/generate_mc_mgraph_network/generic_mc.r: -------------------------------------------------------------------------------- 1 | 2 | 3 | generate_mc = function(name, 4 | color_key, 5 | recompute=F, Knn = 100, Knn_core=30, 6 | min_mc_sz=20, 7 | T_vm = 0.1, 8 | add_bad_genes = NULL) 9 | { 10 | mat_nm = name 11 | mc_nm = sprintf("%s_bs500",name) 12 | mcf_nm = sprintf("%s_bs500f", name) 13 | coc_nm = sprintf("%s_500",name) 14 | 15 | db_fn = .scdb_base 16 | if(recompute | !file.exists(sprintf("%s/mc.%s.Rda", db_fn, mcf_nm))) { 17 | mcell_add_gene_stat(mat_nm, mat_nm, force=T) 18 | 19 | mcell_gset_filter_varmean(mat_nm, mat_nm, T_vm=T_vm, force_new=T) 20 | mcell_gset_filter_cov(mat_nm, mat_nm, T_tot=50, T_top3=3) 21 | 22 | gset = scdb_gset(mat_nm) 23 | nms = names(gset@gene_set) 24 | #bad gene that will be removed from list of genes that helps to mark metacell 25 | bad_g = c(grep("^Rpl",nms,v=T),grep("^Gm",nms,v=T),grep("Rps",nms,v=T)) 26 | if(!is.null(add_bad_genes)) { 27 | bad_g = c(bad_g, add_bad_genes) 28 | } 29 | gset_f = gset_new_restrict_nms(gset=gset, bad_g, inverse=T, "feat filt") 30 | scdb_add_gset(mat_nm, gset_f) 31 | 32 | mcell_add_cgraph_from_mat_bknn(mat_id=mat_nm, 33 | gset_id = mat_nm, 34 | graph_id=mat_nm, 35 | K=Knn, 36 | dsamp=T) 37 | 38 | mcell_coclust_from_graph_resamp(coc_id = coc_nm, graph_id = mat_nm,min_mc_size=15, p_resamp=0.75, n_resamp=500) 39 | 40 | mcell_mc_from_coclust_balanced(mc_nm, coc_nm, mat_nm, K=Knn_core, min_mc_size=min_mc_sz, alpha=2) 41 | 42 | mcell_plot_outlier_heatmap(mc_nm, mat_nm, 3) 43 | mcell_mc_split_filt(new_mc_id=mcf_nm, mc_nm, mat_nm,T_lfc=3, plot_mats=F) 44 | } 45 | 46 | # marks_colors = read.table("/home/zoharmuk/proj/EarlyDevMeth/config/e7_mc_colorize.txt", sep="\t", h=T, stringsAsFactors=F) 47 | if(!is.na(color_key)) { 48 | marks_colors = read.table(color_key, sep="\t", h=T, stringsAsFactors=F) 49 | mc_colorize(mc_nm, marker_colors=marks_colors) 50 | mc_colorize(mcf_nm, marker_colors=marks_colors) 51 | } 52 | 53 | mcell_gset_from_mc_markers(gset_id=mc_nm, mc_id=mc_nm) 54 | #list of genes that help to simply color the metacells and to lable them at the bottom of the heatmap 55 | mcell_mc_plot_marks(mc_id=mc_nm, gset_id=mc_nm, mat_id=mat_nm) 56 | 57 | mcell_mc2d_force_knn(mc_nm,mc_nm, mat_nm) 58 | mcell_mc2d_plot(mc2d_id=mc_nm) 59 | 60 | mcell_gset_from_mc_markers(gset_id=mcf_nm, mc_id=mcf_nm) 61 | 62 | mcell_mc_plot_marks(mc_id=mcf_nm, gset_id=mcf_nm, mat_id=mat_nm) 63 | 64 | mcell_mc2d_force_knn(mcf_nm,mcf_nm, mat_nm) 65 | mcell_mc2d_plot(mc2d_id=mcf_nm) 66 | 67 | mcell_mc_export_tab(mc_id=mcf_nm,mat_id=mat_nm, gstat_id=mat_nm) 68 | 69 | #mcell_mc_plot_by_factor(mcf_nm, "batch_set_id", mat_id=mat_nm) 70 | #add and erase from here## 71 | mcell_mc2d_plot_by_factor(mcf_nm, mat_nm, "batch_set_id", single_plot = T) 72 | } 73 | 74 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_2.r: -------------------------------------------------------------------------------- 1 | source("scripts/generate_paper_figures/plot_network.r") 2 | 3 | gen_fig_2_plots = function() { 4 | tgconfig::override_params("config/sing_emb.yaml","metacell") 5 | 6 | if(!dir.exists("figs/paper_figs/fig2")) { 7 | dir.create("figs/paper_figs/fig2") 8 | } 9 | 10 | fig_2a() 11 | fig_2b() 12 | fig_2c() 13 | fig_2d() 14 | fig_2d_legend() 15 | fig_2e() 16 | 17 | } 18 | 19 | 20 | 21 | fig_2a = function(plot_pdf = F) { 22 | 23 | 24 | edge_w = 1 25 | 26 | mc = scdb_mc("sing_emb_wt10_recolored") 27 | mc2d = scdb_mc2d("sing_emb_wt10_recolored") 28 | 29 | if(plot_pdf) { 30 | 31 | fn= "figs/paper_figs/fig2/fig_2a.pdf" 32 | 33 | res = 72 34 | mcp_2d_height = 1000 35 | mcp_2d_width = 1000 36 | mcp_2d_cex = 0.8 37 | 38 | pdf(file = fn,width = mcp_2d_width/res,height = mcp_2d_height/res,useDingbats = F) 39 | cols = mc@colors 40 | cols[is.na(cols)] = "gray" 41 | plot(mc2d@sc_x, mc2d@sc_y, pch=19, col=cols[mc@mc[names(mc2d@sc_x)]],axes = F,xlab = "",ylab = "") 42 | fr = mc2d@graph$mc1 43 | to = mc2d@graph$mc2 44 | 45 | dx = mc2d@mc_x[fr]-mc2d@mc_x[to] 46 | dy = mc2d@mc_y[fr]-mc2d@mc_y[to] 47 | segments(mc2d@mc_x[fr], mc2d@mc_y[fr], mc2d@mc_x[to], mc2d@mc_y[to], 48 | lwd=edge_w) 49 | 50 | points(mc2d@mc_x, mc2d@mc_y, cex= 3*mcp_2d_cex, col="black", pch=21, bg=cols) 51 | text(mc2d@mc_x, mc2d@mc_y, 1:length(mc2d@mc_x), cex=mcp_2d_cex) 52 | 53 | dev.off() 54 | } else { 55 | 56 | fn= "figs/paper_figs/fig2/fig_2a.png" 57 | 58 | scl = 3 59 | mcp_2d_height = 1000 60 | mcp_2d_width = 1000 61 | mcp_2d_cex = 0.8 62 | 63 | png(file = fn,width = mcp_2d_width*scl,height = mcp_2d_height*scl) 64 | cols = mc@colors 65 | cols[is.na(cols)] = "gray" 66 | plot(mc2d@sc_x, mc2d@sc_y, pch=19, col=cols[mc@mc[names(mc2d@sc_x)]],axes = F,xlab = "",ylab = "",cex = scl) 67 | fr = mc2d@graph$mc1 68 | to = mc2d@graph$mc2 69 | 70 | dx = mc2d@mc_x[fr]-mc2d@mc_x[to] 71 | dy = mc2d@mc_y[fr]-mc2d@mc_y[to] 72 | segments(mc2d@mc_x[fr], mc2d@mc_y[fr], mc2d@mc_x[to], mc2d@mc_y[to], 73 | lwd=edge_w*scl) 74 | 75 | points(mc2d@mc_x, mc2d@mc_y, cex= 3*mcp_2d_cex*scl, col="black", pch=21, bg=cols,lwd = scl) 76 | text(mc2d@mc_x, mc2d@mc_y, 1:length(mc2d@mc_x), cex=mcp_2d_cex*scl) 77 | 78 | dev.off() 79 | 80 | 81 | } 82 | 83 | 84 | 85 | 86 | } 87 | 88 | 89 | fig_2b = function(plot_pdf = T) { 90 | 91 | mat = scdb_mat("sing_emb_wt10") 92 | mc = scdb_mc("sing_emb_wt10_recolored") 93 | emb_counts = read.table(file = "data/external_data/counted_embryos/wt10.cell_counts.txt",sep = "\t",stringsAsFactors = F) 94 | emb_time = unique(mat@cell_metadata[names(mc@mc),c("embryo","developmental_time")]) 95 | 96 | emb_to_time = emb_time$developmental_time 97 | names(emb_to_time) = emb_time$embryo 98 | fit_count = lm(log2(emb_counts$cell_count) ~ emb_to_time[emb_counts$embryo]) 99 | 100 | y_fit = 2^fit_count$fitted.values 101 | 102 | if(plot_pdf) { 103 | pdf(file = "figs/paper_figs/fig2/fig_2b.pdf",useDingbats = F) 104 | } else { 105 | png(file = "figs/paper_figs/fig2/fig_2b.png") 106 | } 107 | plot(emb_to_time[emb_counts$embryo],emb_counts$cell_count,pch = 19,log = "y",cex = 2,xlab = "Time",ylab = "Number of cells") 108 | lines(x = emb_to_time[emb_counts$embryo],y = y_fit) 109 | dev.off() 110 | 111 | } 112 | 113 | 114 | fig_2c = function(plot_pdf = FALSE) { 115 | 116 | mat_id = "sing_emb_wt10" 117 | mc_id = "sing_emb_wt10_recolored" 118 | dir_name = "figs/paper_figs/fig2" 119 | m_0 = 0.01 120 | s_0 = 0.005 121 | 122 | m = scdb_mat(mat_id) 123 | mc = scdb_mc(mc_id) 124 | mc2d = scdb_mc2d("sing_emb_wt10_recolored") 125 | 126 | col_to_rank = c(1:nrow(mc@color_key)) 127 | names(col_to_rank) = mc@color_key$color 128 | 129 | if(!dir.exists(dir_name)) { 130 | dir.create(dir_name) 131 | } 132 | 133 | m_genes = c("Mki67","Cenpf","Top2a","Smc4;SMC4","Ube2c","Ccnb1","Cdk1","Arl6ip1","Ankrd11","Hmmr;IHABP","Cenpa;Cenp-a","Tpx2","Aurka","Kif4", "Kif2c","Bub1b","Ccna2", "Kif23","Kif20a","Sgol2","Smc2", "Kif11", "Cdca2","Incenp","Cenpe") 134 | s_genes = c("Pcna", "Rrm2", "Mcm5", "Mcm6", "Mcm4", "Ung", "Mcm7", "Mcm2","Uhrf1", "Orc6", "Tipin") # Npm1 135 | 136 | mc_mean_age = tapply(m@cell_metadata[names(mc@mc),"age_group"],mc@mc,mean) 137 | 138 | s_genes = intersect(rownames(mc@mc_fp), s_genes) 139 | m_genes = intersect(rownames(mc@mc_fp), m_genes) 140 | 141 | tot = colSums(m@mat) 142 | s_tot = colSums(m@mat[s_genes,]) 143 | m_tot = colSums(m@mat[m_genes,]) 144 | 145 | s_score = s_tot/tot 146 | m_score = m_tot/tot 147 | 148 | f = (m_score < m_0 * (1- s_score/s_0)) 149 | 150 | mc_cc_tab = table(mc@mc, f[names(mc@mc)]) 151 | mc_cc = 1+floor(99*mc_cc_tab[,2]/rowSums(mc_cc_tab)) 152 | 153 | 154 | # 2d projection and barplot 155 | shades = colorRampPalette(c("white","lightblue", "blue", "purple"))(100) 156 | 157 | if(plot_pdf) { 158 | pdf(sprintf("%s/fig_2c_2d_projection.pdf", dir_name), w=16, h=16) 159 | plot(mc2d@sc_x, mc2d@sc_y, pch=19, cex=0.8, col=ifelse(f[names(mc2d@sc_x)], "black", "lightgray"),axes = F,xlab = "",ylab = "") 160 | points(mc2d@mc_x, mc2d@mc_y, pch=21, cex=3, bg=shades[mc_cc]) 161 | dev.off() 162 | } else { 163 | png(sprintf("%s/fig_2c_2d_projection.png", dir_name), w=1600, h=1600) 164 | plot(mc2d@sc_x, mc2d@sc_y, pch=19, cex=1, col=ifelse(f[names(mc2d@sc_x)], "black", "lightgray"),axes= F,xlab = "",ylab = "") 165 | points(mc2d@mc_x, mc2d@mc_y, pch=21, cex=5, bg=shades[mc_cc]) 166 | dev.off() 167 | } 168 | 169 | # next the color legend 170 | 171 | shades = colorRampPalette(c("white","lightblue", "blue", "purple"))(101) 172 | 173 | vals = seq(0,1,length.out = 101) 174 | cols = shades 175 | show_vals_ind = c(1,51,101) 176 | 177 | 178 | pdf(file = "figs/paper_figs/fig2/fig_2c_color_legend.pdf",useDingbats = F) 179 | plot.new() 180 | plot.window(xlim=c(0,100), ylim=c(0, length(cols) + 3)) 181 | rect(7, 1:length(cols), 17, 1:length(cols) + 1, border=NA, col=cols) 182 | rect(7, 1, 17, length(cols)+1, col=NA, border = 'black') 183 | 184 | text(19, (1:length(cols))[show_vals_ind] + 0.5, labels=vals[show_vals_ind], pos=4) 185 | dev.off() 186 | 187 | # boxplot per cell type of the cell cycle score 188 | 189 | cc_score_all = m_score + s_score 190 | 191 | all_cls = intersect(colnames(m@mat),names(mc@mc)) 192 | 193 | mc_col_m = rep("gray",length(mc@colors)) 194 | f = mc@colors %in% c("#F6BFCB","#7F6874","#0F4A9C") 195 | mc_col_m[f] = mc@colors[f] 196 | 197 | col_order = c("gray","#F6BFCB","#7F6874","#0F4A9C") 198 | 199 | 200 | 201 | cc_score = split(cc_score_all[all_cls],f = mc_col_m[mc@mc[all_cls]]) 202 | cc_score = cc_score[col_order] 203 | 204 | pdf(sprintf("%s/fig_2c_boxplot.pdf",dir_name),useDingbats = F) 205 | boxplot(cc_score,pch = 19,cex = 0.5,col = names(cc_score),xaxt = 'n') 206 | dev.off() 207 | 208 | } 209 | 210 | fig_2d = function() { 211 | 212 | mc = scdb_mc("sing_emb_wt10_recolored") 213 | network_color_ord = mc@color_key$color 214 | 215 | net_id = "sing_emb_wt10" 216 | 217 | # this function is sourced from scripts/plot_network.r 218 | mm_mctnetwork_plot_net(mct_id = net_id,fn = "figs/paper_figs/fig2/fig_2d_flow_chart.png",w = 3500, h = 4900, 219 | dx_back = 0,colors_ordered=network_color_ord,plot_pdf = F,show_axes = F,show_over_under_flow = F,mc_cex = 1,max_lwd = 15,edge_w_scale = 2e-4, 220 | plot_mc_ids = F) 221 | } 222 | 223 | fig_2d_legend = function() { 224 | mc = scdb_mc("sing_emb_wt10_recolored") 225 | 226 | 227 | pdf("figs/paper_figs/fig2/fig_2d_cell_type_legend.pdf",h = 8,w = 4,useDingbats = F) 228 | plot.new() 229 | legend(x = "topleft",legend = mc@color_key$group,pch = 19, col = mc@color_key$color,pt.cex = 2,cex = 1,bty = 'n') 230 | dev.off() 231 | 232 | 233 | } 234 | 235 | fig_2e = function() { 236 | 237 | fig_dir = "figs/paper_figs/fig2" 238 | mc = scdb_mc("sing_emb_wt10_recolored") 239 | mat = scdb_mat("sing_emb_wt10") 240 | 241 | shades = colorRampPalette(RColorBrewer::brewer.pal(9,"BuPu"))(1000) 242 | 243 | network_color_ord = mc@color_key$color 244 | 245 | mc_rank = mctnetwork_mc_rank_from_color_ord(mct_id = "sing_emb_wt10",colors_ordered = network_color_ord) 246 | mc_ord = order(mc_rank) 247 | 248 | lfp = log2(mc@e_gc + 1e-5) 249 | 250 | marker_genes = c("Utf1","Pou3f1","Sox3","Irx3","Tcfap2a","Six3","Dppa3","Grsf1","Aldh1a2","Tbx6","Eomes","Snai1","Mesp1","Tcf15","Twist1","Smarcd3","Nkx2-5","Tagln","Hand1","Tbx4","Etv2", 251 | "Lmo2","Tal1","Gata1","Noto","Foxa2","Cer1","Sox17","Ttr") 252 | 253 | lfp = lfp- rowMeans(lfp) 254 | lfp = lfp[marker_genes,] 255 | lfp = lfp[,rev(mc_ord)] 256 | 257 | lfp = pmax(lfp,0) 258 | lfp = pmin(lfp,6) 259 | 260 | lfp_n = lfp/rowSums(lfp) 261 | 262 | cluster_gap_position = c(2,3,4,5,6,7,10,13,15,17,20,24,25,28) 263 | 264 | annotation_row = data.frame(ct = mc@colors) 265 | rownames(annotation_row) = c(1:length(mc@colors)) 266 | col_to_col = unique(mc@colors) 267 | names(col_to_col) = col_to_col 268 | annotation_color = list(ct = col_to_col) 269 | 270 | fn = sprintf("%s/fig_2e.pdf",fig_dir) 271 | 272 | pheatmap::pheatmap(mat = t(lfp),cluster_cols = F,cluster_rows = F,color = shades,filename = fn,show_rownames = F, 273 | gaps_col = cluster_gap_position, 274 | column_gap = unit(100,"mm"), 275 | annotation_row = annotation_row,annotation_colors = annotation_color, 276 | annotation_legend = F,annotation_names_row = F,w = 6.4, h = 10.4) 277 | 278 | 279 | 280 | 281 | } 282 | 283 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_7.r: -------------------------------------------------------------------------------- 1 | # fig 7 plots 2 | source("scripts/generate_paper_figures/plot_3d_vein.r") 3 | 4 | gen_fig_7_plots = function() { 5 | 6 | if(!dir.exists("figs/paper_figs/fig7")) { 7 | dir.create("figs/paper_figs/fig7") 8 | } 9 | 10 | fig_7a() 11 | fig_7b() 12 | fig_7c() 13 | 14 | } 15 | 16 | 17 | fig_7a = function(plot_pdf = T) { 18 | 19 | fig_dir = "figs/paper_figs/fig7" 20 | if(!dir.exists(fig_dir)) { 21 | dir.create(fig_dir) 22 | } 23 | 24 | theiler_stage_order = c("7","8","9","10","10b","10c","11","11b","11c","11d","11e") 25 | 26 | mat = scdb_mat("sing_emb_wt10") 27 | mc = scdb_mc("sing_emb_wt10_recolored") 28 | 29 | emb_age_time = unique(mat@cell_metadata[names(mc@mc),c("embryo","Theiler_Stage","developmental_time","age_group")]) 30 | f = !is.na(emb_age_time$Theiler_Stage) 31 | emb_age_time = emb_age_time[f,] 32 | 33 | emb_age = split(emb_age_time$developmental_time,f = emb_age_time$Theiler_Stage) 34 | 35 | 36 | th_stage_median = round(tapply(emb_age_time$developmental_time,emb_age_time$Theiler_Stage,median),2) 37 | 38 | n_1 = (length(th_stage_median) + 1) %/% 2 39 | n_2 = length(th_stage_median) %/% 2 40 | odd_n = 2*c(1:n_1) - 1 41 | even_n = 2*c(1:n_2) 42 | 43 | if(plot_pdf) { 44 | pdf(sprintf("%s/fig_7a.pdf",fig_dir),w = 7,h = 5) 45 | } else { 46 | png(sprintf("%s/fig_7a.png",fig_dir),w = 1000,h = 500) 47 | } 48 | 49 | out = boxplot(emb_age[theiler_stage_order],horizontal = T,pch = 19,las = 2,xaxt = 'n',ylim = c(6.4,8.1)) 50 | #axis(side = 1,at = c(6.5,6.75,7,7.25,7.5,7.75,8),labels = c(6.5,6.75,7,7.25,7.5,7.75,8) 51 | axis(side = 1,at = th_stage_median[odd_n],labels = th_stage_median[odd_n],cex.axis = 0.5) 52 | axis(side = 1,at = th_stage_median[even_n],labels = th_stage_median[even_n],cex.axis = 0.5) 53 | dev.off() 54 | } 55 | 56 | 57 | fig_7b = function(plot_pdf = T) { 58 | 59 | fig_dir = "figs/paper_figs/fig7/fig_7b" 60 | 61 | w = 300 62 | h = 600 63 | 64 | mat_id = "sing_emb_wt10" 65 | mc_id = "sing_emb_wt10_recolored" 66 | 67 | mat = scdb_mat(mat_id) 68 | mc = scdb_mc(mc_id) 69 | 70 | col_to_rank = c(1:nrow(mc@color_key)) 71 | names(col_to_rank) = mc@color_key$color 72 | 73 | if(!dir.exists(fig_dir)) { 74 | dir.create(fig_dir) 75 | } 76 | 77 | excluded_colors = c("#7F6874","#F6BFCB") 78 | #excluded_colors = c("#7F6874") 79 | 80 | 81 | cls = names(mc@mc)[!(mc@colors[mc@mc] %in% excluded_colors)] 82 | 83 | ct_vs_emb = table(mc@colors[mc@mc[cls]],mat@cell_metadata[cls,"transcriptional_rank"]) 84 | ct_vs_emb = t(t(ct_vs_emb)/colSums(ct_vs_emb)) 85 | 86 | col_ord = order(col_to_rank[rownames(ct_vs_emb)]) 87 | 88 | ct_vs_emb = ct_vs_emb[col_ord,] 89 | 90 | emb_to_age_group = unique(mat@cell_metadata[names(mc@mc),c("age_group","transcriptional_rank")]) 91 | emb_to_age_group = emb_to_age_group[order(emb_to_age_group$transcriptional_rank),] 92 | 93 | 94 | for (age in 1:max(emb_to_age_group$age_group)) { 95 | f = emb_to_age_group$transcriptional_rank[emb_to_age_group$age_group == age] 96 | if(plot_pdf) { 97 | pdf(sprintf("%s/age_%d.pdf",fig_dir,age),w= w/100,h= h/100) 98 | } else { 99 | png(sprintf("%s/age_%d.png",fig_dir,age),w= w,h= h) 100 | } 101 | par(mar = c(0.1,0.1,0.1,0.5)) 102 | barplot(ct_vs_emb[,f], col = rownames(ct_vs_emb),las = 2,horiz = F,axes = F,axisnames = F) 103 | dev.off() 104 | 105 | } 106 | 107 | 108 | 109 | if(plot_pdf) { 110 | pdf(sprintf("%s/fig_7b.pdf",fig_dir),w= 13*w/100,h= h/100) 111 | } else { 112 | png(sprintf("%s/fig_7b.png",fig_dir),w= 13*w,h= h) 113 | } 114 | 115 | layout(matrix(c(1:13),nrow = 1,ncol = 13),w = rep(300,13),h = c(600)) 116 | for (age in 1:max(emb_to_age_group$age_group)) { 117 | f = emb_to_age_group$transcriptional_rank[emb_to_age_group$age_group == age] 118 | par(mar = c(0.1,0.1,0.1,0.3)) 119 | barplot(ct_vs_emb[,f], col = rownames(ct_vs_emb),las = 2,horiz = F,axes = F,axisnames = F) 120 | } 121 | dev.off() 122 | 123 | 124 | 125 | } 126 | 127 | 128 | 129 | 130 | fig_7c = function() { 131 | 132 | # next follows the vein plot 133 | vein_par = read.table("data/external_data/fig7_vein_parameters.txt",sep = "\t",stringsAsFactors = F) 134 | ordered_cols = vein_par$color 135 | 136 | col_persp = vein_par$level 137 | names(col_persp) = vein_par$color 138 | 139 | # the following function is contained in scripts/generate_paper_figures/fig7_vein_all.r 140 | plot_all_veins(ordered_cols = ordered_cols,fig_dir = "figs/paper_figs/fig7",plot_pdf = T,col_persp = col_persp,fn = "fig_7c") 141 | 142 | } 143 | 144 | if(0) { 145 | size_infered_time = function() { 146 | 147 | fig_dir = "figs/paper_figs/fig7/size infered time" 148 | 149 | theiler_stage_order = c("7","8","9","10","10b","10c","11","11b","11c","11d","11e") 150 | 151 | mat = scdb_mat("sing_emb_wt10") 152 | mc = scdb_mc("sing_emb_wt10_recolored") 153 | 154 | emb_age_time = unique(mat@cell_metadata[names(mc@mc),c("embryo","Theiler_Stage","embryo_infered_time","age_group","morphology_rank","area","mouse_type")]) 155 | 156 | f = (emb_age_time$mouse_type != "ICR") & !(is.na(emb_age_time$area)) 157 | emb_age_time = emb_age_time[f,] 158 | f2 = emb_age_time$morphology_rank > 100 & log2(emb_age_time$area) < 16.3 159 | 160 | tmp = smooth.spline(x = emb_age_time[!f2,"morphology_rank"],log2(emb_age_time[!f2,"area"]),spar = 0.8) 161 | png(sprintf("%s/fit_morph_ranks_to_log_area.png",fig_dir)) 162 | plot(emb_age_time[,"morphology_rank"],log2(emb_age_time[,"area"]),pch = 19,ylab = "log2(area)",xlab = "Morphology rank") 163 | lines(x = tmp$x,y = tmp$y) 164 | dev.off() 165 | 166 | emb_age_time2 = unique(mat@cell_metadata[names(mc@mc),c("embryo","Theiler_Stage","embryo_infered_time","age_group","morphology_rank","area","mouse_type")]) 167 | f = !is.na(emb_age_time2$morphology_rank) 168 | emb_age_time2 = emb_age_time2[f,] 169 | emb_age_time2 = emb_age_time2[order(emb_age_time2$morphology_rank),] 170 | 171 | infered_time = (tmp$y - min(tmp$y))/(max(tmp$y) - min(tmp$y))*(8.1 - 6.4) + 6.4 172 | names(infered_time) = tmp$x 173 | 174 | morph_time = c(1:nrow(emb_age_time2)) 175 | for(i in 1:max(tmp$x)) { 176 | if(i %in% tmp$x) { 177 | morph_time[i] = infered_time[as.character(i)] 178 | } else { 179 | last_ind = max(tmp$x[tmp$x < i]) 180 | next_ind = min(tmp$x[tmp$x > i]) 181 | morph_time[i] = ((i - last_ind)*infered_time[as.character(next_ind)] + (next_ind - i)*infered_time[as.character(last_ind)])/(next_ind - last_ind) 182 | } 183 | } 184 | 185 | for(i in (max(tmp$x) + 1):nrow(emb_age_time2)) { 186 | last_ind = max(tmp$x) 187 | scnd_last_ind = max(tmp$x[tmp$x < last_ind]) 188 | 189 | morph_time[i] = morph_time[scnd_last_ind] + (morph_time[last_ind] - morph_time[scnd_last_ind])/(last_ind - scnd_last_ind)*(i - scnd_last_ind) 190 | } 191 | 192 | emb_age_time2$size_infered_time = morph_time 193 | 194 | tmp2 = smooth.spline(x = emb_age_time2[,"morphology_rank"],emb_age_time2[,"embryo_infered_time"],spar = 0.8) 195 | png(sprintf("%s/fit_morph_rank_to_transcriptional_infered_time.png",fig_dir)) 196 | plot(emb_age_time2[,"morphology_rank"],emb_age_time2[,"embryo_infered_time"],pch = 19,ylab = "Transcriptional time",xlab = "Morphology rank") 197 | lines(x = tmp2$x,y = tmp2$y) 198 | dev.off() 199 | 200 | transcriptional_time = tmp$y 201 | 202 | png(sprintf("%s/compare_size_time_to_transcripional_time.png",fig_dir)) 203 | plot(x = emb_age_time2$morphology_rank,y = tmp2$y,type = "l",ylab = "Infered time",xlab = "Morphology rank", col = "red") 204 | lines(x = emb_age_time2$morphology_rank,y = morph_time,col = "blue") 205 | legend(x = "topleft",legend = c("by size","by transcription"),col = c("blue","red"),lty = 1) 206 | dev.off() 207 | 208 | png(sprintf("%s/compare_size_time_to_transcripional_time2.png",fig_dir)) 209 | plot(x = emb_age_time2$size_infered_time,y = emb_age_time2$embryo_infered_time,pch = 19,xlab = "Time infered from size", ylab = "Transcriptional time") 210 | dev.off() 211 | 212 | 213 | 214 | 215 | th_stage = split(emb_age_time2$size_infered_time,f = emb_age_time2$Theiler_Stage) 216 | th_stage_median = round(tapply(emb_age_time2$size_infered_time,emb_age_time2$Theiler_Stage,median),2) 217 | n_1 = (length(th_stage_median) + 1) %/% 2 218 | n_2 = length(th_stage_median) %/% 2 219 | odd_n = 2*c(1:n_1) - 1 220 | even_n = 2*c(1:n_2) 221 | 222 | png(sprintf("%s/size_infered_time.png",fig_dir),w = 1000,h = 500) 223 | boxplot(th_stage[theiler_stage_order],horizontal = T,pch = 19,las = 2,xaxt = 'n',ylim = c(6.4,8.1)) 224 | axis(side = 1,at = th_stage_median[odd_n],labels = th_stage_median[odd_n],cex.axis = 0.8) 225 | axis(side = 1,at = th_stage_median[even_n],labels = th_stage_median[even_n],cex.axis = 0.8) 226 | dev.off() 227 | 228 | th_stage = split(tmp2$y,f = emb_age_time2$Theiler_Stage) 229 | th_stage_median = round(tapply(tmp2$y,emb_age_time2$Theiler_Stage,median),2) 230 | n_1 = (length(th_stage_median) + 1) %/% 2 231 | n_2 = length(th_stage_median) %/% 2 232 | odd_n = 2*c(1:n_1) - 1 233 | even_n = 2*c(1:n_2) 234 | 235 | png(sprintf("%s/transcription_infered_time.png",fig_dir),w = 1000,h = 500) 236 | boxplot(th_stage[theiler_stage_order],horizontal = T,pch = 19,las = 2,xaxt = 'n',ylim = c(6.4,8.1)) 237 | axis(side = 1,at = th_stage_median[odd_n],labels = th_stage_median[odd_n],cex.axis = 0.8) 238 | axis(side = 1,at = th_stage_median[even_n],labels = th_stage_median[even_n],cex.axis = 0.8) 239 | dev.off() 240 | 241 | 242 | 243 | } 244 | 245 | } 246 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_s1.r: -------------------------------------------------------------------------------- 1 | # s1 panels 2 | #source("paper_scripts/calculate_embryo_time.r") 3 | library("Matrix") 4 | 5 | gen_fig_s1_plots = function() { 6 | 7 | if(!dir.exists("figs/paper_figs/fig_s1")) { 8 | dir.create("figs/paper_figs/fig_s1") 9 | } 10 | 11 | fig_s1_a_number_of_umi_per_cell(T) 12 | fig_s1_a_number_of_cells_per_embryo() 13 | fig_s1_a_number_of_cls_per_age_group() 14 | 15 | fig_s1c() 16 | fig_s1d(T) 17 | fig_s1e() 18 | fig_s1f(T) 19 | 20 | fig_s1g() 21 | fig_s1h() 22 | 23 | fig_s1i() 24 | 25 | } 26 | 27 | 28 | fig_s1d = function(plot_pdf = T) { 29 | 30 | fig_dir = "figs/paper_figs/fig_s1" 31 | if(!dir.exists(fig_dir)) { 32 | dir.create(fig_dir) 33 | } 34 | 35 | mat_id = "sing_emb_wt10" 36 | mc_id = "sing_emb_wt10_recolored" 37 | 38 | mat = scdb_mat(mat_id) 39 | mc = scdb_mc(mc_id) 40 | gene_intervals = read.table(file = "data/external_data/gene_intervals_mm9.txt",sep = "\t",h = T,stringsAsFactors = F) 41 | chrom_gene_table = unique(gene_intervals[,c("chrom","gene_name")]) 42 | genes_y = chrom_gene_table$gene_name[chrom_gene_table$chrom == "chrY"] 43 | 44 | mat_n = t(t(mat@mat)/colSums(mat@mat)) 45 | mat_ig_n = t(t(mat@ignore_gmat)/colSums(mat@mat[,colnames(mat@ignore_gmat)])) 46 | 47 | y_genes_per_embryo = colSums(mat_n[genes_y,names(mc@mc)]) 48 | y_genes_per_embryo = tapply(X = y_genes_per_embryo, 49 | INDEX = mat@cell_metadata[names(mc@mc),"embryo"], 50 | FUN = mean) 51 | 52 | Xist_per_embryo = tapply(X = mat_ig_n["Xist",names(mc@mc)], 53 | INDEX = mat@cell_metadata[names(mc@mc),"embryo"], 54 | FUN = mean) 55 | 56 | y_umis_per_cell = colSums(mat@mat[genes_y,names(mc@mc)]) 57 | xist_per_cell = mat@ignore_gmat["Xist",names(mc@mc)] 58 | 59 | if(!dir.exists(fig_dir)) { 60 | dir.create(fig_dir) 61 | } 62 | 63 | embryo_sex = rep("female",length(Xist_per_embryo)) 64 | embryo_sex[y_genes_per_embryo > 5e-5] = "male" 65 | df_emb_sex = data.frame(embryo = names(y_genes_per_embryo), sex = embryo_sex) 66 | 67 | df_emb_sex$color = as.character(ifelse(df_emb_sex$sex == "female","#CB181D","cornflowerblue")) 68 | 69 | # Next plot per embryo 70 | if(plot_pdf) { 71 | pdf(sprintf("%s/fig_s1d.pdf",fig_dir),useDingbats = F) 72 | } else { 73 | png(sprintf("%s/fig_s1d.png",fig_dir)) 74 | } 75 | plot(x = Xist_per_embryo,y = y_genes_per_embryo[names(Xist_per_embryo)],pch = 19, 76 | xlab = "Xist UMIs per embryo", ylab = "Y chromosome UMIs per embryo",col = df_emb_sex$color,cex = 2,cex.lab = 1.5) 77 | dev.off() 78 | 79 | 80 | } 81 | 82 | 83 | fig_s1_a_number_of_umi_per_cell = function(plot_pdf = T) { 84 | 85 | fig_dir = "figs/paper_figs/fig_s1/fig_s1a" 86 | if(!dir.exists(fig_dir)) { 87 | dir.create(fig_dir) 88 | } 89 | 90 | mat = scdb_mat("sing_emb_wt10") 91 | mc = scdb_mc("sing_emb_wt10_recolored") 92 | 93 | n_umi = colSums(mat@mat) 94 | 95 | if(plot_pdf) { 96 | pdf("figs/paper_figs/fig_s1/fig_s1a/n_umi_per_cell.pdf",useDingbats = F) 97 | } else { 98 | png("figs/paper_figs/fig_s1/fig_s1/fig_s1a/n_umi_per_cell.png") 99 | } 100 | hist(n_umi,40,main = "UMI per cell",col = "grey80",xlab = "#UMI") 101 | dev.off() 102 | 103 | } 104 | 105 | 106 | fig_s1_a_number_of_cells_per_embryo = function() { 107 | 108 | fig_dir = "figs/paper_figs/fig_s1/fig_s1a" 109 | if(!dir.exists(fig_dir)) { 110 | dir.create(fig_dir) 111 | } 112 | 113 | mc = scdb_mc("sing_emb_wt10_recolored") 114 | mat = scdb_mat("sing_emb_wt10") 115 | 116 | n_cls_per_emb = table(mat@cell_metadata[names(mc@mc),"transcriptional_rank"]) 117 | 118 | pdf(sprintf("%s/number_of_cells_per_emb_gray.pdf",fig_dir),useDingbats = F) 119 | barplot(n_cls_per_emb,log = "y",col = "grey80") 120 | dev.off() 121 | } 122 | 123 | fig_s1_a_number_of_cls_per_age_group = function() { 124 | 125 | fig_dir = "figs/paper_figs/fig_s1/fig_s1a" 126 | if(!dir.exists(fig_dir)) { 127 | dir.create(fig_dir) 128 | } 129 | 130 | mc = scdb_mc("sing_emb_wt10_recolored") 131 | mat = scdb_mat("sing_emb_wt10") 132 | 133 | emb_age = unique(mat@cell_metadata[names(mc@mc),c("transcriptional_rank","age_group","developmental_time")]) 134 | 135 | time_age_group = tapply(emb_age$developmental_time,emb_age$age_group,median) 136 | 137 | n_cls_per_age_group = table(mat@cell_metadata[names(mc@mc),"age_group"]) 138 | 139 | 140 | # estimated number of cells per age group 141 | emb_counts = read.table(file = "data/external_data/counted_embryos/wt10.cell_counts.txt",sep = "\t",stringsAsFactors = F) 142 | emb_time = unique(mat@cell_metadata[names(mc@mc),c("embryo","developmental_time")]) 143 | 144 | emb_to_time = emb_time$developmental_time 145 | names(emb_to_time) = emb_time$embryo 146 | fit_count = lm(log2(emb_counts$cell_count) ~ emb_to_time[emb_counts$embryo]) 147 | 148 | a = fit_count$coefficients[1] 149 | b = fit_count$coefficients[2] 150 | 151 | 152 | log_n_est_per_emb = a + b*time_age_group 153 | 154 | n_est_per_age_group = 2^log_n_est_per_emb 155 | 156 | mat_n_cls = cbind(n_cls_per_age_group,n_est_per_age_group) 157 | 158 | pdf(sprintf("%s/number_of_cells_per_age_group.pdf",fig_dir),useDingbats = F) 159 | barplot(t(mat_n_cls),beside = T,col = c("grey30","grey80")) 160 | legend(x = "topleft",legend = c("estimated in embryo from \n this age group","sampled in age group"),pch = 15, col = c("grey80","grey30")) 161 | dev.off() 162 | } 163 | 164 | 165 | fig_s1c = function() { 166 | 167 | 168 | mat = scdb_mat("sing_emb_wt10") 169 | mc = scdb_mc("sing_emb_wt10_recolored") 170 | 171 | emb_ranks = unique(mat@cell_metadata[names(mc@mc),c("embryo","transcriptional_rank","morphology_rank","mouse_type")]) 172 | emb_ranks$color = as.character(ifelse(emb_ranks$mouse_type == "ICR","cornflowerblue","black")) 173 | 174 | 175 | dir_name = "figs/paper_figs/fig_s1" 176 | 177 | if(!dir.exists(dir_name)) { 178 | dir.create(dir_name) 179 | } 180 | 181 | cex.lab = 2 182 | cex.axis = 2 183 | cex.main = 2 184 | margins = c(5,5,5,5) 185 | cex = 3 186 | 187 | 188 | xlabel = "Transcriptional rank" 189 | ylabel = "Morphological rank" 190 | main_label = "" 191 | #xlabel = "intrinsic rank" 192 | #ylabel = "atlas rank" 193 | #main_label = "Atlas vs Intrinsic Rank" 194 | 195 | pdf(sprintf("%s/fig_s1c.pdf",dir_name),useDingbats = F) 196 | #png(sprintf("%s/atlas_rank_vs_intrinsic_rank.png",dir_name)) 197 | par(mar = margins) 198 | plot(x = emb_ranks$transcriptional_rank, y = emb_ranks$morphology_rank,pch = 19,cex.lab = cex.lab,cex.axis = cex.axis, 199 | xlab = xlabel,ylab = ylabel,col = emb_ranks$color, 200 | main = main_label,cex.main = cex.main,cex = cex) 201 | legend(x = "topleft",legend = c("ICR","C57BL/6"),pch = 19,col = c("cornflowerblue","black"),cex = 2) 202 | dev.off() 203 | 204 | } 205 | 206 | fig_s1e = function(plot_pdf = T) { 207 | 208 | mat = scdb_mat("sing_emb_wt10") 209 | mc = scdb_mc("sing_emb_wt10_recolored") 210 | 211 | emb_ranks = unique(mat@cell_metadata[names(mc@mc),c("embryo","transcriptional_rank","developmental_time","age_group")]) 212 | 213 | age_group_cols = c(RColorBrewer::brewer.pal(n = 12,name = "Paired"),"cornflowerblue") 214 | 215 | emb_ranks$color = age_group_cols[emb_ranks$age_group] 216 | 217 | dir_name = "figs/paper_figs/fig_s1" 218 | 219 | if(!dir.exists(dir_name)) { 220 | dir.create(dir_name) 221 | } 222 | 223 | cex.lab = 2 224 | cex.axis = 2 225 | cex.main = 2 226 | margins = c(5,6,5,4) 227 | cex = 2 228 | 229 | 230 | xlabel = "Transcriptional rank" 231 | ylabel = "Developmental time" 232 | main_label = "" 233 | #xlabel = "intrinsic rank" 234 | #ylabel = "atlas rank" 235 | #main_label = "Atlas vs Intrinsic Rank" 236 | if (plot_pdf) { 237 | pdf(sprintf("%s/fig_s1e.pdf",dir_name),useDingbats = F) 238 | #png(sprintf("%s/atlas_rank_vs_intrinsic_rank.png",dir_name)) 239 | par(mar = margins) 240 | plot(x = emb_ranks$transcriptional_rank, y = emb_ranks$developmental_time,pch = 19,cex.lab = cex.lab,cex.axis = cex.axis, 241 | xlab = xlabel,ylab = ylabel,col = emb_ranks$color, 242 | main = main_label,cex.main = cex.main,cex = cex) 243 | legend(x = "topleft",legend = c(1:13),title = "Age group",pch = 19,col = age_group_cols, cex = 1) 244 | dev.off() 245 | 246 | } else { 247 | png(sprintf("%s/fig_s1e.png",dir_name)) 248 | #png(sprintf("%s/atlas_rank_vs_intrinsic_rank.png",dir_name)) 249 | par(mar = margins) 250 | plot(x = emb_ranks$transcriptional_rank, y = emb_ranks$developmental_time,pch = 19,cex.lab = cex.lab,cex.axis = cex.axis, 251 | xlab = xlabel,ylab = ylabel,col = emb_ranks$color, 252 | main = main_label,cex.main = cex.main,cex = cex) 253 | legend(x = "topleft",legend = c(1:13),title = "Age group",pch = 19,col = age_group_cols, cex = 1) 254 | dev.off() 255 | 256 | } 257 | 258 | 259 | } 260 | 261 | 262 | 263 | fig_s1f = function(plot_pdf = T) { 264 | mat_id = "sing_emb_wt10" 265 | mc_id = "sing_emb_wt10_recolored" 266 | dir_name = "figs/paper_figs/fig_s1" 267 | m_0 = 0.01 268 | s_0 = 0.005 269 | 270 | m = scdb_mat(mat_id) 271 | mc = scdb_mc(mc_id) 272 | mc2d = scdb_mc2d("sing_emb_wt10_recolored") 273 | 274 | col_to_rank = c(1:nrow(mc@color_key)) 275 | names(col_to_rank) = mc@color_key$color 276 | #mc_exe = c(which(mc@mc_fp["Ttr",] > 4),379) 277 | mc_exe = which(mc@colors %in% c("#7F6874","#F6BFCB","#EF5A9D","#F397C0","#0F4A9C")) 278 | 279 | 280 | cls_exe_all = names(mc@mc)[mc@mc %in% mc_exe] 281 | 282 | cls_exe = sample(x = cls_exe_all,500) 283 | 284 | 285 | if(!dir.exists(dir_name)) { 286 | dir.create(dir_name) 287 | } 288 | 289 | m_genes = c("Mki67","Cenpf","Top2a","Smc4;SMC4","Ube2c","Ccnb1","Cdk1","Arl6ip1","Ankrd11","Hmmr;IHABP","Cenpa;Cenp-a","Tpx2","Aurka","Kif4", "Kif2c","Bub1b","Ccna2", "Kif23","Kif20a","Sgol2","Smc2", "Kif11", "Cdca2","Incenp","Cenpe") 290 | 291 | s_genes = c("Pcna", "Rrm2", "Mcm5", "Mcm6", "Mcm4", "Ung", "Mcm7", "Mcm2","Uhrf1", "Orc6", "Tipin") # Npm1 292 | 293 | mc_mean_age = tapply(m@cell_metadata[names(mc@mc),"age_group"],mc@mc,mean) 294 | 295 | s_genes = intersect(rownames(mc@mc_fp), s_genes) 296 | m_genes = intersect(rownames(mc@mc_fp), m_genes) 297 | 298 | tot = colSums(m@mat) 299 | s_tot = colSums(m@mat[s_genes,]) 300 | m_tot = colSums(m@mat[m_genes,]) 301 | 302 | s_score = s_tot/tot 303 | m_score = m_tot/tot 304 | 305 | max_s_score = quantile(s_score,0.9995) 306 | max_m_score = quantile(m_score,0.9995) 307 | 308 | f = s_score < max_s_score & m_score < max_m_score 309 | 310 | p_coldens =densCols(x = s_score,y = m_score,colramp = colorRampPalette(RColorBrewer::brewer.pal(9,"Blues"),bias = 1)) 311 | 312 | if(plot_pdf) { 313 | pdf(sprintf("%s/fig_s1f.pdf", dir_name), w=6, h=6,useDingbats = F) 314 | plot(s_score[f], m_score[f], pch=19, main = "S phase vs M phase UMIs",cex=1, 315 | xlab = "S phase score",ylab = "M phase score", 316 | col = p_coldens[f]) 317 | points(s_score[cls_exe], m_score[cls_exe], pch=19, cex=0.8, col="black") 318 | abline(a = m_0,b = -m_0/s_0) 319 | dev.off() 320 | } else { 321 | png(sprintf("%s/fig_s1f.png", dir_name), w=2000, h=2000) 322 | plot(s_score[f], m_score[f], pch=19, main = "",cex=3, 323 | xlab = "",ylab = "",cex.axis = 2.5, 324 | col = p_coldens[f]) 325 | points(s_score[cls_exe], m_score[cls_exe], pch=19, cex=2.4, col="black") 326 | abline(a = m_0,b = -m_0/s_0) 327 | dev.off() 328 | } 329 | 330 | } 331 | 332 | 333 | 334 | fig_s1g = function(plot_pdf = T) { 335 | 336 | mc = scdb_mc("sing_emb_wt10_recolored") 337 | 338 | legc = log2(mc@e_gc + 1e-5) 339 | 340 | 341 | # first separation embryonic endoderm (including node/notochord) from meso/-ectoderm 342 | x1 = -16 343 | y1 = -12 344 | x2 = -12 345 | y2 = -16 346 | 347 | b_emb_endo = (y2 - y1)/(x2 - x1) 348 | a_emb_endo = (y1*x2 - y2*x1)/(x2 - x1) 349 | 350 | if(plot_pdf) { 351 | pdf(file = "figs/paper_figs/fig_s1/fig_s1g_Foxa2_Foxa1.pdf",useDingbats = F) 352 | } else { 353 | png(filename = "figs/paper_figs/fig_s1/fig_s1g_Foxa2_Foxa1.png") 354 | } 355 | par(mar = c(4,5,2,2)) 356 | plot(x = legc["Foxa2",], y= legc["Foxa1",],pch = 19,col = mc@colors,cex = 2,xlab = "Foxa2", 357 | ylab = "Foxa1",cex.lab = 1) 358 | abline(a = a_emb_endo, b = b_emb_endo,lty = "dashed") 359 | dev.off() 360 | 361 | # second separation extraembryonic from embryonic endoderm 362 | x1 = -8.4 363 | y1 = -14 364 | x2 = -11 365 | y2 = -8.4 366 | 367 | b_exe_endo = (y2 - y1)/(x2 - x1) 368 | a_exe_endo = (y1*x2 - y2*x1)/(x2 - x1) 369 | 370 | if(plot_pdf) { 371 | pdf(file = "figs/paper_figs/fig_s1/fig_s1g_Apoe_Ttr.pdf",useDingbats = F) 372 | } else { 373 | png(filename = "figs/paper_figs/fig_s1/fig_s1g_Apoe_Ttr.png") 374 | } 375 | par(mar = c(4,5,2,1)) 376 | plot(x = legc["Apoe",], y= legc["Ttr",],pch = 19,col = mc@colors,cex = 2,xlab = "Apoe", 377 | ylab = "Ttr",cex.lab = 1) 378 | abline(a = a_exe_endo, b = b_exe_endo,lty = "dashed") 379 | dev.off() 380 | 381 | } 382 | 383 | 384 | 385 | fig_s1h = function() { 386 | mat_id = "sing_emb_wt10" 387 | mc_id = "sing_emb_wt10_recolored" 388 | dir_name = "figs/paper_figs/fig_s1" 389 | m_0 = 0.01 390 | s_0 = 0.005 391 | 392 | m = scdb_mat(mat_id) 393 | mc = scdb_mc(mc_id) 394 | mc2d = scdb_mc2d("sing_emb_wt10_recolored") 395 | 396 | col_to_rank = c(1:nrow(mc@color_key)) 397 | names(col_to_rank) = mc@color_key$color 398 | 399 | if(!dir.exists(dir_name)) { 400 | dir.create(dir_name) 401 | } 402 | 403 | m_genes = c("Mki67","Cenpf","Top2a","Smc4;SMC4","Ube2c","Ccnb1","Cdk1","Arl6ip1","Ankrd11","Hmmr;IHABP","Cenpa;Cenp-a","Tpx2","Aurka","Kif4", "Kif2c","Bub1b","Ccna2", "Kif23","Kif20a","Sgol2","Smc2", "Kif11", "Cdca2","Incenp","Cenpe") 404 | s_genes = c("Pcna", "Rrm2", "Mcm5", "Mcm6", "Mcm4", "Ung", "Mcm7", "Mcm2","Uhrf1", "Orc6", "Tipin") # Npm1 405 | 406 | s_genes = intersect(rownames(mc@mc_fp), s_genes) 407 | m_genes = intersect(rownames(mc@mc_fp), m_genes) 408 | 409 | tot = colSums(m@mat) 410 | s_tot = colSums(m@mat[s_genes,]) 411 | m_tot = colSums(m@mat[m_genes,]) 412 | 413 | s_score = s_tot/tot 414 | m_score = m_tot/tot 415 | 416 | # boxplot per cell type of the cell cycle score 417 | 418 | cc_score_all = m_score + s_score 419 | 420 | all_cls = intersect(colnames(m@mat),names(mc@mc)) 421 | 422 | cc_score = split(cc_score_all[all_cls],f = mc@colors[mc@mc[all_cls]]) 423 | cc_score = cc_score[order(col_to_rank[names(cc_score)])] 424 | 425 | pdf(sprintf("%s/fig_s1h.pdf",dir_name),useDingbats = F) 426 | boxplot(cc_score,pch = 19,cex = 0.5,col = names(cc_score),xaxt = 'n') 427 | dev.off() 428 | 429 | } 430 | 431 | 432 | 433 | fig_s1i = function(plot_pdf = T) { 434 | 435 | dir_name = "figs/paper_figs/fig_s1" 436 | fn = sprintf("%s/fig_s1i_fraction_of_exe_endo_cells.png",dir_name) 437 | fn2 = sprintf("%s/fig_s1i_fraction_of_endoderm_cells.png",dir_name) 438 | 439 | mc = scdb_mc("sing_emb_wt10_recolored") 440 | mat = scdb_mat("sing_emb_wt10") 441 | 442 | mc_exe = which(mc@colors %in% c("#7F6874","#F6BFCB")) 443 | 444 | mc_ag = table(mc@mc,mat@cell_metadata[names(mc@mc),"age_group"]) 445 | mc_ag_n = t(t(mc_ag)/colSums(mc_ag)) 446 | 447 | fr_exe_endo = colSums(mc_ag_n[mc_exe,]) 448 | 449 | fit_y_exe_endo = 0.2*(0.83^c(0:12)) 450 | 451 | if(plot_pdf) { 452 | pdf(file= gsub(".png",".pdf",fn),w = 8, h = 4,useDingbats = F) 453 | } else { 454 | png(filename = fn,w = 800, h = 400) 455 | } 456 | plot(x = c(1:13),y = fr_exe_endo,pch = 19,log = "y",xlab = "",ylab = "",cex = 4,cex.axis = 2) 457 | lines(x = c(1:13),fit_y_exe_endo) 458 | dev.off() 459 | 460 | 461 | mc_endo = which(mc@colors %in% c("#0F4A9C","#F397C0","#EF5A9D")) 462 | 463 | fr_endo = pmax(colSums(mc_ag_n[mc_endo,]),0.02) 464 | fit_y_endo = 0.10*(0.88^c(0:6)) 465 | 466 | if(plot_pdf) { 467 | pdf(file = gsub(".png",".pdf",fn2),w = 8, h = 4,useDingbats = F) 468 | } else { 469 | png(filename = fn2,w = 800, h = 400) 470 | } 471 | plot(x = c(1:13),y = fr_endo,pch = 19,log = "y",xlab = "",ylab = "",cex = 4,cex.axis = 2) 472 | lines(x = c(7:13),fit_y_endo) 473 | dev.off() 474 | 475 | 476 | 477 | } 478 | 479 | 480 | 481 | 482 | 483 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_s3.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | scdb_init("scrna_db/",force_reinit = T) 3 | 4 | gen_fig_s3_plots = function() { 5 | 6 | if(!dir.exists("figs/paper_figs/fig_s3")) { 7 | dir.create("figs/paper_figs/fig_s3") 8 | } 9 | 10 | # both fig_s3a() and fig_s3b() require the file "data/fig_s3/mc_cluster_order.txt" 11 | # It contains a clustering of metacells based on the flows 12 | # Clustering of metacells is based on the network flow model 13 | # Users who want to redo this analysis should run the function cluster_metacells_by_flow(mct_id,K_cluster) 14 | # and save the output in data/fig_s3/mc_cluster_order.txt 15 | fig_s3a() 16 | fig_s3a_color_scale() 17 | fig_s3b(T) 18 | 19 | } 20 | 21 | fig_s3b = function(plot_pdf = T) { 22 | fig_dir = "figs/paper_figs/fig_s3/fig_s3b" 23 | if(!dir.exists(fig_dir)) { 24 | dir.create(fig_dir) 25 | } 26 | 27 | mc = scdb_mc("sing_emb_wt10_recolored") 28 | 29 | mct_id = "sing_emb_wt10" 30 | K = 65 31 | marks = c("Foxc2","Lefty2","Tcf15","Nkx2-5","Myl4","Tagln","Pim2","Tcfap2a","Sox1","Grsf1","Dppa3","Etv2","Tal1","Cited4","Lefty1", 32 | "Noto","Foxa1","Ttr") 33 | 34 | mc_clust = read.table("data/fig_s3/mc_cluster_order.txt",sep = "\t",stringsAsFactors = F) 35 | 36 | plot_marks_along_clust(genes = marks,mc_clust = mc_clust,mc = mc,fig_dir = fig_dir,plot_pdf = plot_pdf) 37 | } 38 | 39 | 40 | fig_s3a = function(plot_pdf = T) { 41 | 42 | fig_dir = "figs/paper_figs/fig_s3" 43 | text_cex=1 44 | 45 | mct_id= "sing_emb_wt10" 46 | mct = scdb_mctnetwork(mct_id) 47 | 48 | cls_ord = read.table("data/fig_s3/mc_cluster_order.txt",sep = "\t",stringsAsFactors = F) 49 | mc_ord = cls_ord$mc 50 | 51 | K = length(unique(cls_ord$cluster)) 52 | 53 | clst_flows = mctnetwork_clust_flows(mct_id = mct_id,K = K) 54 | 55 | cmat = clst_flows$cmat 56 | fclst = cls_ord$cluster[order(cls_ord$mc)] 57 | 58 | mc = scdb_mc(mct@mc_id) 59 | 60 | 61 | shades = colorRampPalette(c("darkblue", "blue","white", "red", "yellow"))(1000) 62 | if(plot_pdf) { 63 | pdf(sprintf("%s/fig_s3a.pdf",fig_dir), w=16, h=16) 64 | layout(matrix(c(1,2),nrow=2),heights=c(nrow(cmat)*0.09+0.50, 3)) 65 | fig_scl = 1 66 | } else { 67 | fig_scl = 2 68 | png(sprintf("%s/fig_s3a.png",fig_dir), w=1600*fig_scl, h=1600*fig_scl) 69 | layout(matrix(c(1,2),nrow=2),heights=c(nrow(cmat)*9+50, 300)*fig_scl) 70 | 71 | 72 | } 73 | 74 | n_mc = nrow(cmat) 75 | 76 | 77 | par(mar=c(0,5,4,5)) 78 | image(cmat[mc_ord, mc_ord], zlim=c(-1,1), col=shades, yaxt='n', xaxt='n') 79 | N = length(mc_ord) 80 | 81 | mc_x = 1:length(mc_ord) 82 | names(mc_x) = 1:N 83 | mc_x[mc_ord] = 1:N 84 | for(i in 1:K) { 85 | abline(h=max(-0.5+mc_x[fclst == i])/(N-1),lwd = fig_scl) 86 | abline(v=max(-0.5+mc_x[fclst == i])/(N-1),lwd = fig_scl) 87 | } 88 | cl_x = tapply(mc_x, fclst, mean) 89 | cl_max = tapply(mc_x, fclst, max) 90 | 91 | mtext(1:K, side = 3, at=cl_x/N, las=1, cex=fig_scl) 92 | par(mar=c(3,5,0,5)) 93 | image(as.matrix(mc_ord,nrow=1), col=mc@colors, yaxt='n', xaxt='n') 94 | dev.off() 95 | 96 | } 97 | 98 | fig_s3a_color_scale = function() { 99 | 100 | # plot color scale 101 | shades = colorRampPalette(c("darkblue", "blue","white", "red", "yellow"))(101) 102 | 103 | vals = seq(-1,1,length.out = 101) 104 | cols = shades 105 | show_vals_ind = c(1,51,101) 106 | 107 | 108 | pdf(file = "figs/paper_figs/fig_s3/fig_s3a_color_scale.pdf",useDingbats = F) 109 | plot.new() 110 | plot.window(xlim=c(0,100), ylim=c(0, length(cols) + 3)) 111 | rect(7, 1:length(cols), 17, 1:length(cols) + 1, border=NA, col=cols) 112 | rect(7, 1, 17, length(cols)+1, col=NA, border = 'black') 113 | 114 | text(19, (1:length(cols))[show_vals_ind] + 0.5, labels=vals[show_vals_ind], pos=4) 115 | dev.off() 116 | 117 | 118 | 119 | } 120 | 121 | 122 | 123 | plot_marks_along_clust = function(genes,mc_clust,mc,fig_dir,plot_pdf = F,additional_horizontal_line = NULL) { 124 | 125 | fig_scl = 3 126 | 127 | mc_clust = mc_clust[order(mc_clust$mc_rank),] 128 | 129 | n_clust = max(mc_clust$cluster) 130 | 131 | mc_ord = mc_clust$mc 132 | 133 | fclst = mc_clust$cluster[order(mc_clust$mc)] 134 | 135 | legc = log2(mc@e_gc+1e-5) 136 | 137 | mc_x = 1:length(mc_ord) 138 | names(mc_x) = 1:length(mc_ord) 139 | mc_x[mc_ord] = 1:length(mc_ord) 140 | 141 | cl_x = tapply(mc_x, fclst, mean) 142 | cl_max = tapply(mc_x, fclst, max) 143 | 144 | 145 | if(!dir.exists(fig_dir)) { 146 | dir.create(fig_dir) 147 | } 148 | 149 | for(g in genes) { 150 | 151 | if(plot_pdf) { 152 | pdf(sprintf("%s/%s.pdf", fig_dir, g),w=10,h=3,useDingbats = F) 153 | #svg(sprintf("%s/%s.svg", fig_dir, g),w=10,h=3) 154 | fig_scl = 1 155 | } else { 156 | png(sprintf("%s/%s.png", fig_dir, g),w=1000*fig_scl,h=300*fig_scl) 157 | } 158 | plot(1:length(mc_ord), legc[g, mc_ord], pch=19, col=mc@colors[mc_ord], ylab="",xaxt='n',xlab = "",cex = fig_scl,cex.axis = fig_scl) 159 | mtext(1:n_clust, at=cl_x,side=1, las=2,cex = 0.6*fig_scl) 160 | 161 | if(!is.null(additional_horizontal_line)) { 162 | abline(h = additional_horizontal_line,lwd = fig_scl*3,lty = "dashed") 163 | } 164 | abline(v=cl_max+0.5,lwd = fig_scl) 165 | 166 | grid(lwd = fig_scl) 167 | dev.off() 168 | } 169 | 170 | } 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_s4.r: -------------------------------------------------------------------------------- 1 | # fig S4 2 | #source("paper_scripts/fig3_plots.r") 3 | 4 | library("metacell") 5 | source("scripts/generate_paper_figures/fig_4.r") 6 | 7 | gen_fig_s4_plots = function() { 8 | 9 | if(!dir.exists("figs/paper_figs/fig_s4")) { 10 | dir.create("figs/paper_figs/fig_s4") 11 | } 12 | 13 | # Figure S4a is generated by the functions of fig3_plots.r 14 | fig_s4b() 15 | fig_s4c() 16 | fig_s4de() 17 | fig_s4f() 18 | fig_s4g() 19 | } 20 | 21 | fig_s4b = function() { 22 | 23 | 24 | mct = scdb_mctnetwork("sing_emb_wt10") 25 | mc = scdb_mc("sing_emb_wt10_recolored") 26 | confu = mctnetwork_get_flow_mat(mct, -2) 27 | diag(confu) = 0 28 | 29 | mat = scdb_mat("sing_emb_wt10") 30 | mc_ag = table(mc@mc, mat@cell_metadata[names(mc@mc), "age_group"]) 31 | mc_t = apply(mc_ag,1, function(x) sum((1:13)*x)/sum(x)) 32 | 33 | genes = c("Otx2","Sox9","Gbx2","Fst") 34 | 35 | incl_colors = c("#635547","#DABE99","#9e6762","#65A83E","#647a4f","#354E23") 36 | 37 | for(gene in genes) { 38 | plot_time_gene_mc_flow(confu = confu,mc = mc,mc_t = mc_t,gene = gene,incl_colors = incl_colors,fig_dir = "figs/paper_figs/fig_s4/fig_s4b",max_t = 13,fig_pref = "epiblast",plot_pdf = T) 39 | } 40 | 41 | 42 | } 43 | 44 | fig_s4c = function() { 45 | 46 | 47 | mct = scdb_mctnetwork("sing_emb_wt10") 48 | mc = scdb_mc("sing_emb_wt10_recolored") 49 | confu = mctnetwork_get_flow_mat(mct, -2) 50 | diag(confu) = 0 51 | 52 | mat = scdb_mat("sing_emb_wt10") 53 | mc_ag = table(mc@mc, mat@cell_metadata[names(mc@mc), "age_group"]) 54 | mc_t = apply(mc_ag,1, function(x) sum((1:13)*x)/sum(x)) 55 | 56 | genes = c("Eomes","Foxa2","Mixl1","T") 57 | 58 | incl_colors = c("#635547","#DABE99","#9e6762","#65A83E","#647a4f","#354E23") 59 | 60 | for(gene in genes) { 61 | plot_time_gene_mc_flow(confu = confu,mc = mc,mc_t = mc_t,gene = gene,incl_colors = incl_colors,fig_dir = "figs/paper_figs/fig_s4/fig_s4c",max_t = 13,fig_pref = "epiblast",plot_pdf = T) 62 | } 63 | 64 | 65 | } 66 | 67 | fig_s4de = function() { 68 | 69 | fig_dir = "figs/paper_figs/fig_s4" 70 | 71 | endo_cts = c("#0F4A9C","#F397C0","#EF5A9D","#F6BFCB","#7F6874") 72 | ecto_cts = c("#f7f79e","#647a4f","#354E23","#9e6762") 73 | meso_cts = c("#FACB12","#1a3f52","#DFCDE4","#408DA1","#8DB5CE","#45d1c5","#53f1fc","#B51D8D","#cc7818","#8870ad", 74 | "#532C8A","#FBBE92","#c9a997","#C72228","#EF4E22") 75 | ct_all = list(endo = endo_cts,meso = meso_cts,ecto = ecto_cts) 76 | 77 | mc = scdb_mc("sing_emb_wt10_recolored") 78 | mct = scdb_mctnetwork("sing_emb_wt10") 79 | mc_ag = mct@mc_t_infer 80 | mc_mean_age = ((mc_ag/rowSums(mc_ag)) %*% c(1:13))[,1] 81 | 82 | legc = log2(mc@e_gc + 1e-5) 83 | 84 | col_to_rank = c(1:nrow(mc@color_key)) 85 | names(col_to_rank) = mc@color_key$color 86 | 87 | mc_ls = lapply(ct_all,function(cts) {return(which(mc@colors %in% cts))}) 88 | p_0_ls = lapply(mc_ls,function(mcs) { 89 | p_0 = rep(0,ncol(mc@e_gc)) 90 | p_0[mcs] = mc_ag[mcs,13] 91 | return(p_0) 92 | }) 93 | 94 | prop_ls = lapply(p_0_ls,function(p_0) { 95 | 96 | prop = mctnetwork_propogate_from_t(mct = mct,t = 13,mc_p = p_0) 97 | 98 | return(prop) 99 | }) 100 | 101 | mc_ag_inf = mctnetwork_propogate_from_t(mct = mct,t = 13,mc_p = mc_ag[,13]) 102 | 103 | mc_commit_ls = lapply(prop_ls,function(prop) { 104 | mc_commit = rowSums(prop$probs)/rowSums(mc_ag_inf$probs) 105 | return(mc_commit) 106 | }) 107 | 108 | mc_ord = order(1000*col_to_rank[mc@colors] + mc_mean_age) 109 | 110 | w = 8 111 | h = 4 112 | 113 | pdf(sprintf("%s/fig_s4d_endoderm_commitment.pdf",fig_dir),width = w, height = h,useDingbats = F) 114 | plot(x = 1:ncol(mc@e_gc),y = mc_commit_ls$endo[mc_ord],pch = 19,col = mc@colors[mc_ord],xlab = "Metacells",xaxt = 'n', 115 | main = "Endoderm commitment",ylab = "") 116 | dev.off() 117 | 118 | pdf(sprintf("%s/fig_s4d_mesoderm_commitment.pdf",fig_dir),width = w, height = h,useDingbats = F) 119 | plot(x = 1:ncol(mc@e_gc),y = mc_commit_ls$meso[mc_ord],pch = 19,col = mc@colors[mc_ord],xlab = "Metacells",xaxt = 'n', 120 | main = "Mesoderm commitment",ylab = "") 121 | dev.off() 122 | 123 | #pdf(sprintf("%s/ectoderm_commitment.pdf",fig_dir),width = w, height = h,useDingbats = F) 124 | #plot(x = 1:ncol(mc@e_gc),y = mc_commit_ls$ecto[mc_ord],pch = 19,col = mc@colors[mc_ord],xlab = "Metacells",xaxt = 'n', 125 | # main = "Ectoderm commitment",ylab = "") 126 | #dev.off() 127 | 128 | pdf(sprintf("%s/fig_s4e_endo_vs_mesoderm_commitment.pdf",fig_dir),useDingbats = F) 129 | plot(mc_commit_ls$endo,mc_commit_ls$meso,pch = 19, col = mc@colors,cex = 1.5,xlab = c("Endoderm commitment"),ylab = "Mesoderm commitment") 130 | dev.off() 131 | 132 | pdf(sprintf("%s/fig_s4e_Mesp1_vs_Foxa2.pdf",fig_dir),useDingbats = F,w = 4.5,h = 5) 133 | plot(x = legc["Foxa2",],y = legc["Mesp1",],pch = 19,col = mc@colors, 134 | xlab = "Foxa2 expression",ylab = "Mesp1 expression",cex = 1.5) 135 | dev.off() 136 | 137 | 138 | 139 | } 140 | 141 | fig_s4f = function() { 142 | 143 | included_cell_types = c("#65A83E","#635547","#DABE99","#C594BF","#DFCDE4","#c19f70","#F397C0","#c9a997","#C72228") 144 | 145 | mat = scdb_mat("sing_emb_wt10") 146 | mc = scdb_mc("sing_emb_wt10_recolored") 147 | col_to_rank = c(1:nrow(mc@color_key)) 148 | names(col_to_rank) = mc@color_key$color 149 | col_to_ct = mc@color_key$group 150 | names(col_to_ct) = mc@color_key$color 151 | 152 | age_group_time = unique(mat@cell_metadata[names(mc@mc),c("transcriptional_rank","developmental_time","age_group")]) 153 | dev_time = round(tapply(age_group_time$developmental_time,age_group_time$age_group,mean),2) 154 | 155 | mct_id = "sing_emb_wt10" 156 | mct = scdb_mctnetwork(mct_id) 157 | 158 | load(file = "data/fig_s4/retention_time_replica.Rda") 159 | 160 | # next original retention time 161 | 162 | 163 | type_flows = mctnetwork_get_type_flows(mct,1,13) 164 | 165 | in_flows = sapply(type_flows,FUN = function(flow_mat) { 166 | 167 | diag(flow_mat) = 0 168 | in_flow = colSums(flow_mat) 169 | 170 | return(in_flow) 171 | }) 172 | 173 | in_flows = cbind(rowSums(type_flows[[1]]),in_flows) 174 | tot_in_flow = rowSums(in_flows) 175 | 176 | type_ag = sapply(type_flows,FUN = function(flow_mat) { 177 | 178 | return(colSums(flow_mat)) 179 | }) 180 | type_ag = cbind(rowSums(type_flows[[1]]),type_ag) 181 | type_ag_n = type_ag/tot_in_flow 182 | 183 | diff_time = diff(dev_time) 184 | 185 | av_ret_time = 0.5*(diff_time[2:length(diff_time)] + diff_time[1:(length(diff_time) - 1)]) 186 | av_ret_time = c(diff_time[1]/2,av_ret_time,diff_time[length(diff_time)]/2) 187 | 188 | ret_time_type = (type_ag_n %*% av_ret_time)[,1] 189 | 190 | col_ord = order(col_to_rank[names(ret_time_type)]) 191 | ret_time_type = ret_time_type[col_ord] 192 | 193 | 194 | # boxplot of retention times 195 | ret_time_type = ret_time_type[included_cell_types] 196 | 197 | ret_time_replica = ret_time_replica[,included_cell_types] 198 | 199 | ret_time_ls = split(24*ret_time_replica,rep(colnames(ret_time_replica),each = nrow(ret_time_replica))) 200 | ret_time_ls = ret_time_ls[included_cell_types] 201 | #names(ret_time_ls) = col_to_ct[included_cell_types] 202 | names(ret_time_ls) = c("DEc","Epi","PS","NM","LNM","APS","DE","BP","E1") 203 | 204 | pdf(file = "figs/paper_figs/fig_s4/fig_s4f.pdf",useDingbats = F) 205 | boxplot(ret_time_ls,col = included_cell_types,las = 2,names = names(ret_time_ls),ylab = "Retention time in hours",xlab = "Cell types",w = 8,h = 3,horizontal = F) 206 | points(x = c(1:9),y= 24*ret_time_type,pch = 19) 207 | dev.off() 208 | 209 | } 210 | 211 | fig_s4g = function() { 212 | 213 | if(!dir.exists("figs/paper_figs/fig_s4")) { 214 | dir.create("figs/paper_figs/fig_s4") 215 | } 216 | if(!dir.exists("figs/paper_figs/fig_s4/fig_s4g")) { 217 | dir.create("figs/paper_figs/fig_s4/fig_s4g") 218 | } 219 | 220 | mct = scdb_mctnetwork("sing_emb_wt10") 221 | mc = scdb_mc("sing_emb_wt10_recolored") 222 | confu = mctnetwork_get_flow_mat(mct, -2) 223 | diag(confu) = 0 224 | colnames(confu)[1] = "-1" 225 | 226 | mat = scdb_mat("sing_emb_wt10") 227 | mc_ag = table(mc@mc, mat@cell_metadata[names(mc@mc), "age_group"]) 228 | mc_t = apply(mc_ag,1, function(x) sum((1:13)*x)/sum(x)) 229 | 230 | 231 | # blood vs amnion chorion rostral mesoderm 232 | 233 | genes = c("T","Snai1","Foxc2","Cyp26a1","Frzb","Sp5","Igf2","Msx1","Smarcd3","Myl7") 234 | 235 | incl_colors = c("#DABE99","#C594BF","#FBBE92","#DFCDE4","#cc7818","#8DB5CE","#c9a997","#8870ad","#53f1fc","#1a3f52") 236 | 237 | for(gene in genes) { 238 | 239 | plot_time_gene_mc_flow(confu = confu,mc = mc,mc_t = mc_t,gene = gene,incl_colors = incl_colors,fig_dir = "figs/paper_figs/fig_s4/fig_s4g",max_t = 13,fig_pref = "nascent_mesoderm",plot_pdf = T) 240 | } 241 | 242 | 243 | 244 | } 245 | 246 | 247 | 248 | if(0) { 249 | 250 | # this part needs all the replica mctnetwork objects from the statistical stability analysis of 251 | # the network flows. 252 | 253 | generate_replica_retention_time_matrix = function() { 254 | 255 | scdb_init("scrna_db/",force_reinit = T) 256 | mat = scdb_mat("sing_emb_wt10") 257 | mc = scdb_mc("sing_emb_wt10_recolored") 258 | 259 | scdb_init("scrna_db/bootstrap/",force_reinit = T) 260 | 261 | included_cell_types = c("#65A83E","#635547","#DABE99","#C594BF","#DFCDE4","#c19f70","#F397C0","#c9a997","#C72228") 262 | 263 | col_to_rank = c(1:nrow(mc@color_key)) 264 | names(col_to_rank) = mc@color_key$color 265 | col_to_ct = mc@color_key$group 266 | names(col_to_ct) = mc@color_key$color 267 | 268 | age_group_time = unique(mat@cell_metadata[names(mc@mc),c("transcriptional_rank","developmental_time","age_group")]) 269 | dev_time = round(tapply(age_group_time$developmental_time,age_group_time$age_group,mean),2) 270 | 271 | mct_id = "sing_emb_wt10" 272 | mct = scdb_mctnetwork(mct_id) 273 | 274 | ret_time_replica = c() 275 | 276 | 277 | for (i in 1:153) { 278 | 279 | #load(sprintf("data/paper_data/bootstrap/color_transitions/col_trans_per_t_%d.Rda",i)) 280 | 281 | mct_bs_id = paste0("sing_emb_wt10_",i) 282 | mct_bs = scdb_mctnetwork(mct_bs_id) 283 | 284 | 285 | type_flows = mctnetwork_get_type_flows(mct_bs,1,13) 286 | 287 | in_flows = sapply(type_flows,FUN = function(flow_mat) { 288 | 289 | diag(flow_mat) = 0 290 | in_flow = colSums(flow_mat) 291 | 292 | return(in_flow) 293 | }) 294 | 295 | in_flows = cbind(rowSums(type_flows[[1]]),in_flows) 296 | tot_in_flow = rowSums(in_flows) 297 | 298 | type_ag = sapply(type_flows,FUN = function(flow_mat) { 299 | 300 | return(colSums(flow_mat)) 301 | }) 302 | type_ag = cbind(rowSums(type_flows[[1]]),type_ag) 303 | type_ag_n = type_ag/tot_in_flow 304 | 305 | diff_time = diff(dev_time) 306 | 307 | av_ret_time = 0.5*(diff_time[2:length(diff_time)] + diff_time[1:(length(diff_time) - 1)]) 308 | av_ret_time = c(diff_time[1]/2,av_ret_time,diff_time[length(diff_time)]/2) 309 | 310 | ret_time_type = (type_ag_n %*% av_ret_time)[,1] 311 | 312 | col_ord = order(col_to_rank[names(ret_time_type)]) 313 | ret_time_type = ret_time_type[col_ord] 314 | 315 | ret_time_replica = rbind(ret_time_replica,ret_time_type) 316 | 317 | } 318 | 319 | save(ret_time_replica,file = "data/fig_s4/retention_time_replica.Rda") 320 | 321 | scdb_init("scrna_db/",force_reinit = T) 322 | 323 | 324 | } 325 | 326 | } 327 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_s5.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | scdb_init("scrna_db/",force_reinit = T) 3 | 4 | # permutation test for 2 tfs 5 | 6 | 7 | # for every feat gene compute the two most predictive TFs 8 | # among them, select the one that has the highest single predictive power and fix it 9 | # next randomize the rows of tfs * metacell matrix excluding the selected TFS 10 | # find best predictive pair from randomized data. 11 | # repeat a 500 times 12 | # create boxplot of randomized R squared vs real R squared 13 | 14 | gen_fig_s5_plots = function() { 15 | 16 | if(!dir.exists("figs/paper_figs/fig_s5")) { 17 | dir.create("figs/paper_figs/fig_s5") 18 | } 19 | 20 | fig_s5a() 21 | # If the following files 22 | # 23 | # r_sq_permut_test_tmp_1.Rda 24 | # r_sq_permut_test_tmp_2.Rda 25 | # r_sq_permut_test_tmp_3.Rda 26 | # 27 | # are not available in the subfolder data/fig5 please rerun 28 | # permutation_test_two_tf_pred() with the input values 1,2,3: 29 | # 30 | # permutation_test_two_tf_pred(1) 31 | # permutation_test_two_tf_pred(2) 32 | # permutation_test_two_tf_pred(3) 33 | 34 | fig_s5b() 35 | fig_s5c() 36 | } 37 | 38 | fig_s5b = function() { 39 | 40 | best_predict = read.table(file = "data/fig5/best_pred_2tfs.txt",sep = "\t",stringsAsFactors = F) 41 | best_predict_1tf = read.table(file = "data/fig5/best_pred_1_tf.txt",sep = "\t",stringsAsFactors = F) 42 | colnames(best_predict_1tf) = c("targ","best_tf_1tf","rsq_one_tf") 43 | best_predict = left_join(best_predict,best_predict_1tf, by = "targ") 44 | 45 | load("data/fig5/r_sq_permut_test_tmp_1.Rda") 46 | r_sq_ls_all = list() 47 | r_sq_ls_all[c(1:70)] = r_sq_ls[1:70] 48 | load("data/fig5/r_sq_permut_test_tmp_2.Rda") 49 | r_sq_ls_all[c(71:140)] = r_sq_ls[71:140] 50 | load("data/fig5/r_sq_permut_test_tmp_3.Rda") 51 | r_sq_ls_all[c(141:nrow(best_predict))] = r_sq_ls[141:nrow(best_predict)] 52 | 53 | names(r_sq_ls_all) = best_predict$targ 54 | 55 | delta_r_sq = best_predict$m_rsq - best_predict$rsq_one_tf 56 | names(delta_r_sq) = best_predict$targ 57 | delta_r_sq = delta_r_sq[order(delta_r_sq,decreasing = T)] 58 | r_sq_ls_all = r_sq_ls_all[names(delta_r_sq)] 59 | 60 | 61 | names(r_sq_ls_all) = substr(names(r_sq_ls_all),1,10) 62 | 63 | 64 | ylim_box = c(0,max(unlist(r_sq_ls_all),delta_r_sq)) 65 | n_min = 1 66 | n_max = 100 67 | pdf("figs/paper_figs/fig_s5/fig_s5b_top100.pdf",useDingbats = F,w = 20, h = 5) 68 | par(mar = c(6,6,4,4)) 69 | boxplot(r_sq_ls_all[n_min:n_max],ylim = ylim_box,las = 2,ylab = expression(paste(Delta,paste(" R"^"2"))),main = expression(paste("Difference in ",paste("R"^"2")," values using 2 TFs vs 1 TF in linear model regression"))) 70 | points(delta_r_sq[n_min:n_max],col = "red",pch = 19) 71 | dev.off() 72 | 73 | n_min = 101 74 | n_max = nrow(best_predict) 75 | pdf("figs/paper_figs/fig_s5/fig_s5b_top200.pdf",useDingbats = F,w = 20, h = 5) 76 | par(mar = c(6,6,4,4)) 77 | boxplot(r_sq_ls_all[n_min:n_max],ylim = ylim_box,las = 2,ylab = expression(paste(Delta,paste(" R"^"2")))) 78 | points(delta_r_sq[n_min:n_max],col = "red",pch = 19) 79 | dev.off() 80 | 81 | 82 | 83 | 84 | } 85 | 86 | 87 | permutation_test_two_tf_pred = function(n_ind) { 88 | 89 | 90 | 91 | best_fit_1_tf_1_fixed = function(targ,tf_list,tf_fixed,legc,legc_shuff,best_one_tf) { 92 | r_sq = 0 93 | best_fit = c(0,0,0) 94 | tf_list_f = setdiff(tf_list,tf_fixed) 95 | for(i in 1:length(tf_list_f)) { 96 | tf1 = tf_list[i] 97 | m = summary(lm(legc[targ,] ~ legc_shuff[tf1,] + legc[tf_fixed,])) 98 | if(m$r.squared > r_sq) { 99 | best_fit = c(targ,tf1, m$r.squared) 100 | r_sq = m$r.squared 101 | } 102 | m2 = summary(lm(legc[targ,] ~ legc_shuff[tf1,] + legc[best_one_tf,])) 103 | if(m2$r.squared > r_sq) { 104 | best_fit = c(targ,tf1, m2$r.squared) 105 | r_sq = m2$r.squared 106 | } 107 | } 108 | return(r_sq) 109 | } 110 | 111 | mc = scdb_mc("sing_emb_wt10_recolored") 112 | col_to_ct = mc@color_key$group 113 | names(col_to_ct) = mc@color_key$color 114 | 115 | included_cell_types = c("Early nascent mesoderm","Late nascent mesoderm","Caudal mesoderm", 116 | "Paraxial mesoderm","Cardiac mesoderm","Rostral mesoderm","Lateral & intermediate mesoderm","ExE mesoderm", 117 | "Allantois","Haematoendothelial progenitors","Amnion/Chorion") 118 | 119 | mc_f = which(col_to_ct[mc@colors] %in% included_cell_types) 120 | 121 | legc = log2(mc@e_gc[,mc_f] + 3e-5) 122 | 123 | best_predict = read.table(file = "data/fig5/best_pred_2tfs.txt",sep = "\t",stringsAsFactors = F) 124 | best_predict_1tf = read.table(file = "data/fig5/best_pred_1_tf.txt",sep = "\t",stringsAsFactors = F) 125 | colnames(best_predict_1tf) = c("targ","best_tf_1tf","rsq_one_tf") 126 | best_predict = left_join(best_predict,best_predict_1tf, by = "targ") 127 | 128 | best_predict$rsq_tf_fix = ifelse(best_predict$tf1_r_sq > best_predict$tf2_r_sq,best_predict$tf1_r_sq,best_predict$tf2_r_sq) 129 | 130 | markers = best_predict$targ 131 | 132 | tf_list = read.table("data/fig5/tf_list_mesoderm.txt",sep = "\t",stringsAsFactors = F)$x 133 | 134 | set.seed(112) 135 | r_sq_ls = list() 136 | 137 | if (n_ind == 1) { 138 | ind_ls = c(1:70) 139 | } else if (n_ind == 2) { 140 | ind_ls = c(71:140) 141 | } else { 142 | ind_ls = c(141:nrow(best_predict)) 143 | } 144 | 145 | for (i in ind_ls) { 146 | print(i) 147 | targ = best_predict$targ[i] 148 | r_sq_tf1 = best_predict$tf1_r_sq[i] 149 | r_sq_tf2 = best_predict$tf2_r_sq[i] 150 | 151 | tf_fix = best_predict$tf1[i] 152 | if(r_sq_tf2 > r_sq_tf1) { 153 | tf_fix = best_predict$tf2[i] 154 | } 155 | rsq_tf_fix = best_predict$rsq_tf_fix[i] 156 | best_one_tf = best_predict$best_tf_1tf[i] 157 | 158 | 159 | r_sq_v = sapply(c(1:100),function(n) { 160 | 161 | mc_shuff = sample(ncol(legc)) 162 | legc_shuff = legc[tf_list,mc_shuff] 163 | 164 | r_sq = best_fit_1_tf_1_fixed(targ = targ,tf_list = tf_list,tf_fixed = tf_fix,legc = legc,legc_shuff = legc_shuff,best_one_tf = best_one_tf) 165 | return(r_sq) 166 | }) 167 | 168 | r_sq_ls[[i]] = r_sq_v - best_predict$rsq_one_tf[i] 169 | 170 | } 171 | 172 | delta_r_sq = best_predict$m_rsq[ind_ls] - best_predict$rsq_one_tf[ind_ls] 173 | 174 | 175 | save(r_sq_ls,file = sprintf("data/fig5/r_sq_permut_test_tmp_%d.Rda",n_ind)) 176 | 177 | } 178 | 179 | 180 | 181 | 182 | fig_s5a = function() { 183 | 184 | tf_list = read.table("data/fig5/tf_list_mesoderm.txt",sep = "\t",stringsAsFactors = F)$x 185 | 186 | mc = scdb_mc("sing_emb_wt10_recolored") 187 | col_to_ct = mc@color_key$group 188 | names(col_to_ct) = mc@color_key$color 189 | 190 | included_cell_types = c("Early nascent mesoderm","Late nascent mesoderm","Caudal mesoderm", 191 | "Paraxial mesoderm","Cardiac mesoderm","Rostral mesoderm","Lateral & intermediate mesoderm","ExE mesoderm", 192 | "Allantois","Haematoendothelial progenitors","Amnion/Chorion") 193 | 194 | mc_f = which(col_to_ct[mc@colors] %in% included_cell_types) 195 | 196 | legc = log2(1e-5+mc@e_gc[tf_list,mc_f]) 197 | 198 | tf_max = apply(legc,1,max) - log2(1e-5) 199 | 200 | tf_max = sort(tf_max,decreasing = T) 201 | 202 | pdf("figs/paper_figs/fig_s5/fig_s5a.pdf",useDingbats = F,width = 12,h = 4.5) 203 | barplot(tf_max,las = 2,yaxt = 'n',ylim = c(0,-8-log2(1e-5))) 204 | axis(side = 2,labels = c(-16.6,-14,-12,-10,-8),at = c(-16.6,-14,-12,-10,-8) - log2(1e-5)) 205 | dev.off() 206 | 207 | } 208 | 209 | fig_s5c = function(plot_pdf = T) { 210 | 211 | fig_dir = "figs/paper_figs/fig_s5/fig_s5c" 212 | if(!dir.exists(fig_dir)) { 213 | dir.create(fig_dir) 214 | } 215 | 216 | mc = scdb_mc("sing_emb_wt10_recolored") 217 | col_to_ct = mc@color_key$group 218 | names(col_to_ct) = mc@color_key$color 219 | 220 | included_cell_types = c("Early nascent mesoderm","Late nascent mesoderm","Caudal mesoderm", 221 | "Paraxial mesoderm","Cardiac mesoderm","Rostral mesoderm","Lateral & intermediate mesoderm","ExE mesoderm", 222 | "Allantois","Haematoendothelial progenitors","Amnion/Chorion") 223 | 224 | mc_f = which(col_to_ct[mc@colors] %in% included_cell_types) 225 | 226 | legc = log2(mc@e_gc[,mc_f] + 3e-5) 227 | 228 | # these external tables are generated by the function predict_variable_genes_lm() 229 | best_pred_1_tf = read.table(file = "data/fig5/best_pred_1_tf.txt",sep = "\t",stringsAsFactors = F) 230 | best_pred_2_tfs = read.table(file = "data/fig5/best_pred_2tfs.txt",sep = "\t",stringsAsFactors = F) 231 | 232 | rownames(best_pred_1_tf) = best_pred_1_tf$targ 233 | rownames(best_pred_2_tfs) = best_pred_2_tfs$targ 234 | 235 | target_genes = c("Col23a1","Dll3","Hsd11b2","Tmem88","Vim") 236 | 237 | for (targ in target_genes) { 238 | 239 | tf1 = best_pred_2_tfs[targ,"tf1"] 240 | tf2= best_pred_2_tfs[targ,"tf2"] 241 | 242 | if(plot_pdf) { 243 | pdf(sprintf("%s/%s_vs_%s.pdf",fig_dir,targ,tf1)) 244 | } else { 245 | png(sprintf("%s/%s_vs_%s.png",fig_dir,targ,tf1),w = 500,h = 500) 246 | } 247 | par(mar = c(6,6,2,4)) 248 | plot(x = legc[tf1,],y = legc[targ,],pch = 19,cex = 4,col = mc@colors[mc_f],xlab = tf1,ylab = targ,cex.lab = 2) 249 | dev.off() 250 | 251 | if (plot_pdf) { 252 | pdf(sprintf("%s/%s_vs_%s.pdf",fig_dir,targ,tf2)) 253 | } else { 254 | png(sprintf("%s/%s_vs_%s.png",fig_dir,targ,tf2),w = 500,h = 500) 255 | } 256 | par(mar = c(6,6,2,4)) 257 | plot(x = legc[tf2,],y = legc[targ,],pch = 19,cex = 4,col = mc@colors[mc_f],xlab = tf2,ylab = targ,cex.lab = 2) 258 | dev.off() 259 | 260 | out = lm(legc[targ,] ~ legc[tf1,]+legc[tf2,]) 261 | 262 | if(plot_pdf) { 263 | pdf(sprintf("%s/%s_vs_%s_and_%s.pdf",fig_dir,targ,tf1,tf2)) 264 | } else { 265 | png(sprintf("%s/%s_vs_%s_and_%s.png",fig_dir,targ,tf1,tf2),w = 500,h = 500) 266 | } 267 | par(mar = c(6,6,2,4)) 268 | plot(x = out$fitted.values,y = legc[targ,],pch = 19,cex = 4,col = mc@colors[mc_f],xlab = sprintf("Prediction from %s and %s",tf1,tf2),ylab = targ,cex.lab = 2) 269 | dev.off() 270 | 271 | } 272 | 273 | 274 | } 275 | 276 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/fig_s6.r: -------------------------------------------------------------------------------- 1 | library("Matrix") 2 | source("scripts/generate_paper_figures/plot_network.r") 3 | source("scripts/foxc12/preprocessing/foxc_control_gating.r") 4 | 5 | gen_fig_s6_plots = function() { 6 | 7 | if(!dir.exists("figs/paper_figs/fig_s6")) { 8 | dir.create("figs/paper_figs/fig_s6") 9 | } 10 | 11 | fig_s6a() 12 | fig_s6b() 13 | fig_s6c() 14 | } 15 | 16 | fig_s6b = function() { 17 | 18 | fig_dir = "figs/paper_figs/fig_s6/fig_s6b" 19 | if(!dir.exists(fig_dir)) { 20 | dir.create(fig_dir) 21 | } 22 | 23 | # from scripts/foxc12/foxc_control_gating 24 | # this script was used to gate the single cells and assign each cell to one of 25 | # the groups KO,control,host,wt or unclear 26 | # this information was added to the metadata entry of the metacell mat object. 27 | # See end of the function gating_of_foxc_control() 28 | df_gating = gating_of_foxc_control(fig_dir) 29 | 30 | } 31 | 32 | fig_s6a = function() { 33 | 34 | mct_id = "sing_emb_wt10" 35 | mat_id = "sing_emb_wt10" 36 | genes = c("Foxc1","Foxc2") 37 | mc = scdb_mc("sing_emb_wt10_recolored") 38 | mat = scdb_mat("sing_emb_wt10") 39 | mat_n = t(t(mat@mat[genes,names(mc@mc)])/colSums(mat@mat[,names(mc@mc)])) 40 | colors_ordered = mc@color_key$color 41 | 42 | cls_spl = split(names(mc@mc),mat@cell_metadata[names(mc@mc),"age_group"]) 43 | 44 | reg = 3e-5 45 | 46 | legc_ls = lapply(genes,function(gene) { 47 | e_gc_mat = sapply(cls_spl,function(cls) { 48 | e_gc_tmp = tapply(mat_n[gene,cls],mc@mc[cls],mean) 49 | e_gc_t = rep(0,ncol(mc@e_gc)) 50 | e_gc_t[as.numeric(names(e_gc_tmp))] = e_gc_tmp 51 | return(e_gc_t) 52 | }) 53 | legc = log2(reg + e_gc_mat) 54 | return(legc) 55 | }) 56 | 57 | 58 | mm_mctnetwork_plot_net(mct_id = mct_id,fn = "figs/paper_figs/fig_s6/fig_s6a_foxc1_over_flows.png",colors_ordered = colors_ordered,mc_t_score = legc_ls[[1]],dy_ext = 0,dx_back = 0,w = 2200,h = 1200) 59 | mm_mctnetwork_plot_net(mct_id = mct_id,fn = "figs/paper_figs/fig_s6/fig_s6a_foxc2_over_flows.png",colors_ordered = colors_ordered,mc_t_score = legc_ls[[2]],dy_ext = 0,dx_back = 0,w = 2200,h = 1200) 60 | } 61 | 62 | fig_s6a_color_scale = function() { 63 | 64 | cols = colorRampPalette(c("lightgray", "gray", "darkgray", "lightpink", "pink", "red", "darkred"))(101) 65 | vals = seq(0,1,length.out = 101) 66 | show_vals_ind = c(1,51,101) 67 | 68 | pdf(file = "figs/paper_figs/fig_s6/fig_s6a_color_scale.pdf",useDingbats = F) 69 | plot.new() 70 | plot.window(xlim=c(0,100), ylim=c(0, length(cols) + 3)) 71 | rect(7, 1:length(cols), 17, 1:length(cols) + 1, border=NA, col=cols) 72 | rect(7, 1, 17, length(cols)+1, col=NA, border = 'black') 73 | 74 | text(19, (1:length(cols))[show_vals_ind] + 0.5, labels=vals[show_vals_ind], pos=4) 75 | dev.off() 76 | 77 | } 78 | 79 | 80 | 81 | fig_s6c = function(plot_pdf = T) { 82 | 83 | fig_dir = "figs/paper_figs/fig_s6/fig_s6c" 84 | if(!dir.exists(fig_dir)) { 85 | dir.create(fig_dir) 86 | } 87 | 88 | genes_mm9 = read.table("data/external_data/gene_intervals_mm9.txt",sep = "\t",stringsAsFactors = F,header = T) 89 | mat= scdb_mat("foxc_chim_wt10") 90 | mc_wt = scdb_mc("sing_emb_wt10_recolored") 91 | cgraph = scdb_cgraph("foxc_chim_wt10") 92 | col_to_ct = mc_wt@color_key$group 93 | names(col_to_ct) = mc_wt@color_key$color 94 | col_to_ct = c(col_to_ct,"Mixed") 95 | names(col_to_ct)[length(col_to_ct)] = "gray" 96 | included_colors = setdiff(unique(mc_wt@colors),c("#F6BFCB","#7F6874")) 97 | 98 | df_chim = read.table("data/chimera_tetraploid_analysis/foxc_chim_wt10/time_match/time_match_summary.txt",h= T,sep = "\t",stringsAsFactors = F) 99 | chim_embryos = df_chim$embryo 100 | rownames(df_chim) = df_chim$embryo 101 | chim_embryos = chim_embryos[order(df_chim[chim_embryos,"best_rank_host"])] 102 | ref_ranks = c() 103 | for (i in 1:length(chim_embryos)) { 104 | host_match = round(df_chim[chim_embryos[i],"best_rank_host"]) 105 | ref_ranks = c(ref_ranks,c((host_match-2):(host_match+2))) 106 | } 107 | ref_ranks = unique(ref_ranks) 108 | 109 | load("data/chimera_tetraploid_analysis/foxc_chim_wt10/color_annotation/cmp_annot.Rda") 110 | query_cls_col = cmp_annot$query_cls_col 111 | 112 | all_cells = colnames(mat@mat) 113 | 114 | host_cls = all_cells[(mat@cell_metadata[all_cells,"cell_type"] == "host") & (mat@cell_metadata[all_cells,"embryo"] %in% chim_embryos)] 115 | host_cls = host_cls[query_cls_col[host_cls] %in% included_colors] 116 | 117 | fox_ko_cls = all_cells[mat@cell_metadata[all_cells,"cell_type"] == "KO" & (mat@cell_metadata[all_cells,"embryo"] %in% chim_embryos)] 118 | fox_ko_cls = fox_ko_cls[query_cls_col[fox_ko_cls] %in% included_colors] 119 | 120 | wt10_cls = names(mc_wt@mc)[(mat@cell_metadata[names(mc_wt@mc),"transcriptional_rank"] %in% ref_ranks) & mc_wt@colors[mc_wt@mc] %in% included_colors] 121 | wt10_cls = intersect(wt10_cls,colnames(mat@mat)) 122 | 123 | egc_ko = rowSums(mat@mat[,fox_ko_cls])/sum(mat@mat[,fox_ko_cls]) 124 | egc_host = rowSums(mat@mat[,host_cls])/sum(mat@mat[,host_cls]) 125 | egc_ref = rowSums(mat@mat[,wt10_cls])/sum(mat@mat[,wt10_cls]) 126 | 127 | 128 | lfp_ko_host = (egc_ko + 1e-4)/(egc_host + 1e-4) 129 | lfp_ko_ref = (egc_ko + 1e-4)/(egc_ref + 1e-4) 130 | lfp_host_ref = (egc_host + 1e-4)/(egc_ref + 1e-4) 131 | lfp_ko_host = (egc_ko + 1e-5)/(egc_host + 1e-5) 132 | lfp_ko_ref = (egc_ko + 1e-5)/(egc_ref + 1e-5) 133 | lfp_host_ref = (egc_host + 1e-5)/(egc_ref + 1e-5) 134 | 135 | included_chromosomes = grep("chr",unique(genes_mm9$chrom),v = T) 136 | included_chromosomes = setdiff(included_chromosomes,"chrM") 137 | 138 | mm9_f = genes_mm9[genes_mm9$chrom %in% included_chromosomes,] 139 | 140 | gene_to_chr = unique(mm9_f[,c("chrom","gene_name")]) 141 | 142 | gene_freq = table(gene_to_chr$gene_name) 143 | genes_f = names(gene_freq)[gene_freq == 1] 144 | 145 | mm9_f = mm9_f[mm9_f$gene_name %in% genes_f,] 146 | 147 | gene_pos = tapply(mm9_f$start,mm9_f$gene_name,min) 148 | 149 | gene_to_chr = gene_to_chr[gene_to_chr$gene_name %in% genes_f,] 150 | 151 | gene_ls = split(gene_to_chr$gene_name,gene_to_chr$chrom) 152 | chromosomes = paste0("chr",as.character(c(c(1:19),c("X","Y")))) 153 | gene_ls =gene_ls[chromosomes] 154 | 155 | ymin = 0 156 | ymax = 2.5 157 | abline_h = 1 158 | if(plot_pdf) { 159 | h_all = 10.5 160 | h_one = 0.45 161 | h_first = 0.8 162 | w_all = 15 163 | } else { 164 | 165 | h_all = 1050 166 | h_one = 45 167 | h_first = 80 168 | w_all = 1500 169 | } 170 | 171 | if(plot_pdf) { 172 | pdf(sprintf("%s/karyogramm_ko_vs_host.pdf",fig_dir),h = h_all, w = w_all,useDingbats = F) 173 | } else { 174 | png(sprintf("%s/karyogramm_ko_vs_host.png",fig_dir),h = h_all, w = w_all) 175 | } 176 | 177 | layout(mat = matrix(c(1:21),nrow = 21,ncol = 1),heights = c(h_first,rep(h_one,20))) 178 | par(mar= c(0.5,6,4,4)) 179 | for(chrom in chromosomes) { 180 | genes = gene_ls[[chrom]] 181 | if(chrom == "chr1") { 182 | main_plot = "Chimera KO vs host" 183 | } else { 184 | main_plot = "" 185 | } 186 | plot(x = gene_pos[genes],y = pmax(pmin(lfp_ko_host[genes],ymax),ymin),pch =19,ylim = c(ymin,ymax),xlab = "",ylab = chrom,cex.lab = 2,main = main_plot, 187 | cex.main = 4) 188 | abline(h = abline_h) 189 | par(mar= c(0.5,6,0.5,4)) 190 | } 191 | dev.off() 192 | 193 | for (chrom in c("chr1","chr7","chr8")) { 194 | genes = gene_ls[[chrom]] 195 | if(plot_pdf) { 196 | pdf(sprintf("%s/%s.pdf",fig_dir,chrom), w = 10,h = 3) 197 | } else { 198 | png(sprintf("%s/%s.png",fig_dir,chrom), w = 1000,h = 300) 199 | } 200 | 201 | main_plot = "" 202 | par(mar = c(4,6,2,2)) 203 | plot(x = gene_pos[genes],y = pmax(pmin(lfp_ko_host[genes],ymax),ymin),pch =19,ylim = c(ymin,ymax),xlab = "",ylab = chrom,cex.lab = 2,main = main_plot, 204 | cex.main = 4) 205 | abline(h = abline_h) 206 | dev.off() 207 | 208 | } 209 | 210 | } -------------------------------------------------------------------------------- /scripts/generate_paper_figures/generate_all_figures.r: -------------------------------------------------------------------------------- 1 | # generate all figures 2 | 3 | generate_all_figures = function() { 4 | 5 | gen_fig_1_plots() 6 | message("generated Figure 1 plots") 7 | gen_fig_2_plots() 8 | message("generated Figure 2 plots") 9 | gen_fig_3_plots() 10 | message("generated Figure 3 plots") 11 | gen_fig_4_plots() 12 | message("generated Figure 4 plots") 13 | gen_fig_5_plots() 14 | message("generated Figure 5 plots") 15 | gen_fig_6_plots() 16 | message("generated Figure 6 plots") 17 | gen_fig_7_plots() 18 | message("generated Figure 7 plots") 19 | gen_fig_s1_plots() 20 | message("generated Figure S1 plots") 21 | gen_fig_s2_plots() 22 | message("generated Figure S2 plots") 23 | gen_fig_s3_plots() 24 | message("generated Figure S3 plots") 25 | gen_fig_s4_plots() 26 | message("generated Figure S4 plots") 27 | gen_fig_s5_plots() 28 | message("generated Figure S5 plots") 29 | gen_fig_s6_plots() 30 | message("generated Figure S6 plots") 31 | gen_fig_s7_plots() 32 | message("generated Figure S7 plots") 33 | 34 | } -------------------------------------------------------------------------------- /scripts/generate_paper_figures/plot_3d_vein.r: -------------------------------------------------------------------------------- 1 | add_alpha = function(col, alpha) { 2 | return(rgb(t(col2rgb(col))/256, alpha=alpha)) 3 | } 4 | 5 | persp_polygon = function(x, y, col, z, k_persp=0.5) 6 | { 7 | polygon(y, 14-(x+z*k_persp), col=col, border=NA) 8 | } 9 | 10 | plot_persp_levels = function(smoo_y1, smoo_y2, col_persp, cols, lims, k_persp=0.5, level_step=20) 11 | { 12 | max_xi = length(smoo_y1[[1]]) 13 | max_y = lims[2] 14 | min_y = lims[1] 15 | yrange = floor((max_y-min_y)*100) 16 | zs = matrix(ncol = yrange, nrow=max_xi,0) 17 | for(col in cols) { 18 | z = col_persp[col] 19 | y1 = smoo_y1[[col]] 20 | y2 = smoo_y2[[col]] 21 | if(sum(y1)>sum(y2)) { 22 | y2 = smoo_y1[[col]] 23 | y1 = smoo_y2[[col]] 24 | } 25 | 26 | idx = apply(cbind(1:max_xi,(y1-min_y)*100,(y2-min_y)*100), 1, function(y) { 27 | xi = y[1] 28 | y1bin = max(0,floor(y[2])-1) 29 | y2bin = floor(y[3])-1 30 | if((y2bin-y1bin)<2) { 31 | return(c()) 32 | } else { 33 | return(xi + max_xi*((y1bin-2):(y2bin+2))) 34 | } 35 | }) 36 | zs[unlist(idx)] = z 37 | } 38 | ys = seq(min_y, max_y, l=yrange) 39 | shades = colorRampPalette(c("cornsilk3", "darkgoldenrod2", "darkgoldenrod4", "black"))(200) 40 | for(xi in seq(1,max_xi,level_step)) { 41 | z = zs[xi,] 42 | z_lo = loess(z ~ ys, span=0.03)$fitted 43 | lev_cols = shades[pmax(pmin(floor(z_lo*100)+1,200),1)] 44 | sx = ys 45 | sy = 13-xi/100-z_lo*k_persp 46 | n = length(sx) 47 | segments(sx[-n],sy[-n],sx[-1],sy[-1], lty=1, lwd=0.3, col=lev_cols[-1]) 48 | } 49 | 50 | } 51 | 52 | draw_sig_edge_3d = function(x1, x2, x2t, y1, y2, y2t, flow, col1, col2, col_alpha=0.8, clip_top, clip_bot, z1, z2) 53 | { 54 | x1 = x1 55 | y1 = y1 56 | dx = x2t - x1 57 | dy = y2t - y1 58 | 59 | y1t = y1+flow 60 | dxt = x2 - x1 61 | dyt = y2 - y1t 62 | 63 | dz = z2-z1 64 | 65 | col1 = col2rgb(col1)[,1] 66 | names(col1) = c("red","green","blue") 67 | col2 = col2rgb(col2)[,1] 68 | names(col2) = c("red","green","blue") 69 | res = 0.01 70 | # message("segs ", "x1 ", x1, " y1 ", y1, " dx ", dx, " dy ", dy, " y1t ", y1t, " dxt ", dxt, " dyt ", dyt) 71 | beta0 = plogis(0,loc=0.5,scale=0.2) 72 | beta_f = plogis(1,loc=0.5,scale=0.2)-plogis(0,loc=0.5, scale=0.2) 73 | for(r in seq(0,1,res)) { 74 | beta = (plogis(r,loc=0.5,scale=0.2)-beta0)/beta_f 75 | beta5 = (plogis(r+res,loc=0.5,scale=0.2)-beta0)/beta_f 76 | 77 | sx1 = x1+r*dx 78 | sy1 = y1+beta*dy 79 | sx2 = x1+(r+res)*dx 80 | sy2 = y1+beta5*dy 81 | 82 | sx1t = x1+r*dxt 83 | sy1t = y1t+beta*dyt 84 | sx2t = x1+(r+res)*dxt 85 | sy2t = y1t+beta5*dyt 86 | 87 | sz = z1+r*dz 88 | sz2 = z1+(r+res)*dz 89 | 90 | r_col = r 91 | 92 | sy1 = pmin(pmax(sy1, clip_bot[as.character(round(sx1,2))]), 93 | clip_top[as.character(round(sx1,2))]) 94 | sy2 = pmin(pmax(sy2, clip_bot[as.character(round(sx2,2))]), 95 | clip_top[as.character(round(sx2,2))]) 96 | sy1t = pmin(pmax(sy1t, clip_bot[as.character(round(sx1t,2))]), 97 | clip_top[as.character(round(sx1t,2))]) 98 | sy2t = pmin(pmax(sy2t, clip_bot[as.character(round(sx2t,2))]), 99 | clip_top[as.character(round(sx2t,2))]) 100 | rgb_r = col2["red"]*r_col+col1["red"]*(1-r_col) 101 | rgb_g = col2["green"]*r_col+col1["green"]*(1-r_col) 102 | rgb_b = col2["blue"]*r_col+col1["blue"]*(1-r_col) 103 | col = rgb(rgb_r/256, rgb_g/256, rgb_b/256, col_alpha) 104 | persp_polygon(c(sx1, sx2, sx2t,sx1t), 105 | c(sy1, sy2, sy2t, sy1t), 106 | col=col, 107 | z = c(sz,sz2,sz2,sz)) 108 | } 109 | # segments(x1,y1,x2,y2, 110 | } 111 | 112 | plot_all_veins = function(ordered_cols,fig_dir="figs", plot_pdf = F,first_col="#635547", xlim=c(-4,12), col_persp=NULL, add_levels=T, fn="all") { 113 | 114 | mc = scdb_mc("sing_emb_wt10_recolored") 115 | 116 | mat = scdb_mat("sing_emb_wt10") 117 | 118 | md = mat@cell_metadata 119 | 120 | type_ag= table(mc@colors[mc@mc], md[names(mc@mc),"age_group"]) 121 | 122 | type_agn = t(t(type_ag)/colSums(type_ag)) 123 | 124 | mct = scdb_mctnetwork("sing_emb_wt10") 125 | type_flow = mctnetwork_get_type_flows(mct, 1,13) 126 | 127 | key = mc@color_key 128 | rownames(key) = key$group 129 | 130 | t1 = 1 131 | t2 = 12 132 | T_minflow_for_type = 0.005 133 | T_minflow = 6e-3 134 | k_space_z = 0.2 135 | 136 | cols = ordered_cols 137 | 138 | center_i = which(cols == first_col) 139 | 140 | top_front = rep(0,1+(t2-t1)*100) 141 | 142 | foc_agn = type_agn[cols,t1:t2] 143 | foc_agn[foc_agn < 1e-3] = 0 144 | 145 | smoo_y1 = list() 146 | smoo_y2 = list() 147 | x = t1:t2 148 | if(plot_pdf) { 149 | pdf(sprintf("%s/%s.pdf",fig_dir, fn),w=20,h=12,useDingbats = F) 150 | } else { 151 | png(sprintf("%s/%s.png",fig_dir, fn),w=3000,h=1600,bg = "transparent") 152 | } 153 | 154 | plot(0, ylim=c(t1 - 2,t2 + 2), xlim=xlim) 155 | 156 | y_expand = seq(1,2,l=1+(t2-t1)*100)**1 157 | ry_expand = rev(y_expand) 158 | prev_z = 0 159 | for(i in center_i:length(cols)) { 160 | cur_col = cols[i] 161 | y = foc_agn[cur_col,t1:t2] 162 | ys = approx(x,y, seq(t1,t2,l=1+(t2-t1)*100)) 163 | lo = loess(ys$y ~ ys$x,span=0.3)$fitted 164 | 165 | calpha = add_alpha(cur_col,0.8) 166 | if(i == center_i) { 167 | smoo_y2[[cur_col]] = (lo+top_front)*y_expand 168 | smoo_y1[[cur_col]] = (-lo+top_front)*y_expand 169 | names(smoo_y1[[cur_col]]) = ys$x 170 | names(smoo_y2[[cur_col]]) = ys$x 171 | persp_polygon(c(ys$x,rev(ys$x)), 172 | c((lo+top_front)*y_expand, (rev(top_front)+rev(-lo))*ry_expand), 173 | col = calpha, 174 | z = rep(col_persp[cur_col],length(top_front)*2)) 175 | } else { 176 | smoo_y2[[cur_col]] = (2*lo+top_front)*y_expand 177 | smoo_y1[[cur_col]] = top_front*y_expand 178 | names(smoo_y1[[cur_col]]) = ys$x 179 | names(smoo_y2[[cur_col]]) = ys$x 180 | persp_polygon(c(ys$x,rev(ys$x)), 181 | c((2*lo+top_front)*y_expand, rev(top_front)*ry_expand), 182 | col = calpha, 183 | z = rep(col_persp[cur_col], length(top_front)*2)) 184 | } 185 | 186 | dz = max(0,col_persp[cur_col] - prev_z) 187 | prev_z = col_persp[cur_col] 188 | top_front = top_front + dz*k_space_z + loess(ifelse(lo>0, 0.05+0.5*max(lo)+2*lo, 0) ~ ys$x, span=0.7)$fitted 189 | if(i == center_i) { 190 | top_front = top_front / 2 191 | bot_front = -top_front 192 | } 193 | } 194 | 195 | prev_z = 0 196 | if(center_i != 1) { 197 | for(i in (center_i-1):1) { 198 | cur_col = cols[i] 199 | y = foc_agn[cur_col,t1:t2] 200 | ys = approx(x,y, seq(t1,t2,l=1+(t2-t1)*100)) 201 | lo = loess(ys$y ~ ys$x,span=0.3)$fitted 202 | 203 | calpha = add_alpha(cur_col,0.8) 204 | smoo_y2[[cur_col]] = bot_front*y_expand 205 | smoo_y1[[cur_col]] = (-2*lo+bot_front)*y_expand 206 | names(smoo_y1[[cur_col]]) = ys$x 207 | names(smoo_y2[[cur_col]]) = ys$x 208 | #message("cur_col ", cur_col, " persp " , col_persp[cur_col]) 209 | persp_polygon(c(ys$x,rev(ys$x)), 210 | c(bot_front*y_expand, (rev(bot_front)+rev(-2*lo))*ry_expand), 211 | col = calpha, 212 | z = rep(col_persp[cur_col], length(top_front)*2)) 213 | 214 | dz = max(0,col_persp[cur_col] - prev_z) 215 | prev_z = col_persp[cur_col] 216 | bot_front = bot_front - dz*k_space_z - loess(ifelse(lo>0, 0.05+0.5*max(lo)+2*lo, 0) ~ ys$x, span=0.7)$fitted 217 | } 218 | } 219 | if(add_levels) { 220 | plot_persp_levels(smoo_y1, smoo_y2, col_persp, cols, xlim) 221 | } 222 | 223 | for(foc_i in 1:length(cols)) { 224 | foc_color = cols[foc_i] 225 | for(t in t1:(t2-1)) { 226 | flow = type_flow[[t]] 227 | max_i = length(cols) 228 | cum_y = smoo_y2[[foc_color]][as.character(t)] 229 | if(foc_i > 1) { 230 | for(i in 1:(foc_i-1)) { 231 | col_i = cols[i] 232 | fl = flow[foc_color,col_i]*2 233 | if(!is.null(fl) & length(fl) > 0 & fl > T_minflow) { 234 | calpha = add_alpha(col_i, 0.8) 235 | draw_sig_edge_3d(x1=t, 236 | x2 = t+1, 237 | x2t = t+1-2*fl-0.05, 238 | y1 = smoo_y1[[foc_color]][as.character(t)], 239 | y2 = smoo_y2[[col_i]][as.character(t+1)], 240 | y2t =smoo_y2[[col_i]][as.character(round(t+1-2*fl-0.05,2))], 241 | flow = fl, 242 | col1=foc_color, col2=col_i, 243 | clip_top = smoo_y1[[foc_color]], 244 | clip_bot = smoo_y2[[col_i]], 245 | z1=col_persp[foc_color], z2=col_persp[col_i]) 246 | } 247 | } 248 | } 249 | if(foc_i < max_i) { 250 | for(i in max_i:(foc_i+1)) { 251 | col_i = cols[i] 252 | fl = flow[foc_color,col_i]*2 253 | if(fl > T_minflow) { 254 | calpha = add_alpha(col_i, 0.8) 255 | draw_sig_edge_3d(x1=t, 256 | x2t = t+1, 257 | x2 = t+1-2*fl-0.05, 258 | y1 = smoo_y2[[foc_color]][as.character(t)]-fl, 259 | y2t = smoo_y1[[col_i]][as.character(t+1)], 260 | y2 = smoo_y1[[col_i]][as.character(round(t+1-2*fl-0.05,2))], 261 | flow = fl, 262 | col1=foc_color, col2=col_i, 263 | clip_bot = smoo_y2[[foc_color]], 264 | clip_top = smoo_y1[[col_i]], 265 | z1=col_persp[foc_color], z2=col_persp[col_i]) 266 | } 267 | } 268 | } 269 | #incoming 270 | } 271 | } 272 | dev.off() 273 | } 274 | 275 | 276 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/plot_network.r: -------------------------------------------------------------------------------- 1 | 2 | 3 | mm_mctnetwork_plot_net = function(mct_id, fn, 4 | mc_ord = NULL, colors_ordered = NULL, 5 | propogate=NULL, 6 | mc_t_score = NULL, 7 | edge_w_scale=5e-4, 8 | w = 2000,h = 2000, 9 | mc_cex = 0.5, 10 | dx_back = 0.15, dy_ext = 0.4, 11 | sigmoid_edge = F, grad_col_edge = F, 12 | plot_mc_ids = F,miss_color_thresh = 0.5, 13 | func_deform=NULL, 14 | plot_background_as_grey = F, 15 | bg_col = "gray90", 16 | score_shades = colorRampPalette(c("lightgray", "gray", "darkgray", "lightpink", "pink", "red", "darkred"))(1000), 17 | bg_scale = 1, 18 | fr_scale = 2, 19 | max_lwd = 10, 20 | plot_pdf = FALSE, 21 | show_over_under_flow = T, 22 | show_axes = T, 23 | bg = "white") 24 | { 25 | if(!is.null(propogate) | !is.null(mc_t_score)) { 26 | dx_back = 0 27 | dy_ext = 0 28 | } 29 | 30 | mct = scdb_mctnetwork(mct_id) 31 | if(is.null(mct)) { 32 | stop("cannot find mctnet object ", mct_id, " when trying to plot net flows") 33 | } 34 | net= mct@network 35 | mc = scdb_mc(mct@mc_id) 36 | if(is.null(mc)) { 37 | stop("cannot find mc object ", mct@mc_id, " matching the mc id in the mctnetwork object! db mismatch? recreate objects?") 38 | } 39 | if(is.null(mct@edge_flows) | sum(mct@edge_flows)==0) { 40 | stop("flows seems not to be initialized in mct id ", mct_id, " maybe rerun the mincost algorithm?") 41 | } 42 | names(mc@colors) = as.character(1:length(mc@colors)) 43 | 44 | #color_ord = read.table("config/atlas_type_order.txt", h=T, sep="\t") 45 | #order MCs by type, mean age 46 | if(is.null(mc_ord)) { 47 | if(is.null(colors_ordered)) { 48 | stop("specify either mc_ord or color ord when plotting mctnet network") 49 | } 50 | mc_rank = mctnetwork_mc_rank_from_color_ord(mct_id, colors_ordered) 51 | } else { 52 | mc_rank = rep(-1,length(mc_ord)) 53 | mc_rank[mc_ord] = c(1:length(mc_ord)) 54 | names(mc_rank) = as.character(1:length(mc_rank)) 55 | } 56 | 57 | mc_rank["-2"] = 0 58 | mc_rank["-1"] = length(mc_rank)/2 59 | 60 | #add growth mc 61 | 62 | f= net$flow > 1e-4 63 | nn = net[f,] 64 | x1 = nn$time1 65 | x2 = nn$time2 66 | y1 = as.numeric(mc_rank[as.character(nn$mc1)]) 67 | y2 = as.numeric(mc_rank[as.character(nn$mc2)]) 68 | 69 | x1 = ifelse(nn$type1 == "growth", x1 + 0.3, x1) 70 | x2 = ifelse(nn$type2 == "growth", x2 + 0.3, x2) 71 | x1 = ifelse(nn$type1 == "norm_b" | nn$type1 == "extend_b",x1-dx_back,x1) 72 | x2 = ifelse(nn$type2 == "norm_b" | nn$type2 == "extend_b",x2-dx_back,x2) 73 | y1 = ifelse(nn$type1 == "src", max(y1)/2, y1) 74 | y2 = ifelse(nn$type2 == "sink", max(y2)/2, y2) 75 | y2 = ifelse(nn$type2 == "sink", NA, y2) 76 | y1 = ifelse(nn$type1 == "growth", y2+2.5, y1) 77 | y2 = ifelse(nn$type2 == "growth", y1+2.5, y2) 78 | y1 = ifelse(nn$type1 == "extend_b" | nn$type1 == "extend_f",y1+dy_ext, y1) 79 | y2 = ifelse(nn$type2 == "extend_f" | nn$type2 == "extend_b",y2+dy_ext, y2) 80 | 81 | if(!is.null(func_deform)) { 82 | min_x = min(c(x1,x2),na.rm=T) 83 | min_y = min(c(y1,y2),na.rm=T) 84 | range_x = (max(c(x1,x2),na.rm=T)-min_x) 85 | range_y = (max(c(y1,y2),na.rm=T)-min_y) 86 | xy1 = func_deform((x1-min_x)/range_x, (y1-min_y)/range_y) 87 | xy2 = func_deform((x2-min_x)/range_x, (y2-min_y)/range_y) 88 | x1 = xy1[[1]] 89 | x2 = xy2[[1]] 90 | y1 = xy1[[2]] 91 | y2 = xy2[[2]] 92 | } 93 | 94 | nn$mc1 = ifelse(nn$type1 == "src", nn$mc2, nn$mc1) 95 | 96 | if(plot_pdf) { 97 | pdf(fn,width = w,height =h,useDingbats = F) 98 | } else { 99 | png(fn, width = w,height = h,bg = bg) 100 | } 101 | 102 | f_overflow = nn$type2=="extend_f" & nn$cost > 100 103 | f_underflow = nn$type2 == "norm_f" & nn$cost < -100 104 | # nn$flow/(1e-8 + nn$capacity) < miss_color_thresh 105 | if(is.null(propogate) & is.null(mc_t_score)) { 106 | 107 | if(show_axes) { 108 | plot(c(x1,x2), c(y1,y2), pch=19, col=mc@colors[c(nn$mc1,nn$mc2)],cex=mc_cex) 109 | } else { 110 | plot(c(x1,x2), c(y1,y2), pch=19, col=mc@colors[c(nn$mc1,nn$mc2)],cex=mc_cex,axes = F,xlab = "",ylab = "") 111 | } 112 | 113 | mc_rgb = col2rgb(mc@colors)/256 114 | f = nn$mc1>0 & nn$mc2 > 0 115 | m1 = as.numeric(nn$mc1[f]) 116 | m2 = as.numeric(nn$mc2[f]) 117 | seg_df = data.frame(x1 = x1[f], y1=y1[f], dx=x2[f]-x1[f], dy=y2[f]-y1[f], 118 | r1 = mc_rgb["red",m1], 119 | r2 = mc_rgb["red",m2], 120 | g1 = mc_rgb["green",m1], 121 | g2 = mc_rgb["green",m2], 122 | b1 = mc_rgb["blue",m1], 123 | b2 = mc_rgb["blue",m2]) 124 | 125 | for(alpha in seq(0,0.98,0.02)) { 126 | beta = alpha 127 | beta5 = alpha+0.02 128 | if(sigmoid_edge) { 129 | beta = plogis(alpha,loc=0.5,scale=0.1) 130 | beta5 = plogis(alpha+0.02,loc=0.5,scale=0.1) 131 | } 132 | sx1 = seg_df$x1+alpha*seg_df$dx 133 | sx2 = seg_df$x1+(alpha+0.02)*seg_df$dx 134 | sy1 = seg_df$y1+beta*seg_df$dy 135 | sy2 = seg_df$y1+beta5*seg_df$dy 136 | alpha_col = ifelse(grad_col_edge, alpha,0) 137 | rgb_r = seg_df$r2*alpha_col+seg_df$r1*(1-alpha_col) 138 | rgb_g = seg_df$g2*alpha_col+seg_df$g1*(1-alpha_col) 139 | rgb_b = seg_df$b2*alpha_col+seg_df$b1*(1-alpha_col) 140 | cols = rgb(rgb_r, rgb_g, rgb_b) 141 | segments(sx1, sy1, sx2, sy2, 142 | col=ifelse(nn$type2=="growth" | nn$type1=="source" | nn$type2=="sink", "gray", cols), 143 | lwd=pmin(nn$flow/edge_w_scale, max_lwd)) 144 | } 145 | # segments(x1,y1,x2,y2, 146 | # col=ifelse(nn$type2=="growth", "black", mc@colors[nn$mc1]), 147 | # lwd=pmin(nn$flow/edge_w_scale, 10)) 148 | if(show_over_under_flow) { 149 | f = f_overflow; segments(x1[f],y1[f],x2[f],y2[f], col="red", 150 | lwd=pmin(nn$flow[f]/edge_w_scale, max_lwd)) 151 | f = f_underflow; segments(x1[f],y1[f],x2[f],y2[f], col="blue", 152 | lwd=pmin((nn$capacity[f] - nn$flow[f])/edge_w_scale,max_lwd)) 153 | } 154 | 155 | points(c(x1,x2), c(y1,y2), pch=19, col=mc@colors[c(nn$mc1,nn$mc2)],cex=1) 156 | } else if(!is.null(mc_t_score)) { 157 | plot(c(x1,x2), c(y1,y2), pch=19, col=mc@colors[c(nn$mc1,nn$mc2)],cex=mc_cex) 158 | mc_t_score = pmax(mc_t_score, quantile(mc_t_score,0.03)) 159 | mc_t_score = pmin(mc_t_score, quantile(mc_t_score,0.97)) 160 | mc_t_score = mc_t_score-min(mc_t_score) 161 | mc_t_score = mc_t_score/max(mc_t_score) 162 | mc_t_score = floor(1+999*mc_t_score) 163 | f = nn$mc1>0 & nn$mc2>0 & nn$time1>0 164 | max_mc = nrow(mc_t_score) 165 | m1 = as.numeric(nn$mc1[f]) 166 | m2 = as.numeric(nn$mc2[f]) 167 | score1 = rep(1, nrow(nn)) 168 | score2 = rep(1, nrow(nn)) 169 | score1[f] = mc_t_score[m1+(nn[f,"time1"]-1)*max_mc] 170 | score2[f] = mc_t_score[m2+(nn[f,"time2"]-1)*max_mc] 171 | 172 | seg_df = data.frame(x1 = x1, y1=y1, dx=x2-x1, dy=y2-y1, 173 | score1= score1, dscore=score2-score1) 174 | 175 | for(alpha in seq(0,0.95,0.05)) { 176 | x1 = seg_df$x1+alpha*seg_df$dx 177 | x2 = seg_df$x1+(alpha+0.05)*seg_df$dx 178 | y1 = seg_df$y1+alpha*seg_df$dy 179 | y2 = seg_df$y1+(alpha+0.05)*seg_df$dy 180 | cols = score_shades[floor(seg_df$score1 + (alpha+0.025)*seg_df$dscore)] 181 | segments(x1, y1, x2, y2, 182 | col=ifelse(nn$type2=="growth" | nn$type1=="source" | nn$type2=="sink", "gray", cols), 183 | lwd=pmin(nn$flow/edge_w_scale, max_lwd)) 184 | } 185 | rect(seg_df$x1-0.12,seg_df$y1-0.5, seg_df$x1+0.12, seg_df$y1+0.5, 186 | col = mc@colors[nn$mc1], border=NA) 187 | rect(seg_df$x2-0.12,seg_df$y2-0.5, seg_df$x2+0.12, seg_df$y2+0.5, 188 | col = mc@colors[nn$mc2], border=NA) 189 | } else { 190 | 191 | #mc_col_bg = alpha(c(c("gray","gray"),mc@colors),0.01) 192 | #names(mc_col_bg) = c(c("-2","-1"),as.character(c(1:length(mc@colors)))) 193 | 194 | 195 | plot(c(x1,x2), c(y1,y2), pch=19, col=bg_col,cex=mc_cex*bg_scale*0.2,axes = F,xlab = "",ylab = "") 196 | 197 | segments(x1,y1,x2,y2, 198 | col=bg_col, 199 | lwd=pmin(nn$flow/edge_w_scale, max_lwd)*bg_scale) 200 | 201 | #plot(c(x1,x2), c(y1,y2), pch=19, col=mc@colors[c(nn$mc1,nn$mc2)],cex=mc_cex) 202 | max_time = length(propogate) 203 | m1 = as.numeric(nn$mc1) 204 | m2 = as.numeric(nn$mc2) 205 | max_m = ncol(propogate[[1]]) 206 | prop_flow = rep(0, nrow(nn)) 207 | for(t in 1:max_time) { 208 | f = (nn$time1 == t) & nn$mc1>0 & nn$mc2>0 209 | prop_flow[f] = propogate[[t]][m1[f]+max_m*(m2[f]-1)] 210 | } 211 | 212 | prop_flow[prop_flow/edge_w_scale < 1] = 0 213 | 214 | segments(x1,y1,x2,y2, 215 | col=ifelse(nn$type2=="growth", "black", mc@colors[nn$mc1]), 216 | lwd=pmin(prop_flow/edge_w_scale, max_lwd)*fr_scale) 217 | points(c(x1,x2), c(y1,y2), pch=19, col=mc@colors[c(nn$mc1,nn$mc2)],cex=mc_cex*rep(pmin(prop_flow/edge_w_scale, max_lwd),2)*0.1*fr_scale) 218 | } 219 | 220 | if(plot_mc_ids) { 221 | f1 = nn$type1!="growth" 222 | text(x1[f1]-0.2,y1[f1], labels = nn$mc1[f1], cex=1) 223 | # text(c(x1[f1],x2[f2]),c(y1[f1],y2[f2]),labels = c(nn$mc1[f1],nn$mc2[f2]), cex=1) 224 | } 225 | 226 | dev.off() 227 | } 228 | 229 | 230 | 231 | 232 | 233 | 234 | mctnetwork_mc_rank_from_color_ord = function(mct_id, colors_ordered = NULL) 235 | { 236 | mct = scdb_mctnetwork(mct_id) 237 | if(is.null(mct)) { 238 | stop("cannot find mctnet object ", mct_id, " when trying to plot net flows") 239 | } 240 | mc = scdb_mc(mct@mc_id) 241 | mc_mean_age = apply(mct@mc_t, 1, function(x) {return(mean(x*(1:length(x))/sum(x))) }) 242 | col_to_rank = c(1:length(colors_ordered)) 243 | names(col_to_rank) = colors_ordered 244 | 245 | mc_rank = 1000*col_to_rank[mc@colors] + mc_mean_age 246 | mc_rank = rank(mc_rank) 247 | names(mc_rank) = as.character(1:length(mc_rank)) 248 | 249 | mc_flows = mctnetwork_get_flow_mat(mct, -1) 250 | diag(mc_flows) = 0 251 | mc_flow_fore = mc_flows/rowSums(mc_flows) 252 | mc_flow_back = t(mc_flows)/colSums(mc_flows) 253 | 254 | mc_rank1 = mc_rank 255 | 256 | #cluster flow graph 257 | #rank clusters by mean rank 258 | #rank mc by membership + time 259 | if(0) { 260 | for(i in 1:20) { 261 | rank_fore = mc_flow_fore %*% mc_rank1 262 | rank_back = mc_flow_back %*% mc_rank1 263 | 264 | dfore = rank_fore-mc_rank1 265 | dback = rank_back-mc_rank1 266 | 267 | f = sign(dfore) == sign(dback) 268 | f[is.na(f)] = F 269 | # message("iter ", i, " sign consist on " , sum(f)) 270 | dlt = sign(dfore) * pmin(abs(dfore), abs(dback)) 271 | dlt[!f] = 0 272 | # message("max dlt ", paste(tail(sort(dlt),5)," ")) 273 | mc_rank1 = rank(mc_rank1 + dlt*0.5) 274 | names(mc_rank1) = as.character(1:length(mc_rank)) 275 | } 276 | } 277 | #plot(sign(dfore)*log2(abs(dfore)), sign(dback)*log2(abs(dback)), pch=19, col=mc@colors, cex=1.5) 278 | return(mc_rank1) 279 | } 280 | 281 | -------------------------------------------------------------------------------- /scripts/generate_paper_figures/plot_vein.r: -------------------------------------------------------------------------------- 1 | add_alpha = function(col, alpha) { 2 | return(rgb(t(col2rgb(col))/256, alpha=alpha)) 3 | } 4 | 5 | draw_sig_edge = function(x1, x2, x2t, y1, y2, y2t, flow, col1, col2, col_alpha=0.8) { 6 | 7 | x1 = x1 8 | y1 = y1 9 | dx = x2t - x1 10 | dy = y2t - y1 11 | 12 | y1t = y1+flow 13 | dxt = x2 - x1 14 | dyt = y2 - y1t 15 | 16 | col1 = col2rgb(col1)[,1] 17 | names(col1) = c("red","green","blue") 18 | col2 = col2rgb(col2)[,1] 19 | names(col2) = c("red","green","blue") 20 | res = 0.05 21 | # message("segs ", "x1 ", x1, " y1 ", y1, " dx ", dx, " dy ", dy, " y1t ", y1t, " dxt ", dxt, " dyt ", dyt) 22 | beta0 = plogis(0,loc=0.5,scale=0.2) 23 | beta_f = plogis(1,loc=0.5,scale=0.2)-plogis(0,loc=0.5, scale=0.2) 24 | for(r in seq(0,0.98,res)) { 25 | beta = (plogis(r,loc=0.5,scale=0.2)-beta0)/beta_f 26 | beta5 = (plogis(r+res,loc=0.5,scale=0.2)-beta0)/beta_f 27 | 28 | sx1 = x1+r*dx 29 | sy1 = y1+beta*dy 30 | sx2 = x1+(r+res)*dx 31 | sy2 = y1+beta5*dy 32 | 33 | sx1t = x1+r*dxt 34 | sy1t = y1t+beta*dyt 35 | sx2t = x1+(r+res)*dxt 36 | sy2t = y1t+beta5*dyt 37 | # r_col = ifelse(grad_col_edge, r,0) 38 | r_col = r 39 | rgb_r = col2["red"]*r_col+col1["red"]*(1-r_col) 40 | rgb_g = col2["green"]*r_col+col1["green"]*(1-r_col) 41 | rgb_b = col2["blue"]*r_col+col1["blue"]*(1-r_col) 42 | col = rgb(rgb_r/256, rgb_g/256, rgb_b/256, col_alpha) 43 | polygon(c(sx1, sx2, sx2t,sx1t), c(sy1, sy2, sy2t, sy1t), col=col, border=NA) 44 | } 45 | # segments(x1,y1,x2,y2, 46 | } 47 | 48 | plot_focals = function(fig_dir,plot_pdf = F,bg = "white",focals = c("Epiblast", "Early nascent mesoderm", "Late nascent mesoderm", "Primitive streak")) { 49 | 50 | #fig_dir = "figs/paper_figs/fig4/focal_flows" 51 | 52 | if(!dir.exists(fig_dir)) { 53 | dir.create(fig_dir) 54 | } 55 | 56 | mc = scdb_mc("sing_emb_wt10_recolored") 57 | 58 | mat = scdb_mat("sing_emb_wt10") 59 | 60 | md = mat@cell_metadata 61 | 62 | type_ag= table(mc@colors[mc@mc], md[names(mc@mc),"age_group"]) 63 | 64 | type_agn = t(t(type_ag)/colSums(type_ag)) 65 | 66 | mct = scdb_mctnetwork("sing_emb_wt10") 67 | type_flow = mctnetwork_get_type_flows(mct, 1,13) 68 | 69 | 70 | key = mc@color_key 71 | rownames(key) = key$group 72 | 73 | all_cols = key[c( 74 | "Blood progenitors", 75 | "Haematoendothelial progenitors", 76 | "Amnion/Chorion", 77 | "Allantois", 78 | "ExE mesoderm", 79 | "Caudal mesoderm", 80 | "Early nascent mesoderm", 81 | "Paraxial mesoderm", 82 | "Cardiac mesoderm", 83 | "Late nascent mesoderm", 84 | "Cardiomyocytes", 85 | "Rostral mesoderm", 86 | "Lateral & intermediate mesoderm" 87 | ), "color"] 88 | 89 | t1 = 1 90 | t2 = 12 91 | T_minflow_for_type = 0.005 92 | 93 | #focals = c("Epiblast", "Early nascent mesoderm", "Late nascent mesoderm", "Primitive streak") 94 | #focals = c("Late nascent mesoderm") 95 | 96 | 97 | foc_colors = key[focals,"color"] 98 | for(foc_type in focals) { 99 | 100 | foc_color = key[foc_type,"color"] 101 | cols = names(which(colSums(do.call("rbind", lapply(type_flow, function(x) x[foc_color,])))>T_minflow_for_type)) 102 | foc_agn = type_agn[cols,t1:t2] 103 | 104 | base_y = c(1) 105 | 106 | for(i in 2:length(cols)) { 107 | base_y[i] = base_y[i-1] + 0.2 + max(foc_agn[cols[i],]+foc_agn[cols[i-1],]) 108 | } 109 | names(base_y) = cols 110 | 111 | if(plot_pdf) { 112 | pdf(sprintf("%s/%s.pdf",fig_dir, foc_type),w=30,h=20,useDingbats = F) 113 | } else { 114 | png(sprintf("%s/%s.png",fig_dir, foc_type),w=3000,h=2000,bg = bg) 115 | } 116 | 117 | plot(0, xlim=c(t1 - 0.5,t2), ylim=c(min(base_y)-0.5, max(base_y)+0.5),yaxt = 'n',bg = "transparent") 118 | 119 | x_pos = 0.5 120 | y_c = base_y[foc_color] 121 | 122 | segments(x0 = x_pos,x1 = x_pos, y0 = y_c - 0.25,y1 = y_c + 0.25,lwd = 4) 123 | segments(x0 = x_pos - 0.05,x1 = x_pos + 0.05, y0 = y_c - 0.25,y1 = y_c - 0.25,lwd = 4) 124 | segments(x0 = x_pos - 0.05,x1 = x_pos + 0.05, y0 = y_c + 0.25,y1 = y_c + 0.25,lwd = 4) 125 | 126 | text(x = x_pos - 0.2,y = y_c, labels = "25%",cex = 2) 127 | smoo_y = list() 128 | for(c in cols) { 129 | x = t1:t2 130 | y = foc_agn[c,t1:t2] 131 | ys = approx(x,y, seq(t1,t2,l=1+(t2-t1)*100)) 132 | lo = loess(ys$y ~ ys$x,span=0.3)$fitted 133 | base = base_y[c] 134 | smoo_y[[c]] = lo 135 | names(smoo_y[[c]]) = ys$x 136 | calpha = add_alpha(c,0.8) 137 | polygon(c(ys$x,rev(ys$x)), base+c(lo, rev(-lo)), border=NA, col=ifelse(c==foc_color, c, calpha)) 138 | } 139 | 140 | foc_i = which(cols == foc_color) 141 | base_foc = base_y[foc_color] 142 | for(t in t1:(t2-1)) { 143 | flow = type_flow[[t]] 144 | max_i = length(cols) 145 | cum_y = -smoo_y[[foc_color]][as.character(t)] 146 | for(i in 1:(foc_i-1)) { 147 | col_i = cols[i] 148 | fl = flow[foc_color,col_i]*2 149 | if(fl > 0) { 150 | calpha = add_alpha(col_i, 0.8) 151 | draw_sig_edge(x1=t, x2 = t+1, x2t = t+1-2*fl-0.05, 152 | y1 = base_foc+cum_y, 153 | y2 = base_y[col_i]+smoo_y[[col_i]][as.character(t+1)], 154 | y2t =base_y[col_i]+smoo_y[[col_i]][as.character(round(t+1-2*fl-0.05,2))], 155 | flow = fl, 156 | col1=foc_color, col2=col_i) 157 | cum_y = cum_y+fl 158 | } 159 | } 160 | cum_y = smoo_y[[foc_color]][as.character(t)] 161 | for(i in max_i:(foc_i+1)) { 162 | col_i = cols[i] 163 | fl = flow[foc_color,col_i]*2 164 | if(fl > 0) { 165 | calpha = add_alpha(col_i, 0.8) 166 | #message("at ", i, " fl ", fl) 167 | draw_sig_edge(x1=t, x2t = t+1, x2 = t+1-2*fl-0.05, 168 | y1 = base_foc+cum_y-fl, 169 | y2t = base_y[col_i]-smoo_y[[col_i]][as.character(t+1)], 170 | y2 =base_y[col_i]-smoo_y[[col_i]][as.character(round(t+1-2*fl-0.05,2))], 171 | flow = fl, 172 | col1=foc_color, col2=col_i) 173 | cum_y = cum_y-fl 174 | } 175 | } 176 | #incoming 177 | } 178 | dev.off() 179 | } 180 | 181 | 182 | } 183 | 184 | 185 | -------------------------------------------------------------------------------- /scripts/initialize_scripts.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | scdb_init("scrna_db/",force_reinit = T) 3 | scfigs_init("figs") 4 | 5 | source("scripts/generate_paper_figures/fig_1.r") 6 | source("scripts/generate_paper_figures/fig_2.r") 7 | source("scripts/generate_paper_figures/fig_3.r") 8 | source("scripts/generate_paper_figures/fig_4.r") 9 | source("scripts/generate_paper_figures/fig_5.r") 10 | source("scripts/generate_paper_figures/fig_6.r") 11 | source("scripts/generate_paper_figures/fig_7.r") 12 | source("scripts/generate_paper_figures/fig_s1.r") 13 | source("scripts/generate_paper_figures/fig_s2.r") 14 | source("scripts/generate_paper_figures/fig_s3.r") 15 | source("scripts/generate_paper_figures/fig_s4.r") 16 | source("scripts/generate_paper_figures/fig_s5.r") 17 | source("scripts/generate_paper_figures/fig_s6.r") 18 | source("scripts/generate_paper_figures/fig_s7.r") 19 | source("scripts/generate_paper_figures/generate_all_figures.r") 20 | source("scripts/additional_network_functions.r") 21 | source("scripts/generate_paper_figures/plot_vein.r") 22 | source("scripts/generate_paper_figures/plot_network.r") 23 | source("scripts/generate_paper_figures/plot_3d_vein.r") 24 | source("scripts/foxc12/chimeras_compare_frequency.r") 25 | source("scripts/foxc12/control_chim_timing.r") 26 | source("scripts/foxc12/definition_cell_types_endoderm_ectoderm_embryonic_mesoderm.r") 27 | source("scripts/foxc12/differential_expression_analysis.r") 28 | source("scripts/foxc12/foxc_chim_timing.r") 29 | source("scripts/foxc12/generate_chimera_tetraploid_data_analysis.r") 30 | source("scripts/foxc12/transfer_cell_type_annotation.r") 31 | source("scripts/foxc12/transfer_time_annotation.r") 32 | -------------------------------------------------------------------------------- /scripts/parameter_stability_analysis.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | tgconfig::override_params("config/sing_emb.yaml","metacell") 3 | source("scripts/generate_mc_mgraph_network/gen_network.r") 4 | scdb_init("scrna_db/stability_analysis/",force_reinit = T) 5 | 6 | gen_parameter_stability_analysis = function() { 7 | # first generate networks for all the parameters 8 | 9 | if(0) { 10 | generate_mgraph_and_network_for_param_values(1) 11 | generate_mgraph_and_network_for_param_values(2) 12 | generate_mgraph_and_network_for_param_values(3) 13 | } 14 | 15 | # generate a summary list of the flows and plot them 16 | gen_net_id_ls() 17 | 18 | # next generate color summary 19 | gen_col_summary_all() 20 | } 21 | 22 | generate_mgraph_and_network_for_param_values = function(n_split = 1) { 23 | fig_dir = "figs/paper_figs/fig_s_parameter_stability" 24 | if(!dir.exists(fig_dir)) { 25 | dir.create(fig_dir) 26 | } 27 | 28 | 29 | 30 | mat_id = "sing_emb_wt10" 31 | mc_id = "sing_emb_wt10_recolored" 32 | gset = scdb_gset("sing_emb_wt10") 33 | feat_genes = names(gset@gene_set) 34 | mgraph_id_orig = "sing_emb_wt10_recolored_logist" 35 | mc = scdb_mc(mc_id) 36 | mc_leak = get_mc_leak_parameter_endo(mc_id = mc_id,leak_emb_endo = 0.12,leak_exe_endo = 0.17) 37 | 38 | a = 1 39 | # parameters of interest assigned to their current standard value 40 | 41 | logist_loc = 1 42 | logist_scale = 0.2 43 | logist_eps = 4e-5 44 | max_d_fold = 3 45 | t_exp = 1 46 | cap_var_factor = 0.4 47 | capacity_var_factor = rep(cap_var_factor,ncol(mc@e_gc)) 48 | off_capacity_cost1 = 1 49 | off_capacity_cost2 = 1000 50 | max_degree_mc = 4 51 | tgconfig::set_param("mcell_mc2d_max_confu_deg",max_degree_mc,"metacell") 52 | 53 | if(n_split == 1) { 54 | # start with t_exp 55 | # sample ten values 56 | param_ls = 2^(c((-3:6))) 57 | 58 | for(i in 1:length(param_ls)) { 59 | print(i) 60 | t_exp = param_ls[i] 61 | net_id = sprintf("t_exp_%d",i) 62 | 63 | build_net(mat_id = mat_id, 64 | mc_id = mc_id, 65 | mgraph_id = mgraph_id_orig, 66 | net_id = net_id, 67 | fig_dir = fig_dir, 68 | age_field = "age_group",mc_leak = mc_leak, 69 | capacity_var_factor = capacity_var_factor, 70 | k_norm_ext_cost = 1, 71 | k_ext_norm_cost = 1, 72 | k_ext_ext_cost = 1, 73 | t_exp = t_exp) 74 | } 75 | t_exp = 1 76 | 77 | # next max degree mc 78 | param_ls = c(3,4,6,8,10,15,20,50) 79 | 80 | for(i in 1:length(param_ls)) { 81 | 82 | print(i) 83 | max_degree_mc = param_ls[i] 84 | tgconfig::set_param("mcell_mc2d_max_confu_deg",max_degree_mc,"metacell") 85 | mgraph_id = sprintf("max_degree_mc_%d",i) 86 | net_id = mgraph_id 87 | 88 | mgraph = mc2d_comp_mgraph_param(mc, feat_genes, logist_loc, logist_scale, logist_eps, max_d_fold) 89 | scdb_add_mgraph(id = mgraph_id,mgraph = tgMCManifGraph(mc_id = mc_id,mgraph = mgraph)) 90 | 91 | build_singemb_net(mat_id = mat_id, 92 | mc_id = mc_id, 93 | mgraph_id = mgraph_id, 94 | net_id = net_id, 95 | fig_dir = fig_dir, 96 | age_field = "age_group",mc_leak = mc_leak, 97 | capacity_var_factor = capacity_var_factor, 98 | k_norm_ext_cost = 1, 99 | k_ext_norm_cost = 1, 100 | k_ext_ext_cost = 1, 101 | t_exp = t_exp) 102 | 103 | } 104 | tgconfig::set_param("mcell_mc2d_max_confu_deg",4,"metacell") 105 | 106 | 107 | } 108 | 109 | if (n_split == 2) { 110 | # next logistic loc 111 | param_ls = c(0,0.25,0.5,0.75,1,1.5,2,2.5) 112 | 113 | for(i in 1:length(param_ls)) { 114 | 115 | print(i) 116 | logist_loc = param_ls[i] 117 | mgraph_id = sprintf("logist_loc_%d",i) 118 | net_id = mgraph_id 119 | 120 | mgraph = mc2d_comp_mgraph_param(mc, feat_genes, logist_loc, logist_scale, logist_eps, max_d_fold) 121 | scdb_add_mgraph(id = mgraph_id,mgraph = tgMCManifGraph(mc_id = mc_id,mgraph = mgraph)) 122 | 123 | build_singemb_net(mat_id = mat_id, 124 | mc_id = mc_id, 125 | mgraph_id = mgraph_id, 126 | net_id = net_id, 127 | fig_dir = fig_dir, 128 | age_field = "age_group",mc_leak = mc_leak, 129 | capacity_var_factor = capacity_var_factor, 130 | k_norm_ext_cost = 1, 131 | k_ext_norm_cost = 1, 132 | k_ext_ext_cost = 1, 133 | t_exp = t_exp) 134 | 135 | } 136 | logist_loc = 1 137 | 138 | # next logistic scale 139 | param_ls = c(0.10,0.15,0.20,0.25,0.30,0.35,0.45,0.60,0.80) 140 | 141 | for(i in 1:length(param_ls)) { 142 | 143 | print(i) 144 | logist_scale = param_ls[i] 145 | mgraph_id = sprintf("logist_scale_%d",i) 146 | net_id = mgraph_id 147 | 148 | mgraph = mc2d_comp_mgraph_param(mc, feat_genes, logist_loc, logist_scale, logist_eps, max_d_fold) 149 | scdb_add_mgraph(id = mgraph_id,mgraph = tgMCManifGraph(mc_id = mc_id,mgraph = mgraph)) 150 | 151 | build_singemb_net(mat_id = mat_id, 152 | mc_id = mc_id, 153 | mgraph_id = mgraph_id, 154 | net_id = net_id, 155 | fig_dir = fig_dir, 156 | age_field = "age_group",mc_leak = mc_leak, 157 | capacity_var_factor = capacity_var_factor, 158 | k_norm_ext_cost = 1, 159 | k_ext_norm_cost = 1, 160 | k_ext_ext_cost = 1, 161 | t_exp = t_exp) 162 | 163 | } 164 | logist_scale = 0.2 165 | } 166 | 167 | 168 | if(n_split == 3) { 169 | 170 | # next follows capacity variance factor 171 | 172 | param_ls = c(0.01,0.05,0.1,0.25,0.4,0.6,0.8) 173 | 174 | for(i in 1:length(param_ls)) { 175 | 176 | print(i) 177 | cap_var_factor = param_ls[i] 178 | capacity_var_factor = rep(cap_var_factor,ncol(mc@e_gc)) 179 | net_id = sprintf("cap_var_factor_%d",i) 180 | 181 | build_singemb_net(mat_id = mat_id, 182 | mc_id = mc_id, 183 | mgraph_id = mgraph_id_orig, 184 | net_id = net_id, 185 | fig_dir = fig_dir, 186 | age_field = "age_group",mc_leak = mc_leak, 187 | capacity_var_factor = capacity_var_factor, 188 | k_norm_ext_cost = 1, 189 | k_ext_norm_cost = 1, 190 | k_ext_ext_cost = 1, 191 | t_exp = t_exp) 192 | 193 | } 194 | capacity_var_factor = rep(0.25,ncol(mc@e_gc)) 195 | 196 | 197 | # next follows off_capacity_cost1 198 | 199 | param_ls = c(0,1,2,5,10,20,50,100) 200 | 201 | for(i in 1:length(param_ls)) { 202 | 203 | print(i) 204 | off_capacity_cost1 = param_ls[i] 205 | net_id = sprintf("off_capacity_cost1_%d",i) 206 | 207 | build_singemb_net(mat_id = mat_id, 208 | mc_id = mc_id, 209 | mgraph_id = mgraph_id_orig, 210 | net_id = net_id, 211 | fig_dir = fig_dir, 212 | age_field = "age_group",mc_leak = mc_leak, 213 | capacity_var_factor = capacity_var_factor, 214 | k_norm_ext_cost = 1, 215 | k_ext_norm_cost = 1, 216 | k_ext_ext_cost = 1, 217 | t_exp = t_exp, 218 | off_capacity_cost1 = off_capacity_cost1, 219 | off_capacity_cost2 = off_capacity_cost2) 220 | 221 | } 222 | off_capacity_cost1 = 1 223 | 224 | # next follows off_capacity_cost2 225 | 226 | param_ls = c(1,10,20,50,100,200,500,1000,2000,5000) 227 | 228 | for(i in 1:length(param_ls)) { 229 | 230 | print(i) 231 | off_capacity_cost2 = param_ls[i] 232 | net_id = sprintf("off_capacity_cost2_%d",i) 233 | 234 | build_singemb_net(mat_id = mat_id, 235 | mc_id = mc_id, 236 | mgraph_id = mgraph_id_orig, 237 | net_id = net_id, 238 | fig_dir = fig_dir, 239 | age_field = "age_group",mc_leak = mc_leak, 240 | capacity_var_factor = capacity_var_factor, 241 | k_norm_ext_cost = 1, 242 | k_ext_norm_cost = 1, 243 | k_ext_ext_cost = 1, 244 | t_exp = t_exp, 245 | off_capacity_cost1 = off_capacity_cost1, 246 | off_capacity_cost2 = off_capacity_cost2) 247 | 248 | } 249 | off_capacity_cost2 = 1000 250 | 251 | } 252 | 253 | 254 | 255 | } 256 | 257 | 258 | gen_net_id_ls = function() { 259 | 260 | scdb_init("scrna_db/",force_reinit = T) 261 | mc = scdb_mc("sing_emb_wt10_recolored") 262 | network_color_ord = mc@color_key$color 263 | scdb_init("scrna_db/stability_analysis/",force_reinit = T) 264 | 265 | net_id_ls = c() 266 | all_param_ls = list() 267 | # start with t_exp 268 | # sample ten values 269 | param_ls = 2^(c((-3:6))) 270 | net_ids = paste0("t_exp_",c(1:length(param_ls))) 271 | net_id_ls = c(net_id_ls,net_ids) 272 | all_param_ls[[1]] = param_ls 273 | 274 | # next logistic loc 275 | param_ls = c(0,0.25,0.5,0.75,1,1.5,2,2.5) 276 | net_ids = paste0("logist_loc_",c(1:length(param_ls))) 277 | net_id_ls = c(net_id_ls,net_ids) 278 | all_param_ls[[2]] = param_ls 279 | 280 | # next logistic scale 281 | param_ls = c(0.10,0.15,0.20,0.25,0.30,0.35,0.45,0.60,0.80) 282 | net_ids = paste0("logist_scale_",c(1:length(param_ls))) 283 | net_id_ls = c(net_id_ls,net_ids) 284 | all_param_ls[[3]] = param_ls 285 | 286 | # next max degree mc 287 | param_ls = c(3,4,6,8,10,15,20,50) 288 | net_ids = paste0("max_degree_mc_",c(1:length(param_ls))) 289 | net_id_ls = c(net_id_ls,net_ids) 290 | all_param_ls[[4]] = param_ls 291 | 292 | # next follows capacity variance factor 293 | param_ls = c(0.01,0.05,0.1,0.25,0.4,0.6,0.8) 294 | net_ids = paste0("cap_var_factor_",c(1:length(param_ls))) 295 | net_id_ls = c(net_id_ls,net_ids) 296 | all_param_ls[[5]] = param_ls 297 | 298 | # next follows off_capacity_cost1 299 | param_ls = c(0,1,2,5,10,20,50,100) 300 | net_ids = paste0("off_capacity_cost1_",c(1:length(param_ls))) 301 | net_id_ls = c(net_id_ls,net_ids) 302 | all_param_ls[[6]] = param_ls 303 | 304 | # next follows off_capacity_cost2 305 | param_ls = c(1,10,20,50,100,200,500,1000,2000,5000) 306 | net_ids = paste0("off_capacity_cost2_",c(1:length(param_ls))) 307 | net_id_ls = c(net_id_ls,net_ids) 308 | all_param_ls[[7]] = param_ls 309 | 310 | 311 | 312 | ignore_ls = c() 313 | ignore_flow = c() 314 | for(net_id in net_id_ls) { 315 | print(net_id) 316 | mct = scdb_mctnetwork(net_id) 317 | 318 | tot_flow = sum(mct@network$flow) 319 | 320 | if((tot_flow < 20) | (tot_flow > 30)) { 321 | ignore_ls = c(ignore_ls,net_id) 322 | ignore_flow = c(ignore_flow,tot_flow) 323 | } 324 | 325 | } 326 | names(ignore_flow) = ignore_ls 327 | net_id_ls_f = setdiff(net_id_ls,ignore_ls) 328 | 329 | param_nms = c("t_exp","logist_loc","logist_scale","max_degree_mc","cap_var_factor","off_capacity_cost1","off_capacity_cost2") 330 | 331 | for (i in 1:length(all_param_ls)) { 332 | 333 | n_param = length(all_param_ls[[i]]) 334 | 335 | param_nm = param_nms[i] 336 | 337 | net_ids = paste(param_nm,c(1:n_param),sep = "_") 338 | 339 | f = net_ids %in% net_id_ls_f 340 | 341 | all_param_ls[[i]] = all_param_ls[[i]][f] 342 | } 343 | 344 | names(all_param_ls) = param_nms 345 | 346 | ls_of_param = all_param_ls 347 | 348 | save(net_id_ls,file = "data/parameter_stability_analysis/net_id_ls.Rda") 349 | save(net_id_ls_f,file = "data/parameter_stability_analysis/net_id_ls_f.Rda") 350 | save(ls_of_param,file = "data/parameter_stability_analysis/ls_of_param.Rda") 351 | } 352 | 353 | gen_col_summary_all = function() { 354 | 355 | param_ls = c("t_exp","logist_loc","logist_scale","max_degree_mc","cap_var_factor","off_capacity_cost1","off_capacity_cost2") 356 | 357 | load(file = "data/parameter_stability_analysis/net_id_ls.Rda") 358 | load("data/parameter_stability_analysis/net_id_ls_f.Rda") 359 | 360 | mc = scdb_mc("sing_emb_wt10_recolored") 361 | network_color_ord = mc@color_key$color 362 | 363 | for (param in param_ls) { 364 | print(param) 365 | net_ids = grep(pattern = param,x = net_id_ls_f,v = T) 366 | 367 | net_id = net_ids[1] 368 | #load(sprintf("data/parameter_stability_analysis/%s.Rda",net_id)) 369 | col_trans_per_t = param_gen_col_summary(net_id) 370 | 371 | 372 | col_dist_all = lapply(col_trans_per_t,function(col_dist) { 373 | return(list(col_dist)) 374 | }) 375 | 376 | for (i in 2:length(net_ids)) { 377 | print(i) 378 | net_id = net_ids[i] 379 | #load(sprintf("data/parameter_stability_analysis/%s.Rda",net_id)) 380 | col_trans_per_t = param_gen_col_summary(net_id) 381 | print(i) 382 | for (t in 1:12) { 383 | col_dist_ls_t = col_dist_all[[t]] 384 | col_dist_ls_t = c(col_dist_ls_t,list(col_trans_per_t[[t]])) 385 | col_dist_all[[t]] = col_dist_ls_t 386 | } 387 | 388 | } 389 | save(col_dist_all,file = sprintf("data/parameter_stability_analysis/%s_all.Rda",param)) 390 | 391 | } 392 | 393 | } 394 | 395 | param_gen_col_summary = function(net_id) { 396 | scdb_init("scrna_db/stability_analysis/",force_reinit = T) 397 | mct = scdb_mctnetwork(net_id) 398 | scdb_init("scrna_db/",force_reinit = T) 399 | mat = scdb_mat("sing_emb_wt10") 400 | mc = scdb_mc("sing_emb_wt10_recolored") 401 | 402 | all_colors = unique(mc@colors) 403 | 404 | cls_f = names(mc@mc) 405 | 406 | cls_spl = split(cls_f,mat@cell_metadata[cls_f,"age_group"]) 407 | 408 | col_dist_zero = matrix(0,nrow = ncol(mc@e_gc),ncol = length(all_colors)) 409 | rownames(col_dist_zero) = c(1:ncol(mc@e_gc)) 410 | colnames(col_dist_zero) = all_colors 411 | 412 | col_dist_per_t = lapply(cls_spl,function(cls) { 413 | 414 | col_dist = col_dist_zero 415 | col_dist_tmp = table(mc@mc[cls],mc@colors[mc@mc[cls]]) 416 | col_dist_tmp = col_dist_tmp/rowSums(col_dist_tmp) 417 | 418 | col_dist[rownames(col_dist_tmp),colnames(col_dist_tmp)] = col_dist_tmp 419 | 420 | return(col_dist) 421 | }) 422 | 423 | 424 | col_trans_per_t = lapply(c(1:length(mct@mc_forward)),function(i) { 425 | 426 | mc_trans = mct@mc_t_infer[,i]*mct@mc_forward[[i]] 427 | 428 | col_trans = t(col_dist_per_t[[i]]) %*% mc_trans %*% col_dist_per_t[[i+1]] 429 | return(col_trans) 430 | }) 431 | 432 | #save(col_trans_per_t,file = sprintf("data/parameter_stability_analysis/%s.Rda",net_id)) 433 | return(col_trans_per_t) 434 | } 435 | -------------------------------------------------------------------------------- /scripts/single_embryo_timing.r: -------------------------------------------------------------------------------- 1 | library("metacell") 2 | library("tidyverse") 3 | scdb_init("scrna_db/",force_reinit = T) 4 | # calculate embryo time 5 | 6 | gen_single_embryo_timing = function() { 7 | 8 | # external ranking of embryos containing morphology ranks and size measurements 9 | emb_age = read.table(file = "data/external_data/single_embryo_morphology_size_information.txt",sep = "\t",h = T) 10 | 11 | # generating intrinsic transcriptional ranking of embryos,"intrinsic ranking" if Figure 1 12 | intrinsic_ranking = wt10_gen_intrinsinc_ranking() 13 | 14 | # similarity matrix displayed in Figure 1B 15 | ee_similarity_mat = intrinsic_ranking$similarity_mat 16 | 17 | emb_age = left_join(emb_age,intrinsic_ranking$embryo_final_order,by = "embryo") 18 | 19 | 20 | # translating transcriptional ranks into embryonic developmental times 6.x to 8.x 21 | developmental_time = calculate_developmental_time_from_size_measurement(emb_age_time = emb_age) 22 | 23 | emb_age = left_join(emb_age,developmental_time,by = "transcriptional_rank") 24 | 25 | # projecting embryos on reference gastrulation atlas Pijuan-Sala et al Nature (2019) 26 | ref_age = calculate_ref_gastru_atlas_age() 27 | 28 | emb_age = left_join(emb_age,ref_age,by = "embryo") 29 | 30 | return(list(single_embryo_time = emb_age,transcriptional_similarity_matrix = ee_similarity_mat)) 31 | } 32 | 33 | 34 | calculate_developmental_time_from_size_measurement = function(emb_age_time) { 35 | 36 | mat = scdb_mat("sing_emb_wt10") 37 | mc = scdb_mc("sing_emb_wt10_recolored") 38 | 39 | emb_age_time = emb_age_time[,c("embryo","transcriptional_rank","morphology_rank","area","mouse_type","age_group")] 40 | #emb_age_time = unique(mat@cell_metadata[names(mc@mc),c("embryo","transcriptional_rank","morphology_rank","area","mouse_type","age_group")]) 41 | 42 | age_group_median_rank = tapply(emb_age_time$transcriptional_rank,emb_age_time$age_group,median) 43 | min_rank = round(age_group_median_rank[1]) 44 | max_rank = round(age_group_median_rank[13]) 45 | 46 | min_age = 6.5 47 | max_age = 8.1 48 | 49 | f = (emb_age_time$mouse_type != "ICR") & !(is.na(emb_age_time$area)) 50 | emb_age_time = emb_age_time[f,] 51 | 52 | f2 = emb_age_time$morphology_rank > 100 & log2(emb_age_time$area) < 16.3 53 | 54 | fit_rank_vs_area = smooth.spline(x = emb_age_time[!f2,"transcriptional_rank"],log2(emb_age_time[!f2,"area"]),spar = 0.9) 55 | 56 | 57 | min_x = fit_rank_vs_area$x[1] 58 | max_x = fit_rank_vs_area$x[length(fit_rank_vs_area$x)] 59 | 60 | extrapol_fit_x = fit_rank_vs_area$x 61 | extrapol_fit_y = fit_rank_vs_area$y 62 | if(min_x > 1) { 63 | 64 | new_fit_y = (extrapol_fit_y[10] - extrapol_fit_y[1])/(extrapol_fit_x[10] - extrapol_fit_x[1])*(c(1:(min_x-1)) - min_x) + extrapol_fit_y[1] 65 | extrapol_fit_x = c(c(1:(min_x-1)),extrapol_fit_x) 66 | extrapol_fit_y = c(new_fit_y,extrapol_fit_y) 67 | } 68 | 69 | if(max_x < 153) { 70 | 71 | new_fit_y = (extrapol_fit_y[length(extrapol_fit_y)] - extrapol_fit_y[(length(extrapol_fit_y) - 1)])/(extrapol_fit_x[length(extrapol_fit_x)] - extrapol_fit_x[(length(extrapol_fit_x) - 1)])*(c(1:(153-max_x))) + extrapol_fit_y[length(extrapol_fit_y)] 72 | extrapol_fit_x = c(extrapol_fit_x,c((max_x + 1):153)) 73 | extrapol_fit_y = c(extrapol_fit_y,new_fit_y) 74 | } 75 | 76 | area_interp = rep(0,153) 77 | 78 | for(n in 1:153) { 79 | if (n %in% extrapol_fit_x) { 80 | n_ind = which(extrapol_fit_x == n) 81 | area_interp[n] = extrapol_fit_y[n_ind] 82 | } else { 83 | 84 | n_l_ind = max(which(extrapol_fit_x < n)) 85 | n_h_ind = min(which(extrapol_fit_x > n)) 86 | 87 | n_l = extrapol_fit_x[n_l_ind] 88 | n_h = extrapol_fit_x[n_h_ind] 89 | 90 | t_interp = (n - n_l)/(n_h - n_l)*extrapol_fit_y[n_h_ind] + (n_h - n)/(n_h - n_l)*extrapol_fit_y[n_l_ind] 91 | 92 | area_interp[n] = t_interp 93 | } 94 | } 95 | 96 | 97 | infered_time = min_age + (area_interp - area_interp[min_rank])/(area_interp[max_rank] - area_interp[min_rank])*(max_age - min_age) 98 | 99 | embryonic_time = data.frame(transcriptional_rank = c(1:153),developmental_time = infered_time) 100 | 101 | return(embryonic_time) 102 | } 103 | 104 | 105 | calculate_ref_gastru_atlas_age = function() { 106 | 107 | atlas = mcell_gen_atlas(mat_id = "emb_gotg", 108 | mc_id = "emb_gotg_bs500f", 109 | gset_id = "emb_gotg", 110 | mc2d_id= "emb_gotg_bs500f") 111 | 112 | mat_id = "sing_emb_wt10" 113 | mat_ref_id = "emb_gotg" 114 | mc_id = "sing_emb_wt10_recolored" 115 | 116 | # the following cell types are ignored because they show strong embryo-to-embryo fluctuations in there frequency 117 | list_of_ignored_colors = c("#1A1A1A","#989898","#7F6874","#c9a997","#C72228") 118 | 119 | ref_mc = scdb_mc(atlas$mc_id) 120 | query_mc = scdb_mc(mc_id) 121 | mat = scdb_mat(mat_id) 122 | mat_ref = scdb_mat(mat_ref_id) 123 | 124 | gset = scdb_gset(atlas$gset_id) 125 | ref_mc2d = scdb_mc2d(atlas$mc2d_id) 126 | 127 | gene_name_map = gen_10x_mars_gene_match(mars_mc_id = mc_id, tenx_mc_id = atlas$mc_id) 128 | #check cor of all cells, features 129 | common_genes_ref = names(gset@gene_set) 130 | common_genes_ref = common_genes_ref[!is.na(gene_name_map[common_genes_ref])] 131 | common_genes_ref = intersect(common_genes_ref, rownames(ref_mc@e_gc)) 132 | common_genes_ref = common_genes_ref[!is.na(gene_name_map[common_genes_ref])] 133 | common_genes_ref = common_genes_ref[gene_name_map[common_genes_ref] %in% rownames(query_mc@e_gc)] 134 | common_genes = gene_name_map[common_genes_ref] 135 | 136 | if(mean(!is.null(common_genes)) < 0.5) { 137 | stop("less than half of the atlas feature genes can be mapped to reference gene names. Probably should provide a name conversion table") 138 | } 139 | 140 | # next calculate average age of reference atlas metacells 141 | mc_ref_ag = table(ref_mc@mc,mat_ref@cell_metadata[names(ref_mc@mc),"stage"]) 142 | mc_ref_ag = mc_ref_ag[,1:9] 143 | mc_ref_ag = mc_ref_ag[rowSums(mc_ref_ag)>0,] 144 | mc_list= as.numeric(rownames(mc_ref_ag)) 145 | 146 | mc_ref_ag_c = t(t(mc_ref_ag)/colSums(mc_ref_ag)) 147 | mc_ref_ag_cn = mc_ref_ag_c/rowSums(mc_ref_ag_c) 148 | 149 | time_points = 6.5 + c(0:8)/4 150 | mc_mean_age = (mc_ref_ag_cn %*% time_points)[,1] 151 | names(mc_mean_age) = rownames(mc_ref_ag_cn) 152 | 153 | # next find best reference metacell for each single cell from the query matrix 154 | feats = mat@mat[common_genes, names(query_mc@mc)] 155 | rownames(feats) = common_genes_ref 156 | 157 | ref_abs_lfp = log(1e-6+ref_mc@e_gc[common_genes_ref,as.character(mc_list)]) 158 | ref_abs_fp = ref_mc@e_gc[common_genes_ref,as.character(mc_list)] 159 | 160 | 161 | cross = tgs_cor((cbind(ref_abs_lfp, as.matrix(feats)))) 162 | cross1 = cross[1:ncol(ref_abs_lfp),ncol(ref_abs_lfp)+1:ncol(feats)] 163 | 164 | f = rowSums(is.na(cross1)) 165 | cross1[f,] = 0 166 | 167 | best_ref = as.numeric(unlist(apply(cross1,2,function(x) names(which.max(x))))) 168 | best_ref_cor = apply(cross1, 2, max) 169 | 170 | # only include cells in age calculation that are projected on an included cell type from the reference atlas 171 | list_of_ignored_reference_mcs = which(ref_mc@colors %in% list_of_ignored_colors) 172 | 173 | best_ref_mean_age = mc_mean_age[as.character(best_ref)] 174 | names(best_ref_mean_age) = colnames(cross1) 175 | best_ref_mean_age = best_ref_mean_age[!(best_ref %in% list_of_ignored_reference_mcs)] 176 | 177 | 178 | # summarise single embryo time distribution in a data.frame 179 | emb_age_dist = split(best_ref_mean_age,mat@cell_metadata[names(best_ref_mean_age),"embryo"]) 180 | n_cells = sapply(emb_age_dist,length) 181 | mean_age = sapply(emb_age_dist,mean) 182 | median_age = sapply(emb_age_dist,median) 183 | var_age = sapply(emb_age_dist,var) 184 | 185 | sing_emb_age = data.frame(embryo = names(emb_age_dist), 186 | ref_gastru_atlas_age = signif(mean_age[names(emb_age_dist)],4), 187 | ref_gastru_atlas_rank = rank(mean_age[names(emb_age_dist)]),stringsAsFactors = F) 188 | 189 | sing_emb_age = sing_emb_age[order(sing_emb_age$ref_gastru_atlas_rank),] 190 | 191 | 192 | 193 | return(sing_emb_age) 194 | } 195 | 196 | wt10_gen_intrinsinc_ranking = function() { 197 | 198 | mat = scdb_mat("sing_emb_wt10") 199 | 200 | mc = scdb_mc("sing_emb_wt10_recolored") 201 | legc = log2(1e-5 + mc@e_gc) 202 | 203 | # define blood mcs 204 | f = ( legc["Cited4",] > -14 ) 205 | blood_mcs = which(f) 206 | 207 | # next extraembryonic endo 208 | x1 = -8.4 209 | y1 = -14 210 | x2 = -11 211 | y2 = -8.4 212 | 213 | b_exe_endo = (y2 - y1)/(x2 - x1) 214 | a_exe_endo = (y1*x2 - y2*x1)/(x2 - x1) 215 | 216 | f = legc["Ttr",] > a_exe_endo + b_exe_endo*legc["Apoe",] 217 | exe_endo_mcs = which(f) 218 | 219 | excluded_mcs = c(blood_mcs,exe_endo_mcs) 220 | 221 | df_init_ord = read.table("data/intrinsic_temporal_ranking/intrinsic_ranking_embryo_initial_order_first_iteration.txt",sep = "\t",stringsAsFactors = F) 222 | 223 | emb_age_group = unique(mat@cell_metadata[names(mc@mc),c("embryo","age_group")]) 224 | 225 | 226 | intrinsic_ranking = sing_emb_intrinsic_ranking(mat_id = "sing_emb_wt10", 227 | graph_id = "sing_emb_wt10", 228 | mc_id = "sing_emb_wt10_recolored", 229 | df_embryo_coarse_ranking = df_init_ord, 230 | excluded_mcs = excluded_mcs) 231 | 232 | 233 | intrinsic_ranking = sing_emb_intrinsic_ranking(mat_id = "sing_emb_wt10", 234 | graph_id = "sing_emb_wt10", 235 | mc_id = "sing_emb_wt10_recolored", 236 | df_embryo_coarse_ranking = intrinsic_ranking$embryo_final_order, 237 | excluded_mcs = excluded_mcs) 238 | 239 | return(intrinsic_ranking) 240 | } 241 | 242 | sing_emb_intrinsic_ranking = function(mat_id, graph_id,mc_id,df_embryo_coarse_ranking,excluded_mcs, 243 | number_of_neighbors = 50) { 244 | 245 | # This function is based only on a coarse initial ranking 246 | cgraph = scdb_cgraph(graph_id) 247 | mat = scdb_mat(mat_id) 248 | mc = scdb_mc(mc_id) 249 | number_of_neighbors = number_of_neighbors 250 | remove_neighbors_from_same_batch = TRUE 251 | reshuffle = T 252 | 253 | # initial ranking - use coarse morphological marks 254 | 255 | #check if the embryo names in df_embryo_coarse_ranking are all contained in the mat 256 | embryos = df_embryo_coarse_ranking$embryo 257 | embryos = embryos[embryos %in% unique(mat@cell_metadata[colnames(mat@mat),"embryo"])] 258 | 259 | included_mcs = setdiff(c(1:ncol(mc@e_gc)),excluded_mcs) 260 | included_cells = names(mc@mc)[mc@mc %in% included_mcs] 261 | 262 | #embryos = as.character(unique(mat@cell_metadata[names(mc@mc),"embryo"])) 263 | 264 | cell_to_embryo = as.character(mat@cell_metadata[names(mc@mc),"embryo"]) 265 | names(cell_to_embryo) = names(mc@mc) 266 | 267 | #old way 268 | #included_edges = cgraph@edges[(( cgraph@edges$mc1 %in% names(mc@mc) ) & ( cgraph@edges$mc2 %in% names(mc@mc) )),] 269 | #included_edges = included_edges[included_edges$w > (1 - number_of_neighbors/100),] 270 | #included_edges = included_edges[( included_edges$mc1 %in% included_cells ) & ( included_edges$mc2 %in% included_cells ),] 271 | 272 | #new way 273 | included_edges = cgraph@edges[(( cgraph@edges$mc1 %in% included_cells ) & ( cgraph@edges$mc2 %in% included_cells )),] 274 | included_edges$mc1 = as.character(included_edges$mc1) 275 | included_edges$mc2 = as.character(included_edges$mc2) 276 | 277 | # exclude embryo self edges 278 | f = !(cell_to_embryo[included_edges$mc1] == cell_to_embryo[included_edges$mc2]) 279 | included_edges = included_edges[f,] 280 | 281 | included_edges = included_edges[order(included_edges$mc1),] 282 | list_temp = tapply(X = 1-included_edges$w,INDEX = as.character(included_edges$mc1),FUN = function(x) {rank(x)}) 283 | neighbor_ranks = unlist(tapply(X = 1-included_edges$w,INDEX = as.character(included_edges$mc1),FUN = rank)) 284 | included_edges = included_edges[neighbor_ranks < number_of_neighbors +1,] 285 | included_cells = unique(included_edges$mc1) 286 | 287 | mat_emb_neighbors = table(mat@cell_metadata[as.character(included_edges$mc1),"embryo"], 288 | mat@cell_metadata[as.character(included_edges$mc2),"embryo"]) 289 | mat_emb_neighbors = mat_emb_neighbors[embryos,embryos] 290 | mat_emb_neighbors = as.matrix(mat_emb_neighbors) 291 | #mat_emb_neighbors = mat_emb_neighbors - diag(diag(mat_emb_neighbors)) 292 | knn_neighbors = rowSums(mat_emb_neighbors) 293 | batch_to_embryo_table = unique(mat@cell_metadata[names(mc@mc),c("embryo","Sort.Date")]) 294 | 295 | embryo_weight = table(mat@cell_metadata[included_cells,"embryo"]) 296 | embryo_weight = embryo_weight[embryos] 297 | 298 | # Normalize each column by the total number of cells this embryo has. 299 | mat_emb_neighbors = t(t(mat_emb_neighbors)/as.numeric(embryo_weight)) 300 | 301 | 302 | if (remove_neighbors_from_same_batch) { 303 | for (i in 1:length(embryos)) { 304 | embryo = embryos[i] 305 | batch_id = batch_to_embryo_table$Sort.Date[batch_to_embryo_table$embryo == embryo] 306 | embryos_same_batch = as.character(batch_to_embryo_table$embryo[batch_to_embryo_table$Sort.Date == batch_id]) 307 | #mat_emb_neighbors[embryos_same_batch,embryo] = 0 308 | 309 | embryo_neighbors = setdiff(embryos[max(1,i-10):min(i+10,length(embryos))],embryo) 310 | # next comes the renormalization part. 311 | if(length(intersect(embryo_neighbors,embryos_same_batch)) > 0) { 312 | f = intersect(embryo_neighbors,embryos_same_batch) 313 | same_batch_mean_of_neighbors = mean(mat_emb_neighbors[embryo,f]) 314 | f2 = setdiff(embryo_neighbors,embryos_same_batch) 315 | if (length(f2) > 0) { 316 | other_batches_mean_of_neighbors = mean(mat_emb_neighbors[embryo,f2]) 317 | if (same_batch_mean_of_neighbors > 0) { 318 | mat_emb_neighbors[embryo,embryos_same_batch] = mat_emb_neighbors[embryo,embryos_same_batch]* 319 | other_batches_mean_of_neighbors/same_batch_mean_of_neighbors 320 | } 321 | } 322 | } 323 | 324 | } 325 | } 326 | 327 | mat_emb_neighbors = mat_emb_neighbors/rowSums(mat_emb_neighbors) 328 | 329 | embryo_init_order = embryos 330 | mat_emb_init = mat_emb_neighbors 331 | 332 | # calculate reshuffled matrix 333 | out = sing_emb_time_ranking_shuffle_rows_and_cols(embryos,mat_emb_neighbors) 334 | 335 | mat_emb_neighbors = out$mat_reshuffled 336 | embryos = out$embryos_final_order 337 | 338 | embryo_final_order = data.frame(embryo = rownames(mat_emb_neighbors),transcriptional_rank = c(1:nrow(mat_emb_neighbors)), 339 | stringsAsFactors = F) 340 | 341 | 342 | return(list(similarity_mat = mat_emb_neighbors,embryo_final_order = embryo_final_order)) 343 | } 344 | 345 | 346 | sing_emb_time_ranking_shuffle_rows_and_cols = function(embryos,mat_emb_neighbors) { 347 | 348 | count = 0 349 | n_emb = length(embryos) 350 | n_iter_max = 100 351 | n_iter = 0 352 | delta_count = 1 353 | 354 | while ((n_iter < n_iter_max) & (delta_count > 0) ) { 355 | 356 | old_count = count 357 | n_iter = n_iter + 1 358 | 359 | for(i in 1:(n_emb-1)) { 360 | 361 | embryo = embryos[i] 362 | next_embryo = embryos[i+1] 363 | 364 | if (i == 1) { 365 | row_2 = sum(mat_emb_neighbors[embryo,(i+2):n_emb]) - sum(mat_emb_neighbors[next_embryo,(i+2):n_emb]) 366 | col_2 = sum(mat_emb_neighbors[(i+2):n_emb,embryo]) - sum(mat_emb_neighbors[(i+2):n_emb,next_embryo]) 367 | if (row_2 +col_2 > 0) { 368 | embryos[i] = next_embryo 369 | embryos[i+1] = embryo 370 | mat_emb_neighbors = mat_emb_neighbors[embryos,embryos] 371 | count = count + 1 372 | } 373 | } else if (i == (n_emb-1)) { 374 | row_1 = sum(mat_emb_neighbors[embryo,1:(i-1)]) - sum(mat_emb_neighbors[next_embryo,1:(i-1)]) 375 | col_1 = sum(mat_emb_neighbors[1:(i-1),embryo]) - sum(mat_emb_neighbors[1:(i-1),next_embryo]) 376 | if (row_1 + col_1 > 0) { 377 | embryos[i] = next_embryo 378 | embryos[i+1] = embryo 379 | mat_emb_neighbors = mat_emb_neighbors[embryos,embryos] 380 | count = count + 1 381 | } 382 | 383 | 384 | } else { 385 | row_1 = sum(mat_emb_neighbors[embryo,1:(i-1)]) - sum(mat_emb_neighbors[next_embryo,1:(i-1)]) 386 | row_2 = sum(mat_emb_neighbors[embryo,(i+2):n_emb]) - sum(mat_emb_neighbors[next_embryo,(i+2):n_emb]) 387 | col_1 = sum(mat_emb_neighbors[1:(i-1),embryo]) - sum(mat_emb_neighbors[1:(i-1),next_embryo]) 388 | col_2 = sum(mat_emb_neighbors[(i+2):n_emb,embryo]) - sum(mat_emb_neighbors[(i+2):n_emb,next_embryo]) 389 | if (row_2 - row_1 + col_2 - col_1 > 0) { 390 | embryos[i] = next_embryo 391 | embryos[i+1] = embryo 392 | mat_emb_neighbors = mat_emb_neighbors[embryos,embryos] 393 | count = count + 1 394 | } 395 | } 396 | 397 | 398 | } 399 | delta_count = count - old_count 400 | print(delta_count) 401 | } 402 | 403 | 404 | return(list(mat_reshuffled = mat_emb_neighbors,embryos_final_order = embryos)) 405 | } 406 | 407 | -------------------------------------------------------------------------------- /scripts/test_cmp_network_flow_model.r: -------------------------------------------------------------------------------- 1 | # test network flow model construction 2 | source("scripts/generate_mc_mgraph_network/gen_network.r") 3 | 4 | test_network_flow_model = function() { 5 | 6 | build_sing_emb_wt10_network(net_id = "test") 7 | message("computed network flow model") 8 | mct1 = scdb_mctnetwork("sing_emb_wt10") 9 | mct2 = scdb_mctnetwork("test") 10 | 11 | return(paste0("Recomputed network flow model is identical with sing_emb_wt10: ", 12 | identical(mct1@network,mct2@network))) 13 | } --------------------------------------------------------------------------------