├── .gitignore ├── Background_genes_null_brain.csv ├── LICENSE ├── README.md ├── examples.m ├── install.m └── src ├── freesurfer ├── copyright_notice.txt ├── fsgettmppath.m ├── load_nifti.m ├── load_nifti_hdr.m └── strlen.m ├── gene_expression.mat ├── group_regions.m ├── permutation_expression_null_brain.m ├── permutation_expression_null_coexp.m ├── permutation_expression_null_spin.m ├── permutation_null_brain.m ├── permutation_null_coexp.m ├── permutation_null_spin.m ├── permutation_null_spin_correlated_genes.m └── y_rand_gs_coexp.m /.gitignore: -------------------------------------------------------------------------------- 1 | src/__MACOSX/ 2 | src/.DS_Store 3 | src/atlas/ 4 | src/examples/ 5 | src/gene_expression.mat 6 | src/gene_expression_spin/ 7 | output/ 8 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 yongbin-wei 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # GAMBA-MATLAB 2 | 3 | A MATLAB toolbox to study whether the expression of the gene(s) of interest (GOI) and neuroimaging-derived brain phenotypes show overlapped spatial patterns. Different statistical **null models** are available to examine both ***gene specificity*** and ***spatial specificity***. This toolbox is an extension of the web application [GAMBA](http://www.dutchconnectomelab.nl/GAMBA/). 4 | 5 | For details, please see: 6 | 7 | > Wei Y. et al. (2021), Statistical testing and annotation of gene transcriptomic-neuroimaging associations, bioRxiv 8 | 9 | ## Installation 10 | 11 | ### Requirements 12 | 13 | Before you start, make sure you have **MATLAB** on your machine. 14 | 15 | In our examples we use [FSL - FMRIB Software Library](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki) to perform image coregistration. You can use other equivalent tools for the same purpose. See for details in `example.m`. 16 | 17 | ### Download 18 | 19 | Through the command line in terminal: 20 | 21 | `git clone https://github.com/dutchconnectomelab/GAMBA-MATLAB.git`. 22 | 23 | Alternatively, you can click 'Code -- Download ZIP' and unzip the downloaded file. 24 | 25 | ### Install 26 | 27 | Open MATLAB, run `install.m` to install the toolbox and retrieve all required data. 28 | 29 | Data files will be automatically downloaded from: https://www.dropbox.com/sh/psfudnzktyd0860/AABtx7ESvEphO60dcV_xbQ4qa?dl=0 and be saved to the folder *src/*. 30 | 31 | ## Examples 32 | 33 | We use examples to show the utility of this toolbox. Examples cover the following usage of the toolbox: 34 | 35 | - "I have an imaging map (e.g., a nifti file) and a gene set. I want to test if the imaging pattern correlates to the gene expression pattern." 36 | - "I have an imaging data matrix (region by feature), a gene expression data matrix (region by gene), and a gene set. I want to test if the imaging pattern correlates to the gene expression pattern." 37 | - "I have a gene set. I want to test in which brain regions the gene set is differentially expressed." 38 | - "I have a gene expression data matrix and a gene set. I want to test in which brain regions the gene set is differentially expressed." 39 | - "I have an imaging map (e.g., a nifti file) and I want to look for the most correlated genes." 40 | 41 | Scripts related to above questions are included in `examples.m`. Here is a detailed tutorial: 42 | 43 | #### 1. "I have an imaging map (a nifti file) and a gene set. I want to test if the imaging pattern correlates to the gene expression pattern." 44 | 45 | This example uses VBM meta-analysis result -- a brain map showing the vulnerability of brain volume -- for Alzheimer's disease (AD) and examines whether the VBM pattern is associated with the pattern of brain gene expression of three AD risk genes ('APOE', 'APP', 'PSEN2') through GAMBA null-models. 46 | 47 | In the first step, you need to co-register the imaging file to *'/src/atlas/brain.nii.gz'* (which is the MNI152 atlas used for brain parcellation and segmentation). The following codes will do the co-registration using FSL FLIRT. 48 | 49 | ```matlab 50 | input_img_file = fullfile(filepath, 'src', 'examples', 'alzheimers_ALE.nii.gz'); 51 | input_img_anat_file = fullfile(filepath, 'src', 'examples', 'Colin27_T1_seg_MNI_2x2x2.nii.gz'); % anatomical file in the same space 52 | ref_img_file = fullfile(filepath, 'src', 'atlas', 'brain.nii.gz'); % reference file in MNI152 space 53 | reg_file = fullfile(filepath, 'output', 'registration.mat'); 54 | output_img_file = fullfile(filepath, 'output', 'coreg_alzheimers_ALE.nii.gz'); 55 | 56 | system(['flirt -in ', input_img_anat_file, ' -ref ', ref_img_file, ' -omat ', reg_file]); 57 | system(['flirt -in ', input_img_file, ' -ref ', ref_img_file, ' -applyxfm -init ', reg_file, ' -out ', output_img_file]); 58 | ``` 59 | 60 | Note 1: if `flirt` doesn’t work you may need to use the absolute path to FSL (which can be viewed in terminal through `echo $FSLDIR`). Then the MATLAB script should be something like `system([‘/usr/local/fsl/bin/flirt -in input_img_anat_file … … ` 61 | 62 | Note 2: If you do not use FSL, please adjust the codes to coregister the input imaging file to */src/atlas/brain.nii.gz* and then continue. 63 | 64 | Next, you need to compute region-wise measures based on the registered imaging map. Here, voxels within each brain region in the Desikan-Killiany-114-region (DK114) atlas are averaged. 65 | 66 | ```matlab 67 | res_Y = group_regions(output_img_file, 'DK114'); 68 | ``` 69 | 70 | After this you will get a 114 x 1 brain-phenotypic data array, together with region descriptions of the 114 regions. 71 | 72 | Now it’s time to compute the correlation between the phenotypic data array and gene expression data matrix. 73 | 74 | Preprocessed gene expression data matrix (57 x 17879) from the [Allen Human Brain Atlas](https://human.brain-map.org) will be used. Due to that only AHBA transcriptomic data from the left hemisphere is used, you also need to extract brain-phenotypic data for the left hemisphere. 75 | 76 | ```matlab 77 | img_data = res_Y.data(contains(res_Y.regionDescriptions, 'ctx-lh-')); 78 | ``` 79 | 80 | Then, you can examine for instance whether there is a **spatially specific** association between the brain-phenotypic pattern and the mean expression pattern of the input GOI (here 'APOE', 'APP', 'PSEN2') using the **null-spatial** model. 81 | 82 | ```matlab 83 | res_nullspatial = permutation_null_spin(img_data, {'APOE', 'APP', 'PSEN2'}); 84 | ``` 85 | 86 | As we showed in our paper, **spatial specificity** is *however* not enough, you need to examine the level of **gene specificity**, namely, whether the observed association exceeds what one can expect by any other gene sets with similar co-expression levels (i.e., the **null-coexpressed-gene model**). 87 | 88 | ```matlab 89 | res_nullcoexpGene = permutation_null_coexp(img_data, {'APOE', 'APP', 'PSEN2'}); 90 | ``` 91 | 92 | A more stringent null model can be used to examine how many genes of the GOI are **brain-expressed** genes and whether the observed association exceeds what one can expect by any brain-expressed genes (i.e., the **null-brain-gene model**). 93 | 94 | ```matlab 95 | res_nullbraingene = permutation_null_brain(img_data, {'APOE', 'APP', 'PSEN2'}); 96 | ``` 97 | 98 | After doing the above analysis, you will know whether there is a significant association between your brain-phenotypic pattern and the gene set of your interest, and, more importantly, whether the association is **spatial specific** and **gene-specific**. 99 | 100 | 101 | 102 | 103 | #### 2. "I have an imaging data matrix (region by feature), a gene expression data matrix (region by gene), and a gene set. I want to test if the imaging pattern correlates to the expression pattern." 104 | 105 | This example examines whether the expression pattern of 19 human-supragranular-enriched (HSE) genes is associated with the pattern of regional connectome metrics. The input data for this example include: 1) a gene expression data matrix (57 regions x 5000 genes, a subset of the entire AHBA data); 2) an imaging data matrix (57 regions by 5 phenotypes, such as NOS-, FA-, SD-weighted nodal strength, nodal degree, and FC strength); 3) a gene set (19 genes). 106 | 107 | You can load the example data first: 108 | 109 | ```matlab 110 | load('src/examples/example_conn_5k_genes.mat', 'geneset'); 111 | ``` 112 | 113 | Then you can use the **null-spatial model** to examine whether there is a spatially specific association between the brain-phenotypic pattern and the mean expression pattern of the input GOI. **ATTENTION** the null-spatial model ONLY works for the DK114 atlas here. If you use other atlas, please refer to [Alexander-Bloch et al., 2018](https://github.com/spin-test/spin-test) to first generate random gene expression matrices using the spin-model. 114 | 115 | ```matlab 116 | res_nullspatial = permutation_null_spin(img_data, geneset); 117 | ``` 118 | 119 | Going beyond the null-spatial model, you need to examine the level of **gene specificity** using the **null-coexpressed-gene model**. 120 | 121 | ```matlab 122 | res_nullcoexpGene = permutation_null_coexp(img_data, geneset, gene_expression, gene_symbols); 123 | ``` 124 | 125 | A more stringent **null-brain-gene model** is recommended to examine how many genes of the GOI are **brain-expressed** genes and whether the observed association exceeds what one can expect by any brain-expressed genes 126 | 127 | ```matlab 128 | res_nullbraingene = permutation_null_brain(img_data, geneset, gene_expression, gene_symbols); 129 | ``` 130 | 131 | The above codes will help you get to know whether there is a significant association between your brain-phenotypic pattern and the gene set of your interest, and, more importantly, whether the association is **spatial specific** and **gene-specific**. 132 | 133 | #### 3. "I have a gene set. I want to test in which brain regions the gene set is differentially expressed." 134 | 135 | In the third example, we show how to use this toolbox to examine in which brain regions the input GOI is differentially expressed considering different types of null models. The human-supragranular-enriched (HSE) genes are used here. 136 | 137 | You can load the example data first: 138 | 139 | ```matlab 140 | load('src/examples/example_conn_5k_genes.mat', 'geneset'); 141 | ``` 142 | 143 | Now you get a gene set of 19 HSE genes. Then you can use the **null-coexpressed-gene model** to examine whether the mean expression of HSE genes is significantly higher/lower than random genes with the similar coexpression conserved for each brain region. 144 | 145 | ```matlab 146 | res_nullcoexpGene = permutation_expression_null_coexp(geneset); 147 | ``` 148 | 149 | You can also use the **null-brain-gene** model to examine whether the mean expression of HSE genes is significantly higher/lower than random brain-expressed genes for each brain region. 150 | 151 | ```matlab 152 | res_nullbraingene = permutation_expression_null_brain(geneset); 153 | ``` 154 | 155 | Using the above null models we will be able to examine the level of *gene specificity*. 156 | 157 | Another question we may ask is *whether the mean expression of the GOI is specifically higher in one region in contrast to random brain regions*. You can get the answer using the **null-spatial** model. 158 | 159 | ```matlab 160 | res_nullspatial = permutation_expression_null_spin(geneset); 161 | ``` 162 | 163 | #### 4. "I have a gene expression data matrix and a gene set. I want to test in which brain regions the gene set is over-expressed." 164 | 165 | In the fourth example, we show how to use this toolbox to examine in which brain regions the input GOI is differentially expressed considering different types of null models, given a customized gene expression matrix. The human-supragranular-enriched (HSE) genes are used as an example here. 166 | 167 | You can load the example data first. Gene symbols of the GOI, the gene expression matrix of all genes, and symbols of all genes are needed. 168 | 169 | ```matlab 170 | load('src/examples/example_conn_5k_genes.mat', 'geneset', 'gene_expression', 'gene_symbols'); 171 | ``` 172 | 173 | Then you can examine in which brain region the input GOI is differentially expressed compared to random genes, using the **null-coexpressed-gene model* where the null distribution of gene expression of random genes with similar coexpression level is generated and is used for permutation testing. 174 | 175 | ```matlab 176 | res_nullcoexpGene = permutation_expression_null_coexp(geneset, gene_expression, gene_symbols); 177 | ``` 178 | 179 | You can also examine the same question using the more stringent **null-brain-gene** model, where the null distribution of gene expression of brain-expressed genes is generated. 180 | 181 | ```matlab 182 | res_nullbraingene = permutation_expression_null_brain(geneset, gene_expression, gene_symbols); 183 | ``` 184 | 185 | For the other research question -- "in which brain region the input GOI is differentially expressed compared to random brain regions" -- you can use the following code only if you work on the Desikan-Killiany atlas with 57 left-hemisphere regions. Otherwise please use the spin model [Alexander-Bloch et al., 2018](https://github.com/spin-test/spin-test) or other equivalents to generate random gene expression matrices first. 186 | 187 | ```matlab 188 | res_nullspatial = permutation_expression_null_spin(geneset, gene_expression, gene_symbols); 189 | ``` 190 | 191 | #### 5. "I have an imaging map (.nii file) and I want to look for the most correlated genes" 192 | 193 | In the last example, we show how to use our toolbox to look for the most correlated genes given an imaging map. This example uses VBM meta-analysis results for Alzheimer's disease (AD). 194 | 195 | You need to first coregister the imaging file to MNI152 space. 196 | 197 | ```matlab 198 | input_img_file = fullfile(filepath, 'src', 'examples', 'alzheimers_ALE.nii.gz'); 199 | input_img_anat_file = fullfile(filepath, 'src', 'examples', 'Colin27_T1_seg_MNI_2x2x2.nii.gz'); 200 | ref_img_file = fullfile(filepath, 'src', 'atlas', 'brain.nii.gz'); % reference file in MNI152 space 201 | reg_file = fullfile(filepath, 'output', 'registration.mat'); 202 | output_img_file = fullfile(filepath, 'output', 'coreg_alzheimers_ALE.nii.gz'); 203 | 204 | system(['flirt -in ', input_img_anat_file, ' -ref ', ref_img_file, ' -omat ', reg_file]); 205 | system(['flirt -in ', input_img_file, ' -ref ', ref_img_file, ' -applyxfm -init ', reg_file, ' -out ', output_img_file]); 206 | ``` 207 | 208 | Note 1: if `flirt` doesn’t work you may need to use the absolute path to FSL (which can be viewed in terminal through `echo $FSLDIR`). Then the MATLAB script should be something like `system([‘/usr/local/fsl/bin/flirt -in input_img_anat_file … … ` 209 | 210 | Note 2: If you do not use FSL, please adjust the codes to coregister the input imaging file to */src/atlas/brain.nii.gz* and then continue. 211 | 212 | Next, you need to compute region-wise measurements based on the registered imaging map. Here, voxels within each brain region in the DK-114 atlas are averaged. 213 | 214 | ```matlab 215 | res_Y = group_regions(output_img_file, 'DK114'); 216 | ``` 217 | 218 | Using the resulted phenotypic data you can now do correlations between single-gene expression profiles and the phenotypic profile, and then do permutations per gene using the null-spatial model. 219 | 220 | ```matlab 221 | res_nullspatial = permutation_null_spin_correlated_genes(img_data); 222 | ``` 223 | -------------------------------------------------------------------------------- /examples.m: -------------------------------------------------------------------------------- 1 | clc, clear, close all; 2 | filepath = fileparts(mfilename('fullpath')); 3 | addpath(genpath(filepath)); 4 | 5 | % Select an example to run 6 | exampleID = 3; % 1, 2, 3, 4, 5 7 | 8 | 9 | %% Example 1 10 | % Research question: 11 | % - I have an imaging map (a nifti file) and a gene set. I want to test if 12 | % the imaging pattern correlates to the expression pattern.' 13 | if exampleID == 1 14 | disp(strcat('Example 1. Examining association between Alzheimer', ... 15 | "'", 's VBM map and APOE, APP, PSEN2 expression patterns.')); 16 | 17 | % 1.1 Co-register the imaging file to MNI152 18 | input_img_file = fullfile(filepath, 'src', 'examples', ... 19 | 'alzheimers_ALE.nii.gz'); % meta-analysis of alzheimer's VBM studies 20 | input_img_anat_file = fullfile(filepath, 'src', 'examples', ... 21 | 'Colin27_T1_seg_MNI_2x2x2.nii.gz'); % anatomical file in the same space 22 | ref_img_file = fullfile(filepath, 'src', 'atlas', ... 23 | 'brain.nii.gz'); % reference file in MNI152 space 24 | reg_file = fullfile(filepath, 'output', 'registration.mat'); 25 | output_img_file = fullfile(filepath, 'output', 'coreg_alzheimers_ALE.nii.gz'); 26 | 27 | % here using FSL flirt 28 | system(['flirt -in ', input_img_anat_file, ' -ref ', ref_img_file, ... 29 | ' -omat ', reg_file]); 30 | 31 | system(['flirt -in ', input_img_file, ' -ref ', ref_img_file, ... 32 | ' -applyxfm -init ', reg_file, ' -out ', output_img_file]); 33 | 34 | % 1.2 Group the imaging map into brain regions (here regional mean) 35 | res_Y = group_regions(output_img_file, 'DK114'); 36 | 37 | % 1.3 Association between the imaging profile and the expression profile 38 | img_data = res_Y.data(contains(res_Y.regionDescriptions, 'ctx-lh-')); 39 | 40 | % 1.3.1 Null-coexpression model 41 | res_nullcoexp = permutation_null_coexp(img_data, {'APOE', 'APP', 'PSEN2'}); 42 | 43 | % 1.3.2 Null-brain model 44 | res_nullbrain = permutation_null_brain(img_data, {'APOE', 'APP', 'PSEN2'}); 45 | 46 | % 1.3.3 Null-spin model 47 | res_nullspin = permutation_null_spin(img_data, {'APOE', 'APP', 'PSEN2'}); 48 | end 49 | 50 | %% Example 2 51 | % Research question: 52 | % - I have an imaging data matrix (region by feature), a gene expression data 53 | % matrix (region by gene), and a GOI. I want to test if the imaging 54 | % pattern correlates to the expression pattern. 55 | if exampleID == 2 56 | disp(strcat('Example 2. Examining association between connectome ', ... 57 | 'metrics and the expression pattern of supragranular-enriched genes.')); 58 | 59 | % 2.1 load example data 60 | load('src/examples/example_conn_5k_genes.mat'); 61 | % expression data matrix: 57 regions by 5000 genes 62 | % imaging data matrix: 57 regions by 5 phenotypes 63 | % gene set: 19 Human-supragranular genes 64 | 65 | % 2.2 Association between the imaging profile and the expression profile 66 | % 2.2.1 null-coexpression model 67 | res_nullcoexp = permutation_null_coexp(img_data, geneset, ... 68 | gene_expression, gene_symbols); 69 | 70 | % 2.2.2 null-brain model 71 | res_nullbrain = permutation_null_brain(img_data, geneset, ...); 72 | gene_expression, gene_symbols); 73 | 74 | % 2.2.3 null-spin model 75 | % ATTENTION: null-spin model ONLY works for DK114 atlas. If you use 76 | % other atlas, please refer to Alexander-Bloch et al., 2018 to first 77 | % generate gene expression data for 'spinned' atlas. 78 | res_nullspin = permutation_null_spin(img_data, geneset); 79 | end 80 | 81 | %% Example 3 82 | % Research question: 83 | % - I have a GOI. I want to test in which brain regions the GOI is 84 | % over-expressed. 85 | 86 | if exampleID == 3 87 | disp(strcat('Example 3. Testiassociation between connectome ', ... 88 | 'metrics and the expression pattern of supragranular-enriched genes.')); 89 | 90 | % 3.1 load example data 91 | load('src/examples/example_conn_5k_genes.mat', 'geneset'); 92 | % gene set: 19 Human-supragranular genes 93 | 94 | % 3.2.1 null-coexp model 95 | res_nullcoexp = permutation_expression_null_coexp(geneset); 96 | 97 | % 3.2.2 null-brain model 98 | res_nullbrain = permutation_expression_null_brain(geneset); 99 | 100 | % 3.2.3 null-spin model 101 | res_nullspin = permutation_expression_null_spin(geneset); 102 | end 103 | 104 | %% Example 4 105 | % Research question: 106 | % - I have a gene expression matrix and a GOI. I want to test in which 107 | % brain regions the gene-set is over-expressed. 108 | 109 | if exampleID == 4 110 | disp(strcat('Example 4. Testiassociation between connectome ', ... 111 | 'metrics and the expression pattern of supragranular-enriched genes.')); 112 | 113 | % 4.1 load example data 114 | load('src/examples/example_conn_5k_genes.mat', 'geneset', ... 115 | 'gene_expression', 'gene_symbols'); 116 | % gene set: 19 Human-supragranular genes 117 | 118 | % 4.2.1 null-coexp model 119 | res_nullcoexp = permutation_expression_null_coexp(geneset, ... 120 | gene_expression, gene_symbols); 121 | 122 | % 4.2.2 null-brain model 123 | res_nullbrain = permutation_expression_null_brain(geneset, ... 124 | gene_expression, gene_symbols); 125 | 126 | % 4.2.3 null-spin model 127 | res_nullspin = permutation_expression_null_spin(geneset, ... 128 | gene_expression, gene_symbols); 129 | end 130 | 131 | %% Example 5 132 | % Research question: 133 | % - I have an imaging map (.nii file) and I want to look for the most 134 | % correlated genes 135 | 136 | if exampleID == 5 137 | % 5.1 Co-register the imaging file to MNI152 space 138 | input_img_file = fullfile(filepath, 'src', 'examples', ... 139 | 'alzheimers_ALE.nii.gz'); % meta-analysis of alzheimer's VBM studies 140 | input_img_anat_file = fullfile(filepath, 'src', 'examples', ... 141 | 'Colin27_T1_seg_MNI_2x2x2.nii.gz'); % anatomical file in the same space 142 | ref_img_file = fullfile(filepath, 'src', 'atlas', ... 143 | 'brain.nii.gz'); % reference file in MNI152 space 144 | reg_file = fullfile(filepath, 'output', 'registration.mat'); 145 | output_img_file = fullfile(filepath, 'output', 'coreg_alzheimers_ALE.nii.gz'); 146 | 147 | % here using FSL flirt 148 | system(['flirt -in ', input_img_anat_file, ' -ref ', ref_img_file, ... 149 | ' -omat ', reg_file]); 150 | 151 | system(['flirt -in ', input_img_file, ' -ref ', ref_img_file, ... 152 | ' -applyxfm -init ', reg_file, ' -out ', output_img_file]); 153 | 154 | % 5.2 Group the imaging map into brain regions (here regional mean) 155 | res_Y = group_regions(output_img_file, 'DK114'); 156 | 157 | % 5.3 Using the null-spin model to look for the most correlated genes 158 | res_nullspin = permutation_null_spin_correlated_genes(res_Y.data(1:57)); 159 | end 160 | -------------------------------------------------------------------------------- /install.m: -------------------------------------------------------------------------------- 1 | % Run this script for the first time to retrieve necessary data 2 | clc, clear, close all 3 | filepath = fileparts(mfilename('fullpath')); 4 | 5 | if ~exist(fullfile(filepath, 'output'), 'dir') 6 | mkdir(fullfile(filepath, 'output')); 7 | end 8 | 9 | % Add path 10 | addpath(genpath(filepath)); 11 | savepath; 12 | 13 | % Download atlas data 14 | fprintf('%s', '## Download atlas data ...'); 15 | outfilename = websave(fullfile(filepath, 'src', 'atlas.zip'), ... 16 | 'https://www.dropbox.com/s/a3ov1ztgaen5u2m/atlas.zip?dl=1'); 17 | unzip(outfilename, fullfile(filepath, 'src')); 18 | delete(outfilename); 19 | fprintf('%s\n', 'finished'); 20 | 21 | % Download examples 22 | fprintf('%s', '## Download examples ...'); 23 | outfilename = websave(fullfile(filepath, 'src', 'examples.zip'), ... 24 | 'https://www.dropbox.com/s/3iteb4btcbvi39p/examples.zip?dl=1'); 25 | unzip(outfilename, fullfile(filepath, 'src')); 26 | delete(outfilename); 27 | fprintf('%s\n', 'finished'); 28 | 29 | % Download data for the null-spatial model 30 | fprintf('%s', '## Download expression data for the null-spatial model (~1.5G) ...'); 31 | outfilename = websave(fullfile(filepath, 'src', 'gene_expression_spin.zip'), ... 32 | 'https://www.dropbox.com/s/nwmtqro3dkma3u1/gene_expression_spin.zip?dl=1'); 33 | unzip(outfilename, fullfile(filepath, 'src')); 34 | delete(outfilename); 35 | fprintf('%s\n', 'finished'); 36 | -------------------------------------------------------------------------------- /src/freesurfer/copyright_notice.txt: -------------------------------------------------------------------------------- 1 | The scripts in this directory are copied or adapted from version 6.0.0 of the 2 | FreeSurfer software suite and are subject to the FreeSurfer Software License 3 | Agreement: 4 | https://surfer.nmr.mgh.harvard.edu/fswiki/FreeSurferSoftwareLicense 5 | 6 | -------------------------------------------------------------------------------- /src/freesurfer/fsgettmppath.m: -------------------------------------------------------------------------------- 1 | function tmppath = fsgettmppath(tmppathdefault) 2 | % tmppath = fsgettmppath() 3 | % Gets path to a temporary folder (does NOT generate a random file name) 4 | % First looks in the following order: 5 | % 1. $TMPDIR - env var must exist and folder must exist 6 | % 2. $TEMPDIR - env var must exist and folder must exist 7 | % 3. /scratch - folder must exist 8 | % 4. /tmp - folder must exist 9 | % 5. tmppathdefault - if passed 10 | % 6. current folder, ie, ./ (prints warning) 11 | 12 | 13 | tmppath = getenv('TMPDIR'); 14 | if(~isempty(tmppath)) 15 | if(exist(tmppath)) 16 | return; 17 | end 18 | end 19 | 20 | tmppath = getenv('TEMPDIR'); 21 | if(~isempty(tmppath)) 22 | if(exist(tmppath)) 23 | return; 24 | end 25 | end 26 | 27 | tmppath = '/scratch'; 28 | if(exist(tmppath)) return; end 29 | 30 | tmppath = '/tmp'; 31 | if(exist(tmppath)) return; end 32 | 33 | if(nargin > 0) 34 | tmppath = tmppathdefault; 35 | if(exist(tmppath)) return; end 36 | end 37 | 38 | tmppath = './'; 39 | fprintf(['WARNING: fsgettmppath: could not find a temporary folder,' ... 40 | ' using current folder\n']); 41 | 42 | return; 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /src/freesurfer/load_nifti.m: -------------------------------------------------------------------------------- 1 | function hdr = load_nifti(niftifile,hdronly) 2 | % hdr = load_nifti(niftifile,hdronly) 3 | % 4 | % Loads nifti header and volume. The volume is stored 5 | % in hdr.vol. Columns and rows are not swapped. 6 | % 7 | % Handles compressed nifti (nii.gz) by issuing a unix command to 8 | % uncompress the file to a temporary file, which is then deleted. 9 | % 10 | % Dimensions are in mm and msec 11 | % hdr.pixdim(1) = physical size of first dim (eg, 3.125 mm or 2000 ms) 12 | % hdr.pixdim(2) = ... 13 | % 14 | % The sform and qform matrices are stored in hdr.sform and hdr.qform. 15 | % 16 | % hdr.vox2ras is the vox2ras matrix based on sform (if valid), then 17 | % qform. 18 | % 19 | % Handles data structures with more than 32k cols by looking for 20 | % hdr.dim(2) = -1 in which case ncols = hdr.glmin. This is FreeSurfer 21 | % specific, for handling surfaces. When the total number of spatial 22 | % voxels equals 163842, then the volume is reshaped to 23 | % 163842x1x1xnframes. This is for handling the 7th order icosahedron 24 | % used by FS group analysis. 25 | % 26 | % See also: load_nifti_hdr.m 27 | % 28 | 29 | 30 | % 31 | % load_nifti.m 32 | % 33 | % Original Author: Doug Greve 34 | % CVS Revision Info: 35 | % $Author: greve $ 36 | % $Date: 2016/01/19 21:18:27 $ 37 | % $Revision: 1.21 $ 38 | % 39 | % Copyright © 2011 The General Hospital Corporation (Boston, MA) "MGH" 40 | % 41 | % Terms and conditions for use, reproduction, distribution and contribution 42 | % are found in the 'FreeSurfer Software License Agreement' contained 43 | % in the file 'LICENSE' found in the FreeSurfer distribution, and here: 44 | % 45 | % https://surfer.nmr.mgh.harvard.edu/fswiki/FreeSurferSoftwareLicense 46 | % 47 | % Reporting: freesurfer@nmr.mgh.harvard.edu 48 | % 49 | 50 | hdr = []; 51 | 52 | if(nargin < 1 | nargin > 2) 53 | fprintf('hdr = load_nifti(niftifile,)\n'); 54 | return; 55 | end 56 | 57 | if(~exist('hdronly','var')) hdronly = []; end 58 | if(isempty(hdronly)) hdronly = 0; end 59 | 60 | % unzip if it is compressed 61 | ext = niftifile((strlen(niftifile)-2):strlen(niftifile)); 62 | if(strcmpi(ext,'.gz')) 63 | % Need to create unique file name (harder than it looks) 64 | %r0 = rand('state'); rand('state', sum(100*clock)); 65 | %gzipped = round(rand(1)*10000000 + sum(int16(niftifile))) + round(cputime); 66 | %rand('state',r0); 67 | new_niftifile = sprintf('%s.load_nifti.m.nii', tempname(fsgettmppath)); 68 | %fprintf('Uncompressing %s to %s\n',niftifile,new_niftifile); 69 | gzipped = 1; 70 | if(strcmp(computer,'MAC') || strcmp(computer,'MACI') || ismac) 71 | cmd = sprintf('gunzip -c %s > %s', niftifile, new_niftifile) 72 | else 73 | cmd = sprintf('zcat %s > %s', niftifile, new_niftifile); 74 | end 75 | [status result] = unix(cmd); 76 | if(status) 77 | fprintf('cd %s\n',pwd); 78 | fprintf('%s\n',cmd); 79 | fprintf('ERROR: %s\n',result); 80 | return; 81 | end 82 | niftifile = new_niftifile ; 83 | else 84 | gzipped = -1 ; 85 | end 86 | 87 | hdr = load_nifti_hdr(niftifile); 88 | if(isempty(hdr)) 89 | if(gzipped >=0) 90 | cmd = sprintf('rm -f %s', niftifile); 91 | [status result] = unix(cmd); 92 | if(status) 93 | fprintf('cd %s\n',pwd); 94 | fprintf('%s\n',cmd); 95 | fprintf('ERROR: %s\n',result); 96 | return; 97 | end 98 | end 99 | return; 100 | end 101 | 102 | % Check for ico7 103 | nspatial = prod(hdr.dim(2:4)); 104 | IsIco7 = 0; 105 | if(nspatial == 163842) IsIco7 = 1; end 106 | 107 | % If only header is desired, return now 108 | if(hdronly) 109 | if(gzipped >=0) unix(sprintf('rm -f %s', niftifile)); end 110 | if(IsIco7) 111 | % Reshape 112 | hdr.dim(2) = 163842; 113 | hdr.dim(3) = 1; 114 | hdr.dim(4) = 1; 115 | end 116 | return; 117 | end 118 | 119 | % Get total number of voxels 120 | dim = hdr.dim(2:end); 121 | ind0 = find(dim==0); 122 | dim(ind0) = 1; 123 | nvoxels = prod(dim); 124 | 125 | % Open to read the pixel data 126 | fp = fopen(niftifile,'r',hdr.endian); 127 | 128 | % Get past the header 129 | fseek(fp,round(hdr.vox_offset),'bof'); 130 | 131 | switch(hdr.datatype) 132 | % Note: 'char' seems to work upto matlab 7.1, but 'uchar' needed 133 | % for 7.2 and higher. 134 | case 2, [hdr.vol nitemsread] = fread(fp,inf,'uchar'); 135 | case 4, [hdr.vol nitemsread] = fread(fp,inf,'short'); 136 | case 8, [hdr.vol nitemsread] = fread(fp,inf,'int'); 137 | case 16, [hdr.vol nitemsread] = fread(fp,inf,'float'); 138 | case 64, [hdr.vol nitemsread] = fread(fp,inf,'double'); 139 | case 512, [hdr.vol nitemsread] = fread(fp,inf,'ushort'); 140 | case 768, [hdr.vol nitemsread] = fread(fp,inf,'uint'); 141 | otherwise, 142 | fprintf('ERROR: data type %d not supported',hdr.datatype); 143 | hdr = []; 144 | fclose(fp); 145 | if(gzipped >=0) 146 | fprintf('Deleting temporary uncompressed file %s\n',niftifile); 147 | unix(sprintf('rm -f %s', niftifile)); 148 | end 149 | return; 150 | end 151 | 152 | fclose(fp); 153 | if(gzipped >=0) 154 | %fprintf('Deleting temporary uncompressed file %s\n',niftifile); 155 | unix(sprintf('rm -f %s', niftifile)); 156 | end 157 | 158 | % Check that that many voxels were read in 159 | if(nitemsread ~= nvoxels) 160 | fprintf('ERROR: %s, read in %d voxels, expected %d\n',... 161 | niftifile,nitemsread,nvoxels); 162 | hdr = []; 163 | return; 164 | end 165 | 166 | if(IsIco7) 167 | %fprintf('load_nifti: ico7 reshaping\n'); 168 | hdr.dim(2) = 163842; 169 | hdr.dim(3) = 1; 170 | hdr.dim(4) = 1; 171 | dim = hdr.dim(2:end); 172 | end 173 | 174 | hdr.vol = reshape(hdr.vol, dim'); 175 | if(hdr.scl_slope ~= 0) 176 | %fprintf('Rescaling NIFTI: slope = %g, intercept = %g\n',... 177 | % hdr.scl_slope,hdr.scl_inter); 178 | hdr.vol = hdr.vol * hdr.scl_slope + hdr.scl_inter; 179 | end 180 | 181 | return; 182 | 183 | 184 | 185 | 186 | 187 | -------------------------------------------------------------------------------- /src/freesurfer/load_nifti_hdr.m: -------------------------------------------------------------------------------- 1 | function hdr = load_nifti_hdr(niftifile) 2 | % hdr = load_nifti_hdr(niftifile) 3 | % 4 | % Changes units to mm and msec. 5 | % Creates hdr.sform and hdr.qform with the matrices in them. 6 | % Creates hdr.vox2ras based on sform if valid, then qform. 7 | % Does not and will not handle compressed. Compression is handled 8 | % in load_nifti.m, which calls load_nifti_hdr.m after any 9 | % decompression. 10 | % 11 | % Endianness is returned as hdr.endian, which is either 'l' or 'b'. 12 | % When opening again, use fp = fopen(niftifile,'r',hdr.endian); 13 | % 14 | % Handles data structures with more than 32k cols by 15 | % reading hdr.glmin = ncols when hdr.dim(2) < 0. This 16 | % is FreeSurfer specific, for handling surfaces. 17 | % 18 | % $Id: load_nifti_hdr.m,v 1.10.4.1 2016/08/02 21:03:47 greve Exp $ 19 | 20 | 21 | % 22 | % load_nifti_hdr.m 23 | % 24 | % Original Author: Doug Greve 25 | % CVS Revision Info: 26 | % $Author: greve $ 27 | % $Date: 2016/08/02 21:03:47 $ 28 | % $Revision: 1.10.4.1 $ 29 | % 30 | % Copyright © 2011 The General Hospital Corporation (Boston, MA) "MGH" 31 | % 32 | % Terms and conditions for use, reproduction, distribution and contribution 33 | % are found in the 'FreeSurfer Software License Agreement' contained 34 | % in the file 'LICENSE' found in the FreeSurfer distribution, and here: 35 | % 36 | % https://surfer.nmr.mgh.harvard.edu/fswiki/FreeSurferSoftwareLicense 37 | % 38 | % Reporting: freesurfer@nmr.mgh.harvard.edu 39 | % 40 | 41 | 42 | hdr = []; 43 | 44 | if(nargin ~= 1) 45 | fprintf('hdr = load_nifti_hdr(niftifile)\n'); 46 | return; 47 | end 48 | 49 | % Try opening as big endian first 50 | fp = fopen(niftifile,'r','b'); 51 | if(fp == -1) 52 | fprintf('ERROR: could not read %s\n',niftifile); 53 | return; 54 | end 55 | 56 | hdr.sizeof_hdr = fread(fp,1,'int'); 57 | if(hdr.sizeof_hdr ~= 348) 58 | fclose(fp); 59 | % Now try opening as little endian 60 | fp = fopen(niftifile,'r','l'); 61 | hdr.sizeof_hdr = fread(fp,1,'int'); 62 | if(hdr.sizeof_hdr ~= 348) 63 | fclose(fp); 64 | fprintf('ERROR: %s: hdr size = %d, should be 348\n',... 65 | niftifile,hdr.sizeof_hdr); 66 | hdr = []; 67 | return; 68 | end 69 | hdr.endian = 'l'; 70 | else 71 | hdr.endian = 'b'; 72 | end 73 | 74 | hdr.data_type = fscanf(fp,'%c',10); 75 | hdr.db_name = fscanf(fp,'%c',18); 76 | hdr.extents = fread(fp, 1,'int'); 77 | hdr.session_error = fread(fp, 1,'short'); 78 | hdr.regular = fread(fp, 1,'char'); 79 | hdr.dim_info = fread(fp, 1,'char'); 80 | hdr.dim = fread(fp, 8,'short'); 81 | hdr.intent_p1 = fread(fp, 1,'float'); 82 | hdr.intent_p2 = fread(fp, 1,'float'); 83 | hdr.intent_p3 = fread(fp, 1,'float'); 84 | hdr.intent_code = fread(fp, 1,'short'); 85 | hdr.datatype = fread(fp, 1,'short'); 86 | hdr.bitpix = fread(fp, 1,'short'); 87 | hdr.slice_start = fread(fp, 1,'short'); 88 | hdr.pixdim = fread(fp, 8,'float'); % physical units 89 | hdr.vox_offset = fread(fp, 1,'float'); 90 | hdr.scl_slope = fread(fp, 1,'float'); 91 | hdr.scl_inter = fread(fp, 1,'float'); 92 | hdr.slice_end = fread(fp, 1,'short'); 93 | hdr.slice_code = fread(fp, 1,'char'); 94 | hdr.xyzt_units = fread(fp, 1,'char'); 95 | hdr.cal_max = fread(fp, 1,'float'); 96 | hdr.cal_min = fread(fp, 1,'float'); 97 | hdr.slice_duration = fread(fp, 1,'float'); 98 | hdr.toffset = fread(fp, 1,'float'); 99 | hdr.glmax = fread(fp, 1,'int'); 100 | hdr.glmin = fread(fp, 1,'int'); 101 | hdr.descrip = fscanf(fp,'%c',80); 102 | hdr.aux_file = fscanf(fp,'%c',24); 103 | hdr.qform_code = fread(fp, 1,'short'); 104 | hdr.sform_code = fread(fp, 1,'short'); 105 | hdr.quatern_b = fread(fp, 1,'float'); 106 | hdr.quatern_c = fread(fp, 1,'float'); 107 | hdr.quatern_d = fread(fp, 1,'float'); 108 | hdr.quatern_x = fread(fp, 1,'float'); 109 | hdr.quatern_y = fread(fp, 1,'float'); 110 | hdr.quatern_z = fread(fp, 1,'float'); 111 | hdr.srow_x = fread(fp, 4,'float'); 112 | hdr.srow_y = fread(fp, 4,'float'); 113 | hdr.srow_z = fread(fp, 4,'float'); 114 | hdr.intent_name = fscanf(fp,'%c',16); 115 | hdr.magic = fscanf(fp,'%c',4); 116 | 117 | fclose(fp); 118 | 119 | % This is to accomodate structures with more than 32k cols 120 | % FreeSurfer specific. See also mriio.c. 121 | if(hdr.dim(2) < 0) 122 | hdr.dim(2) = hdr.glmin; 123 | hdr.glmin = 0; 124 | end 125 | 126 | % look at xyz units and convert to mm if needed 127 | xyzunits = bitand(hdr.xyzt_units,7); % 0x7 128 | switch(xyzunits) 129 | case 1, xyzscale = 1000.000; % meters 130 | case 2, xyzscale = 1.000; % mm 131 | case 3, xyzscale = .001; % microns 132 | otherwise, 133 | fprintf('WARNING: xyz units code %d is unrecognized, assuming mm\n',xyzunits); 134 | xyzscale = 1; % just assume mm 135 | end 136 | hdr.pixdim(2:4) = hdr.pixdim(2:4) * xyzscale; 137 | hdr.srow_x = hdr.srow_x * xyzscale; 138 | hdr.srow_y = hdr.srow_y * xyzscale; 139 | hdr.srow_z = hdr.srow_z * xyzscale; 140 | 141 | % look at time units and convert to msec if needed 142 | tunits = bitand(hdr.xyzt_units,3*16+8); % 0x38 143 | switch(tunits) 144 | case 8, tscale = 1000.000; % seconds 145 | case 16, tscale = 1.000; % msec 146 | case 32, tscale = .001; % microsec 147 | otherwise, tscale = 0; 148 | end 149 | hdr.pixdim(5) = hdr.pixdim(5) * tscale; 150 | 151 | % Change value in xyzt_units to reflect scale change 152 | hdr.xyzt_units = bitor(2,16); % 2=mm, 16=msec 153 | 154 | % Sform matrix 155 | hdr.sform = [hdr.srow_x'; 156 | hdr.srow_y'; 157 | hdr.srow_z'; 158 | 0 0 0 1]; 159 | 160 | % Qform matrix - not quite sure how all this works, 161 | % mainly just copied CH's code from mriio.c 162 | b = hdr.quatern_b; 163 | c = hdr.quatern_c; 164 | d = hdr.quatern_d; 165 | x = hdr.quatern_x; 166 | y = hdr.quatern_y; 167 | z = hdr.quatern_z; 168 | a = 1.0 - (b*b + c*c + d*d); 169 | if(abs(a) < 1.0e-7) 170 | a = 1.0 / sqrt(b*b + c*c + d*d); 171 | b = b*a; 172 | c = c*a; 173 | d = d*a; 174 | a = 0.0; 175 | else 176 | a = sqrt(a); 177 | end 178 | r11 = a*a + b*b - c*c - d*d; 179 | r12 = 2.0*b*c - 2.0*a*d; 180 | r13 = 2.0*b*d + 2.0*a*c; 181 | r21 = 2.0*b*c + 2.0*a*d; 182 | r22 = a*a + c*c - b*b - d*d; 183 | r23 = 2.0*c*d - 2.0*a*b; 184 | r31 = 2.0*b*d - 2*a*c; 185 | r32 = 2.0*c*d + 2*a*b; 186 | r33 = a*a + d*d - c*c - b*b; 187 | if(hdr.pixdim(1) < 0.0) 188 | r13 = -r13; 189 | r23 = -r23; 190 | r33 = -r33; 191 | end 192 | qMdc = [r11 r12 r13; r21 r22 r23; r31 r32 r33]; 193 | D = diag(hdr.pixdim(2:4)); 194 | P0 = [x y z]'; 195 | hdr.qform = [qMdc*D P0; 0 0 0 1]; 196 | 197 | if(hdr.sform_code ~= 0) 198 | % Use sform first 199 | hdr.vox2ras = hdr.sform; 200 | elseif(hdr.qform_code ~= 0) 201 | % Then use qform first 202 | hdr.vox2ras = hdr.qform; 203 | else 204 | fprintf('WARNING: neither sform or qform are valid in %s\n', ... 205 | niftifile); 206 | D = diag(hdr.pixdim(2:4)); 207 | P0 = [0 0 0]'; 208 | hdr.vox2ras = [eye(3)*D P0; 0 0 0 1]; 209 | end 210 | 211 | return; 212 | 213 | 214 | 215 | 216 | 217 | -------------------------------------------------------------------------------- /src/freesurfer/strlen.m: -------------------------------------------------------------------------------- 1 | function len = strlen(str) 2 | % len = strlen(str) 3 | % compute the # of characters in str (ignoring 0s at the end) 4 | 5 | 6 | % 7 | % strlen.m 8 | % 9 | % Original Author: Bruce Fischl 10 | % CVS Revision Info: 11 | % $Author: nicks $ 12 | % $Date: 2011/03/02 00:04:13 $ 13 | % $Revision: 1.3 $ 14 | % 15 | % Copyright © 2011 The General Hospital Corporation (Boston, MA) "MGH" 16 | % 17 | % Terms and conditions for use, reproduction, distribution and contribution 18 | % are found in the 'FreeSurfer Software License Agreement' contained 19 | % in the file 'LICENSE' found in the FreeSurfer distribution, and here: 20 | % 21 | % https://surfer.nmr.mgh.harvard.edu/fswiki/FreeSurferSoftwareLicense 22 | % 23 | % Reporting: freesurfer@nmr.mgh.harvard.edu 24 | % 25 | 26 | len = length(str) ; 27 | for i=length(str):-1:1 28 | if (str(i) ~= 0) 29 | break ; 30 | end 31 | 32 | len = len-1; 33 | end 34 | 35 | 36 | -------------------------------------------------------------------------------- /src/gene_expression.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dutchconnectomelab/GAMBA-MATLAB/0c0128688ba898ea8bc14f0a54d6f18e12da2767/src/gene_expression.mat -------------------------------------------------------------------------------- /src/group_regions.m: -------------------------------------------------------------------------------- 1 | function res = group_regions(input_file, atlas) 2 | % GROUP_REGIONS(...) computes region-wise mean values from a voxel-wise 3 | % brain map. 4 | % 5 | % INPUT 6 | % input_file -- input brain map (.nii file) that has been co-registered 7 | % to the same space as './src/atlas/brain.nii.gz' (i.e., MNI152 brain) 8 | % 9 | % OPTIONAL: 10 | % atlas -- atlas name. Default: 'DK114'. Other options: 'aparc','DK250' 11 | % 12 | % OUTPUT 13 | % res.data -- N x 1 array of regional mean value extracted from the 14 | % input imaging data. N is the number of regions. 15 | % res.regionDescriptions -- N x 1 cell str of region descriptions. 16 | % res.regionIndexes -- N x 1 array of region indexes. 17 | % 18 | % REFERENCE 19 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 20 | % transcriptomic-neuroimaging associations, bioRxiv 21 | % 22 | % NOTE: This function uses FreeSurfer's MRIread(...) to read nifti file. 23 | % Fischl, B. 2012. “FreeSurfer.” NeuroImage 62 (2): 774–81. 24 | 25 | 26 | if nargin == 1 27 | atlas = 'DK114'; 28 | end 29 | 30 | filepath = fileparts(mfilename('fullpath')); 31 | 32 | switch atlas 33 | case 'DK114' 34 | lookuptable = fullfile(filepath, 'atlas', 'lausanne120.txt'); 35 | ref_file = fullfile(filepath, 'atlas', 'lausanne120+aseg.nii.gz'); 36 | case 'aparc' 37 | lookuptable = fullfile(filepath, 'atlas', 'aparc.txt'); 38 | ref_file = fullfile(filepath, 'atlas', 'aparc+aseg.nii.gz'); 39 | case 'DK219' 40 | lookuptable = fullfile(filepath, 'atlas', 'lausanne250.txt'); 41 | ref_file = fullfile(filepath, 'atlas', 'lausanne250+aseg.nii.gz'); 42 | otherwise 43 | error('Atlas not supported.'); 44 | end 45 | 46 | % load data 47 | hdr = load_nifti(input_file); 48 | vol = hdr.vol; 49 | 50 | ref = load_nifti(ref_file); 51 | 52 | % read color table 53 | tbl = readtable(lookuptable); 54 | res.regionDescriptions = tbl.Var2; 55 | res.regionIndexes = tbl.Var1; 56 | 57 | % compute regional mean 58 | res.data = nan(numel(res.regionDescriptions), 1); 59 | for ii = 1:numel(res.regionIndexes) 60 | tmp = vol(ref.vol == res.regionIndexes(ii)); 61 | res.data(ii, 1) = mean(tmp); 62 | end 63 | 64 | end -------------------------------------------------------------------------------- /src/permutation_expression_null_brain.m: -------------------------------------------------------------------------------- 1 | function res = permutation_expression_null_brain(geneset, expressions, gene_symbols, background) 2 | % PERMUTATION_EXPRESSION_NULL_BRAIN(GENESET, EXPRESSIONS, GENE_SYMBOLS) 3 | % performs permutation testing to examine in which regions the input gene 4 | % set is overexpressed. 5 | % 6 | % INPUT 7 | % geneset -- a cell array of gene symbols of the genes of interest 8 | % 9 | % OPTIONAL: 10 | % expressions -- a NxK matrix of gene expressions of all genes. N is the 11 | % number of genes, K is the number of genes. if not available, 12 | % default gene expression data in the DK114 atlas will be loaded. 13 | % gene_symbols -- a cell array of gene symbols of all genes. This must be 14 | % provided if EXPRESSIONS is provided. If not available, gene symbols 15 | % will be loaded from the default gene expression data. 16 | % background -- a string indicating the selection of brackground genes 17 | % Options: "brain" -- genes over-expressed in brain compared to other 18 | % body sites (N=2655), default 19 | % "body" -- genes over-/similarly expressed in brain tissue 20 | % compared to other body sites (N=8296) 21 | % "general" -- genes expressed in brain without contrast to 22 | % other body sites (N=16778) 23 | % 24 | % OUTPUT 25 | % res.p -- two-tailed p-value in permutation testing 26 | % res.mean_expressions -- regional mean expressions of the input GOI 27 | % res.null_expressions -- regional mean expressions of random BRAIN genes 28 | % res.difference -- difference between mean_expressions and the mean of 29 | % null_expressions, indicating the effect direction. 30 | % 31 | % REFERENCE 32 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 33 | % transcriptomic-neuroimaging associations, bioRxiv 34 | 35 | 36 | % ========================== Check input data ============================= 37 | disp('Runing null-brain model'); 38 | filepath = fileparts(mfilename('fullpath')); 39 | 40 | if nargin == 1 41 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 42 | disp('## Loading default gene expression data in DK114 atlas ...'); 43 | expressions = data_ge.mDataGEctx; 44 | gene_symbols = data_ge.gene_symbols; 45 | regionDescriptions = data_ge.regionDescriptionCtx; 46 | background = 'brain'; 47 | elseif nargin == 2 48 | error('Please provide gene symbols of all genes included in the expression data.'); 49 | elseif nargin == 3 50 | background = 'brain'; 51 | end 52 | 53 | if isempty(expressions) 54 | warning('"expressions" is empty. Loading default expression matrix.'); 55 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 56 | expressions = data_ge.mDataGEctx; 57 | regionDescriptions = data_ge.regionDescriptionCtx; 58 | end 59 | 60 | if isempty(gene_symbols) 61 | warning('"gene_symbols" is empty. Loading default expression matrix.'); 62 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 63 | gene_symbols = data_ge.gene_symbols; 64 | regionDescriptions = data_ge.regionDescriptionCtx; 65 | end 66 | 67 | [N, K] = size(expressions); 68 | disp(['## ', num2str(K), ' genes detected totally.']); 69 | disp(['## ', num2str(N), ' brain regions detected.']); 70 | 71 | NG = numel(geneset); 72 | disp(['## ', num2str(NG), ' gene(s) of the GOI detected.']); 73 | 74 | NGA = numel(gene_symbols); 75 | if NGA~=K 76 | error('The number of gene symbols is different from the number of genes in the expression data'); 77 | end 78 | 79 | II = ismember(gene_symbols, geneset); 80 | if nnz(II) == 0 81 | error('None of the genes in the input gene set found in gene data.'); 82 | end 83 | disp(['## ', num2str(nnz(II)), '/', num2str(NG), ' genes with gene expression data available.']); 84 | 85 | % load brain genes 86 | data_ge = load(fullfile(filepath, 'gene_expression.mat'), 'BRAINgene_idx',... 87 | 'BRAINandBODYgene_idx', 'BRAIN_expressed_gene_idx', 'gene_symbols'); 88 | 89 | switch background 90 | case 'brain' 91 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAINgene_idx); 92 | disp(['## Background is selected as genes over-expressed in brain, N=',num2str(numel(BRAINgenes))]); 93 | case 'body' 94 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAINandBODYgene_idx); 95 | disp(['## Background is selected as genes over- or similarly expressed in brain compared to other body sites, N=',num2str(numel(BRAINgenes))]); 96 | case 'general' 97 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAIN_expressed_gene_idx); 98 | disp(['## Background is selected as genes expressed in brain, without contrast to other body sites, N=',num2str(numel(BRAINgenes))]); 99 | otherwise 100 | warning('Background genes are not properly selected. Setting to "brain" by default.') 101 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAINgene_idx); 102 | disp(['## Background is selected as genes over-expressed in brain, N=',num2str(numel(BRAINgenes))]); 103 | end 104 | geneset = intersect(geneset, BRAINgenes); 105 | II = ismember(gene_symbols, geneset); 106 | disp(['## ', num2str(nnz(II)), ' genes of the input GOI are brain-enriched genes.']); 107 | 108 | 109 | % ========================= Perform permutation =========================== 110 | nPerm = 10000; 111 | 112 | % raw mean expressions 113 | mGE = nanmean(expressions(:, ismember(gene_symbols, geneset)), 2); 114 | res.mean_expressions = mGE; 115 | 116 | % initialize permutation 117 | fprintf('%s', '## Progress: '); 118 | idx_rand_genes = nan(nPerm, nnz(II)); 119 | [~, idx_background] = ismember(BRAINgenes, gene_symbols); 120 | idx_background(idx_background==0) = []; 121 | tmpGE = nan(N, nPerm); 122 | 123 | % permutation 124 | for kk = 1:nPerm 125 | fprintf('\b\b\b\b%.3d%%', round(kk/nPerm*100)); 126 | rid = idx_background(randperm(numel(idx_background), nnz(II))); 127 | idx_rand_genes(kk, :) = rid; 128 | tmpGE(:, kk) = nanmean(expressions(:, rid), 2); 129 | end 130 | res.null_expressions = tmpGE; 131 | res.difference = res.mean_expressions - nanmean(res.null_expressions, 2); 132 | 133 | % compute p-value 134 | for ii = 1: N 135 | P = nnz(double(tmpGE(ii, :) > mGE(ii))) ./ nPerm; 136 | if P > 0.5 137 | res.p(ii, 1) = (1 - P) * 2; 138 | else 139 | res.p(ii, 1) = P * 2; 140 | end 141 | end 142 | 143 | if exist('regionDescriptions', 'var') 144 | res.regionDescriptions = regionDescriptions; 145 | end 146 | 147 | disp(' >> finished without errors'); 148 | 149 | end 150 | 151 | 152 | 153 | -------------------------------------------------------------------------------- /src/permutation_expression_null_coexp.m: -------------------------------------------------------------------------------- 1 | function res = permutation_expression_null_coexp(geneset, expressions, gene_symbols) 2 | % PERMUTATION_EXPRESSION_NULL_COEXP(GENESET, EXPRESSIONS, GENE_SYMBOLS) 3 | % performs permutation testing to examine in which brain regions the input 4 | % gene set is differentially expressed, in comparison to random genes with 5 | % similar level of coexpression conserved. 6 | % 7 | % INPUT 8 | % geneset -- a cell array of gene symbols of the genes of interest 9 | % 10 | % OPTIONAL: 11 | % expressions -- a NxK matrix of gene expressions of all genes. N is the 12 | % number of genes, K is the number of genes. if not available, 13 | % default gene expression data in the DK114 atlas will be loaded. 14 | % gene_symbols -- a cell array of gene symbols of all genes. This must be 15 | % provided if EXPRESSIONS is provided. If not available, gene symbols 16 | % will be loaded from the default gene expression data. 17 | % 18 | % OUTPUT 19 | % res.p -- two-tailed p-value in permutation testing 20 | % res.mean_expressions -- regional mean expressions of the input GOI 21 | % res.null_expressions -- regional mean expressions of random BRAIN genes 22 | % res.difference -- difference between mean_expressions and the mean of 23 | % null_expressions, indicating the effect direction. 24 | % res.coexp_mean -- mean coexpression level among the input GOI 25 | % res.permut_coexp_mean -- mean coexpression level among random genes 26 | % 27 | % REFERENCE 28 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 29 | % transcriptomic-neuroimaging associations, bioRxiv 30 | 31 | 32 | % ========================== Check input data ============================= 33 | disp('Runing null-coexpression model'); 34 | 35 | if nargin == 1 36 | filepath = fileparts(mfilename('fullpath')); 37 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 38 | disp('## Loading default gene expression data in DK114 atlas ...'); 39 | expressions = data_ge.mDataGEctx; 40 | gene_symbols = data_ge.gene_symbols; 41 | regionDescriptions = data_ge.regionDescriptionCtx; 42 | elseif nargin == 2 43 | error('Please provide gene symbols of all genes included in the expression data.'); 44 | end 45 | 46 | [N, K] = size(expressions); 47 | disp(['## ', num2str(K), ' genes detected totally.']); 48 | disp(['## ', num2str(N), ' brain regions detected.']); 49 | 50 | NG = numel(geneset); 51 | disp(['## ', num2str(NG), ' gene(s) of the GOI detected.']); 52 | if NG == 1 53 | error('Only 1 gene included in the GOI. Coexpression cannot be computed.') 54 | end 55 | 56 | NGA = numel(gene_symbols); 57 | if NGA~=K 58 | error('The number of gene symbols is different from the number of genes in the expression data'); 59 | end 60 | 61 | II = ismember(gene_symbols, geneset); 62 | if nnz(II) == 0 63 | error('None of the genes in the input gene set found in gene data'); 64 | end 65 | disp(['## ', num2str(nnz(II)), '/', num2str(NG), ' genes with gene expression data available']); 66 | 67 | 68 | % ========================= Perform permutation =========================== 69 | nPerm = 1000; 70 | 71 | % raw mean expressions 72 | mGE = nanmean(expressions(:, ismember(gene_symbols, geneset)), 2); 73 | res.mean_expressions = mGE; 74 | 75 | % compute coexpression of the input GOI 76 | G = expressions(:, II); 77 | coexp_mat = corr(G, 'rows', 'pairwise'); 78 | mask_tril = tril(ones(size(coexp_mat)), -1); 79 | coexp = nanmean(coexp_mat(mask_tril == 1)); 80 | disp(['## Mean coexpression: ', num2str(coexp)]); 81 | res.coexp_mean = coexp; 82 | 83 | coexp_null = nan(nPerm, 1); 84 | idx_rand_genes = nan(nPerm, nnz(II)); 85 | tmpGE = nan(N, nPerm); 86 | 87 | fprintf('%s', '## Progress: '); 88 | for kk = 1:nPerm 89 | tmp_status = false; 90 | while tmp_status ~= true 91 | [rid, coexp_null(kk), tmp_status] = y_rand_gs_coexp(... 92 | expressions, coexp, nnz(II)); 93 | end 94 | fprintf('\b\b\b\b%.3d%%', round(kk/nPerm*100)); 95 | idx_rand_genes(kk, :) = rid; 96 | 97 | % gene expressions of random genes 98 | tmpGE(:, kk) = nanmean(expressions(:, rid), 2); 99 | end 100 | res.permut_gene_idx = idx_rand_genes; 101 | res.permut_coexp_mean = coexp_null; 102 | res.null_expressions = tmpGE; 103 | res.difference = res.mean_expressions - nanmean(res.null_expressions, 2); 104 | 105 | % compute p-value 106 | for ii = 1: N 107 | P = nnz(double(tmpGE(ii, :) > mGE(ii))) ./ nPerm; 108 | if P > 0.5 109 | res.p(ii, 1) = (1 - P) * 2; 110 | else 111 | res.p(ii, 1) = P * 2; 112 | end 113 | end 114 | 115 | if exist('regionDescriptions', 'var') 116 | res.regionDescriptions = regionDescriptions; 117 | end 118 | disp(' >> finished without errors'); 119 | 120 | end -------------------------------------------------------------------------------- /src/permutation_expression_null_spin.m: -------------------------------------------------------------------------------- 1 | function res = permutation_expression_null_spin(geneset, expressions, gene_symbols) 2 | % PERMUTATION_EXPRESSION_NULL_BRAIN(GENESET, EXPRESSIONS, GENE_SYMBOLS) 3 | % performs permutation testing to examine in which regions the input gene 4 | % set is overexpressed. 5 | % 6 | % INPUT 7 | % geneset -- a cell array of gene symbols of the genes of interest 8 | % 9 | % OPTIONAL: 10 | % expressions -- a NxK matrix of gene expressions of all genes. N is the 11 | % number of genes, K is the number of genes. if not available, 12 | % default gene expression data in the DK114 atlas will be loaded. 13 | % gene_symbols -- a cell array of gene symbols of all genes. This must be 14 | % provided if EXPRESSIONS is provided. If not available, gene symbols 15 | % will be loaded from the default gene expression data. 16 | % 17 | % OUTPUT 18 | % res.p -- two-tailed p-value in permutation testing 19 | % res.mean_expressions -- regional mean expressions of the input GOI 20 | % res.null_expressions -- regional mean expressions of random BRAIN genes 21 | % res.difference -- difference between mean_expressions and the mean of 22 | % null_expressions, indicating the effect direction. 23 | % 24 | % REFERENCE 25 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 26 | % transcriptomic-neuroimaging associations, bioRxiv 27 | 28 | 29 | % ========================== Check input data ============================= 30 | disp('Runing null-spin model'); 31 | 32 | if nargin == 1 33 | filepath = fileparts(mfilename('fullpath')); 34 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 35 | disp('## Loading default gene expression data in DK114 atlas ...'); 36 | expressions = data_ge.mDataGEctx; 37 | regionDescriptions = data_ge.regionDescriptionCtx; 38 | gene_symbols = data_ge.gene_symbols; 39 | elseif nargin == 2 40 | error('Please provide gene symbols of all genes included in the expression data.'); 41 | end 42 | 43 | [N, K] = size(expressions); 44 | disp(['## ', num2str(K), ' genes detected totally.']); 45 | disp(['## ', num2str(N), ' brain regions detected.']); 46 | 47 | NG = numel(geneset); 48 | disp(['## ', num2str(NG), ' gene(s) of the GOI detected.']); 49 | 50 | NGA = numel(gene_symbols); 51 | if NGA~=K 52 | error('The number of gene symbols is different from the number of genes in the expression data'); 53 | end 54 | 55 | II = ismember(gene_symbols, geneset); 56 | if nnz(II)==0 57 | error('None of the genes in the input gene set found in gene data'); 58 | end 59 | disp(['## ', num2str(nnz(II)), '/', num2str(NG), ' genes with gene expression data available']); 60 | 61 | % load available gene symbols 62 | filepath = fileparts(mfilename('fullpath')); 63 | data_ge = load(fullfile(filepath, 'gene_expression.mat'), 'gene_symbols'); 64 | 65 | geneset = intersect(geneset, data_ge.gene_symbols); 66 | NG = numel(geneset); 67 | II = ismember(gene_symbols, geneset); 68 | disp(['## ', num2str(NG), ' genes of the input GOI are available in Null-spin model.']); 69 | 70 | 71 | % ========================= Perform permutation =========================== 72 | % raw mean expressions 73 | mGE = nanmean(expressions(:, ismember(gene_symbols, geneset)), 2); 74 | res.mean_expressions = mGE; 75 | 76 | % Load data for the null-spin model 77 | nPerm = 1000; 78 | null_spin_expression = nan(nPerm, 57, numel(geneset)); 79 | for ii = 1:NG 80 | filename = fullfile(filepath, 'gene_expression_spin', ... 81 | ['GE_spin_', geneset{ii}, '.txt']); 82 | null_spin_expression(:, :, ii) = dlmread(filename); 83 | end 84 | 85 | % Average across genes 86 | null_spin_exp_mean = nanmean(null_spin_expression, 3)'; 87 | 88 | % compute p-value 89 | for ii = 1: N 90 | P = nnz(double(null_spin_exp_mean(ii, :) > mGE(ii))) ./ nPerm; 91 | if P > 0.5 92 | res.p(ii, 1) = (1 - P) * 2; 93 | else 94 | res.p(ii, 1) = P * 2; 95 | end 96 | end 97 | res.null_expressions = null_spin_exp_mean; 98 | res.difference = res.mean_expressions - nanmean(res.null_expressions, 2); 99 | 100 | if exist('regionDescriptions', 'var') 101 | res.regionDescriptions = regionDescriptions; 102 | end 103 | 104 | disp(' >> finished without errors'); 105 | 106 | end 107 | -------------------------------------------------------------------------------- /src/permutation_null_brain.m: -------------------------------------------------------------------------------- 1 | function res = permutation_null_brain(img_data, geneset, expressions, gene_symbols, background) 2 | % PERMUTATION_NULL_BRAIN(...) performs permutation testing to examine 3 | % whether imaging profiles associate with gene expression profiles, based 4 | % on the null-brain model (where random genes were selected from 5 | % brain-expressed genes). 6 | % 7 | % INPUT 8 | % img_data -- a NxM matrix of imaging profiles. N is the number of 9 | % brain regions, M is the number of imaging traits. 10 | % geneset -- a cell array of gene symbols of the genes of interest 11 | % 12 | % OPTIONAL: 13 | % expressions -- a NxK matrix of gene expressions of all genes. N is the 14 | % number of genes, K is the number of genes. if not available, 15 | % default gene expression data in the DK114 atlas will be loaded. 16 | % gene_symbols -- a cell array of gene symbols of all genes. This must be 17 | % provided if EXPRESSIONS is provided. If not available, gene symbols 18 | % will be loaded from the default gene expression data. 19 | % background -- a string indicating the selection of brackground genes 20 | % Options: "brain" -- genes over-expressed in brain compared to other 21 | % body sites (N=2655), default 22 | % "body" -- genes over-/similarly expressed in brain tissue 23 | % compared to other body sites (N=8296) 24 | % "general" -- genes expressed in brain without contrast to 25 | % other body sites (N=16778) 26 | % 27 | % OUTPUT 28 | % res.p -- two-tailed p-value in permutation testing 29 | % res.lr.beta -- standardized beta in linear regression 30 | % res.lr.p -- two-tailed p-value in linear regression 31 | % res.permut_beta -- standardized beta in the null model 32 | % res.permut_gene_idx -- indexes of random genes in each permutation 33 | % 34 | % REFERENCE 35 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 36 | % transcriptomic-neuroimaging associations, bioRxiv 37 | 38 | 39 | % ========================== Check input data ============================= 40 | disp('Runing null-brain model'); 41 | filepath = fileparts(mfilename('fullpath')); 42 | 43 | if nargin == 2 44 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 45 | disp('## Loading default gene expression data in DK114 atlas ...'); 46 | expressions = data_ge.mDataGEctx; 47 | gene_symbols = data_ge.gene_symbols; 48 | background = 'brain'; 49 | elseif nargin == 3 50 | error('Please provide gene symbols of all genes included in the expression data.'); 51 | elseif nargin == 4 52 | background = 'brain'; 53 | elseif (nargin > 5) || (nargin < 2) 54 | error('Input error. Please check input data.') 55 | end 56 | 57 | if isempty(expressions) 58 | warning('"expressions" is empty. Loading default expression matrix.'); 59 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 60 | expressions = data_ge.mDataGEctx; 61 | end 62 | 63 | if isempty(gene_symbols) 64 | warning('"gene_symbols" is empty. Loading default expression matrix.'); 65 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 66 | gene_symbols = data_ge.gene_symbols; 67 | end 68 | 69 | [N, M] = size(img_data); 70 | disp(['## ', num2str(N), ' brain regions detected.']); 71 | disp(['## ', num2str(M), ' imaging trait(s) detected.']); 72 | 73 | [NE, K] = size(expressions); 74 | if N~=NE 75 | error('Different amount of regions in imaging data and gene data. Please check input data.'); 76 | end 77 | disp(['## ', num2str(K), ' genes detected totally.']); 78 | 79 | NG = numel(geneset); 80 | disp(['## ', num2str(NG), ' gene(s) of the GOI detected.']); 81 | 82 | NGA = numel(gene_symbols); 83 | if NGA~=K 84 | error('The number of gene symbols is different from the number of genes in the expression data'); 85 | end 86 | 87 | II = ismember(gene_symbols, geneset); 88 | if nnz(II) == 0 89 | error('None of the genes in the input gene set found in gene data'); 90 | end 91 | disp(['## ', num2str(nnz(II)), '/', num2str(NG), ' genes with gene expression data available']); 92 | 93 | % load brain genes 94 | data_ge = load(fullfile(filepath, 'gene_expression.mat'), 'BRAINgene_idx',... 95 | 'BRAINandBODYgene_idx', 'BRAIN_expressed_gene_idx', 'gene_symbols'); 96 | 97 | switch background 98 | case 'brain' 99 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAINgene_idx); 100 | disp(['## Background is selected as genes over-expressed in brain, N=',num2str(numel(BRAINgenes))]); 101 | case 'body' 102 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAINandBODYgene_idx); 103 | disp(['## Background is selected as genes over- or similarly expressed in brain compared to other body sites, N=',num2str(numel(BRAINgenes))]); 104 | case 'general' 105 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAIN_expressed_gene_idx); 106 | disp(['## Background is selected as genes expressed in brain, without contrast to other body sites, N=',num2str(numel(BRAINgenes))]); 107 | otherwise 108 | warning('Background genes are not properly selected. Setting to "brain" by default.') 109 | BRAINgenes = data_ge.gene_symbols(data_ge.BRAINgene_idx); 110 | disp(['## Background is selected as genes over-expressed in brain, N=',num2str(numel(BRAINgenes))]); 111 | end 112 | geneset = intersect(geneset, BRAINgenes); 113 | II = ismember(gene_symbols, geneset); 114 | disp(['## ', num2str(nnz(II)), ' gene(s) of the input GOI found in background.']); 115 | 116 | 117 | % ====================== Perform linear regression ======================== 118 | beta = nan(M, 1); 119 | pval = nan(M, 1); 120 | 121 | % Standardize 122 | X = nanmean(expressions(:, II), 2); 123 | X = (X - nanmean(X)) ./ nanstd(X); 124 | Y = img_data; 125 | Y = (Y - repmat(nanmean(Y, 1), N, 1)) ./ repmat(nanstd(Y, '', 1), N, 1); 126 | 127 | for ii = 1: M 128 | stats = regstats(Y(:, ii), X, 'linear', {'beta','tstat'}); 129 | beta(ii, 1) = stats.beta(2); 130 | pval(ii, 1) = stats.tstat.pval(2); 131 | end 132 | res.lr.beta = beta; 133 | res.lr.p = pval; 134 | 135 | 136 | % ========================= Perform permutation =========================== 137 | nPerm = 10000; 138 | 139 | idx_rand_genes = nan(nPerm, nnz(II)); 140 | beta_null = nan(nPerm, M); 141 | fprintf('%s', '## Progress: '); 142 | 143 | [~, idx_background] = ismember(BRAINgenes, gene_symbols); 144 | idx_background(idx_background==0) = []; 145 | 146 | for kk = 1:nPerm 147 | fprintf('\b\b\b\b%.3d%%', round(kk/nPerm*100)); 148 | 149 | rid = idx_background(randperm(numel(idx_background), nnz(II))); 150 | idx_rand_genes(kk, :) = rid; 151 | 152 | % gene expressions of random genes 153 | X = nanmean(expressions(:, rid), 2); 154 | X = (X - nanmean(X)) ./ nanstd(X); 155 | 156 | for ii = 1: M 157 | stats = regstats(Y(:, ii), X, 'linear', {'beta'}); 158 | beta_null(kk, ii) = stats.beta(2); 159 | end 160 | end 161 | res.permut_gene_idx = idx_rand_genes; 162 | res.permut_beta = beta_null; 163 | 164 | % compute p-value 165 | for ii = 1: M 166 | P = nnz(double(beta_null(:, ii) > beta(ii))) ./ nPerm; 167 | if P > 0.5 168 | res.p(ii, 1) = (1 - P) * 2; 169 | else 170 | res.p(ii, 1) = P * 2; 171 | end 172 | end 173 | 174 | disp(' >> finished without errors'); 175 | 176 | end 177 | -------------------------------------------------------------------------------- /src/permutation_null_coexp.m: -------------------------------------------------------------------------------- 1 | function res = permutation_null_coexp(img_data, geneset, expressions, gene_symbols) 2 | % PERMUTATION_NULL_COEXP(IMG_DATA, GENESET) performs permutation testing to examine 3 | % whether imaging profiles associate with gene expression profiles, based 4 | % on the null-coexpression model (where random genes with similar 5 | % coexpression level is conserved). 6 | % 7 | % INPUT 8 | % img_data -- a NxM matrix of imaging profiles. N is the number of 9 | % brain regions, M is the number of imaging traits. 10 | % geneset -- a cell array of gene symbols of the genes of interest 11 | % 12 | % OPTIONAL: 13 | % expressions -- a NxK matrix of gene expressions of all genes. N is the 14 | % number of genes, K is the number of genes. if not available, 15 | % default gene expression data in the DK114 atlas will be loaded. 16 | % gene_symbols -- a cell array of gene symbols of all genes. This must be 17 | % provided if EXPRESSIONS is provided. If not available, gene symbols 18 | % will be loaded from the default gene expression data. 19 | % 20 | % OUTPUT 21 | % res.p -- two-tailed p-value in permutation testing 22 | % res.lr.beta -- standardized beta in linear regression 23 | % res.lr.p -- two-tailed p-value in linear regression 24 | % res.permut_beta -- standardized beta in the null model 25 | % res.permut_gene_idx -- indexes of random genes in each permutation 26 | % res.coexp_mean -- mean coexpression level among the input GOI 27 | % res.permut_coexp_mean -- mean coexpression level among random genes 28 | % 29 | % REFERENCE 30 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 31 | % transcriptomic-neuroimaging associations, bioRxiv 32 | 33 | 34 | % ========================== Check input data ============================= 35 | disp('Runing null-coexpression model'); 36 | 37 | if nargin == 2 38 | filepath = fileparts(mfilename('fullpath')); 39 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 40 | disp('## Loading default gene expression data in DK114 atlas ...'); 41 | expressions = data_ge.mDataGEctx; 42 | gene_symbols = data_ge.gene_symbols; 43 | elseif nargin == 3 44 | error('Please provide gene symbols of all genes included in the expression data.'); 45 | elseif (nargin > 4) || (nargin < 2) 46 | error('Input error. Please check input data.') 47 | end 48 | 49 | [N, M] = size(img_data); 50 | disp(['## ', num2str(N), ' brain regions detected.']); 51 | disp(['## ', num2str(M), ' imaging traits detected.']); 52 | 53 | [NE, K] = size(expressions); 54 | if N~=NE 55 | error('Different amount of regions in imaging data and gene data. Please check input data.'); 56 | end 57 | disp(['## ', num2str(K), ' genes detected totally.']); 58 | 59 | NG = numel(geneset); 60 | disp(['## ', num2str(NG), ' gene(s) of the GOI detected.']); 61 | if NG == 1 62 | error('Only 1 gene included in the GOI. Coexpression cannot be computed.') 63 | end 64 | 65 | NGA = numel(gene_symbols); 66 | if NGA~=K 67 | error('The number of gene symbols is different from the number of genes in the expression data'); 68 | end 69 | 70 | II = ismember(gene_symbols, geneset); 71 | if nnz(II) == 0 72 | error('None of the genes in the input gene set found in gene data'); 73 | end 74 | disp(['## ', num2str(nnz(II)), '/', num2str(NG), ' genes with gene expression data available']); 75 | 76 | 77 | % ====================== Perform linear regression ======================== 78 | beta = nan(M, 1); 79 | pval = nan(M, 1); 80 | 81 | % Standardize 82 | X = nanmean(expressions(:, II), 2); 83 | X = (X - nanmean(X)) ./ nanstd(X); 84 | Y = img_data; 85 | Y = (Y - repmat(nanmean(Y, 1), N, 1)) ./ repmat(nanstd(Y, '', 1), N, 1); 86 | 87 | for ii = 1: M 88 | stats = regstats(Y(:, ii), X, 'linear', {'beta','tstat'}); 89 | beta(ii, 1) = stats.beta(2); 90 | pval(ii, 1) = stats.tstat.pval(2); 91 | end 92 | res.lr.beta = beta; 93 | res.lr.p = pval; 94 | 95 | 96 | % ========================= Perform permutation =========================== 97 | nPerm = 1000; 98 | 99 | % compute coexpression of the input GOI 100 | G = expressions(:, II); 101 | coexp_mat = corr(G, 'rows', 'pairwise'); 102 | mask_tril = tril(ones(size(coexp_mat)), -1); 103 | coexp = nanmean(coexp_mat(mask_tril == 1)); 104 | disp(['## Mean coexpression: ', num2str(coexp)]); 105 | res.coexp_mean = coexp; 106 | 107 | coexp_null = nan(nPerm, 1); 108 | idx_rand_genes = nan(nPerm, nnz(II)); 109 | beta_null = nan(nPerm, M); 110 | fprintf('%s', '## Progress: '); 111 | for kk = 1:nPerm 112 | tmp_status = false; 113 | while tmp_status ~= true 114 | [rid, coexp_null(kk), tmp_status] = y_rand_gs_coexp(... 115 | expressions, coexp, nnz(II)); 116 | end 117 | fprintf('\b\b\b\b%.3d%%', round(kk/nPerm*100)); 118 | idx_rand_genes(kk, :) = rid; 119 | 120 | % gene expressions of random genes 121 | X = nanmean(expressions(:, rid), 2); 122 | X = (X - nanmean(X)) ./ nanstd(X); 123 | 124 | for ii = 1: M 125 | stats = regstats(Y(:, ii), X, 'linear', {'beta'}); 126 | beta_null(kk, ii) = stats.beta(2); 127 | end 128 | end 129 | res.permut_gene_idx = idx_rand_genes; 130 | res.permut_beta = beta_null; 131 | res.permut_coexp_mean = coexp_null; 132 | 133 | % compute p-value 134 | for ii = 1: M 135 | P = nnz(double(beta_null(:, ii) > beta(ii))) ./ nPerm; 136 | if P > 0.5 137 | res.p(ii, 1) = (1 - P) * 2; 138 | else 139 | res.p(ii, 1) = P * 2; 140 | end 141 | end 142 | disp(' >> finished without errors'); 143 | 144 | end -------------------------------------------------------------------------------- /src/permutation_null_spin.m: -------------------------------------------------------------------------------- 1 | function res = permutation_null_spin(img_data, geneset, expressions, gene_symbols) 2 | % PERMUTATION_NULL_BRAIN(...) performs permutation testing to examine 3 | % whether imaging profiles associate with gene expression profiles, based 4 | % on the null-brain model (where random genes were selected from 5 | % brain-expressed genes). 6 | % 7 | % INPUT 8 | % img_data -- a NxM matrix of imaging profiles. N is the number of 9 | % brain regions, M is the number of imaging traits. 10 | % geneset -- a cell array of gene symbols of the genes of interest 11 | % 12 | % OPTIONAL: 13 | % expressions -- a NxK matrix of gene expressions of all genes. N is the 14 | % number of genes, K is the number of genes. if not available, 15 | % default gene expression data in the DK114 atlas will be loaded. 16 | % gene_symbols -- a cell array of gene symbols of all genes. This must be 17 | % provided if EXPRESSIONS is provided. If not available, gene symbols 18 | % will be loaded from the default gene expression data. 19 | % 20 | % OUTPUT 21 | % res.p -- two-tailed p-value in permutation testing 22 | % res.lr.beta -- standardized beta in linear regression 23 | % res.lr.p -- two-tailed p-value in linear regression 24 | % res.permut_beta -- standardized beta in the null model 25 | % res.permut_gene_idx -- indexes of random genes in each permutation 26 | % 27 | % REFERENCE 28 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 29 | % transcriptomic-neuroimaging associations, bioRxiv 30 | 31 | 32 | % ========================== Check input data ============================= 33 | disp('Runing null-spin model'); 34 | 35 | if nargin == 2 36 | filepath = fileparts(mfilename('fullpath')); 37 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 38 | disp('## Loading default gene expression data in DK114 atlas ...'); 39 | expressions = data_ge.mDataGEctx; 40 | gene_symbols = data_ge.gene_symbols; 41 | elseif nargin == 3 42 | error('Please provide gene symbols of all genes included in the expression data.'); 43 | elseif (nargin > 4) || (nargin < 2) 44 | error('Input error. Please check input data.') 45 | end 46 | 47 | [N, M] = size(img_data); 48 | disp(['## ', num2str(N), ' brain regions detected.']); 49 | disp(['## ', num2str(M), ' imaging traits detected.']); 50 | 51 | [NE, K] = size(expressions); 52 | if N~=NE 53 | error('Different amount of regions in imaging data and gene data. Please check input data.'); 54 | end 55 | disp(['## ', num2str(K), ' genes detected totally.']); 56 | 57 | NG = numel(geneset); 58 | disp(['## ', num2str(NG), ' gene(s) of the GOI detected.']); 59 | 60 | NGA = numel(gene_symbols); 61 | if NGA~=K 62 | error('The number of gene symbols is different from the number of genes in the expression data'); 63 | end 64 | 65 | II = ismember(gene_symbols, geneset); 66 | if nnz(II) == 0 67 | error('None of the genes in the input gene set found in gene data'); 68 | end 69 | disp(['## ', num2str(nnz(II)), '/', num2str(NG), ' genes with gene expression data available']); 70 | 71 | % load available gene symbols 72 | filepath = fileparts(mfilename('fullpath')); 73 | data_ge = load(fullfile(filepath, 'gene_expression.mat'), 'gene_symbols'); 74 | 75 | geneset = intersect(geneset, data_ge.gene_symbols); 76 | NG = numel(geneset); 77 | II = ismember(gene_symbols, geneset); 78 | disp(['## ', num2str(NG), ' genes of the input GOI are available in Null-spin model.']); 79 | 80 | 81 | % ====================== Perform linear regression ======================== 82 | beta = nan(M, 1); 83 | pval = nan(M, 1); 84 | 85 | % Standardize 86 | X = nanmean(expressions(:, II), 2); 87 | X = (X - nanmean(X)) ./ nanstd(X); 88 | Y = img_data; 89 | Y = (Y - repmat(nanmean(Y, 1), N, 1)) ./ repmat(nanstd(Y, '', 1), N, 1); 90 | 91 | for ii = 1: M 92 | stats = regstats(Y(:, ii), X, 'linear', {'beta','tstat'}); 93 | beta(ii, 1) = stats.beta(2); 94 | pval(ii, 1) = stats.tstat.pval(2); 95 | end 96 | res.lr.beta = beta; 97 | res.lr.p = pval; 98 | 99 | 100 | % ========================= Perform permutation =========================== 101 | % Load data for the null-spin model 102 | null_spin_expression = nan(1000, 57, numel(geneset)); 103 | for ii = 1:NG 104 | filename = fullfile(filepath, 'gene_expression_spin', ... 105 | ['GE_spin_', geneset{ii}, '.txt']); 106 | null_spin_expression(:, :, ii) = dlmread(filename); 107 | end 108 | 109 | % Average across genes 110 | null_spin_exp_mean = nanmean(null_spin_expression, 3); 111 | 112 | nPerm = 1000; % maximum 1000 here 113 | fprintf('%s', '## Progress: '); 114 | beta_null = nan(nPerm, M); 115 | for kk = 1:nPerm 116 | fprintf('\b\b\b\b%.3d%%', round(kk/nPerm*100)); 117 | 118 | % randomized gene expressions 119 | X = null_spin_exp_mean(kk, :); 120 | X = (X - nanmean(X)) ./ nanstd(X); 121 | 122 | for ii = 1: M 123 | stats = regstats(Y(:, ii), X, 'linear', {'beta'}); 124 | beta_null(kk, ii) = stats.beta(2); 125 | end 126 | end 127 | res.permut_beta = beta_null; 128 | 129 | % compute p-value 130 | for ii = 1: M 131 | P = nnz(double(beta_null(:, ii) > beta(ii))) ./ nPerm; 132 | if P > 0.5 133 | res.p(ii, 1) = (1 - P) * 2; 134 | else 135 | res.p(ii, 1) = P * 2; 136 | end 137 | end 138 | 139 | disp(' >> finished without errors'); 140 | 141 | end 142 | -------------------------------------------------------------------------------- /src/permutation_null_spin_correlated_genes.m: -------------------------------------------------------------------------------- 1 | function res = permutation_null_spin_correlated_genes(img_data) 2 | % PERMUTATION_NULL_SPIN_CORRELATED_GENES(IMG_DATA) performs permutation testing to examine 3 | % whether imaging profiles associate with gene expression profiles, based 4 | % on the null-brain model (where random genes were selected from 5 | % brain-expressed genes). 6 | % 7 | % INPUT 8 | % img_data -- a NxM matrix of imaging profiles. N is the number of 9 | % brain regions, M is the number of imaging traits. 10 | % 11 | % OUTPUT 12 | % res.p -- two-tailed p-value in permutation testing 13 | % res.lr.beta -- standardized beta in linear regression 14 | % res.lr.p -- two-tailed p-value in linear regression 15 | % res.gene_symbols -- two-tailed p-value in permutation testing 16 | % 17 | % REFERENCE 18 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 19 | % transcriptomic-neuroimaging associations, bioRxiv 20 | 21 | % ========================== Check input data ============================= 22 | disp('Looking for top correlated genes'); 23 | 24 | filepath = fileparts(mfilename('fullpath')); 25 | data_ge = load(fullfile(filepath, 'gene_expression.mat')); 26 | disp('## Loading default gene expression data in DK114 atlas ...'); 27 | expressions = data_ge.mDataGEctx; 28 | gene_symbols = data_ge.gene_symbols; 29 | 30 | [N, M] = size(img_data); 31 | disp(['## ', num2str(N), ' brain regions detected.']); 32 | disp(['## ', num2str(M), ' imaging traits detected.']); 33 | 34 | [NE, K] = size(expressions); 35 | if N~=NE 36 | error('Different amount of regions in imaging data and gene data. Please check input data.'); 37 | end 38 | disp(['## ', num2str(K), ' genes totally.']); 39 | 40 | 41 | % =========================== Loop all genes ============================== 42 | beta = nan(K, M); 43 | pval = nan(K, M); 44 | 45 | % Standardize 46 | Y = img_data; 47 | Y = (Y - repmat(nanmean(Y, 1), N, 1)) ./ repmat(nanstd(Y, '', 1), N, 1); 48 | 49 | fprintf('%s', '## Progress: '); 50 | for ii = 1:K 51 | fprintf('\b\b\b\b%.3d%%', round(ii/K*100)); 52 | 53 | X = expressions(:, ii); 54 | X = (X - nanmean(X)) ./ nanstd(X); 55 | 56 | beta_null = nan(1000, M); 57 | for jj = 1:M 58 | % linear regression 59 | stats = regstats(Y(:, jj), X, 'linear', {'beta', 'tstat'}); 60 | beta(ii, jj) = stats.beta(2); 61 | pval(ii, jj) = stats.tstat.pval(2); 62 | 63 | % null-spin 64 | filename = fullfile(filepath, 'gene_expression_spin', ... 65 | ['GE_spin_', gene_symbols{ii}, '.txt']); 66 | if exist(filename, 'file') 67 | null_spin_expression = dlmread(filename); 68 | else 69 | null_spin_expression = nan(1000, N); 70 | end 71 | null_spin_expression = null_spin_expression'; 72 | for kk = 1:1000 73 | X = null_spin_expression(:, kk); 74 | X = (X - nanmean(X)) ./ nanstd(X); 75 | 76 | % linear regression 77 | stats = regstats(Y(:, jj), X, 'linear', {'beta','tstat'}); 78 | beta_null(kk, jj) = stats.beta(2); 79 | end 80 | 81 | % p value 82 | P = nnz(double(beta_null(:, jj) > beta(ii, jj))) ./ 1000; 83 | if P > 0.5 84 | res.p(ii, jj) = (1 - P) * 2; 85 | else 86 | res.p(ii, jj) = P * 2; 87 | end 88 | end 89 | end 90 | res.lr.beta = beta; 91 | res.lr.p = pval; 92 | res.gene_symbols = gene_symbols; 93 | 94 | disp(' >> finished without errors'); 95 | end 96 | 97 | -------------------------------------------------------------------------------- /src/y_rand_gs_coexp.m: -------------------------------------------------------------------------------- 1 | function [gene_id, coexp, status] = y_rand_gs_coexp(G, T, NG, maxDiff) 2 | % Y_RAND_GS_COEXP(...) looks for random gene sets with similar coexpression 3 | % level as the input genes of interest 4 | % 5 | % INPUT 6 | % G -- gene expression matrix, regions by genes 7 | % T -- target mean coexpression level 8 | % NG -- number of genes within the gene sets 9 | % 10 | % OPTIONAL: 11 | % maxDiff -- the maximum allowed difference of coexpression between the 12 | % target one (T) and the observed ones. Default: 0.025 13 | % 14 | % OUTPUT 15 | % gene_id -- index of random genes in the gene expression matrix G 16 | % coexp -- mean coexpression level among random genes 17 | % status -- indicator of success performance. false if the function fails 18 | % 19 | % REFERENCE 20 | % Wei Y. et al., (2021) Statistical testing and annotation of gene 21 | % transcriptomic-neuroimaging associations, bioRxiv 22 | 23 | 24 | if nargin == 3 25 | maxDiff = 0.025; 26 | end 27 | 28 | nGenes = size(G, 2); 29 | nCount = 0; 30 | status = true; 31 | 32 | % initialize a set of random genes 33 | gene_id = randperm(nGenes, NG); % random gene set S1 34 | Grand = G(:, gene_id); % expression matrix of S1 35 | 36 | % compute mean coexpression for each gene and for the whole gene set 37 | [~, tmp_coexp_mean, tmp_coexp_mean_mean] = compute_mean_coexp(Grand); 38 | coexp = tmp_coexp_mean_mean; 39 | delta_coexp = tmp_coexp_mean_mean - T; % difference from the target 40 | 41 | % compute coexpression between the GOI and the rest of genes 42 | Gsub_G_coexp = compute_Gsub_G_coexp(Grand, G); 43 | [~, Gsub_G_coexp_sorted_idx] = sort(Gsub_G_coexp, 'ascend'); 44 | Gsub_G_coexp_sorted_idx(ismember(Gsub_G_coexp_sorted_idx, gene_id)) = []; 45 | 46 | % the number of genes to be replaced during iterations 47 | nRep = ceil(NG/100); 48 | 49 | % do if delta_coexp > maxdiff 50 | while abs(delta_coexp) > maxDiff 51 | nCount = nCount + 1; 52 | 53 | % in case we need to decrease coexpression level 54 | if delta_coexp > 0 55 | % find gene with max coexpression and drop it 56 | [~, idx_max] = maxk(tmp_coexp_mean, nRep); 57 | Grand(:, idx_max) = []; 58 | gene_id(idx_max) = []; 59 | 60 | % add new gene(s) 61 | rid = Gsub_G_coexp_sorted_idx(1:nRep); % replace with the one differs the most 62 | Gsub_G_coexp_sorted_idx(1:nRep) = []; 63 | 64 | Grand = [Grand, G(:, rid)]; 65 | gene_id = [gene_id, rid]; 66 | 67 | % compute delta coexp after add this gene 68 | [~, tmp_coexp_mean, tmp_coexp_mean_mean] = compute_mean_coexp(Grand); 69 | 70 | delta_coexp = tmp_coexp_mean_mean - T; 71 | coexp = tmp_coexp_mean_mean; 72 | % in case we need to incease coexpression level 73 | else 74 | % find gene with min coexpression and drop it 75 | [~, idx_min] = mink(tmp_coexp_mean, nRep); 76 | Grand(:, idx_min) = []; 77 | gene_id(idx_min) = ''; 78 | 79 | % add a new gene 80 | rid = Gsub_G_coexp_sorted_idx((end-nRep+1):end); % replace with the one differs the most 81 | Gsub_G_coexp_sorted_idx((end-nRep+1):end) = []; 82 | 83 | Grand = [Grand, G(:, rid)]; 84 | gene_id = [gene_id, rid]; 85 | 86 | % compute delta coexp after add this gene 87 | [~, tmp_coexp_mean, tmp_coexp_mean_mean] = compute_mean_coexp(Grand); 88 | 89 | delta_coexp = tmp_coexp_mean_mean - T; 90 | coexp = tmp_coexp_mean_mean; 91 | end 92 | 93 | if nCount == 500 94 | % disp('ERROR: fail to find random genes. Please use larger maxdiff and retry'); 95 | status = false; 96 | break 97 | end 98 | end 99 | % if status == true 100 | % disp('## Finished without errors'); 101 | % end 102 | 103 | end 104 | 105 | % compute mean coexpression 106 | function [coexp_mat, coexp_mean, coexp_mean_mean] = compute_mean_coexp(G_sub) 107 | coexp_mat = corr(G_sub, 'rows', 'pairwise'); 108 | coexp_mean = nanmean(coexp_mat, 1); 109 | mask_tril = tril(ones(size(coexp_mat)), -1); 110 | coexp_mean_mean = nanmean(coexp_mat(mask_tril == 1)); 111 | end 112 | 113 | % compute coexpression from geneset to all 114 | function Gsub_G_coexp = compute_Gsub_G_coexp(Gsub, G) 115 | rtmp = corr(Gsub, G, 'rows', 'pairwise'); 116 | Gsub_G_coexp = nanmean(rtmp, 1); 117 | end 118 | --------------------------------------------------------------------------------