├── .gitignore ├── requirements.txt ├── docs ├── img │ ├── ABCD-BIDS.png │ ├── matched_groups.png │ └── ABCD-BIDS_cropped.png ├── index.md ├── useful.md ├── inputs.md ├── recommendations.md ├── release_notes.md ├── postpipeline.md ├── derivatives.md └── pipelines.md ├── README.md ├── mkdocs.yml ├── .readthedocs.yaml └── LICENSE /.gitignore: -------------------------------------------------------------------------------- 1 | *.pdf 2 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | mkdocs 2 | -------------------------------------------------------------------------------- /docs/img/ABCD-BIDS.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ABCD-STUDY/nda-abcd-collection-3165/HEAD/docs/img/ABCD-BIDS.png -------------------------------------------------------------------------------- /docs/img/matched_groups.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ABCD-STUDY/nda-abcd-collection-3165/HEAD/docs/img/matched_groups.png -------------------------------------------------------------------------------- /docs/img/ABCD-BIDS_cropped.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ABCD-STUDY/nda-abcd-collection-3165/HEAD/docs/img/ABCD-BIDS_cropped.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Welcome to the NDA Collection 3165 DCAN Labs ABCD-BIDS Documentation Repository 2 | 3 | ## The full documentation lives here: [https://nda-abcd-collection-3165.readthedocs.io/](https://nda-abcd-collection-3165.readthedocs.io/) 4 | -------------------------------------------------------------------------------- /mkdocs.yml: -------------------------------------------------------------------------------- 1 | site_name: ABCC Archival Data Release Documentation 2 | theme: readthedocs 3 | plugins: 4 | - search 5 | docs_dir: docs 6 | nav: 7 | - Home: index.md 8 | - Release Notes: release_notes.md 9 | - Inputs: inputs.md 10 | - Pipeline: pipelines.md 11 | - Post Pipeline: postpipeline.md 12 | - Derivatives: derivatives.md 13 | - Recommendations: recommendations.md 14 | - Useful Links: useful.md 15 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | # ABCC Archival Data Release Documentation 2 | 3 | **This is archived documentation associated with the [ABCD-BIDS Community Collection (ABCC) Collection 3165](https://nda.nih.gov/abcd-collection-3165.html) hosted by the [NIMH Data Archive (NDA)](https://nda.nih.gov/). Please visit the current release documentation for the latest data hosted via the NBDC Data Hub [here](https://docs.abcdstudy.org/latest/documentation/imaging/abcc_start_page.html).** 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.readthedocs.yaml: -------------------------------------------------------------------------------- 1 | # Read the Docs configuration file for MkDocs projects 2 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details 3 | 4 | # Required 5 | version: 2 6 | 7 | # Set the version of Python and other tools you might need 8 | build: 9 | os: ubuntu-22.04 10 | tools: 11 | python: "3.12" 12 | 13 | mkdocs: 14 | configuration: mkdocs.yml 15 | 16 | # Optionally declare the Python requirements required to build your docs 17 | python: 18 | install: 19 | - requirements: docs/requirements.txt 20 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2019, Developmental Cognition and Neuroimaging Labs 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | 3. Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /docs/useful.md: -------------------------------------------------------------------------------- 1 | # Useful Links 2 | 3 | Note: Clicking any link within the readthedocs site will not open a new web browser tab. If you want to keep your docs open, either middle-click or right-click and choose open in new tab for the links you would like to follow. 4 | 5 | --- 6 | 7 | ## 1. About this Document 8 | 9 | This document collects the entirety of other document's link into one. Remember, use the firt link below to report documentation issues or request more data of this collection. 10 | 11 | - [Direct link to this repository's GitHub issues for requests and feedback](https://github.com/ABCD-STUDY/nda-abcd-collection-3165/issues) 12 | 13 | ## 2. References 14 | 15 | - [DCAN Labs ABCD-BIDS MRI processing pipeline on Zenodo](https://doi.org/10.5281/zenodo.2587210) 16 | - [The minimal preprocessing pipelines for the Human Connectome Project. Glasser, et al. NeuroImage. 2013.](https://doi.org/10.1016/j.neuroimage.2013.04.127) 17 | - [Correction of respiratory artifacts in MRI head motion estimates. Fair, et al. NeuroImage. 2019.](https://doi.org/10.1016/j.neuroimage.2019.116400) 18 | - [Power, J. D., Mitra, A., Laumann, T. O., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2014). Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage, 84, 320–41. doi:10.1016/j.neuroimage.2013.08.048](https://www.sciencedirect.com/science/article/pii/S1053811913009117) 19 | - [Gorgolewski, K., Auer, T., Calhoun, V. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3, 160044 (2016) doi:10.1038/sdata.2016.44](https://www.nature.com/articles/sdata201644) 20 | 21 | ## 3. Websites & Documentation 22 | 23 | - [NDA Collection 3165 documentation repository](https://github.com/ABCD-STUDY/nda-abcd-collection-3165) 24 | - [NIMH Data Archive (NDA) Collection 3165 - DCAN Labs ABCD-BIDS](https://nda.nih.gov/edit_collection.html?id=3165) 25 | - [ABCD Fast Track Data explanation on the NDA](https://nda.nih.gov/abcd/query/abcd-fast-track-data.html) 26 | - [NIMH Data Archive (NDA) collection 2573 - ABCD Fast Track Data](https://nda.nih.gov/edit_collection.html?id=2573) 27 | - [Developmental Cognition and Neuroimaging (DCAN) Labs](http://www.ohsu.edu/dcan) 28 | - [Adolescent Brain Cognitive Development (ABCD) Study](https://abcdstudy.org/) 29 | - [ABCD Study Protocol_Imaging_Sequences.pdf](https://abcdstudy.org/images/Protocol_Imaging_Sequences.pdf) 30 | - [Brain Imaging Data Structure (BIDS)](https://bids.neuroimaging.io/) 31 | - [BIDS Apps](https://bids-apps.neuroimaging.io/about/) 32 | - [BIDS Extension Proposals (BEPs)](https://bids-specification.readthedocs.io/en/stable/06-extensions.html#bids-extension-proposals) 33 | - [BEP003](https://docs.google.com/document/d/1Wwc4A6Mow4ZPPszDIWfCUCRNstn7d_zzaWPcfcHmgI4/view) 34 | - [BEP011](https://docs.google.com/document/d/1YG2g4UkEio4t_STIBOqYOwneLEs1emHIXbGKynx7V0Y/view) 35 | - [BEP012](https://docs.google.com/document/d/1qBNQimDx6CuvHjbDvuFyBIrf2WRFUOJ-u50canWjjaw/view) 36 | - [-NIfTI, CIFTI, GIFTI in the HCP and Workbench: a primer- by Jo Etzel from Washington University in St. Louis](http://mvpa.blogspot.com/2014/03/nifti-cifti-gifti-in-hcp-and-workbench.html) 37 | - [-A layman’s guide to working with CIFTI files- by Mandy Mejia from Indiana University](https://mandymejia.com/2015/08/10/a-laymans-guide-to-working-with-cifti-files/) 38 | 39 | ## 4. Software 40 | 41 | - [DCAN Labs nda-abcd-s3-downloader GitHub repository for downloading BIDS inputs and derivatives from collection 3165 directly](https://github.com/ABCD-STUDY/nda-abcd-s3-downloader) 42 | - [ABCD repository for downloading and setting up the BIDS dataset (ABCD-STUDY/abcd-dicom2bids)](https://github.com/ABCD-STUDY/abcd-dicom2bids) 43 | - [Christophe Bedetti's Dcm2Bids GitHub repository](https://github.com/cbedetti/Dcm2Bids) 44 | - [The Chris Rorden's Lab dcm2niix](https://github.com/rordenlab/dcm2niix) 45 | - [A modified version of the minimal preprocessing pipeline for the Human Connectome Project (HCP)](https://github.com/DCAN-Labs/DCAN-HCP) 46 | - [ABCD-BIDS pipeline on GitHub](https://github.com/ABCD-STUDY/abcd-hcp-pipeline) 47 | - [ABCD-BIDS pipeline on DockerHub](https://hub.docker.com/r/dcanlabs/abcd-hcp-pipeline) 48 | - [Advanced Normalization Tools (ANTs)](https://github.com/ANTsX/ANTs) 49 | - [FreeSurfer](https://surfer.nmr.mgh.harvard.edu/) 50 | - [DCAN Labs resting state fMRI analysis tools](https://github.com/DCAN-Labs/dcan_bold_processing) 51 | - [BrainSprite](https://github.com/simexp/brainsprite.js) 52 | - [DCAN-Labs software on GitHub](https://github.com/DCAN-Labs) 53 | - [Custom Clean](https://github.com/DCAN-Labs/CustomClean) 54 | - [File Mapper](https://github.com/DCAN-Labs/file-mapper) 55 | - [DCAN-Labs/cifti-connectivity tools](https://github.com/DCAN-Labs/cifti-connectivity) 56 | - [Connectome Workbench](https://www.humanconnectome.org/software/connectome-workbench) 57 | - [The official BIDS validator](https://github.com/bids-standard/bids-validator) 58 | -------------------------------------------------------------------------------- /docs/inputs.md: -------------------------------------------------------------------------------- 1 | # Inputs 2 | 3 | --- 4 | 5 | ## 1. About this Document 6 | 7 | The data collection contains anatomical MRI images (T1w and T2w), functional MRI images processed through [a modified version](https://github.com/DCAN-Labs/DCAN-HCP) of the [minimal preprocessing pipeline for the Human Connectome Project (HCP)](https://doi.org/10.1016/j.neuroimage.2013.04.127), and spin-echo field maps used in preprocessing. [BIDS-formatted](https://bids-specification.readthedocs.io/en/stable/) diffusion-weighted images and their field maps are included as input data, but have not been minimally preprocessed. This document describes the BIDS input data and the steps involved in setting it up for processing through the ABCD-BIDS pipeline. Other documents describe how to [download the data](https://collection3165.readthedocs.io/en/stable/recommendations/#4-downloading-and-unpacking-data) using our [nda-abcd-s3-downloader](https://github.com/ABCD-STUDY/nda-abcd-s3-downloader) tool. 8 | 9 | ## 2. Input Data Subsets Breakdown 10 | 11 | Sections 3 and onward of this document generally describe what each of the input data subsets are. This section breaks down the exact contents of each of the input data subsets. Subject and session identifiers are instead labeled as `#`. Each input data subset comes with modality-agnostic BIDS-compatible `dataset_description.json`, `README`, and `CHANGES` files. 12 | 13 | `inputs.anat.(T1w|T2w)` 14 | 15 | - `sub-#/ses-#/anat/sub-#_ses-#(_rec-normalized)(_run-#)_(T1w|T2w).json` 16 | - `sub-#/ses-#/anat/sub-#_ses-#(_rec-normalized)(_run-#)_(T1w|T2w).nii.gz` 17 | 18 | `inputs.dwi.dwi` 19 | 20 | - `sub-#/ses-#/dwi/sub-#_ses-#_dwi.bval` 21 | - `sub-#/ses-#/dwi/sub-#_ses-#_dwi.bvec` 22 | - `sub-#/ses-#/dwi/sub-#_ses-#_dwi.json` 23 | - `sub-#/ses-#/dwi/sub-#_ses-#_dwi.nii.gz` 24 | 25 | `inputs.fmap.all` 26 | 27 | - `sub-#/ses-#/fmap/sub-#_ses-#_dir-(AP|PA)_run-#_epi.json` 28 | - `sub-#/ses-#/fmap/sub-#_ses-#_dir-(AP|PA)_run-#_epi.nii.gz` 29 | - `sub-#/ses-#/fmap/sub-#_ses-#_acq-dwi_dir-(AP|PA)_epi.json` 30 | - `sub-#/ses-#/fmap/sub-#_ses-#_acq-dwi_dir-(AP|PA)_epi.nii.gz` 31 | 32 | `inputs.func.task-(MID|nback|SST|rest)` 33 | 34 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_bold.json` 35 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_bold.nii.gz` 36 | 37 | `sourcedata.func.task_events` 38 | 39 | - `sourcedata/sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST)_run-#_bold_EventRelatedInformation.txt` 40 | 41 | ## 3. DAIC Quality Control (QC) Selection Process 42 | 43 | QC is performed by scan operators at the time of the scan. Subjects may fail for a variety of reasons whether the subject was moving, began talking, fell asleep, used the squeeze ball, the scan had to be stopped, etc. These operator notes are taken into account and each scan is given a binary pass or fail. This QC information is now provided by the NIMH Data Archive (NDA) and instructions to download it can be found on the [ABCD repository for downloading and setting up the BIDS dataset (ABCD-STUDY/abcd-dicom2bids)](https://github.com/ABCD-STUDY/abcd-dicom2bids). Only scans that passed this initial QC were considered for processing. Subjects without a passing T1w image were excluded from this dataset. If there were valid anatomical scans, but no functional scans the subjects were processed through an anatomical-only version of the pipeline that excludes any functional image processing. 44 | 45 | ## 4. DICOM to BIDS Conversion 46 | 47 | DICOMS were converted using the [abcd-dcm2bids wrapper](https://github.com/ABCD-STUDY/abcd-dicom2bids). The wrapper includes multiple steps involving pulling the data from ABCD fast-track, as described in recommendations [here](https://collection3165.readthedocs.io/en/stable/recommendations/#4-downloading-and-unpacking-data). ABCD-dcm2bids wrapper pulls DICOMS based on the ABCD fast track QC. DICOMs were first converted into NIfTIs using [Christophe Bedetti's Dcm2Bids](https://github.com/cbedetti/Dcm2Bids), which is a wrapper for [the Chris Rorden's Lab dcm2niix](https://github.com/rordenlab/dcm2niix) that restructures NIfTIs into BIDS format. 48 | 49 | ## 5. MRI Acquisition Parameters 50 | 51 | For a summary of the MRI acquisition parameters for each modality and scanner see [https://abcdstudy.org/images/Protocol_Imaging_Sequences.pdf](https://abcdstudy.org/images/Protocol_Imaging_Sequences.pdf). 52 | 53 | ## 6. Scanner Differences 54 | 55 | Images in this dataset were acquired from three brands of MRI scanner: Siemens, Philips, and General Electric (GE). After initially processing the dataset we noticed a relatively high post-processing quality control failure rate, particularly for images derived from GE and Philips scanners. Upon further investigation we noticed that images from GE scanners had no intensity normalization applied to them and images from both Philips and GE appeared more "grainy" than those from Siemens scanners. 56 | 57 | This motivated inclusion of [Advanced Normalization Tools (ANTs)](https://github.com/ANTsX/ANTs) de-noising as well as ANTs N4 bias field correction during the [PreFreeSurfer](https://github.com/DCAN-Labs/DCAN-HCP/tree/master/PreFreeSurfer) stage of the processing pipeline for all data. We also implemented ANTs-based atlas registration in PostFreeSurfer instead of [FSL's FNIRT](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FNIRT)-based method. 58 | 59 | While these changes had little effect on the high-quality anatomical Siemens data they significantly improved results for both Philips and GE scanner data. 60 | 61 | ## 7. Field Map Selection Process 62 | 63 | A pair of positive (posterior to anterior) and negative (anterior to posterior) spin echo field maps are taken along with each functional scan. These field maps are highly susceptible to motion artifacts and can have a negative effect on procesing if they are not high-quality. Field maps are largely consistent between runs so we select the field map with the least variance from the registered group average and apply the pair to each functional scan to avoid processing with a poor-quality field map. This assumes that field maps with large motion artifacts will have more noise and higher variance from the average. 64 | 65 | ## 8. BIDS Field Map "IntendedFor" Metadata 66 | 67 | The chosen field map pair is used for all anatomical and functional bias field corrections. This is specified with the `IntendedFor` field in the side car JSONs of the associated field map's [BIDS](https://bids-specification.readthedocs.io/en/stable/) metadata. 68 | 69 | ## 9. Diffusion-Weighted Imaging (DWI) 70 | 71 | [BIDS-formatted](https://bids-specification.readthedocs.io/en/stable/) inputs for DWI have been included although they have not been processed through the minimal preprocessing pipeline. 72 | 73 | The bval and bvec files associated with the DWI data for each scanner and software version were provided by the DAIC and can be found in the [nda-abcd-s3-downloader](https://github.com/ABCD-STUDY/nda-abcd-s3-downloader) repository. This was due to formatting issues and irregularities with these files that were packaged along with the DICOMs. 74 | 75 | Field maps for the DWI data are included in each subject's `fmap` directory and can be distinguished from the functional fieldmaps by the `_acq-dwi` tag in their filenames. 76 | 77 | ## 10. Event Related Information 78 | 79 | The text files containing fMRI task event related information (ERI) have duplicated information. Specifically, per task within each subject's session, each run's ERI text file contains both run 1 and run 2. When extracting task event information for task-fMRI analysis, please make sure to take into account the duplicated structure for each ERI file. Our abcd-bids-tfmri-pipeline already takes this duplication into account for both derived contrasts and the pipeline code itself. 80 | 81 | ## 11. BIDS Modality-Agnostic Files 82 | 83 | To maintain a valid [BIDS data structure](https://bids-specification.readthedocs.io/en/stable/) `dataset_description.json`, `README`, and `CHANGES` files are included. They respectively: minimally describe the dataset, provide a small blurb about the datsaet, and log the changes from version to version. 84 | 85 | ## 12. BIDS Validator Compliance 86 | 87 | This dataset was validated using [the official BIDS validator](https://github.com/bids-standard/bids-validator). 88 | -------------------------------------------------------------------------------- /docs/recommendations.md: -------------------------------------------------------------------------------- 1 | # Recommendations 2 | 3 | Note: Clicking any link within the readthedocs site will not open a new web browser tab. If you want to keep your docs open, either middle-click or right-click and choose open in new tab for the links you would like to follow. 4 | 5 | --- 6 | 7 | ## 1. About this Document 8 | 9 | This document highlights common recommendations for usage of the collection 3165 data. 10 | 11 | ## 2. The BIDS Participants Files and Matched Groups 12 | 13 | Demographic and socioeconomic variables relating to the ABCD participants included in Collection 3165 can be found in the `participants.tsv` spreadsheet. A data dictionary further explaining each variable is also included. They are available for download on [the main NDA Collection 3165 page](https://nda.nih.gov/edit_collection.html?id=3165). A high level overview of these variables is detailed below. 14 | 15 | 1. `participant_id`: NDA unique pGUID, starting with `sub-` 16 | 1. `session_id`: Participant's session ID (all data within this first release are `ses-baselineYear1Arm1`) 17 | 1. `collection_3165`: Presence or absence of the subject from this NDA collection 3165 uploaded data 18 | 1. `site`: ABCD site location 19 | 1. `scanner_manufacturer`: GE, Philips, or Siemens scanner 20 | 1. `scanner_model`: Scanner model name 21 | 1. `scanner_software`: Scanner software description 22 | 1. `matched_group`: Carefully matched similar groups 23 | 1. `sex`: Sex 24 | 1. `demo_race_a_p___10`: White 25 | 1. `demo_race_a_p___11`: Black/African American 26 | 1. `demo_race_a_p___12`: Native American 27 | 1. `demo_race_a_p___13`: Alaska Native 28 | 1. `demo_race_a_p___14`: Native Hawaiian 29 | 1. `demo_race_a_p___15`: Guamanian 30 | 1. `demo_race_a_p___16`: Samoan 31 | 1. `demo_race_a_p___17`: Other Pacific Islander 32 | 1. `demo_race_a_p___18`: Asian Indian 33 | 1. `demo_race_a_p___19`: Chinese 34 | 1. `demo_race_a_p___20`: Filipino 35 | 1. `demo_race_a_p___21`: Japanese 36 | 1. `demo_race_a_p___22`: Korean 37 | 1. `demo_race_a_p___23`: Vietnamese 38 | 1. `demo_race_a_p___24`: Other Asian 39 | 1. `demo_ethn_p`: Latinx 40 | 1. `demo_race_a_p___25`: Other Race 41 | 1. `demo_race_a_p___77`: Refuse To Answer 42 | 1. `demo_race_a_p___99`: Don't Know 43 | 1. `age`: Age in months 44 | 1. `handedness`: Handedness 45 | 1. `siblings_twins`: Family member status 46 | 1. `income`: Combined income 47 | 1. `participant_education`: Participant grade in school 48 | 1. `parental_education`: Highest level of parental education 49 | 1. `anesthesia_exposure`: History of participant anesthesia exposure 50 | 1. `neurocog_pc1.bl`: 51 | 1. `neruocog_pc2.bl`: 52 | 1. `neurocog_pc3.bl`: 53 | 1. `released`: Participants with updated fast track data based on revised QC (see: [known issues](https://collection3165.readthedocs.io/en/stable/release_notes/#released)) 54 | 1. `updated_dwi_input_json`: Participants scanned on GE with MR Software release DV25.0_R02_1549.b (see: [known issues](https://collection3165.readthedocs.io/en/stable/release_notes/#updated_dwi_input_json)) 55 | 56 | The `matched_group` field is the product of comparisons across site, age, sex, ethnicity, grade, highest level of parental education, handedness, combined family income, exposure to anesthesia, and family-relatedness which show no significant differences between the ABCD-1 and ABCD-2 groups. Comparison of the counts and means for each of these factors shows that ABCD-1 and ABCD-2 are negligibly different samples. Gender shows the largest absolute difference of 2.5 percent. No other demographic variables differ by more than 1 percent. A further description can be found in the [ABCC paper] (https://doi.org/10.1101/2021.07.09.451638) See table above. 57 | 58 | ![Matched groups](img/matched_groups.png) 59 | 60 | A full-resolution version of this table can be found [here](https://github.com/ABCD-STUDY/nda-abcd-collection-3165/tree/master/docs/img/matched_groups.png). 61 | 62 | 63 | ## 3. The BIDS Quality Control File 64 | 65 | This Quality Control (QC) file contains QC metrics for data from this collection and is available for download on [the main NDA Collection 3165 page](https://nda.nih.gov/edit_collection.html?id=3165). Version 1.0.1 contains brain coverage scores for all runs of the `derivatives.func.runs_task-(MID|nback|rest|SST)_volume` data subsets. Currently, available fields in the QC file are: 66 | 67 | 1. `participant_id`: NDA unique pGUID, starting with `sub-` 68 | 1. `session_id`: Participant's session ID, starting with `ses-` 69 | 1. `data_subset`: Collection 3165 data subset 70 | 1. `task`: fMRI task name, starting with `task-` 71 | 1. `run`: Chronological run number, starting with `run-` 72 | 1. `path`: Relative path from the root of the data set 73 | 1. `brain_coverage_score`: Overlap of the functional run time series mean with the atlas mask 74 | 75 | ### Brain Coverage Score 76 | 77 | The brain coverage score is an estimate of how much overlap exists between the fMRI task volumes and the MNI atlas mask. It is determined by what percentage of the MNI atlas mask file is covered by each temporal mean of the fMRI time series volume. This is calculated by first taking the temporal mean of the 4-dimensional fMRI time series. The meaned 3-dimensional volume is then binarized using `fslmaths` and masked using the `MNI152_T1_2mm_brain_mask.nii.gz`. The brain coverage score is a percentage. The score is the number of non-zero voxels left in the binarized volume divided by the number of non-zero voxels in the MNI mask. 78 | 79 | ## 4. Downloading and Unpacking Data 80 | 81 | There are two ways to download ABCD Study data and get BIDS inputs or derivatives: 82 | 83 | 1. (***PREFERRED***) Downloading from NDA Collection 3165 will provide a "data structure manifest" spreadsheet with AWS S3 links and other key information. DCAN Labs has designed [a GitHub repository for selectively downloading only parts of the BIDS input and derivative data, the "nda-abcd-s3-downloader"](https://github.com/ABCD-STUDY/nda-abcd-s3-downloader). 84 | 2. [ABCD Fast Track Data on the NDA](https://nda.nih.gov/abcd/query/abcd-fast-track-data.html) can alternatively be downloaded and unpacked into BIDS with the [ABCD-STUDY abcd-dicom2bids GitHub repository](https://github.com/ABCD-STUDY/abcd-dicom2bids). This is if you need DICOM files specifically. 85 | 86 | ### [`nda-abcd-s3-downloader`](https://github.com/ABCD-STUDY/nda-abcd-s3-downloader) 87 | 88 | This downloader can parallelize downloads and you can specify only your data subsets of interest. 89 | 90 | ### [`abcd-dicom2bids`](https://github.com/ABCD-STUDY/abcd-dicom2bids) 91 | 92 | This tool pulls DICOMs and E-Prime files from the NDA's "fast-track" data. It also unpacks, converts, and BIDS-standardizes the fast-track data so it becomes BIDS-compliant and matches that which is uploaded to collection 3165. 93 | 94 | ## 5. MATLAB Motion Mask Files 95 | 96 | In order to make an accurate correlation matrix, use the MATLAB motion mask file described in release document 4, [Derivatives](https://collection3165.readthedocs.io/en/stable/derivatives/), under the **Motion MAT File** heading. 97 | 98 | ## 6. Interacting with Output Data Types 99 | 100 | Along with GIFTIs, released data follows the standards defined by the Human Connectome Project, such as reporting different metrics in standard grayordinate space and saving data using CIFTI standard file formats. 101 | 102 | A couple of great blog posts can be read online for more detailed coverage of CIFTI data types and interaction. These topics will only be briefly discussed in this document. 103 | 104 | - [**NIfTI, CIFTI, GIFTI in the HCP and Workbench: a primer** by Jo Etzel from Washington University in St. Louis](http://mvpa.blogspot.com/2014/03/nifti-cifti-gifti-in-hcp-and-workbench.html) 105 | - [**A layman’s guide to working with CIFTI files** by Mandy Mejia from Indiana University](https://mandymejia.com/2015/08/10/a-laymans-guide-to-working-with-cifti-files/) 106 | 107 | The following data types, listed by file name extension, are available in this collection's BIDS derivatives. 108 | 109 | 1. `.dlabel.nii`: "Dense label files" contain the "labels" (a.k.a. parcels) within parcellations. 110 | 1. `.dscalar.nii`: "Dense scalar files" contain things like cortical thickness, curvature, and myelin maps on a scalar value per surface vertex basis. 111 | 1. `.dtseries.nii`: "Dense time series" contain functional time series from fMRI runs in surface space on a vector time series per surface vertex basis. 112 | 1. `.ptseries.nii`: "Parcellated time series," contain the dense time series parcellated by the corresponding dense label file. 113 | 1. `.surf.gii`: "GIFTI surface files" contain the "geometry" surface delineations/definitions of a particular surface, like the midthickness surface for example. 114 | 115 | ### Dense and Parcellated Time Series 116 | 117 | The dense and parcellated time series files should regularly be analyzed using their corresponding motion files. Periods of high motion should be censored out for the purposes of regular connectivity/correlation matrix analysis. 118 | 119 | ### Correlation Matrices 120 | 121 | Correlation matrices should be generated from either the dense or parcellated time series using frame censoring from the aforementioned MATLAB motion mask files. The [DCAN-Labs/cifti-connectivity tools](https://github.com/DCAN-Labs/cifti-connectivity) should be used which account for choosing a framewise displacement threshold, an acceptable amount of remaining minutes threshold, and outputting either dense (`.dconn.nii`) or parcellated (`.pconn.nii`) connectivity matrices. 122 | 123 | ### Connectome Workbench 124 | 125 | For visualization of all of these CIFTI files, use [Connectome Workbench](https://www.humanconnectome.org/software/connectome-workbench). 126 | 127 | ## 7. DCAN Labs Software 128 | 129 | We have built tools to utilize this data using our recommended methods. Read on for descriptions of each publicly-hosted open-source software GitHub repository from [DCAN-Labs](https://github.com/DCAN-Labs). 130 | 131 | ### ABCD-BIDS Pipeline: [https://github.com/ABCD-STUDY/abcd-hcp-pipeline](https://github.com/ABCD-STUDY/abcd-hcp-pipeline) 132 | 133 | See these release notes' document 3: [**Pipeline**](https://collection3165.readthedocs.io/en/stable/pipeline/). 134 | 135 | ### Custom Clean: [https://github.com/DCAN-Labs/CustomClean](https://github.com/DCAN-Labs/CustomClean) 136 | 137 | Custom clean is a generalized piece of software which is great for defining common output files to delete when presented with similar folders of files. This is a common occurrence in data processing where you process an input dataset and end up with a similar set of output files for every processed job following some output folder convention. 138 | 139 | First you use a graphical user interface (GUI) to teach custom clean about what files should be regularly cleaned. The custom clean GUI outputs a "cleaning JSON file" which has all the definitions for files to be cleaned within. After that you can call the custom clean script with the cleaning JSON file as many times as you like on as many similar folders as you like. 140 | 141 | ### File Mapper: [https://github.com/DCAN-Labs/file-mapper](https://github.com/DCAN-Labs/file-mapper) 142 | 143 | File mapper is another generalized piece of software which is great for defining a common output folder/file hierarchy based on a template set of files to be mapped and an output hierarchy to which you can map. We use it to conform the commonly output "Human Connectome Project-styled" processed folders into BIDS-compliant derivative folders. 144 | 145 | Much like custom clean, you define a JSON file which says how to map a file from some common input to some common output in order to "reshape" your data outputs. 146 | 147 | ## 8. BIDS Folder Layout 148 | 149 | Your final BIDS folder structure will look like this tree if you download everything. Full descriptions of these BIDS input and BIDS derivative data are located in these release notes' documents 2 and 4, [**Inputs**](https://collection3165.readthedocs.io/en/stable/inputs/) and [**Derivatives**](https://collection3165.readthedocs.io/en/stable/derivatives/) respectively. 150 | 151 | ![ABCD-BIDS Layout](img/ABCD-BIDS_cropped.png) 152 | 153 | A full-resolution version of this picture, complete with descriptions, can be found [here](https://github.com/ABCD-STUDY/nda-abcd-collection-3165/tree/master/docs/img/ABCD-BIDS.png). -------------------------------------------------------------------------------- /docs/release_notes.md: -------------------------------------------------------------------------------- 1 | # Release Notes 2 | 3 | ## Introduction 4 | 5 | The ABCC houses a community-shared and continually updated ABCD neuroimaging dataset available under Brain Imaging Data Structure (BIDS) standards. Source data are converted to BIDS from the [NIMH Data Archive (NDA) share of ABCD fast-track data](https://nda.nih.gov/edit_collection.html?id=2573). Only data that passed the Data Analysis Imaging Center (DAIC) quality control are included. As a community share, the ABCC enables researchers to access **available derivatives** and share their **own derivatives**. The ABCD-BIDS datasets are continually updated as new ABCD releases become available. A list of currently available datasets are provided below. 6 | 7 | 1. `BIDS inputs` The input DICOM data to this [BIDS version 1.2.0](https://www.nature.com/articles/sdata201644) data collection were retrieved from the [NIMH Data Archive (NDA) share of ABCD fast-track data](https://nda.nih.gov/edit_collection.html?id=2573) and were last accessed on May 1, 2019. BIDS input data were converted from DICOMs using [Dcm2Bids](https://github.com/cbedetti/Dcm2Bids). 8 | 2. `abcd-hcp-pipeline` BIDS derivatives data were derived from the [DCAN Labs ABCD-BIDS MRI (version 0.0.3) processing pipeline](https://doi.org/10.5281/zenodo.2587210) which outputs [Human Connectome Project (HCP) Minimal Preprocessing Pipelines-style data](https://doi.org/10.1016/j.neuroimage.2013.04.127) in both volume and surface spaces. This collection is independent from ABCD Data Collection 2573. Users may access ABCD DICOM files via the ABCD fast-track imaging data release in Collection 2573. 9 | 3. `abcd-task-hcp-pipeline` a modified version of the TaskfMRIAnalysis stage of the HCP-pipeline (Glasser et al., 2013) developed at University of Vermont by Anthony Juliano, was used to process task-fmri data from the minimally processed ABCD-BIDS (Feczko et al., 2020b) processing pipeline (v.1.0) data, as well as derived ABCC data (Feczko, 2020; ABCD-3165). 10 | 4. `freesurfer-5.3.0-HCP` segmentation statistics and surface morphometrics from the FreeSurfer stage within the [DCAN Labs ABCD-BIDS MRI processing pipeline](https://doi.org/10.5281/zenodo.2587210) are provided here. 11 | 5. `fMRIPrep` fMRIPrep v20.2.0 was run on all 10,038 participants whose visit one data was successfully converted to BIDS. The limited fMRIPrep processing errors were due to subjects that did not have any valid fMRI runs, but we did not do any manual quality control of outputs. 9,484 participants have at least one output. The data is available in 18 submissions (a summary, including number of files and submission size can be found [here](https://docs.google.com/spreadsheets/d/1NbZ28vBvGVJb9miivgsJ695VVoFBSBuBQmWigN5pg_c/edit#gid=678992105)). 12 | 13 | 14 | A guiding principle for this collection is to release essential data for analysis. This collection will be updated with waves of data preparation and processing. As waves complete preparation or processing they will be uploaded and version-stamped with updated and versioned release notes. 15 | 16 | ## Release History 17 | 18 | ### Release 2.0.0 (6/22/2022) 19 | 20 | New updates to the ABCD BIDS Community Collection cover both revisions to existing datasets and new derivatives. Revisions include: 21 | 22 | 1. Uploading 144 participants with new data due to revised fast track QC 23 | 2. Providing Connectivity matrices for those participants with discrepancies in the number of timepoints used 24 | 3. Uploading JSONs for the diffusion inputs in some participants. 25 | 4. Updated version of the participants.tsv to v1.0.2 includes correction to site and sex designation for a small subset of subjects based on new information from the DAIC. 26 | 27 | New additions include: 28 | 29 | 1. Individual-specific network labels based on a template matching approach and infomap approaches 30 | 2. Derivatives for the fmriprep pipeline, and 31 | 3. Level-2 task files from the ABCD-task-fMRI pipeline. 32 | 33 | Details about each update are given below. 34 | 35 | #### fMRIPrep outputs 36 | 37 | fMRIPrep v20.2.0 was run on all 10,038 participants whose visit one data was successfully converted to BIDS. The limited fMRIPrep processing errors were due to subjects that did not have any valid fMRI runs, but we did not do any manual quality control of outputs. 9,484 participants have at least one output. The data is available in 18 submissions (a summary, including number of files and submission size can be found [here](https://docs.google.com/spreadsheets/d/1NbZ28vBvGVJb9miivgsJ695VVoFBSBuBQmWigN5pg_c/edit#gid=678992105)). Detailed information about the files included in each submission are on the second tab of that spreadsheet. Files with no submission name listed have not yet been uploaded. If additional outputs are desired, please reach out to Dylan Nielson at . fMRIPrep was run in a singularity container on resources from the NIH High Performance Computing Biowulf cluster. 38 | 39 | #### Replaced subjects 40 | The initial release was processed prior to new updates to the fast track QC spreadsheet that affected the original inputs for 144 participants. This led to discrepancies in the number of timepoints reported for connectivity matrices (see below) relative to the inputs. The 144 participants were re-processed through the ABCD-BIDS pipeline at the Minnesota Supercomputing Institute (MSI), and being replaced, subsequent the required NDA review. The participants.tsv file indicates which subjects were reprocessed. 41 | 42 | These subjects have had their connectivity matrices regenerated and replaced (see: Connectivity Matrices). 43 | 44 | #### Connectivity matrices 45 | 46 | The 144 participants with replaced fast track QC information mentioned above produced new Gordon 10 and 5 minute connectivity matrices. The old matrices remain valid, but may use different frames from the new matrices. The labels for these connectivity matrices were defined in Gordon, et al, 2017. These connectivity matrices were created using the [DCAN Labs cifti connectivity wrapper](https://github.com/DCAN-Labs/cifti-connectivity/). Timepoints used for connectivity calculations were thresholded based on data quality. Data quality was measured by the total frame displacement (FD) calculated from the frame-by-frame realignment parameters; Frames above an FD of 0.2 mm were excluded. An outlier detection procedure was used to exclude remaining frames that were 2 standard deviations away from the mean. These procedures match the original procedures used to generate the connectivity matrices in the November release. 47 | 48 | *Submission IDs: 36449 - 36452* 49 | 50 | #### DWI sidecar JSON patch (Diffusion inputs) 51 | 52 | The DWI acquisition parameters from subjects scanned on Philips and GE with MR Software release versions 5.3.0_5.3.0.0 and DV25.0_R02_1549.b respectively (n=423) are missing the required field, PhaseEncodingDirection. This omission is because they reported the axis and not direction; therefore we did a manual check of these images to check the phase encoding direction, so that these JSON inputs are BIDS compatible and can be processed by pipelines like QSIprep. These JSONs have been updated and uploaded. 53 | 54 | *Submission ID: 36448* 55 | 56 | #### Individual-specific network maps using the Infomap algorithm 57 | 58 | Infomap community detection is an unsupervised method of assigning nodes to communities in a graph based on information theory. Here, grayordinates are treated as nodes, and the edges are the correlation between the nodes. There are two versions of individual-specific maps available depending on whether not investigators are interested in the contribution of tasks to global network topography. 1) Maps are generated for subjects with at least 10 minutes of low-motion (See Hermosillo et al 2021) resting state data. 2) 59 | 60 | Maps are generated with all available minutes below an FD threshold of 0.2mm (and corresponding BOLD outlier detection) using concatenated rest and task data. Because the tie density scales exponentially with the number of grayordinates, infomap community detection was only performed on the cortical surface and did not include subcortical structures (i.e. neither brainstem, cerebellum, nor diencephalon). Note, because infomap is an unsupervised community detection method, the subject may have more or fewer networks than a canonical network set. Where possible, we have attempted to assign networks based on the networks observed in an average dataset using the jaccard similarity (see Gordon et al. 2017), however in some instances the jaccard similarity sufficiently low (<0.1) such that the network did not resemble any of the canonical networks, in which case the network was provided a novel network assignment. 61 | 62 | #### Template Matching 63 | 64 | [Template Matching] Template matching is a supervised algorithm for identifying neural networks using resting state connectivity data, based on the spatial topography. Click [here](https://github.com/DCAN-Labs/compare_matrices_to_assign_networks) for documentation of source code as well as a written tutorial. Multiple versions of the time series are provided, to allow investigator flexibility in their desired analysis: either exactly 10 minutes of randomly sampled frames, all available frames below the 0.2mm FD threshold, or concatenated rest and task time series data in the following order: rest, MID, n-back, and SST (provided that the participant had an available scan for the task). For full details of inter- and intra- participant reliability, and motion correction, see Hermosillo et al. 2021 (in prep). 65 | 66 | *Submission IDs: 36458 - 36630* 67 | 68 | #### Task outputs 69 | 70 | [abcd-bids-tfmripipeline](https://github.com/DCAN-Labs/abcd-bids-tfmri-pipeline) a modified version of the TaskfMRIAnalysis stage of the HCP-pipeline (Glasser et al., 2013) developed at University of Vermont by Anthony Juliano, was used to process task-fmri data from the minimally processed ABCD-BIDS (Feczko et al., 2020b) processing pipeline (v.1.0) data, as well as derived ABCC data (Feczko, 2020; ABCD-3165). An example fsf file template for ABCD's MID task is made available for users to review on ABCC (https://osf.io/psv5m/). MID, Nback, and SST level-2 task outputs are available for the baseline sessions for all data that passed task QC. These outputs include the fully-processed dtseries data that are subsequently ready for the user to perform their desired third-level or group-wise analyses. 71 | 72 | ### Release 1.1.1 (10/7/2020) 73 | 74 | This was a small version 1.0.0 release of the `derivatives_qc.(json|tsv)` with additional BIDS derivatives quality control data including a "brain coverage score" for the `derivatives.func.runs_task-(MID|nback|rest|SST)_volume` data subsets. 75 | 76 | ### Release 1.1.0 (7/27/2020) 77 | 78 | This was the next big release with the addition of: 79 | 80 | 1. 157 additional subjects due to updated fast track QC spreadsheet 81 | 1. `participants.(json|tsv)` version 1.0.0: BIDS standard participants files with matched groups 82 | 1. `sourcedata.func.task_events`: Task-based fMRI E-Prime files 83 | 1. `inputs.dwi.dwi`: DWI BIDS input data 84 | 1. `derivatives.anat.stats`: FreeSurfer stats files 85 | 1. `derivatives.anat.(T1w|T2w)`: T1 and T2 volumes 86 | 1. `derivatives.anat.wmparc`: white-matter volume ROIs 87 | 1. `derivatives.func.updated_motion_task-(MID|nback|SST|rest)`: Improved motion files (including outlier calculation) 88 | 1. `derivatives.func.pconns`: Curated parcellated connectivity files 89 | 1. `derivatives.func.runs_task-(MID|nback|SST|rest)_volume`: Minimally-processed fMRI volumes 90 | 91 | ### Release 1.0.0 (2/17/2020) 92 | 93 | This was the initial release of DCAN Labs ABCD-BIDS inputs and derivatives containing **10,038 MRI sessions worth of NDA imagingcollection01 data** and **9,647 MRI sessions worth of NDA fmriresults01 data**. 94 | 95 | #### Corrections 96 | 97 | #### `sourcedata` 98 | 99 | It was brought to our attention that Event Related Information `sourcedata` files can be CSV files as well as TXT files in the ABCD dataset. Unfortunately, we currently only account for TXT files in our BIDS conversion. Our NDA-uploaded source data only includes Event Related Information TXT files at this time. We intend to upload the CSV files with a future release. 100 | 101 | ##### `task-rest_bold.json` 102 | 103 | Discovered in the middle of June 2020, the modality-specific BIDS inherited `task-rest_bold.json` file at the top of the directory tree which is nested in almost every `task-rest` associated record in the NDA database has a typo in it. The `"TaskDescription"` key has a value of `"See http://www.cognitiveatlas.org/task/id/tsk_4a57abb949e1a/"`. However, this link goes to the stop signal task page on the Cognitive Atlas website. Instead you should refer to [the Cognitive Atlas website for "rest eyes open"](http://www.cognitiveatlas.org/task/id/trm_4c8a834779883/). This website describes the task as: 104 | 105 | "Subjects rest passively with their eyes open. Often used as a baseline for comparison for other tasks." 106 | 107 | ##### `derivatives.func.runs_task-rest_volume` 108 | 109 | This data subset was originally uploaded in Release 1.1.0, but was missing all runs chronologically numbered 3 and up. We are uploading these missing data in Release 1.1.2. 110 | 111 | #### `updated_dwi_input_json` 112 | 113 | The DWI acquisition parameters from all subjects scanned on GE with MR Software release DV25.0_R02_1549.b (n=281) are missing the required field, PhaseEncodingDirection. This omission is because they reported the axis and not direction. -------------------------------------------------------------------------------- /docs/postpipeline.md: -------------------------------------------------------------------------------- 1 | # Post Pipeline Tools 2 | 3 | After fMRI processing we include a couple post processing and analysis steps 4 | 5 | ## Connectivity Matrix Generation 6 | Parcellated connectivity matrices (pconns) are much smaller, and can be further explored to estimate within-study results reproducibility. Using the different sets of parcellated timeseries, we calculated the lag-zero pearson’s correlation coefficient between every pair of parcellated regions of interest (ROIs). Per subject and parcellation scheme, this results in an ROI x ROI correlation matrix. The provided connectivity derivatives were extracted with an FD threshold of 0.2 mm for [5 minutes and 10 minutes of data](https://collection3165.readthedocs.io/en/stable/derivatives/#4-functional). A variance stabilization procedure was applied to the correlations prior to uploading. Specifically, the inverse hyperbolic tangent was applied to the correlations: z = arctanh(r). The maximum value displayed is 7.254329. Applying the hyperbolic tangent will recover the pearson's correlation: r = tanh(z). 7 | 8 | ## Individual-specific network maps 9 | 10 | Multiple versions of the time series are provided, to allow investigator flexibility in their desired analysis: either exactly 10 minutes of randomly sampled frames, all available frames below the 0.2mm FD threshold, or concatenated rest and task time series data in the following order: rest, MID, n-back, and SST (provided that the participant had an available scan for the task). For full details of inter- and intra- participant reliability, and motion correction, see Hermosillo et al. 2021 (in prep). 11 | 12 | ### Infomap 13 | 14 | Because brain synapses grow as a complex system of learning and evolving, neural networks don’t dutifully conform to anatomical coordinates across individuals. Therefore, it often makes sense to consider “function” as the pattern of connections between brain regions rather than assume function occurs at a specific anatomical location. Graph theory is an appropriate avenue for investigation because we can redefine brain regions as anatomically-irrespective nodes, and define the correlation (i.e. connectivity) between them as edges. Nodes that communicate heavily with each other are considered to be a part of the same community or network. Networks for the ABCD collection were detected using infomap (D. Edler, A. Holmgren and M. Rosvall, The MapEquation software package, available online at http://www.mapequation.org). Infomap is an algorithm that describes information flow in the network, by attempting to minimize the number of bits necessary to describe the whole network (Martin Rosvall and Bergstrom 2008; M. Rosvall, Axelsson, and Bergstrom 2009). For example, would it require fewer bits to describe the whole brain with few networks containing many nodes, or many networks with fewer nodes? Similar algorithms maximize modularity metrics, however, Infomap uses a random walk algorithm that uses edge weights (in this case, it uses connectivity) to determine the minimum descriptor code length necessary. Importantly, while the solution provides modules, it is not designed to maximize modularity. Importantly, neural networks have been shown to be scalable. As others have done previously (Gordon et al. 2017), we thresholded the whole brain correlation matrix (91282 x 91292 grayordinates) to the top x% of connections (or edges) because of the computational limitations of using a full set of 8.1 billion connections as descriptors in the map equation. We thresholded the connectivity matrix at a threshold of 0.3%, 0.4%, 0.5%, 1%, 1.5%, 2%, 2.5%, and 3%. These threshold were chosen to scale the number of edges. 15 | 16 | To generate a consensus across multiple edge percentages, we implemented a methodology developed by Gordon and colleagues(Gordon et al. 2017). Briefly, after infomap detected communities for each subject, Putative network assignments were then assigned to each subject’s communities by matching them at each threshold to the independent group networks from the University of Washington (n=120). For each individual, at each percentage threshold, the spatial overlap of each unknown community was compared to each one of the independent group networks separately using the Jaccard similarity index. The unknown community was then assigned that network identity to which it had the highest Jaccard similarity index. If the Jaccard Index was less than 0.1, the community remained unassigned, so as to avoid assigning communities to known networks based on only a few vertices. Assignments were first made with the large, well-known networks (Default, Lateral Visual, Motor, Fronto-Parietal, Cingulo-Opercular, Dorsal Attention), and then to the smaller, less well-known networks (e.g. Ventral Attention, Salience, Parietal Memory, lateral hand-face motor ). In each individual, a “consensus” network assignment was created by giving each grayordinate the canonical assignment it had at the sparsest threshold. 17 | 18 | Infomap community detection is an unsupervised method of assigning nodes to communities in a graph based on information theory. Here, grayordinates are treated as nodes, and the edges are the correlation between the nodes. There are two versions of individual-specific maps available depending on whether not investigators are interested in the contribution of tasks to global network topography. 19 | 20 | The following maps are generated for subjects with at least 10 minutes of low-motion (See Hermosillo et al 2021) resting state data. The following are data subset names: 21 | 22 | ``` 23 | fmriresults01_derivatives.func.networkmaps_task-restonly_10min_Surfonly_infomap_singlenet_dscalar.nii 24 | ``` 25 | 26 | The following maps are generated with all available minutes below the FD threshold (and corresponding BOLD outlier detection) using concatenated rest and task data. 27 | 28 | ``` 29 | fmriresults01_derivatives.func.networkmaps_task-restandtask_allmin_Surfonly_infomap_singlenet_dscalar.nii 30 | ``` 31 | 32 | Because the tie density scales exponentially with the number of grayordinates, infomap community detection was only performed on the cortical surface and did not include subcortical structures (i.e. neither brainstem, cerebellum, nor diencephalon). Note, because infomap is an unsupervised community detection method, the subject may have more or fewer networks than a canonical network set. Where possible, we have attempted to assign networks based on the networks observed in an average dataset using the jaccard similarity (see Gordon et al. 2017), however in some instances the jaccard similarity sufficiently low (<0.1) such that the network did not resemble any of the canonical networks, in which case the network was provided a novel network assignment. 33 | 34 | ### Template matching 35 | 36 | Template matching is a supervised algorithm for identifying neural networks using resting state connectivity data, based on the spatial topography. [Click here for Documentation of source code as well as a written tutorial.](https://github.com/DCAN-Labs/compare_matrices_to_assign_networks) 37 | 38 | ### Creating a template 39 | 40 | This technique has been used previously to identify networks ([Gordon et al. 2017](https://pubmed.ncbi.nlm.nih.gov/26464473/); [Dworetsky 2021](https://www.sciencedirect.com/science/article/pii/S1053811921004419)). Briefly, in order to generate the templates, Infomap community detection was performed at several tie densities (for full details of average networks, see ([Gordon, Laumann, Gilmore, et al. 2017](https://pubmed.ncbi.nlm.nih.gov/26464473/); [Gordon, Laumann, Adeyemo, et al. 2017](https://doi.org/10.1016/j.neuron.2015.06.037); [Laumann et al. 2015](https://pubmed.ncbi.nlm.nih.gov/26212711/)) on an average connectivity matrix (n=120 participants) using a two level solution. This provides a common set of networks based on average brain activity from which one can “match” the spatial topography of brain activation of a given grayordinate. To generate a set independent network templates, a seed-based correlation was performed using an average time series for each network correlated with a smoothed dense time series (spatial Gaussian smoothing kernel of 2.55 mm using each participant’s own mid-thickness surfaces) from each template participant. Seed-based correlation values were averaged across all the participants in the template group (n=164, 9-10 year olds), resulting in a vector (91282 x 1) of average correlation values for each network correlated with each grayordinate. Each network vector was averaged independently across subjects in the template group to generate seed-based templates for each network. We then thresholded each network template at Z ≥ 1. 41 | 42 | ### Matching connectivity to a network template 43 | 44 | To generate individual-specific maps for each participant in ABCD groups 1 and 2, we examined the whole-brain connectivity for each grayordinate by correlating the motion-censored dense time series with all other grayordinates resulting in a 91282 x 91282 (or where cortex-only analyses were performed: 59412 x 59412) correlation matrix. The correlation matrix was Z-scored separately for each hemisphere, and within and between the cortex and subcortical structures. Whole-brain connectivity for each grayordinate was thresholded to only include correlated grayordinates with Z-score values greater than or equal to one. This resulted in a vector of whole-brain connectivity for each grayordinate that only includes grayordinates that are strongly correlated to a given network template. We then calculate a η2 value between the remaining grayordinates and each of the network templates. The grayordinate is assigned to whichever network with the maximum eta2 value. Two versions of individual-specific networks are available: one version was created without using subcortical data from the timeseries, the other was created including subcortical timeseries data. For the following files, participants had varying amounts of data below the framewise displacement threshold (FD=0.2mm, see [Hermosillo et al. 2021](https://www.biorxiv.org/content/10.1101/2022.01.12.475422v1) for additional motion censor criteria). Next were generated concatenated rest and task timeseries data. 45 | 46 | - fmriresults01_derivatives.func.networkmaps_task-restandtask_allmin_Surfandsub_templatematching_singlenet_dscalar.nii 47 | - fmriresults01_derivatives.func.networkmaps_task-restandtask_allmin_Surfonly_templatematching_singlenet_dscalar.nii 48 | 49 | #### Movement Criteria 50 | 51 | We are providing individual-specific maps using concatenated tasks and rest using only 10 minutes of low-motion data or using all available low motion data. 52 | 53 | - fmriresults01_derivatives.func.networkmaps_task-restonly_10min_Surfandsub_templatematching_singlenet_dscalar.nii 54 | - fmriresults01_derivatives.func.networkmaps_task-restonly_10min_Surfonly_templatematching_singlenet_dscalar.nii 55 | - fmriresults01_derivatives.func.networkmaps_task-restonly_allmin_Surfandsub_templatematching_singlenet_dscalar.nii 56 | 57 | Dscalars are provided in a fsLR32k format. In the dscalars, each grayordinate (n=91282) has a single value, where each value corresponds with the following key: 58 | 59 | - 1 = Default mode network (DMN) 60 | - 2 = Visual network (VIS) 61 | - 3 = Frontoparietal network (FPN) 62 | - 4 = *there is no network 4* 63 | - 5 = dorsal attention network (DAN) 64 | - 6 = *there is no network 6* 65 | - 7 = Ventral attention network (VAN) 66 | - 8 = Salience network (Sal) 67 | - 9 = Cingulo-opercular network (CO) 68 | - 10 = Dorsal sensorimotor network (SMd) 69 | - 11 = the lateral sensorimotor Network(SMl) 70 | - 12 = the auditory network(AUD) 71 | - 13 = the temporal pole network (TP) 72 | - 14 = the medial temporal network(MTL) 73 | - 15 = the parietal occipital network(PON) 74 | - 16 = Parietal medial network(PMN) 75 | 76 | --- 77 | 78 | ## Template matching (multiple networks per grayordinate) 79 | 80 | To generate overlapping networks for each participant, we used the identical template networks as described above, however, rather than assigning the grayordinate to the network with the maximum eta^2 value, a data-driven approach was used to assign multiple networks to each grayordinate. For each network we plotted the distribution of eta^2 values. The distribution of similarity (eta^2) for each network is both bimodal and skewed, such that most grayordinates do not resemble the network of interest (left peak), and some grayordiante have a spatial connectivity that are very similar to the template network (right peak). The distribution for eta^2 values was distributed into 10,000 bins and fitted with a cubic spline then smoothed (2,000 point Savitzky-Golay window), and the local minimum was taken. We then used this local minimum as the threshold for whether or not a grayordinate would be labelled with this network, where grayordinates above this threshold would receive the network assignment. 81 | 82 | ### Movement Criteria 83 | The following versions of individual-specific maps are available for subjects that had at least 10 minutes of low-motion resting state data. Networks were generated using exactly 10 minutes of data to ensure that an identical amount of time was used to generate correlation matrices for all participants. 84 | 85 | -fmriresults01_derivatives.func.networkmaps_task-restonly_10min_Surfandsub_templatematching_overlappingnet_dtseries.nii 86 | -fmriresults01_derivatives.func.networkmaps_task-restonly_10min_Surfonly_templatematching_overlappingnet_dtseries.nii 87 | 88 | In order to include additional participants that had less than 10 minutes of data and to capture additional information for participants that had more than 10 minutes, we have also included networks for participants using all available low-motion data. 89 | Note that subjects will have varying numbers of frames due to each participant’s movement in the scanner at the time of collection. 90 | 91 | fmriresults01_derivatives.func.networkmaps_task-restonly_allmin_Surfandsub_templatematching_overlappingnet_dtseries.nii 92 | 93 | Lastly, as mentioned above, task and rest data were concatenated to leverage additional hemodynamic coactivation information related to network connectivity, under the assumption that task-based activations contribute only a miniscule variation to global network communication ([Gratton et al. 2018](https://www.sciencedirect.com/science/article/pii/S0896627318302411)). 94 | 95 | fmriresults01_derivatives.func.networkmaps_task-restandtask_allmin_Surfandsub_templatematching_overlapping_dtseries.nii 96 | fmriresults01_derivatives.func.networkmaps_task-restandtask_allmin_Surfonly_templatematching_overlappingnet_dtseries.nii 97 | 98 | Overlapping .dtseries.nii cifiti files are provided with 1 network per column of the timeseries file in the same order provided above in the single network example. 99 | -------------------------------------------------------------------------------- /docs/derivatives.md: -------------------------------------------------------------------------------- 1 | # Derivatives 2 | 3 | Note: Clicking any link within the readthedocs site will not open a new web browser tab. If you want to keep your docs open, either middle-click or right-click and choose open in new tab for the links you would like to follow. 4 | 5 | --- 6 | 7 | ## 1. About this Document 8 | 9 | This document reports and describes the derivative files containing processed data that are in the `abcd-hcp-pipeline` and `freesurfer-5.3.0-HCP` derivative folders from the ABCC. All BIDS derivative record data types within this document are written in `monospace` font. They are prefaced by their short names. References for other derivatives from pipelines like: 10 | 11 | 1. [fMRIPrep](https://fmriprep.org/) 12 | 1. [QSIPrep](https://qsiprep.readthedocs.io/en/stable/) 13 | 1. [abcd-bids-tfmri-pipeline](https://github.com/DCAN-Labs/abcd-bids-tfmri-pipeline) 14 | 15 | can be found by clicking on their respective links. 16 | 17 | ## 2. Anatomical 18 | 19 | There are eleven possible BIDS anatomical (`anat`) derivative subsets depending on whether a T2w image is present or not. If there is no T1w image present the data set is not processed. The following `monospace` descriptions are names of the subsets. 20 | 21 | ### Prerequisite: At least one `T1w` MRI 22 | 23 | - Cortical Thickness: `derivatives.anat.space-fsLR32k_thickness` 24 | - Curvature: `derivatives.anat.space-fsLR32k_curv` 25 | - Sulcal Depth: `derivatives.anat.space-fsLR32k_sulc` 26 | - Discrete segmentation (native space): `derivatives.anat.space-ACPC_dseg` 27 | - Mid-thickness surface (MNI space, fsLR164k mesh): `derivatives.anat.space-MNI_mesh-fsLR164k_midthickness` 28 | - Mid-thickness surface (MNI space, fsLR32k mesh): `derivatives.anat.space-MNI_mesh-fsLR32k_midthickness` 29 | - Mid-thickness surface (MNI space, native mesh): `derivatives.anat.space-MNI_mesh-native_midthickness` 30 | - Mid-thickness surface (native space, fsLR32k mesh): `derivatives.anat.space-T1w_mesh-fsLR32k_midthickness` 31 | - Mid-thickness surface (native space, native mesh): `derivatives.anat.space-T1w_mesh-native_midthickness` 32 | 33 | ### Prerequisites: At least one `T1w` MRI and at least one `T2w` MRI 34 | 35 | - Myelin map (un-smoothed): `derivatives.anat.space-fsLR32k_myelinmap` 36 | - Myelin map (smoothed): `derivatives.anat.space-fsLR32k_desc-smoothed_myelinmap` 37 | 38 | ## 3. Functional 39 | 40 | There are eight possible BIDS functional (`func`) derivatives depending upon the presence of fMRI runs (all `_____` blanks below would instead be filled in by their fMRI type, see below bulleted list). See [here](https://pubmed.ncbi.nlm.nih.gov/29567376/) to learn more about the task paradigms. 41 | 42 | - Individual run movement files (regression and censoring): `derivatives.func.motion_task-_____` 43 | - Individual run time series: `derivatives.func.runs_task-_____` 44 | - Concatenated dense time series: `derivatives.func.concat_task-______bold_desc-filtered_timeseries` 45 | - Gordon 2014 parcellated time series: `derivatives.func.concat_task-______bold_atlas-Gordon2014FreeSurferSubcortical_desc-filtered_timeseries` 46 | - Human Connectome Project 2016 parcellated time series: `derivatives.func.concat_task-______bold_atlas-HCP2016FreeSurferSubcortical_desc-filtered_timeseries` 47 | - Markov 2012 parcellated time series: `derivatives.func.concat_task-______bold_atlas-Markov2012FreeSurferSubcortical_desc-filtered_timeseries` 48 | - Power 2011 parcellated time series: `derivatives.func.concat_task-______bold_atlas-Power2011FreeSurferSubcortical_desc-filtered_timeseries` 49 | - Yeo 2011 parcellated time series: `derivatives.func.concat_task-______bold_atlas-Yeo2011FreeSurferSubcortical_desc-filtered_timeseries` 50 | 51 | ### Prerequisites for each fMRI type to have eight derivative data subsets: 52 | 53 | - Resting-state (`rest`): At least one `task-rest_bold` fMRI run 54 | - Monetary Incentive Delay (`MID`): A pair of `task-MID_bold` fMRI runs 55 | - N-back (`nback`): A pair of `task-nback_bold` fMRI runs 56 | - Stop Signal Task (`SST`): A pair of `task-SST_bold` fMRI runs 57 | 58 | ## 4. Derivative Data Subsets Breakdown 59 | 60 | Sections 3 and onward of this document generally describe what each of the derivative data subsets are. This section breaks down the exact contents of each of the derivative data subsets. Each NDA data subset is depicted by each subheading with a description of the files underneath. Subject and session identifiers are instead labeled as `#`. Each derivative data subset comes with modality-agnostic BIDS-compatible `dataset_description.json`, `README`, and `CHANGES` files and a `derivatives/abcd-hcp-pipeline/` subfolder. For readability, the `derivatives/abcd-hcp-pipeline/` subfolder has been removed from all below derivative subfolders and filenames. 61 | 62 | ### Freesurfer statistics: derivatives.anat.stats 63 | 64 | FreeSurfer stats folder. 65 | 66 | - `../freesurfer-5.3.0-HCP/sub-#/ses-#/stats/` 67 | 68 | **NOTE:** `` here denotes files directly from the FreeSurfer stats folder. 69 | 70 | ### Discrete segmentation (native space): derivatives.anat.space-ACPC_dseg 71 | 72 | Discrete segmentation in subject's native space in a NIfTI volume. 73 | 74 | - `sub-#/ses-#/anat/sub-#_ses-#_space-ACPC_dseg.nii.gz` 75 | 76 | ### Discrete segmentation (fsLR32k space): derivatives.anat.space-fsLR32k_curv 77 | 78 | Dense subject curvature CIFTI. 79 | 80 | - `sub-#/ses-#/anat/sub-#_ses-#_space-fsLR32k_curv.dscalar.nii` 81 | 82 | ### Sulcal Depth (fsLR32k space): derivatives.anat.space-fsLR32k_sulc 83 | 84 | Dense subject sulcal depth CIFTI. 85 | 86 | - `sub-#/ses-#/anat/sub-#_ses-#_space-fsLR32k_sulc.dscalar.nii` 87 | 88 | ### Cortical Thickness (fsLR32k space): derivatives.anat.space-fsLR32k_thickness 89 | 90 | Dense subject cortical thickness CIFTI. 91 | 92 | - `sub-#/ses-#/anat/sub-#_ses-#_space-fsLR32k_thickness.dscalar.nii` 93 | 94 | ### Myelin Map (Smoothed, fsLR32k space): derivatives.anat.space-fsLR32k_desc-smoothed_myelinmap 95 | 96 | Smoothed myelin map CIFTI (when a T2w image is present in the inputs). 97 | 98 | - `sub-#/ses-#/anat/sub-#_ses-#_space-fsLR32k_desc-smoothed_myelinmap.dscalar.nii` 99 | 100 | ### Myelin Map (Unsmoothed, fsLR32k space): derivatives.anat.space-fsLR32k_myelinmap 101 | 102 | Unsmoothed myelin map CIFTI (when a T2w image is present in the inputs). 103 | 104 | - `sub-#/ses-#/anat/sub-#_ses-#_space-fsLR32k_myelinmap.dscalar.nii` 105 | 106 | ### Mid-thickness surface (MNI space, fsLR164k mesh): derivatives.anat.space-MNI_mesh-fsLR164k_midthickness 107 | 108 | Left and Right mid-thickness CIFTIs in MNI space with fsLR164k surface mesh. 109 | 110 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-L_space-MNI_mesh-fsLR164k_midthickness.surf.gii` 111 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-R_space-MNI_mesh-fsLR164k_midthickness.surf.gii` 112 | 113 | ### Mid-thickness surface (MNI space, fsLR32k mesh):derivatives.anat.space-MNI_mesh-fsLR32k_midthickness 114 | 115 | Left and Right mid-thickness CIFTIs in MNI space with fsLR32k surface mesh. 116 | 117 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-L_space-MNI_mesh-fsLR32k_midthickness.surf.gii` 118 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-R_space-MNI_mesh-fsLR32k_midthickness.surf.gii` 119 | 120 | ### Mid-thickness surface (MNI space, native mesh): derivatives.anat.space-MNI_mesh-native_midthickness 121 | 122 | Left and Right mid-thickness CIFTIs in MNI space with native surface mesh. 123 | 124 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-L_space-MNI_mesh-native_midthickness.surf.gii` 125 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-R_space-MNI_mesh-native_midthickness.surf.gii` 126 | 127 | ### Mid-thickness surface (native space, fsLR32k mesh): derivatives.anat.space-T1w_mesh-fsLR32k_midthickness 128 | 129 | Left and Right mid-thickness CIFTIs in native T1 space with fsLR32k surface mesh. 130 | 131 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-L_space-T1w_mesh-fsLR32k_midthickness.surf.gii` 132 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-R_space-T1w_mesh-fsLR32k_midthickness.surf.gii` 133 | 134 | ### Mid-thickness surface (native space, native mesh): derivatives.anat.space-T1w_mesh-native_midthickness 135 | 136 | Left and Right mid-thickness CIFTIs in native T1 space with native surface mesh. 137 | 138 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-L_space-T1w_mesh-native_midthickness.surf.gii` 139 | - `sub-#/ses-#/anat/sub-#_ses-#_hemi-R_space-T1w_mesh-native_midthickness.surf.gii` 140 | 141 | ### Anatomical Volume (MNI space): derivatives.anat.(T1w|T2w) 142 | 143 | Anatomical imaging masked brain or full head in MNI space in a volume. 144 | 145 | - `sub-#/ses-#/anat/sub-#_ses-#_(T1w|T2w)_space-MNI_(brain|head).nii.gz` 146 | 147 | ### White matter parcellation discrete segmentation (MNI space): derivatives.anat.wmparc 148 | 149 | White matter parcellation discrete segmentation file in a volume. 150 | 151 | - `sub-#/ses-#/anat/sub-#_ses-#_T1w_space-MNI_desc-wmparc_dseg.nii.gz` 152 | 153 | ### Executive Summary: derivatives.executivesummary.all 154 | 155 | DCAN Labs executive summary HTML processing visual inspection summary. 156 | 157 | - `sub-#/ses-#/sub-#_ses-#.html` 158 | - `sub-#/ses-#/img/` 159 | 160 | **NOTE:** `` denotes a variety of `.gif` and `.png` images used in conjunction with the `.html` file which vary in count based on counts of available input images. 161 | 162 | ### Parcellated functional timeseries: derivatives.func.concat_task-(MID|nback|SST|rest)_bold_atlas-(Gordon2014|HCP2016|Markov2012|Power2011|Yeo2011)FreeSurferSubcortical_desc-filtered_timeseries 163 | 164 | A dense label parcellation with FreeSurfer subcorticals with names corresponding to the first author and publication year. 165 | 166 | - `(Gordon2014|HCP2016|Markov2012|Power2011|Yeo2011)FreeSurferSubcortical_dparc.dlabel.nii` 167 | 168 | A "5 contiguous frames" motion censoring algorithm file of temporal masks by FD threshold (0mm->0.5mm) with or without outlier detection in use. 169 | 170 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_desc-filtered_motion_mask.mat` 171 | 172 | Concatenated functional task parcellated time series using the parcellations above. 173 | 174 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_bold_atlas-(Gordon2014|HCP2016|Markov2012|Power2011|Yeo2011)FreeSurferSubcortical_desc-filtered_timeseries.ptseries.nii` 175 | 176 | `derivatives.func.concat_task-(MID|nback|SST|rest)_bold_desc-filtered_timeseries` 177 | 178 | Concatenated functional task dense time series post-DCANBOLDProcessing (regression and filtering) in Atlas space. 179 | 180 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_bold_desc-filtered_timeseries.dtseries.nii` 181 | 182 | A "5 contiguous frames" motion censoring algorithm file of temporal masks by FD threshold (0mm->0.5mm) with or without outlier detection in use. 183 | 184 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_desc-filtered_motion_mask.mat` 185 | 186 | ### Motion Framewise displacements: derivatives.func.motion_task-(MID|nback|SST|rest) 187 | 188 | A "5 contiguous frames" motion censoring algorithm file of temporal masks by FD threshold (0mm->0.5mm) with or without outlier detection in use. 189 | 190 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_desc-filtered_motion_mask.mat` 191 | 192 | Movement-artifact-unfiltered or Movement-artifact-filtered (`_desc-filtered`) movement numbers without an FD column. 193 | 194 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_motion.tsv` 195 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_desc-filtered_motion.tsv` 196 | 197 | ### Motion and Framewise displacement (filtered): derivatives.func.updated_motion_task-(MID|nback|SST|rest) 198 | 199 | Movement-artifact-unfiltered or Movement-artifact-filtered (`_desc-filtered`) movement numbers with an FD column included (this one is recommended over the original `derivatives.func.motion_task-(MID|nback|SST|rest)` motion files). 200 | 201 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_desc-includingFD_motion.tsv` 202 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_desc-filteredincludingFD_motion.tsv` 203 | 204 | ### Parcellated Connectivity: derivatives.func.pconns 205 | 206 | Connectivity matrix and its frame censor with either 5 minutes of data (`_censor-5min`), 10 minutes of data (`_censor-10min`), or all frames (`_censor-belowthresh`) below the FD threshold of 0.2 mm (`_thresh-fd0p2mm`). 207 | 208 | - `sub-#/ses-#/func/sub-#_ses-#_task-rest_bold_atlas-Gordon2014FreeSurferSubcortical_desc-filtered_timeseries_thresh-fd0p2mm_censor-(5min|10min|belowthresh)_conndata-network_(censor.txt|connectivity.pconn.nii)` 209 | 210 | ### Individual functional run and motion files: derivatives.func.runs_task-(MID|nback|SST|rest) 211 | 212 | Individual minimally-processed functional task run dense time series in Atlas space pre-DCANBOLDProcessing with original filtered motion numbers. 213 | 214 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_bold_timeseries.dtseries.nii` 215 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_desc-filtered_motion.tsv` 216 | 217 | ### Individual functional volumes: derivatives.func.runs_task-(MID|nback|SST|rest)_volume 218 | 219 | Motion-corrected individual functional task run in MNI space in a volume. 220 | 221 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST|rest)_run-#_space-MNI_bold.nii.gz` 222 | 223 | 224 | ## 5. Task fMRI 225 | 226 | The task pipeline will produce its derivatives in the following BIDS-valid directory structure. 227 | 228 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST)_level-2_contrast_*_cope1.dtseries.nii` 229 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST)_level-2_contrast_*_tdof_t1.dtseries.nii` 230 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST)_level-2_contrast_*_logfile` 231 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST)_level-2_contrast_*_mask.dtseries.nii` 232 | - `sub-#/ses-#/func/sub-#_ses-#_task-(MID|nback|SST)_level-2_contrast_*_res4d.dtseries.nii` 233 | 234 | ## 6. Executive Summary 235 | 236 | The DCAN Labs executive summary is software for getting a basic visual quality control report to review processed output data. 237 | 238 | ### Prerequisites: At least one `T1w` and one fMRI 239 | 240 | - DCAN Labs Executive Summary: `derivatives.executivesummary.all` 241 | 242 | ## 7. Derivative Filenames 243 | 244 | Some BIDS derivative standards are still [BIDS Extension Proposals (BEPs)](https://bids-specification.readthedocs.io/en/stable/06-extensions.html#bids-extension-proposals) at the time of this writing, but we tried to conform to the available derivative standards at the time for common derivatives ([BEP003](https://docs.google.com/document/d/1Wwc4A6Mow4ZPPszDIWfCUCRNstn7d_zzaWPcfcHmgI4/view)), the structural preprocessing derivatives ([BEP011](https://docs.google.com/document/d/1YG2g4UkEio4t_STIBOqYOwneLEs1emHIXbGKynx7V0Y/view)), and the functional preprocessing derivatives ([BEP012](https://docs.google.com/document/d/1qBNQimDx6CuvHjbDvuFyBIrf2WRFUOJ-u50canWjjaw/view)). 245 | 246 | ## 8. Motion MAT File 247 | 248 | The MATLAB motion .MAT files are a product of the DCANBOLDProcessing stage of the pipeline. They should be used to select a frame censoring mask (frames to keep in analysis versus frames to censor out based on excessive motion). They contain a 1x51 MATLAB cell of MATLAB structs where each struct is the censoring info at a given framewise displacement (FD) threshold (0 to 0.5 millimeters in steps of 0.01 millimeters). 249 | 250 | These files use the motion censoring algorithm from the [Power, et al, 2014 paper](https://www.sciencedirect.com/science/article/pii/S1053811913009117). In that paper, the authors describe a motion censoring method wherein you only exclude periods of data below FD thresholds which are less than five contiguous frames between sequential censored frames. 251 | 252 | [*Power, J. D., Mitra, A., Laumann, T. O., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2014). Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage, 84, 320–41. doi:10.1016/j.neuroimage.2013.08.048*](https://www.sciencedirect.com/science/article/pii/S1053811913009117) 253 | 254 | ## 9. Caveats 255 | 256 | There were a few parts of the NDA fmriresults01 and imagingcollection01 data structures where we could not conform to the NDA's established standard. We plan to correct these in future releases. 257 | 258 | ### `qc_outcome` Field 259 | 260 | All `qc_outcome` fields within NDA records are marked as questionable as the data have not been formally quality controlled. 261 | 262 | ### Executive Summary not-yet-available for Anatomical-only Subjects 263 | 264 | There is a minor bug in the executive summary visual report in this round of processing not allowing it to support sessions without any fMRI runs. 265 | 266 | ### Executive Summary labeled as fMRI `scan_type` in records 267 | 268 | For all NDA records within this first release the `scan_type` variable was set to fMRI for the executive summary visual reports. -------------------------------------------------------------------------------- /docs/pipelines.md: -------------------------------------------------------------------------------- 1 | # Pipeline 2 | 3 | ## 1. About this Document 4 | 5 | This document lightly describes the ABCD-BIDS pipeline, fMRIPrep, and QSIPrep used to process the BIDS input data and output the BIDS derivative data. 6 | 7 | Further documentation for these pipelines can be found by clicking on their respective links.: 8 | [abcd-hcp-pipeline](https://hub.docker.com/r/dcanlabs/abcd-hcp-pipeline) 9 | [fMRIPrep](https://fmriprep.org/) 10 | [QSIPrep](https://qsiprep.readthedocs.io/en/stable/) 11 | 12 | ## 2. ABCD-BIDS Pipeline 13 | 14 | The ABCD-BIDS pipeline is available on [GitHub](https://github.com/ABCD-STUDY/abcd-hcp-pipeline), [OSF](https://doi.org/10.17605/OSF.IO/89PYD), and [DockerHub](https://hub.docker.com/r/dcanlabs/abcd-hcp-pipeline) at the time of this release as the `abcd-hcp-pipeline`. It is a [BIDS App](https://bids-apps.neuroimaging.io/about/) which takes BIDS input data and uses the methods from both the [Human Connectome Project's minimal preprocessing pipeline](https://doi.org/10.1016/j.neuroimage.2013.04.127) and the [DCAN Labs resting state fMRI analysis tools](https://github.com/DCAN-Labs/dcan_bold_processing) to output preprocessed MRI data in both volume and surface spaces. 15 | 16 | It has been designed to be as BIDS compliant and user friendly as possible. While it has been used here specifically to process the ABCD data, it can be run by any investigator to process a wide variety of BIDS input MRI data as long as the data set contains a T1w image. 17 | 18 | Each stage of the larger pipeline has a distinct beginning and ending which is why we consider them stages. The pipeline completes each step in serial, though some steps can utilize multiple processor cores to speed up processing time. Below is a short explanation of each stage's intent and some of the methods. 19 | 20 | For full details read the following references: 21 | 22 | 1. [The minimal preprocessing pipelines for the Human Connectome Project. Glasser, et al. NeuroImage. 2013.](https://doi.org/10.1016/j.neuroimage.2013.04.127) 23 | 2. [Correction of respiratory artifacts in MRI head motion estimates. Fair, et al. NeuroImage. 2019.](https://doi.org/10.1016/j.neuroimage.2019.116400) 24 | 3. [Adolescent Brain Cognitive Development (ABCD) Community MRI Collection and Utilities. Feczko, et al. Biorxiv, 2021](https://www.biorxiv.org/content/10.1101/2021.07.09.451638v1) 25 | 26 | 27 | ### Stage 1: PreFreeSurfer 28 | 29 | The primary goal of PreFreeSurfer is to remove distortions from the anatomical data and align and extract the brain in the subject's native volume space. 30 | 31 | Notable deviations from the original HCP minimal preprocessing pipeline include the use of [Advanced Normalization Tools (ANTs)](http://stnava.github.io/ANTs/) to perform denoising and N4 bias field correction, which significantly improves results for subjects scanned on General Electric (GE) and Philips scanners that tend to have more noise and are not always normalized following the scan. We moved Montreal Neurological Institute (MNI) standard space registration to the PostFreeSurfer stage to make use of the refined brain mask created in the FreeSurfer stage. 32 | 33 | We also enabled the ability to process subjects without a T2w image. A T2w image is necessary for myelin mapping estimates, but it does not make a big enough impact on other outputs such as segmentation and surface generation to require it for all image processing. 34 | 35 | ### Stage 2: FreeSurfer 36 | 37 | The brain gets segmented into predefined structures, the white and pial cortical surfaces are reconstructed, and [FreeSurfer's](https://surfer.nmr.mgh.harvard.edu/) standard folding-based surface registration to FreeSurfer's surface atlas is performed. This stage remains largely unchanged from the original HCP minimal preprocessing pipeline so refer to [Glasser, et al. 2013](https://doi.org/10.1016/j.neuroimage.2013.04.127) for more information. 38 | 39 | ### Stage 3: PostFreeSurfer 40 | 41 | The primary function of the PostFreeSurfer stage is generating CIFTI surface files and applying surface registration to the Conte-69 surface template. In addition to this, atlas registration is performed. By using the refined native space brain mask generated in FreeSurfer the pipeline is able to produce a more robust registration. Furthermore we found that ANTs' diffeomorphic symmetric image normalization method for registration outperforms FSL's FNIRT-based registration. Outputs from the GE and Philips scanners benefit from this modification as their input anatomical image contrast was less clear than the normalized from-the-scanner Siemens images (discussed in these release notes' document 2: [Inputs](https://collection3165.readthedocs.io/en/stable/inputs/)). 42 | 43 | ### Stage 4: FMRIVolume 44 | 45 | The FMRIVolume stage marks the start of the functional part of the image processing pipeline and it begins similarly to the anatomical portion with correction of gradient-nonlinearity-induced distortions to the EPIs. Each volume of the time series is aligned using an [FSL FLIRT](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT) rigid-body (six degrees-of-freedom) registration to the initial frame to correct for motion. The initial frame is used because it tends to have greater anatomical contrast. The registration of each volume results in a twelve column text file containing translation and rotation along each axis and their derivatives. A de-meaned and linearly de-trended motion parameter file is provided as well for nuisance regression. 46 | 47 | One pair of spin echo EPI scans with opposite phase encoding directions are used to correct for distortions in the phase encoding direction of each fMRI volume using [FSL's topup](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/topup). Only a single pair of spin echo field maps is used for all EPI images despite a pair being taken for each EPI image. This pair is denoted by the "IntendedFor" field populated in the JSON sidecar metadata associated among field maps. After removing the distortions, the single-band reference is registered to the T1w image and this registration is used to align all fMRI volumes to the anatomical data independently. Each fMRI volume is non-linearly registered to MNI space and finally masked. 48 | 49 | ### Stage 5: FMRISurface 50 | 51 | The purpose of the FMRISurface stage is primarily to take a volume time series and map it to the standard CIFTI grayordinates space. This stage has not been altered from the original pipeline so refer to [Glasser, et al. 2013](https://doi.org/10.1016/j.neuroimage.2013.04.127) for more information. 52 | 53 | ### Stage 6: DCANBOLDProcessing (DBP) 54 | 55 | [DCAN BOLD Processing](https://github.com/DCAN-Labs/dcan_bold_processing) is a signal processing software developed primarily by Dr. Oscar Miranda-Dominguez in the DCAN Labs with the primary function of nuisance regression from the dense time series and providing motion censoring information in accordance with [Power, et al. 2014](https://www.ncbi.nlm.nih.gov/pubmed/23994314). The motion numbers produced in the FMRIVolume stage are also filtered to remove artifactual motion caused by respiration. For more information on the respiration filtering see [Correction of respiratory artifacts in MRI head motion estimates. Fair, et al. NeuroImage. 2019.](https://doi.org/10.1016/j.neuroimage.2019.116400). 56 | 57 | This stage involves four broad steps: 58 | 59 | 1. Standard pre-processing 60 | 1. Application of a respiratory motion filter 61 | 1. Motion censoring followed by standard re-processing 62 | 1. Construction of parcellated timeseries 63 | 64 | #### 1. DBP Standard pre-processing 65 | 66 | Standard pre-processing comprises three steps. First all fMRI data are de-meaned and de-trended with respect to time. Next a general linear model is used to denoise the processed fMRI data. Denoising regressors comprise signal and movement variables. Signal variables comprise mean time series for white matter, CSF, and the global signal, which are derived from individualized segmentations generated during PostFreesurfer in ACPC-aligned subject native space. Movement variables comprise translational (X,Y,Z) and rotational (roll, pitch, and yaw) measures estimated by re-alignment during FMRIVolume and their Volterra expansion. The inclusion of mean greyordinate timeseries regression is critical for most resting-state functional MRI comparisons, as demonstrated empirically by multiple independent labs (Ciric et al., 2017; Power et al., 2017, 2019b; Satterthwaite et al., 2013). After denoising the fMRI data, the time series are band-pass filtered between 0.008 and 0.09 Hz using a 2nd order Butterworth filter. Such a band-pass filter is softer than other filters, and avoids potential aliasing of the time series signal. 67 | 68 | ##### On Global Signal Regression 69 | 70 | Global signal regression (GSR) has been consistently shown to reduce the effects of motion on BOLD signals and eliminate known batch effects that directly impact group comparisons (Ciric et al., 2017; Power et al., 2015, 2019b). Motion censoring (see below) combined with GSR has been shown to be the best existing method for eliminating artifacts produced by motion. 71 | 72 | #### 2. DBP Respiratory Motion Filter 73 | 74 | In working with ABCD data, we have found that a respiratory artifact is produced within multi-band data (Fair et al., 2020). While this artifact occurs outside the brain, it can affect estimates of frame alignment, leading to inappropriate motion censoring. By filtering the frequencies (18.582 to 25.726 breaths per minute) of the respiratory signal from the motion realignment data, our respiratory motion filter produces better estimates of FD. 75 | 76 | #### 3. DBP Motion censoring 77 | 78 | Our motion censoring procedure is used for performing the standard pre-processing and for the final construction of parcellated timeseries. For standard pre-processing, data are labeled as "bad" frames if they exceed an FD threshold of 0.3 mm. Such "bad" frames are removed when demeaning and detrending, and betas for the denoising are calculated using only the "good" frames. For band-pass filtering, interpolation is used initially to replace the "bad" frames and the residuals are extracted from the denoising GLM. In such a way, standard pre-processing of the timeseries only uses the "good" data but avoids potential aliasing due to missing timepoints. After motion censoring, timepoints are further censored using an outlier detection approach. Both a mask including outlier detection and a mask without outlier detection are created. [These masks](https://collection3165.readthedocs.io/en/stable/derivatives/#7-motion-mat-file) are HDF5 compatible .MAT files, which contain temporal masks from 0 ("No censoring") to 0.5 mm FD thresholds in steps of 0.01 mm. 79 | 80 | #### 4. DBP Generation of parcellated timeseries for specific atlases 81 | 82 | Using the processed resting-state fMRI data, this stage constructs parcellated time series for pre-defined atlases making it easy to construct correlation matrices or perform time series analysis on putative brain areas defined by independent datasets. The atlases comprise recent parcellations of brain regions that comprise different networks. In particular, parcellated timeseries are extracted for Evan Gordon’s 333 ROI atlas template (Gordon et al., 2014), Jonathan Power’s 264 ROI atlas template (Power et al., 2011), Thomas Yeo’s 118 ROI atlas template (Yeo et al., 2011), and the HCP’s 360 ROI atlas template (Glasser et al., 2016). These parcellations also include 19 individualized subcortical parcellations. Since we anticipate newer parcellated atlases as data acquisition, analytic techniques, and knowledge all improve, it is trivial to add new templates for this final stage. 83 | 84 | ### Stage 7: ExecutiveSummary 85 | 86 | The ExecutiveSummary stage produces an HTML visual quality control page that displays a [BrainSprite](https://github.com/simexp/brainsprite.js) viewer of the T1w and T2w segmentation, an overlay of the atlas registration on each single band reference created by FSL's slicer, and a visualization of the movement and grayordinate time series for each fMRI run pre- and post-regression. 87 | 88 | ### Stage 8: CustomClean 89 | 90 | [Custom clean](https://github.com/DCAN-Labs/CustomClean) is an optional (though recommended) stage we use that is especially meant for processing large volumes of data. Custom clean removes some non-critical pipeline outputs to minimize the footprint of a subject's processed dataset. 91 | 92 | ### Stage 9: FileMapper 93 | 94 | [File Mapper](https://github.com/DCAN-Labs/file-mapper) is responsible for mapping the HCP pipeline outputs into valid BIDS derivatives. 95 | 96 | ## abcd-bids-fmri 97 | 98 | [abcd-bids-tfmri](https://github.com/DCAN-Labs/abcd-bids-tfmri-pipeline), a modified version of the TaskfMRIAnalysis stage of the HCP-pipeline (Glasser et al., 2013) developed at University of Vermont by Anthony Juliano, was used to process task-fmri data from the minimally processed ABCD-BIDS (Feczko et al., 2020b) processing pipeline (v.1.0) data, as well as derived ABCC data (Feczko, 2020; ABCD-3165). Given the abcd-bids-tfmri pipeline's focus on reproducibility in neuroimaging, it allows for minimal user input while providing vast flexibility with regard to the task-based fMRI data that can be processed (including the type of task and the number of subject-level runs). Transparency is easily achieved with the abcd-bids-tfmri pipeline as users can efficiently share their command-line that was used in processing their data when presenting their findings. 99 | 100 | Given its focus on CIFTI (like a dtseries) data, the abcd-bids-tfmri pipeline heavily relies on HCP workbench commands (https://www.humanconnectome.org/software/workbench-command). This includes completing the user-specified spatial smoothing (wb_command -cifti-smoothing), converting the smoothed data to and from a format that FSL (Jenkinson et al. 2012) can interpret (wb_command -cifti-convert), separating the dtseries data into its comprised components (wb_command -cifti-separate-all), and reading in pertinent information from the dtseries data (wb_command -file-information), among others. Based on the user-specified parameters for censoring volumes (i.e. initial and/or high-motion frames), the pipeline will read in the filtered motion file (Fair et al., 2020) produced by the ABCD-BIDS processing pipeline and create a matrix for nuisance regression. Finally, high-pass filtering, with a cutoff of 0.005 Hz (200 seconds), is completed before running FSL's FILM (Woolrich et al. 2001). 101 | 102 | For FILM to run, users must supply their own subject-, task-, and run-specific event timing files that are in the FSL standard three column format (i.e. onset, duration, weight/magnitude). Additionally, users need to supply a task-specific fsf template file per task that they will be processing using the abcd-bids-tfmri pipeline. As the abcd-bids-tfmri pipeline modifies this template to make it subject- and run-specific, certain values need to be replaced with specific variables that the abcd-bids-tfmri pipeline will be able to recognize. An example fsf file template for ABCD’s MID task is made available for users to review on ABCC (https://osf.io/psv5m/). 103 | 104 | Users can specify which task data they would like to process by providing a list of task names within the abcd-bids-tfmri pipeline’s command line interface. If the user specifies multiple runs of the task, the pipeline will complete higher-level analyses (i.e. fixed effects modeling) to combine a given subject's run-level data. Therefore if a study has three different fMRI tasks that consist of two runs, all six level 1 analyses and all three level 2 analyses can be completed for a subject with a single run of the abcd-bids-tfmri pipeline. 105 | 106 | The outputs of the abcd-bids-tfmri pipeline include the fully-processed dtseries data that are subsequently ready for the user to perform their desired third-level or group-wise analyses. 107 | 108 | ## fMRIPrep 109 | 110 | fMRIPrep is a tool for preprocessing BIDS compatible fMRI datasets. If groups would like to analyze the ABCD fMRI results, these outputs will be helpful for analysis of resting state and task based fMRI data. This is the command that was used: 111 | 112 | ```bash 113 | singularity run --cleanenv /data/ABCD_MBDU/singularity_images/fmriprep_20.2.0.simg \ 114 | /data/ABCD_MBDU/abcd_bids/bids \ 115 | $TMPDIR/out \ 116 | participant \ 117 | --participant_label $PARTICIPANTID \ 118 | -w $TMPDIR/wrk \ 119 | --nthreads $SLURM_CPUS_PER_TASK \ 120 | --mem_mb $SLURM_MEM_PER_NODE \ 121 | --fs-license-file /data/ABCD_MBDU/singularity_images/license.txt \ 122 | --output-spaces MNI152NLin2009cAsym:res-2 fsnative fsaverage5 fsLR \ 123 | --cifti-output \ 124 | --skip-bids-validation \ 125 | --notrack \ 126 | --omp-nthreads 1 127 | ``` 128 | 129 | Any papers using outputs from this pipeline should acknowledge this contribution of computational resources with the following line: 130 | 131 | “This work used the computational resources of the NIH HPC (high-performance computing) Biowulf cluster ().” 132 | 133 | ## QSIPrep 134 | 135 | QSIPrep configures pipelines for processing diffusion-weighted MRI (dMRI or DWI) data. For more information see the [QSIPrep documentation](https://qsiprep.readthedocs.io/en/latest/). This is the command used to run ABCC subjects through QSIPrep preprocessing: 136 | 137 | ```bash 138 | singularity run --cleanenv -B ${PWD} \ 139 | pennlinc-containers/.datalad/environments/qsiprep-0-16-1/image \ 140 | inputs/data \ 141 | prep \ 142 | participant \ 143 | -w ${PWD}/.git/wkdir \ 144 | --n_cpus 8 \ 145 | --stop-on-first-crash \ 146 | --fs-license-file code/license.txt \ 147 | --skip-bids-validation \ 148 | --participant-label "$subid" \ 149 | --unringing-method mrdegibbs \ 150 | --output-resolution 1.7 \ 151 | --eddy-config code/eddy_params.json \ 152 | --notrack 153 | ``` 154 | 155 | Contents of code/eddy_params.json 156 | 157 | ```json 158 | { 159 | "flm": "linear", 160 | "slm": "linear", 161 | "fep": false, 162 | "interp": "spline", 163 | "nvoxhp": 1000, 164 | "fudge_factor": 10, 165 | "dont_sep_offs_move": false, 166 | "dont_peas": false, 167 | "niter": 5, 168 | "method": "jac", 169 | "repol": true, 170 | "num_threads": 1, 171 | "is_shelled": true, 172 | "use_cuda": false, 173 | "cnr_maps": true, 174 | "residuals": false, 175 | "output_type": "NIFTI_GZ", 176 | "args": "" 177 | } 178 | ``` 179 | --------------------------------------------------------------------------------