├── .github └── workflows │ └── automated_tests.yml ├── .gitignore ├── .gitlab-ci.yml ├── LICENSE ├── Makefile ├── README.md ├── bin └── assert.sh ├── main.nf ├── modules ├── 01_mutect2.nf ├── 02_learn_read_orientation.nf ├── 03_pileup_summary.nf ├── 04_calculate_contamination.nf ├── 05_filter_calls.nf └── 06_annotate.nf ├── mutect2_pon.nf ├── nextflow.config ├── test_data ├── SRR8244836.preprocessed.downsampled.bam ├── SRR8244836.preprocessed.downsampled.bam.bai ├── SRR8244887.preprocessed.downsampled.bam ├── SRR8244887.preprocessed.downsampled.bam.bai ├── gnomad.minimal.vcf.gz ├── gnomad.minimal.vcf.gz.tbi ├── intervals.minimal.bed ├── test_input.txt ├── test_input_with_replicates.txt ├── ucsc.hg19.minimal.dict ├── ucsc.hg19.minimal.fasta └── ucsc.hg19.minimal.fasta.fai └── tests ├── test_00.sh ├── test_01.sh ├── test_02.sh ├── test_03.sh ├── test_04.sh ├── test_05.sh ├── test_06.sh ├── test_07.sh ├── test_08.sh ├── test_09.sh └── test_10.sh /.github/workflows/automated_tests.yml: -------------------------------------------------------------------------------- 1 | name: Automated tests 2 | 3 | on: [push] 4 | 5 | jobs: 6 | test: 7 | runs-on: ubuntu-20.04 8 | 9 | steps: 10 | - uses: actions/checkout@v2 11 | - uses: actions/setup-java@v3 12 | with: 13 | distribution: 'zulu' # See 'Supported distributions' for available options 14 | java-version: '11' 15 | - uses: conda-incubator/setup-miniconda@v2 16 | with: 17 | auto-update-conda: true 18 | channels: defaults,conda-forge,bioconda 19 | - name: Install dependencies 20 | run: | 21 | apt-get update && apt-get --assume-yes install wget make procps software-properties-common 22 | wget -qO- https://get.nextflow.io | bash && cp nextflow /usr/local/bin/nextflow 23 | - name: Cache conda environments 24 | uses: actions/cache@v2 25 | with: 26 | path: | 27 | /home/runner/work/tronflow-mutect2/tronflow-mutect2/work/conda 28 | key: ${{ runner.os }}-tronflow-mutect2 29 | - name: Run tests 30 | run: | 31 | make 32 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Default ignored files 2 | /shelf/ 3 | /workspace.xml 4 | work 5 | .nextflow.log* 6 | report.html 7 | timeline.html 8 | trace.txt 9 | dag.dot 10 | .nextflow 11 | .idea 12 | output 13 | -------------------------------------------------------------------------------- /.gitlab-ci.yml: -------------------------------------------------------------------------------- 1 | image: openjdk:11.0.10-jre-buster 2 | 3 | 4 | before_script: 5 | - java -version 6 | - apt-get update && apt-get --assume-yes install wget make procps 7 | - wget -qO- https://get.nextflow.io | bash && cp nextflow /usr/local/bin/nextflow 8 | - nextflow help 9 | - wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 10 | - mkdir /root/.conda 11 | - bash Miniconda3-latest-Linux-x86_64.sh -b && cp /root/miniconda3/bin/* /usr/local/bin/ 12 | - rm -f Miniconda3-latest-Linux-x86_64.sh 13 | - conda --version 14 | 15 | stages: 16 | - test 17 | 18 | test: 19 | stage: test 20 | script: 21 | - make clean test 22 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 TRON 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | 2 | all : clean test 3 | 4 | clean: 5 | rm -rf output 6 | #rm -rf work 7 | rm -f report.html* 8 | rm -f timeline.html* 9 | rm -f trace.txt* 10 | rm -f dag.dot* 11 | rm -f .nextflow.log* 12 | rm -rf .nextflow* 13 | 14 | 15 | test: 16 | bash tests/test_00.sh 17 | bash tests/test_01.sh 18 | bash tests/test_02.sh 19 | bash tests/test_03.sh 20 | bash tests/test_04.sh 21 | bash tests/test_05.sh 22 | bash tests/test_06.sh 23 | bash tests/test_08.sh 24 | bash tests/test_09.sh 25 | bash tests/test_10.sh 26 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TronFlow Mutect2 2 | 3 | ![GitHub tag (latest SemVer)](https://img.shields.io/github/v/release/tron-bioinformatics/tronflow-mutect2?sort=semver) 4 | [![Run tests](https://github.com/TRON-Bioinformatics/tronflow-mutect2/actions/workflows/automated_tests.yml/badge.svg?branch=master)](https://github.com/TRON-Bioinformatics/tronflow-mutect2/actions/workflows/automated_tests.yml) 5 | [![DOI](https://zenodo.org/badge/355860788.svg)](https://zenodo.org/badge/latestdoi/355860788) 6 | [![License](https://img.shields.io/badge/license-MIT-green)](https://opensource.org/licenses/MIT) 7 | [![Powered by Nextflow](https://img.shields.io/badge/powered%20by-Nextflow-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](https://www.nextflow.io/) 8 | 9 | The TronFlow Mutect2 pipeline is part of a collection of computational workflows for tumor-normal pair somatic variant calling. 10 | 11 | Find the documentation here [![Documentation Status](https://readthedocs.org/projects/tronflow-docs/badge/?version=latest)](https://tronflow-docs.readthedocs.io/en/latest/?badge=latest) 12 | 13 | 14 | This workflow implements the Mutect2 (Benjamin, 2019) best practices somatic variant calling of tumor-normal pairs. 15 | ![Mutect2 best practices](https://drive.google.com/uc?id=1rDDE0v_F2YCeXfQnS00w0MY3cAGQvfho) 16 | 17 | It has the following steps: 18 | * **Mutect2** - the somatic variant caller. 19 | * **Learn read orientation model** - learn the prior probability of read orientation artifacts. 20 | * **Pile-up summaries** - summarizes counts of reads that support reference, alternate and other alleles for given sites (optional). 21 | * **Calculate contamination** - Given pileup data from GetPileupSummaries, calculates the fraction of reads coming from cross-sample contamination (optional). 22 | * **Filter calls** - filters mutations from the raw Mutect2 variant calls 23 | * **Funcotator annotation** - add functional annotations (optional) 24 | 25 | 26 | ## How to run it 27 | 28 | Run it from GitHub as follows: 29 | ``` 30 | nextflow run tron-bioinformatics/tronflow-mutect2 -r v1.4.0 -profile conda --input_files $input --reference $reference --gnomad $gnomad 31 | ``` 32 | 33 | Otherwise download the project and run as follows: 34 | ``` 35 | nextflow main.nf -profile conda --input_files $input --reference $reference --gnomad $gnomad 36 | ``` 37 | 38 | Find the help as follows: 39 | ``` 40 | $ nextflow run tron-bioinformatics/tronflow-mutect2 --help 41 | 42 | Usage: 43 | nextflow run tron-bioinformatics/tronflow-mutect2 -profile conda --input_files input_files [--reference reference.fasta] 44 | 45 | This workflow is based on the implementation at /code/iCaM/scripts/mutect2_ID.sh 46 | 47 | Input: 48 | * input_files: the path to a tab-separated values file containing in each row the sample name, tumor bam and normal bam 49 | The input file does not have header! 50 | Example input file: 51 | name1 tumor_bam1 normal_bam1 52 | name2 tumor_bam2 normal_bam2 53 | * reference: path to the FASTA genome reference (indexes expected *.fai, *.dict) 54 | 55 | Optional input: 56 | * input_name: sample name (alternative to --input_files) 57 | * input_tumor_bam: comma separated list of tumor BAMs (alternative to --input_files) 58 | * input_normal_bam: comma separated list of normal BAMs (alternative to --input_files) 59 | * gnomad: path to the gnomad VCF or other germline resource (recommended). If not provided the contamination will 60 | not be estimated and the filter of common germline variants will be disabled 61 | * pon: path to the panel of normals VCF 62 | * intervals: path to a BED file containing the regions to analyse 63 | * output: the folder where to publish output (default: output) 64 | * enable_bam_output: outputs a new BAM file with the Mutect2 reassembly of reads (default: false) 65 | * disable_common_germline_filter: disable the use of GnomAD to filter out common variants in the population 66 | from the somatic calls. The GnomAD can still be provided though as this common SNPs are used elsewhere to 67 | calculate the contamination (default: false) 68 | * funcotator: To use Funcotator, supply the path to a database to be used. (can be downloaded from GATK FTP server) 69 | * reference_version_funcotator: version of the reference genome (default: "hg19") 70 | * output_format_funcotator: the output format of Funcotator. Can be VCF or MAF (default: "MAF") 71 | * transcript_selection_mode_funcotator: transcript selection method can be CANONICAL, BEST_EFFECT or ALL. (default: CANONICAL) 72 | * memory_mutect2: the ammount of memory used by mutect2 (default: 16g) 73 | * memory_read_orientation: the ammount of memory used by learn read orientation (default: 16g) 74 | * memory_pileup: the ammount of memory used by pileup (default: 32g) 75 | * memory_contamination: the ammount of memory used by contamination (default: 16g) 76 | * memory_filter: the ammount of memory used by filter (default: 16g) 77 | * memory_funcotator: the ammount of memory used by filter (default: 16g) 78 | * args_filter: optional arguments to the FilterMutectCalls function of GATK (e.g.: "--contamination-estimate 0.4 --min-allele-fraction 0.05 --min-reads-per-strand 1 --unique-alt-read-count 4") (see FilterMutectCalls documentation) 79 | * args_funcotator: optional arguments to Funcotator (e.g. "--remove-filtered-variants true") (see Funcotator documentation) 80 | * args_mutect2: optional arguments to Mutect2 (e.g. "--sites-only-vcf-output") (see Mutect2 documentation) 81 | 82 | Output: 83 | * Output VCF 84 | * Other intermediate files 85 | ``` 86 | 87 | 88 | ### Input tables 89 | 90 | The table with BAM files expects three tab-separated columns without a header. 91 | Multiple tumor or normal BAMs can be provided separated by commas. 92 | 93 | | Sample name | Tumor BAMs | Normal BAMs | 94 | |----------------------|---------------------------------|------------------------------| 95 | | sample_1 | /path/to/sample_1_tumor.bam | /path/to/sample_1_normal.bam | 96 | | sample_2 | /path/to/sample_2_tumor_1.bam,/path/to/sample_2_tumor_2.bam | /path/to/sample_2_normal.bam,/path/to/sample_2_normal_2.bam | 97 | 98 | ### About read group tags in BAM headers 99 | 100 | Mutect2 relies on several read group tags to be present in the BAM header. 101 | The commpulsory tags are: `RG:ID`, `RG:PU`, `RG:SM`, `RG:PL` and `RG:LB`. 102 | If your BAM files do not have these read group tags use 103 | [Picard's AddOrReplaceReadGroups](https://gatk.broadinstitute.org/hc/en-us/articles/360037226472-AddOrReplaceReadGroups-Picard-). 104 | 105 | There are some further constraints in the sample tag (`RG:SM`) to distinguish normal and tumor samples. 106 | Hence, this workflow expects that tumor and normal BAMs have different values of RGSM; 107 | and when replicates are provided all normal BAMs must have the same RGSM; and the same applies for all tumor BAMs. 108 | The workflow will fail if these constraints are not met. 109 | 110 | 111 | ## Resources 112 | 113 | - FASTA reference genome with fai and dict indexes (see https://gatk.broadinstitute.org/hc/en-us/articles/360035531652-FASTA-Reference-genome-format for instructions on building the indices) 114 | - Analysis intervals file in BED format. These intervals determine the regions where variants will be called 115 | - VCF file with common germline variants (see [GnomAD](#gnomad)) 116 | - Optionally, a panel of normals (PON) may be used (see [PON](#pon)) 117 | 118 | ### GnomAD 119 | 120 | GnomAD (Karczewski, 2020) is the standard de facto database for germline variants population allele frequencies. Mutect2 employs GnomAD as prior knowledge to reject potential germline variants and it also uses the SNP to estimate contamination. 121 | 122 | If you want to disable the use of GnomAD to filter out common germline variants in the somatic calls use `--disable_common_germline_filter`, GnomAD will still be used to estimate the contamination. 123 | 124 | This resource has a total of 14,967,411 variants, of which 14,078,157 SNVs and 889,254 indels. No variants reported in mitochondrial chromosome. Frequencies are mostly low as expected, although there are some variants with a frequency of 1.0. Overall, we have 95,542 common SNVs (ie: AF > 5%), 96,599 low frequency SNVs (ie: AF<=5% and AF >= 0.5%) and 13,886,016 rare SNVs (AF < 0.5%) (of which 13,703,545 have AF < 0.1%); and 13,200 common indels, 12,444 low frequency indels and 863,610 rare indels. 125 | 126 | GnomAD v2.1 for the coding region can be downloaded from https://storage.googleapis.com/gnomad-public/release/2.1.1/vcf/exomes/gnomad.exomes.r2.1.1.sites.vcf.bgz 127 | 128 | The details on this release are described here https://macarthurlab.org/2018/10/17/gnomad-v2-1/ 129 | 130 | We keep only the variants passing all filters. We remove all annotations except AC, AF and AN, as Mutect does not use any other annotations such as specific population frequencies. 131 | ``` 132 | bcftools annotate --include 'FILTER="PASS"' --remove ^INFO/AC,INFO/AF,INFO/AN /projects/data/gatk_bundle/b37/gnomad.exomes.r2.1.1.sites.vcf.bgz --output-type z --output /projects/data/gatk_bundle/b37/gnomad.exomes.r2.1.1.sites.PASS.only_af.vcf.bgz --threads 4 133 | ``` 134 | 135 | GnomAD file is in b37, thus it may be needed to lift over to hg19 for instance. Lift over chain files can be found here: ftp://gsapubftp-anonymous@ftp.broadinstitute.org/Liftover_Chain_Files. 136 | ``` 137 | java -jar /code/picard/2.21.2/picard.jar LiftoverVcf INPUT=/projects/data/gatk_bundle/b37/gnomad.exomes.r2.1.1.sites.PASS.only_af.vcf.bgz OUTPUT=/projects/data/gatk_bundle/hg19/gnomad.exomes. 138 | ``` 139 | 140 | ### Panel Of Normals (PON) 141 | 142 | The PON is used to filter out technical artifacts from the somatic variant calls. The PON is recommended to be formed from technically similar samples (ie: same sequencing platform, same sample preparation), from healthy and young individuals and to be formed by a minimum of 40 samples (see https://gatkforums.broadinstitute.org/gatk/discussion/11053/panel-of-normals-pon ). 143 | 144 | The normal samples are processed by the BAM preprocessing pipeline including marking duplicates and BQSR. 145 | 146 | Run MuTect2 on each normal sample as follows: 147 | 148 | ``` 149 | java -Xmx16g -jar /code/gatk/4.1.3.0/gatk-package-4.1.3.0-local.jar \ 150 | Mutect2 \ 151 | --reference ${params.reference} \ 152 | --intervals ${interval} \ 153 | --input ${bam} \ 154 | --tumor-sample ${name} \ 155 | --max-mnp-distance 0 \ 156 | --output ${bam.baseName}.${interval.baseName}.mutect.vcf 157 | ``` 158 | 159 | Note the parameter "--max-mnp-distance 0" is needed to avoid MNPs being called. 160 | 161 | The multiple VCFs need to be combined with the GATK tool "CreateSomaticPanelOfNormals". 162 | 163 | This is implemented in the pipeline `mutect2_pon.vcf`. 164 | 165 | Once the panel of normals is created pass it to the workflow using the parameter `--pon`. 166 | 167 | ### Configuring Funcotator 168 | 169 | Funcotator annotation is an optional step. To configure funcotator follow the indications here https://gatk.broadinstitute.org/hc/en-us/articles/360035889931-Funcotator-Information-and-Tutorial. 170 | 171 | In order to use funcotator provide the path to your local funcotator database with the parameter `--funcotator`. 172 | Also, make sure that the reference version provided to funcotator with `--reference_version_funcotator` is consistent with the provided reference with `--reference`. 173 | 174 | 175 | ## How to run the Panel of Normals (PON) pipeline 176 | 177 | ``` 178 | $ nextflow mutect2_pon.nf --help 179 | Usage: 180 | mutect2_pon.nf --input_files input_files 181 | 182 | This workflow aims to compute a panel of normals to be used with MuTect2 183 | 184 | Input: 185 | * input_files: the path to a file containing in each row the sample name and the path to a BAM file to be included in the PON 186 | example: 187 | sample1 /path/to/sample1.bam 188 | sample2 /path/to/sample2.bam 189 | NOTE: the sample name must be set in the @SN annotation 190 | 191 | Optional input: 192 | * output: the folder where to publish output 193 | 194 | Output: 195 | * Output combined VCF pon.vcf 196 | ``` 197 | 198 | ## References 199 | 200 | - Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316–319. https://doi.org/10.1038/nbt.3820 201 | - Benjamin, D., Sato, T., Cibulskis, K., Getz, G., Stewart, C., & Lichtenstein, L. (2019). Calling Somatic SNVs and Indels with Mutect2. BioRxiv. https://doi.org/10.1101/861054 202 | - GATK team. Somatic short variant discovery (SNVs + Indels). Retrieved from https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels- 203 | - Karczewski, K.J., Francioli, L.C., Tiao, G. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020). https://doi.org/10.1038/s41586-020-2308-7 204 | -------------------------------------------------------------------------------- /bin/assert.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | ##################################################################### 4 | ## 5 | ## title: Assert Extension 6 | ## 7 | ## description: 8 | ## Assert extension of shell (bash, ...) 9 | ## with the common assert functions 10 | ## Function list based on: 11 | ## http://junit.sourceforge.net/javadoc/org/junit/Assert.html 12 | ## Log methods : inspired by 13 | ## - https://natelandau.com/bash-scripting-utilities/ 14 | ## author: Mark Torok 15 | ## 16 | ## date: 07. Dec. 2016 17 | ## 18 | ## license: MIT 19 | ## 20 | ##################################################################### 21 | 22 | if command -v tput &>/dev/null && tty -s; then 23 | RED=$(tput setaf 1) 24 | GREEN=$(tput setaf 2) 25 | MAGENTA=$(tput setaf 5) 26 | NORMAL=$(tput sgr0) 27 | BOLD=$(tput bold) 28 | else 29 | RED=$(echo -en "\e[31m") 30 | GREEN=$(echo -en "\e[32m") 31 | MAGENTA=$(echo -en "\e[35m") 32 | NORMAL=$(echo -en "\e[00m") 33 | BOLD=$(echo -en "\e[01m") 34 | fi 35 | 36 | log_header() { 37 | printf "\n${BOLD}${MAGENTA}========== %s ==========${NORMAL}\n" "$@" >&2 38 | } 39 | 40 | log_success() { 41 | printf "${GREEN}✔ %s${NORMAL}\n" "$@" >&2 42 | } 43 | 44 | log_failure() { 45 | printf "${RED}✖ %s${NORMAL}\n" "$@" >&2 46 | } 47 | 48 | 49 | assert_eq() { 50 | local expected="$1" 51 | local actual="$2" 52 | local msg="${3-}" 53 | 54 | if [ "$expected" == "$actual" ]; then 55 | return 0 56 | else 57 | [ "${#msg}" -gt 0 ] && log_failure "$expected == $actual :: $msg" || true 58 | return 1 59 | fi 60 | } 61 | 62 | assert_not_eq() { 63 | local expected="$1" 64 | local actual="$2" 65 | local msg="${3-}" 66 | 67 | if [ ! "$expected" == "$actual" ]; then 68 | return 0 69 | else 70 | [ "${#msg}" -gt 0 ] && log_failure "$expected != $actual :: $msg" || true 71 | return 1 72 | fi 73 | } 74 | 75 | assert_true() { 76 | local actual="$1" 77 | local msg="${2-}" 78 | 79 | assert_eq true "$actual" "$msg" 80 | return "$?" 81 | } 82 | 83 | assert_false() { 84 | local actual="$1" 85 | local msg="${2-}" 86 | 87 | assert_eq false "$actual" "$msg" 88 | return "$?" 89 | } 90 | 91 | assert_array_eq() { 92 | 93 | declare -a expected=("${!1-}") 94 | # echo "AAE ${expected[@]}" 95 | 96 | declare -a actual=("${!2}") 97 | # echo "AAE ${actual[@]}" 98 | 99 | local msg="${3-}" 100 | 101 | local return_code=0 102 | if [ ! "${#expected[@]}" == "${#actual[@]}" ]; then 103 | return_code=1 104 | fi 105 | 106 | local i 107 | for (( i=1; i < ${#expected[@]} + 1; i+=1 )); do 108 | if [ ! "${expected[$i-1]}" == "${actual[$i-1]}" ]; then 109 | return_code=1 110 | break 111 | fi 112 | done 113 | 114 | if [ "$return_code" == 1 ]; then 115 | [ "${#msg}" -gt 0 ] && log_failure "(${expected[*]}) != (${actual[*]}) :: $msg" || true 116 | fi 117 | 118 | return "$return_code" 119 | } 120 | 121 | assert_array_not_eq() { 122 | 123 | declare -a expected=("${!1-}") 124 | declare -a actual=("${!2}") 125 | 126 | local msg="${3-}" 127 | 128 | local return_code=1 129 | if [ ! "${#expected[@]}" == "${#actual[@]}" ]; then 130 | return_code=0 131 | fi 132 | 133 | local i 134 | for (( i=1; i < ${#expected[@]} + 1; i+=1 )); do 135 | if [ ! "${expected[$i-1]}" == "${actual[$i-1]}" ]; then 136 | return_code=0 137 | break 138 | fi 139 | done 140 | 141 | if [ "$return_code" == 1 ]; then 142 | [ "${#msg}" -gt 0 ] && log_failure "(${expected[*]}) == (${actual[*]}) :: $msg" || true 143 | fi 144 | 145 | return "$return_code" 146 | } 147 | 148 | assert_empty() { 149 | local actual=$1 150 | local msg="${2-}" 151 | 152 | assert_eq "" "$actual" "$msg" 153 | return "$?" 154 | } 155 | 156 | assert_not_empty() { 157 | local actual=$1 158 | local msg="${2-}" 159 | 160 | assert_not_eq "" "$actual" "$msg" 161 | return "$?" 162 | } 163 | 164 | assert_contain() { 165 | local haystack="$1" 166 | local needle="${2-}" 167 | local msg="${3-}" 168 | 169 | if [ -z "${needle:+x}" ]; then 170 | return 0; 171 | fi 172 | 173 | if [ -z "${haystack##*$needle*}" ]; then 174 | return 0 175 | else 176 | [ "${#msg}" -gt 0 ] && log_failure "$haystack doesn't contain $needle :: $msg" || true 177 | return 1 178 | fi 179 | } 180 | 181 | assert_not_contain() { 182 | local haystack="$1" 183 | local needle="${2-}" 184 | local msg="${3-}" 185 | 186 | if [ -z "${needle:+x}" ]; then 187 | return 0; 188 | fi 189 | 190 | if [ "${haystack##*$needle*}" ]; then 191 | return 0 192 | else 193 | [ "${#msg}" -gt 0 ] && log_failure "$haystack contains $needle :: $msg" || true 194 | return 1 195 | fi 196 | } 197 | 198 | assert_gt() { 199 | local first="$1" 200 | local second="$2" 201 | local msg="${3-}" 202 | 203 | if [[ "$first" -gt "$second" ]]; then 204 | return 0 205 | else 206 | [ "${#msg}" -gt 0 ] && log_failure "$first > $second :: $msg" || true 207 | return 1 208 | fi 209 | } 210 | 211 | assert_ge() { 212 | local first="$1" 213 | local second="$2" 214 | local msg="${3-}" 215 | 216 | if [[ "$first" -ge "$second" ]]; then 217 | return 0 218 | else 219 | [ "${#msg}" -gt 0 ] && log_failure "$first >= $second :: $msg" || true 220 | return 1 221 | fi 222 | } 223 | 224 | assert_lt() { 225 | local first="$1" 226 | local second="$2" 227 | local msg="${3-}" 228 | 229 | if [[ "$first" -lt "$second" ]]; then 230 | return 0 231 | else 232 | [ "${#msg}" -gt 0 ] && log_failure "$first < $second :: $msg" || true 233 | return 1 234 | fi 235 | } 236 | 237 | assert_le() { 238 | local first="$1" 239 | local second="$2" 240 | local msg="${3-}" 241 | 242 | if [[ "$first" -le "$second" ]]; then 243 | return 0 244 | else 245 | [ "${#msg}" -gt 0 ] && log_failure "$first <= $second :: $msg" || true 246 | return 1 247 | fi 248 | } -------------------------------------------------------------------------------- /main.nf: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env nextflow 2 | 3 | nextflow.enable.dsl = 2 4 | 5 | include { MUTECT2 } from './modules/01_mutect2' 6 | include { LEARN_READ_ORIENTATION_MODEL } from './modules/02_learn_read_orientation' 7 | include { PILEUP_SUMMARIES } from './modules/03_pileup_summary' 8 | include { CALCULATE_CONTAMINATION } from './modules/04_calculate_contamination' 9 | include { FILTER_CALLS } from './modules/05_filter_calls' 10 | include { FUNCOTATOR } from './modules/06_annotate' 11 | 12 | params.help= false 13 | params.input_files = false 14 | params.input_name = false 15 | params.input_tumor_bam = false 16 | params.input_normal_bam = false 17 | params.reference = false 18 | params.gnomad = false 19 | params.output = 'output' 20 | params.funcotator = false 21 | 22 | def helpMessage() { 23 | log.info params.help_message 24 | } 25 | 26 | if (params.help) { 27 | helpMessage() 28 | exit 0 29 | } 30 | if (!params.reference) { 31 | log.error "--reference is required" 32 | exit 1 33 | } 34 | 35 | // checks required inputs 36 | if (params.input_files) { 37 | Channel 38 | .fromPath(params.input_files) 39 | .splitCsv(header: ['name', 'tumor_bam', 'normal_bam'], sep: "\t") 40 | .map{ row-> tuple(row.name, row.tumor_bam, row.normal_bam) } 41 | .set { input_files } 42 | } else if (params.input_name && params.input_tumor_bam && params.input_normal_bam) { 43 | Channel 44 | .fromList([tuple(params.input_name, params.input_tumor_bam, params.input_normal_bam)]) 45 | .set { input_files } 46 | } else { 47 | exit 1, "Input file not specified!" 48 | } 49 | 50 | workflow { 51 | 52 | MUTECT2(input_files) 53 | LEARN_READ_ORIENTATION_MODEL(MUTECT2.out.f1r2_stats) 54 | 55 | if (params.gnomad) { 56 | PILEUP_SUMMARIES(input_files) 57 | CALCULATE_CONTAMINATION(PILEUP_SUMMARIES.out.pileupsummaries) 58 | FILTER_CALLS( 59 | CALCULATE_CONTAMINATION.out.contaminationTables.join( 60 | LEARN_READ_ORIENTATION_MODEL.out.read_orientation_model).join(MUTECT2.out.unfiltered_vcfs)) 61 | } 62 | else { 63 | FILTER_CALLS( 64 | input_files.map{ row-> tuple(row[0], file("dummy"), file("dummy2")) }.join( 65 | LEARN_READ_ORIENTATION_MODEL.out.read_orientation_model).join(MUTECT2.out.unfiltered_vcfs)) 66 | } 67 | 68 | FILTER_CALLS.out.final_vcfs.map {it.join("\t")}.collectFile(name: "${params.output}/mutect2_output_files.txt", newLine: true) 69 | if(params.funcotator){ 70 | FUNCOTATOR(FILTER_CALLS.out.anno_input) 71 | } 72 | } 73 | -------------------------------------------------------------------------------- /modules/01_mutect2.nf: -------------------------------------------------------------------------------- 1 | params.memory_mutect2 = "16g" 2 | params.output = 'output' 3 | params.gnomad = false 4 | params.pon = false 5 | params.disable_common_germline_filter = false 6 | params.reference = false 7 | params.intervals = false 8 | params.args_mutect2 = "" 9 | params.enable_bam_output = false 10 | 11 | 12 | process MUTECT2 { 13 | cpus 2 14 | memory params.memory_mutect2 15 | tag "${name}" 16 | publishDir "${params.output}/${name}", mode: "copy" 17 | 18 | conda (params.enable_conda ? "bioconda::gatk4=4.2.6.1 bioconda::samtools=1.12" : null) 19 | 20 | input: 21 | tuple val(name), val(tumor_bam), val(normal_bam) 22 | 23 | output: 24 | tuple val("${name}"), path("${name}.mutect2.unfiltered.vcf"), path("${name}.mutect2.unfiltered.vcf.stats"), emit: unfiltered_vcfs 25 | tuple val("${name}"), path("${name}.f1r2.tar.gz"), emit: f1r2_stats 26 | tuple path("${name}.mutect2.assembled_haplotypes.bam"), path("${name}.mutect2.assembled_haplotypes.bai"), optional: true 27 | 28 | script: 29 | normal_panel_option = params.pon ? "--panel-of-normals ${params.pon}" : "" 30 | germline_filter = params.disable_common_germline_filter || ! params.gnomad ? "" : "--germline-resource ${params.gnomad}" 31 | normal_inputs = normal_bam.split(",").collect({v -> "--input $v"}).join(" ") 32 | tumor_inputs = tumor_bam.split(",").collect({v -> "--input $v"}).join(" ") 33 | normalRGSMs = normal_bam.split(",").collect({v -> "\$(samtools view -H $v | grep -oP '(?<=SM:)[^ |\\t]*' | head -1)"}) 34 | normalRGSM = normalRGSMs.first() 35 | tumorRGSMs = tumor_bam.split(",").collect({v -> "\$(samtools view -H $v | grep -oP '(?<=SM:)[^ |\\t]*' | head -1)"}) 36 | tumorRGSM = tumorRGSMs.first() 37 | intervals_option = params.intervals ? "--intervals ${params.intervals}" : "" 38 | bam_output_option = params.enable_bam_output ? "--bam-output ${name}.mutect2.assembled_haplotypes.bam" : "" 39 | """ 40 | # sanity checks on the RGSM 41 | source assert.sh 42 | 43 | assert_eq \$(echo ${tumorRGSMs} | sed 's/\\[//g' | sed 's/\\]//g' | sed 's/, /\\n/g' | sort | uniq | wc -l) 1 "All tumor BAMs RGSM tags must be equal" 44 | assert_eq \$(echo ${normalRGSMs} | sed 's/\\[//g' | sed 's/\\]//g' | sed 's/, /\\n/g' | sort | uniq | wc -l) 1 "All normal BAMs RGSM tags must be equal" 45 | assert_not_eq "${normalRGSM}" "${tumorRGSM}" "Tumor and normal RGSM must be different!" 46 | 47 | gatk --java-options '-Xmx${params.memory_mutect2}' Mutect2 \ 48 | --reference ${params.reference} \ 49 | ${intervals_option} \ 50 | ${germline_filter} \ 51 | ${normal_panel_option} \ 52 | ${bam_output_option} \ 53 | ${normal_inputs} --normal-sample ${normalRGSM} \ 54 | ${tumor_inputs} --tumor-sample ${tumorRGSM} \ 55 | --output ${name}.mutect2.unfiltered.vcf \ 56 | --f1r2-tar-gz ${name}.f1r2.tar.gz ${params.args_mutect2} 57 | """ 58 | } 59 | -------------------------------------------------------------------------------- /modules/02_learn_read_orientation.nf: -------------------------------------------------------------------------------- 1 | params.memory_read_orientation = "16g" 2 | params.output = 'output' 3 | 4 | 5 | process LEARN_READ_ORIENTATION_MODEL { 6 | cpus 2 7 | memory params.memory_read_orientation 8 | tag "${name}" 9 | publishDir "${params.output}/${name}", mode: "copy" 10 | 11 | conda (params.enable_conda ? "bioconda::gatk4=4.2.6.1" : null) 12 | 13 | input: 14 | tuple val(name), path(f1r2_stats) 15 | 16 | output: 17 | tuple val(name), path("${name}.read-orientation-model.tar.gz"), emit: read_orientation_model 18 | 19 | """ 20 | gatk --java-options '-Xmx${params.memory_read_orientation}' LearnReadOrientationModel \ 21 | --input ${f1r2_stats} \ 22 | --output ${name}.read-orientation-model.tar.gz 23 | """ 24 | } 25 | -------------------------------------------------------------------------------- /modules/03_pileup_summary.nf: -------------------------------------------------------------------------------- 1 | params.memory_pileup = "32g" 2 | params.output = 'output' 3 | params.gnomad = false 4 | 5 | 6 | process PILEUP_SUMMARIES { 7 | cpus 2 8 | memory params.memory_pileup 9 | tag "${name}" 10 | publishDir "${params.output}/${name}", mode: "copy" 11 | 12 | conda (params.enable_conda ? "bioconda::gatk4=4.2.6.1" : null) 13 | 14 | input: 15 | tuple val(name), val(tumor_bam), val(normal_bam) 16 | 17 | output: 18 | tuple val("${name}"), path("${name}.pileupsummaries.table"), emit: pileupsummaries 19 | 20 | script: 21 | tumor_inputs = tumor_bam.split(",").collect({v -> "--input $v"}).join(" ") 22 | """ 23 | gatk --java-options '-Xmx${params.memory_pileup}' GetPileupSummaries \ 24 | --intervals ${params.gnomad} \ 25 | --variant ${params.gnomad} \ 26 | ${tumor_inputs} \ 27 | --output ${name}.pileupsummaries.table 28 | """ 29 | } -------------------------------------------------------------------------------- /modules/04_calculate_contamination.nf: -------------------------------------------------------------------------------- 1 | params.memory_contamination = "16g" 2 | params.output = 'output' 3 | 4 | 5 | process CALCULATE_CONTAMINATION { 6 | cpus 2 7 | memory params.memory_contamination 8 | tag "${name}" 9 | publishDir "${params.output}/${name}", mode: "copy" 10 | 11 | conda (params.enable_conda ? "bioconda::gatk4=4.2.6.1" : null) 12 | 13 | input: 14 | tuple val(name), path(table) 15 | 16 | output: 17 | tuple val(name), path("${name}.segments.table"), path("${name}.calculatecontamination.table"), emit: contaminationTables 18 | 19 | """ 20 | gatk --java-options '-Xmx${params.memory_contamination}' CalculateContamination \ 21 | --input ${table} \ 22 | -tumor-segmentation ${name}.segments.table \ 23 | --output ${name}.calculatecontamination.table 24 | """ 25 | } -------------------------------------------------------------------------------- /modules/05_filter_calls.nf: -------------------------------------------------------------------------------- 1 | params.memory_filter = "16g" 2 | params.output = 'output' 3 | params.reference = false 4 | params.args_filter = "" 5 | 6 | 7 | process FILTER_CALLS { 8 | cpus 2 9 | memory params.memory_filter 10 | tag "${name}" 11 | publishDir "${params.output}/${name}", mode: "copy" 12 | 13 | conda (params.enable_conda ? "bioconda::gatk4=4.2.6.1" : null) 14 | 15 | input: 16 | tuple val(name), path(segments_table), path(contamination_table), path(model), path(unfiltered_vcf), path(vcf_stats) 17 | 18 | output: 19 | tuple val(name), val("${params.output}/${name}/${name}.mutect2.vcf"), emit: final_vcfs 20 | tuple val(name), path("${name}.mutect2.vcf"), emit: anno_input 21 | path "${name}.mutect2.vcf" 22 | 23 | script: 24 | segments_table_param = segments_table.exists() ? "--tumor-segmentation ${segments_table}" : "" 25 | contamination_table_param = contamination_table.exists() ? "--contamination-table ${contamination_table}" : "" 26 | """ 27 | gatk --java-options '-Xmx${params.memory_filter}' FilterMutectCalls \ 28 | -V ${unfiltered_vcf} \ 29 | --reference ${params.reference} \ 30 | ${segments_table_param} \ 31 | ${contamination_table_param} \ 32 | --ob-priors ${model} \ 33 | --output ${name}.mutect2.vcf ${params.args_filter} 34 | """ 35 | } -------------------------------------------------------------------------------- /modules/06_annotate.nf: -------------------------------------------------------------------------------- 1 | params.memory_funcotator = "16g" 2 | params.funcotator = false 3 | params.reference = false 4 | params.reference_version_funcotator = "hg19" 5 | params.output_format_funcotator = "MAF" 6 | params.transcript_selection_mode_funcotator = "CANONICAL" 7 | params.args_funcotator = "" 8 | 9 | process FUNCOTATOR { 10 | 11 | cpus 2 12 | memory params.memory_funcotator 13 | tag "${name}" 14 | publishDir "${params.output}/${name}", mode: "copy" 15 | 16 | conda (params.enable_conda ? "bioconda::gatk4=4.2.6.1" : null) 17 | 18 | input: 19 | tuple val(name), path(vcf) 20 | 21 | output: 22 | tuple val(name), val("${params.output}/${name}/${name}.mutect2.funcotated.vcf"), emit: vcf_anno 23 | path "${name}.mutect2.funcotated.maf" 24 | 25 | """ 26 | gatk --java-options '-Xmx${params.memory_funcotator}' Funcotator \ 27 | --variant ${vcf} \ 28 | --reference ${params.reference} \ 29 | --ref-version ${params.reference_version_funcotator} \ 30 | --data-sources-path ${params.funcotator} \ 31 | --output ${name}.mutect2.funcotated.maf \ 32 | --output-file-format ${params.output_format_funcotator} \ 33 | --transcript-selection-mode ${params.transcript_selection_mode_funcotator} \ 34 | ${params.args_funcotator} 35 | """ 36 | } -------------------------------------------------------------------------------- /mutect2_pon.nf: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env nextflow 2 | 3 | gatk4_jar = "/code/gatk/4.1.3.0/gatk-package-4.1.3.0-local.jar" 4 | // this version is needed to build a PON from multiple VCFs, it may change in the future 5 | gatk40_jar = "/code/gatk/4.0.12.0/gatk-package-4.0.12.0-local.jar" 6 | picard_jar = "/code/picard/2.21.2/picard.jar" 7 | 8 | params.help= false 9 | params.input_files = false 10 | params.reference = "/projects/data/gatk_bundle/hg19/ucsc.hg19.fasta" // TODO: remove this hard coded bit 11 | params.intervals = "/projects/data/gatk_bundle/hg19/hg19_refseq_exons.sorted.merged.bed" 12 | params.output = 'output' 13 | 14 | def helpMessage() { 15 | log.info""" 16 | Usage: 17 | mutect2_pon.nf --input_files input_files 18 | 19 | This workflow aims to compute a panel of normals to be used with MuTect2 20 | 21 | Input: 22 | * input_files: the path to a file containing in each row the sample name and the path to a BAM file to be included in the PON 23 | example: 24 | sample1 /path/to/sample1.bam 25 | sample2 /path/to/sample2.bam 26 | NOTE: the sample name must be set in the @SN annotation 27 | 28 | Optional input: 29 | * output: the folder where to publish output 30 | 31 | Output: 32 | * Output combined VCF pon.vcf 33 | """ 34 | } 35 | 36 | if (params.help) { 37 | helpMessage() 38 | exit 0 39 | } 40 | 41 | // checks required inputs 42 | if (params.input_files) { 43 | Channel 44 | .fromPath(params.input_files) 45 | .splitCsv(header: ['name', 'bam'], sep: "\t") 46 | .map { row-> tuple(row.name, file(row.bam), file(row.bam.replaceAll(/.bam$/, ".bai"))) } 47 | .set { input_files } 48 | } else { 49 | exit 1, "Input file not specified!" 50 | } 51 | 52 | if (params.intervals) { 53 | Channel 54 | .fromPath(params.intervals) 55 | .splitText(by: 30000, file: true) 56 | .set { intervals } 57 | } 58 | 59 | process mutect2Pon { 60 | cpus 2 61 | memory '16g' 62 | module 'java/1.8.0' 63 | errorStrategy 'finish' 64 | 65 | input: 66 | set val(name), file(bam), file(bai), file(interval) from input_files.combine(intervals) 67 | 68 | output: 69 | set val("${bam.baseName}"), file("${bam.baseName}.${interval.baseName}.mutect.vcf") into mutect_vcfs 70 | 71 | """ 72 | mkdir -p `pwd`/scratch/tmp 73 | java -Xmx16g -Djava.io.tmpdir=`pwd`/scratch/tmp -jar ${gatk4_jar} \ 74 | Mutect2 \ 75 | --reference ${params.reference} \ 76 | --intervals ${interval} \ 77 | --input ${bam} \ 78 | --tumor-sample ${name} \ 79 | --max-mnp-distance 0 \ 80 | --output ${bam.baseName}.${interval.baseName}.mutect.vcf 81 | """ 82 | } 83 | 84 | process gatherVcfs { 85 | cpus 1 86 | memory '32g' 87 | module 'java/1.8.0' 88 | publishDir "${params.output}", mode: "copy" 89 | 90 | input: 91 | set name, file(vcf_list) from mutect_vcfs.groupTuple() // group by name 92 | 93 | output: 94 | file("${name}.vcf") into whole_vcfs 95 | 96 | script: 97 | // NOTE: the input VCFs need to be provided in order by genomic coordinates 98 | input_vcfs = "$vcf_list".split(" ") 99 | .sort{ a, b -> a.tokenize(".")[-3].toInteger().compareTo b.tokenize(".")[-3].toInteger() } 100 | .collect{"INPUT=" + it}.join(" ") 101 | """ 102 | mkdir -p `pwd`/scratch/tmp 103 | java -Xmx32g -Djava.io.tmpdir=`pwd`/scratch/tmp -jar $picard_jar \ 104 | GatherVcfs \ 105 | ${input_vcfs} \ 106 | OUTPUT=${name}.vcf 107 | """ 108 | } 109 | 110 | process createPON { 111 | cpus 1 112 | memory '32g' 113 | module 'java/1.8.0' 114 | publishDir "${params.output}", mode: "copy" 115 | 116 | input: 117 | file(vcf_list) from whole_vcfs.collect() 118 | 119 | output: 120 | file("pon.vcf") 121 | file("pon.vcf.idx") 122 | 123 | script: 124 | input_vcfs = "$vcf_list".split(" ").collect{"--vcfs " + it}.join(" ") 125 | """ 126 | # combines VCFs and keeps variants occuring in at least two VCFs 127 | mkdir -p `pwd`/scratch/tmp 128 | java -Xmx32g -Djava.io.tmpdir=`pwd`/scratch/tmp -jar $gatk40_jar \ 129 | CreateSomaticPanelOfNormals ${input_vcfs} \ 130 | --output pon.vcf 131 | """ 132 | } 133 | -------------------------------------------------------------------------------- /nextflow.config: -------------------------------------------------------------------------------- 1 | /* 2 | * ------------------------------------------------- 3 | * TRON-Bioinformatics/tronflow-mutect2 Nextflow config file 4 | * ------------------------------------------------- 5 | * Default config options for all environments. 6 | */ 7 | 8 | profiles { 9 | conda { 10 | params.enable_conda = true 11 | conda.enabled = true 12 | } 13 | debug { process.beforeScript = 'echo $HOSTNAME' } 14 | ci { 15 | params.memory_mutect2 = "2g" 16 | params.cpus_mutect2 = 1 17 | params.memory_read_orientation = "2g" 18 | params.cpus_read_orientation = 1 19 | params.memory_pileup = "2g" 20 | params.cpus_pileup = 1 21 | params.memory_contamination = "2g" 22 | params.cpus_contamination = 1 23 | params.memory_filter = "2g" 24 | params.cpus_filter = 1 25 | timeline.enabled = false 26 | report.enabled = false 27 | trace.enabled = false 28 | dag.enabled = false 29 | } 30 | test { 31 | params.reference = "$baseDir/test_data/ucsc.hg19.minimal.fasta" 32 | params.intervals = "$baseDir/test_data/intervals.minimal.bed" 33 | params.gnomad = "$baseDir/test_data/gnomad.minimal.vcf.gz" 34 | } 35 | } 36 | 37 | // Export this variable to prevent local Python libraries from conflicting with those in the container 38 | env { 39 | PYTHONNOUSERSITE = 1 40 | } 41 | 42 | // Capture exit codes from upstream processes when piping 43 | process.shell = ['/bin/bash', '-euo', 'pipefail'] 44 | 45 | VERSION = '1.8.1' 46 | DOI = 'https://zenodo.org/badge/latestdoi/355860788' 47 | 48 | manifest { 49 | name = 'TRON-Bioinformatics/tronflow-mutect2' 50 | author = 'Pablo Riesgo-Ferreiro, Özlem Muslu, Luisa Bresadola' 51 | homePage = 'https://github.com/TRON-Bioinformatics/tronflow-mutect2' 52 | description = 'Mutect2 best practices workflow' 53 | mainScript = 'main.nf' 54 | nextflowVersion = '>=19.10.0' 55 | version = VERSION 56 | doi = DOI 57 | } 58 | 59 | params.help_message = """ 60 | TronFlow Mutect2 v${VERSION} ${DOI} 61 | 62 | Usage: 63 | nextflow run tron-bioinformatics/tronflow-mutect2 -profile conda --input_files input_files [--reference reference.fasta] 64 | 65 | This workflow is based on the implementation at /code/iCaM/scripts/mutect2_ID.sh 66 | 67 | Input: 68 | * input_files: the path to a tab-separated values file containing in each row the sample name, tumor bam and normal bam 69 | The input file does not have header! 70 | Example input file: 71 | name1 tumor_bam1 normal_bam1 72 | name2 tumor_bam2 normal_bam2 73 | * reference: path to the FASTA genome reference (indexes expected *.fai, *.dict) 74 | 75 | Optional input: 76 | * input_name: sample name (alternative to --input_files) 77 | * input_tumor_bam: comma separated list of tumor BAMs (alternative to --input_files) 78 | * input_normal_bam: comma separated list of normal BAMs (alternative to --input_files) 79 | * gnomad: path to the gnomad VCF or other germline resource (recommended). If not provided the contamination will 80 | not be estimated and the filter of common germline variants will be disabled 81 | * pon: path to the panel of normals VCF 82 | * intervals: path to a BED file containing the regions to analyse 83 | * output: the folder where to publish output 84 | * enable_bam_output: outputs a new BAM file with the Mutect2 reassembly of reads (default: false) 85 | * disable_common_germline_filter: disable the use of GnomAD to filter out common variants in the population 86 | from the somatic calls. The GnomAD resource is still required though as this common SNPs are used elsewhere to 87 | calculate the contamination (default: false) 88 | * funcotator: To use Funcotator, supply the path to a database to be used. (can be downloaded from GATK FTP server) 89 | * reference_version_funcotator: version of the reference genome (default: "hg19") 90 | * output_format_funcotator: the output format of Funcotator. Can be VCF or MAF (default: "MAF") 91 | * transcript_selection_mode_funcotator: transcript selection method can be CANONICAL, BEST_EFFECT or ALL. (default: CANONICAL) 92 | * memory_mutect2: the ammount of memory used by mutect2 (default: 16g) 93 | * memory_read_orientation: the ammount of memory used by learn read orientation (default: 16g) 94 | * memory_pileup: the ammount of memory used by pileup (default: 32g) 95 | * memory_contamination: the ammount of memory used by contamination (default: 16g) 96 | * memory_filter: the ammount of memory used by filter (default: 16g) 97 | * memory_funcotator: the ammount of memory used by filter (default: 16g) 98 | * args_filter: optional arguments to the FilterMutectCalls function of GATK (e.g.: "--contamination-estimate 0.4 --min-allele-fraction 0.05 --min-reads-per-strand 1 --unique-alt-read-count 4") (see FilterMutectCalls documentation) 99 | * args_funcotator: optional arguments to Funcotator (e.g. "--remove-filtered-variants true") (see Funcotator documentation) 100 | * args_mutect2: optional arguments to Mutect2 (e.g. "--sites-only-vcf-output") (see Mutect2 documentation) 101 | 102 | 103 | 104 | Output: 105 | * Output VCF/MAF 106 | * Other intermediate files 107 | """ 108 | -------------------------------------------------------------------------------- /test_data/SRR8244836.preprocessed.downsampled.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TRON-Bioinformatics/tronflow-mutect2/efc439ea2d571b54886086922c94482dca089fb9/test_data/SRR8244836.preprocessed.downsampled.bam -------------------------------------------------------------------------------- /test_data/SRR8244836.preprocessed.downsampled.bam.bai: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TRON-Bioinformatics/tronflow-mutect2/efc439ea2d571b54886086922c94482dca089fb9/test_data/SRR8244836.preprocessed.downsampled.bam.bai -------------------------------------------------------------------------------- /test_data/SRR8244887.preprocessed.downsampled.bam: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TRON-Bioinformatics/tronflow-mutect2/efc439ea2d571b54886086922c94482dca089fb9/test_data/SRR8244887.preprocessed.downsampled.bam -------------------------------------------------------------------------------- /test_data/SRR8244887.preprocessed.downsampled.bam.bai: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TRON-Bioinformatics/tronflow-mutect2/efc439ea2d571b54886086922c94482dca089fb9/test_data/SRR8244887.preprocessed.downsampled.bam.bai -------------------------------------------------------------------------------- /test_data/gnomad.minimal.vcf.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TRON-Bioinformatics/tronflow-mutect2/efc439ea2d571b54886086922c94482dca089fb9/test_data/gnomad.minimal.vcf.gz -------------------------------------------------------------------------------- /test_data/gnomad.minimal.vcf.gz.tbi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TRON-Bioinformatics/tronflow-mutect2/efc439ea2d571b54886086922c94482dca089fb9/test_data/gnomad.minimal.vcf.gz.tbi -------------------------------------------------------------------------------- /test_data/intervals.minimal.bed: -------------------------------------------------------------------------------- 1 | chr1 11873 12227 NR_046018#exon.0 255 + 2 | chr1 12612 12721 NR_046018#exon.1 255 + 3 | chr1 13220 14409 NR_046018#exon.2 255 + 4 | chr1 14361 14829 NR_024540#exon.10 255 - 5 | chr1 14969 15038 NR_024540#exon.9 255 - 6 | chr1 15795 15947 NR_024540#exon.8 255 - 7 | chr1 16606 16765 NR_024540#exon.7 255 - 8 | chr1 16857 17055 NR_024540#exon.6 255 - 9 | chr1 17232 17368 NR_024540#exon.5 255 - 10 | chr1 17605 17742 NR_024540#exon.4 255 - 11 | chr1 17914 18061 NR_024540#exon.3 255 - 12 | chr1 18267 18366 NR_024540#exon.2 255 - 13 | chr1 24737 24891 NR_024540#exon.1 255 - 14 | chr1 29320 29370 NR_024540#exon.0 255 - 15 | chr1 34610 35174 NR_026820#exon.2 255 - 16 | chr1 34610 35174 NR_026818#exon.2 255 - 17 | chr1 35276 35481 NR_026820#exon.1 255 - 18 | chr1 35276 35481 NR_026818#exon.1 255 - 19 | chr1 35720 36081 NR_026820#exon.0 255 - 20 | chr1 35720 36081 NR_026818#exon.0 255 - 21 | chr1 69090 70008 NM_001005484#exon.0 255 + 22 | chr1 134772 139696 NR_039983#exon.2 255 - 23 | chr1 139789 139847 NR_039983#exon.1 255 - 24 | chr1 140074 140566 NR_039983#exon.0 255 - 25 | chr1 323891 324060 NR_028327#exon.0 255 + 26 | chr1 323891 324060 NR_028322#exon.0 255 + 27 | chr1 323891 324060 NR_028325#exon.0 255 + 28 | chr1 324287 324345 NR_028327#exon.1 255 + 29 | chr1 324287 324345 NR_028325#exon.1 255 + 30 | chr1 324287 324345 NR_028322#exon.1 255 + 31 | chr1 324438 326938 NR_028327#exon.2 255 + 32 | chr1 324438 328581 NR_028325#exon.2 255 + 33 | chr1 324438 328581 NR_028322#exon.2 255 + 34 | chr1 327035 328581 NR_028327#exon.3 255 + 35 | chr1 367658 368597 NM_001005224#exon.0 255 + 36 | chr1 367658 368597 NM_001005221#exon.0 255 + 37 | chr1 367658 368597 NM_001005277#exon.0 255 + 38 | chr1 621095 622034 NM_001005224#exon.0 255 - 39 | chr1 621095 622034 NM_001005221#exon.0 255 - 40 | chr1 621095 622034 NM_001005277#exon.0 255 - 41 | chr1 661138 665184 NR_028327#exon.2 255 - 42 | chr1 665277 665335 NR_028327#exon.1 255 - 43 | chr1 665562 665731 NR_028327#exon.0 255 - 44 | chr1 700244 700627 NR_033908#exon.6 255 - 45 | chr1 701708 701767 NR_033908#exon.5 255 - 46 | chr1 703927 703993 NR_033908#exon.4 255 - 47 | chr1 704876 705092 NR_033908#exon.3 255 - 48 | chr1 708355 708487 NR_033908#exon.2 255 - 49 | chr1 709550 709660 NR_033908#exon.1 255 - 50 | chr1 713663 714068 NR_033908#exon.0 255 - 51 | chr1 761585 762902 NR_024321#exon.0 255 - 52 | chr1 762970 763155 NR_047520#exon.0 255 + 53 | chr1 762970 763155 NR_047526#exon.0 255 + 54 | chr1 762970 763155 NR_047523#exon.0 255 + 55 | chr1 762970 763155 NR_047524#exon.0 255 + 56 | chr1 762970 763155 NR_047522#exon.0 255 + 57 | chr1 762970 763155 NR_047519#exon.0 255 + 58 | chr1 762970 763155 NR_015368#exon.0 255 + 59 | chr1 762970 763155 NR_047521#exon.0 255 + 60 | chr1 763177 763229 NR_047525#exon.0 255 + 61 | chr1 764382 764484 NR_047522#exon.1 255 + 62 | chr1 764382 764484 NR_047525#exon.1 255 + 63 | chr1 764382 764484 NR_015368#exon.1 255 + 64 | chr1 764382 764484 NR_047524#exon.1 255 + 65 | chr1 764382 764484 NR_047521#exon.1 255 + 66 | chr1 764382 764484 NR_047523#exon.1 255 + 67 | chr1 764382 764484 NR_047519#exon.1 255 + 68 | chr1 764382 764484 NR_047520#exon.1 255 + 69 | chr1 764382 764484 NR_047526#exon.1 255 + 70 | chr1 776579 778984 NR_047526#exon.2 255 + 71 | chr1 783033 783186 NR_015368#exon.2 255 + 72 | chr1 783033 783186 NR_047519#exon.2 255 + 73 | chr1 783033 783186 NR_047520#exon.2 255 + 74 | chr1 787306 787490 NR_047520#exon.3 255 + 75 | chr1 787306 787490 NR_047525#exon.2 255 + 76 | chr1 787306 787490 NR_047521#exon.2 255 + 77 | chr1 787306 787490 NR_047523#exon.2 255 + 78 | chr1 787306 787490 NR_047522#exon.2 255 + 79 | chr1 787306 787490 NR_047519#exon.3 255 + 80 | chr1 787306 787490 NR_015368#exon.3 255 + 81 | chr1 788050 788146 NR_047520#exon.4 255 + 82 | chr1 788050 788146 NR_015368#exon.4 255 + 83 | chr1 788050 788146 NR_047525#exon.3 255 + 84 | chr1 788050 788146 NR_047522#exon.3 255 + 85 | chr1 788050 788146 NR_047524#exon.2 255 + 86 | chr1 788050 788146 NR_047519#exon.4 255 + 87 | chr1 788050 788146 NR_047521#exon.3 255 + 88 | chr1 788770 794826 NR_047521#exon.4 255 + 89 | chr1 788770 788809 NR_047520#exon.5 255 + 90 | chr1 788770 788854 NR_015368#exon.5 255 + 91 | chr1 788770 794826 NR_047519#exon.5 255 + 92 | chr1 788770 794826 NR_047523#exon.3 255 + 93 | chr1 788770 794826 NR_047524#exon.3 255 + 94 | chr1 788770 794826 NR_047525#exon.4 255 + 95 | chr1 788770 788809 NR_047522#exon.4 255 + 96 | chr1 788858 794826 NR_047522#exon.5 255 + 97 | chr1 788858 794826 NR_047520#exon.6 255 + 98 | chr1 788952 788956 NR_015368#exon.6 255 + 99 | chr1 789152 794826 NR_015368#exon.7 255 + 100 | chr1 803450 804055 NR_027055#exon.2 255 - 101 | chr1 809491 810535 NR_027055#exon.1 255 - 102 | chr1 812125 812182 NR_027055#exon.0 255 - 103 | chr1 852952 853100 NR_026874#exon.3 255 - 104 | chr1 853401 853555 NR_026874#exon.2 255 - 105 | chr1 854204 854295 NR_026874#exon.1 255 - 106 | chr1 854714 854817 NR_026874#exon.0 255 - 107 | chr1 861120 861180 NM_152486#exon.0 255 + 108 | chr1 861301 861393 NM_152486#exon.1 255 + 109 | chr1 865534 865716 NM_152486#exon.2 255 + 110 | chr1 866418 866469 NM_152486#exon.3 255 + 111 | chr1 871151 871276 NM_152486#exon.4 255 + 112 | chr1 874419 874509 NM_152486#exon.5 255 + 113 | chr1 874654 874840 NM_152486#exon.6 255 + 114 | chr1 876523 876686 NM_152486#exon.7 255 + 115 | chr1 877515 877631 NM_152486#exon.8 255 + 116 | chr1 877789 877868 NM_152486#exon.9 255 + 117 | chr1 877938 878438 NM_152486#exon.10 255 + 118 | chr1 878632 878757 NM_152486#exon.11 255 + 119 | chr1 879077 879188 NM_152486#exon.12 255 + 120 | chr1 879287 879961 NM_152486#exon.13 255 + 121 | chr1 879582 880180 NM_015658#exon.18 255 - 122 | chr1 880436 880526 NM_015658#exon.17 255 - 123 | chr1 880897 881033 NM_015658#exon.16 255 - 124 | chr1 881552 881666 NM_015658#exon.15 255 - 125 | chr1 881781 881925 NM_015658#exon.14 255 - 126 | chr1 883510 883612 NM_015658#exon.13 255 - 127 | chr1 883869 883983 NM_015658#exon.12 255 - 128 | chr1 886506 886618 NM_015658#exon.11 255 - 129 | chr1 887379 887519 NM_015658#exon.10 255 - 130 | chr1 887791 887980 NM_015658#exon.9 255 - 131 | chr1 888554 888668 NM_015658#exon.8 255 - 132 | chr1 889161 889272 NM_015658#exon.7 255 - 133 | chr1 889383 889462 NM_015658#exon.6 255 - 134 | chr1 891302 891393 NM_015658#exon.5 255 - 135 | chr1 891474 891595 NM_015658#exon.4 255 - 136 | chr1 892273 892405 NM_015658#exon.3 255 - 137 | chr1 892478 892653 NM_015658#exon.2 255 - 138 | chr1 894308 894461 NM_015658#exon.1 255 - 139 | chr1 894594 894679 NM_015658#exon.0 255 - 140 | chr1 895966 896180 NM_198317#exon.0 255 + 141 | chr1 896672 896932 NM_198317#exon.1 255 + 142 | chr1 897008 897130 NM_198317#exon.2 255 + 143 | chr1 897205 897427 NM_198317#exon.3 255 + 144 | chr1 897734 897851 NM_198317#exon.4 255 + 145 | chr1 898083 898297 NM_198317#exon.5 255 + 146 | chr1 898488 898633 NM_198317#exon.6 255 + 147 | chr1 898716 898884 NM_198317#exon.7 255 + 148 | chr1 899299 899388 NM_198317#exon.8 255 + 149 | chr1 899486 899560 NM_198317#exon.9 255 + 150 | chr1 899728 899910 NM_198317#exon.10 255 + 151 | chr1 900342 901099 NM_198317#exon.11 255 + 152 | chr1 901876 901994 NM_001160184#exon.0 255 + 153 | chr1 901876 901994 NM_032129#exon.0 255 + 154 | chr1 902083 902183 NM_001160184#exon.1 255 + 155 | chr1 902083 902183 NM_032129#exon.1 255 + 156 | chr1 905656 905803 NM_001160184#exon.2 255 + 157 | chr1 905656 905803 NM_032129#exon.2 255 + 158 | chr1 905900 905981 NM_001160184#exon.3 255 + 159 | chr1 905900 905981 NM_032129#exon.3 255 + 160 | chr1 906065 906138 NM_001160184#exon.4 255 + 161 | chr1 906065 906138 NM_032129#exon.4 255 + 162 | chr1 906258 906386 NM_001160184#exon.5 255 + 163 | chr1 906258 906386 NM_032129#exon.5 255 + 164 | chr1 906456 906588 NM_001160184#exon.6 255 + 165 | chr1 906492 906588 NM_032129#exon.6 255 + 166 | chr1 906703 906784 NM_001160184#exon.7 255 + 167 | chr1 906703 906784 NM_032129#exon.7 255 + 168 | chr1 907454 907530 NM_001160184#exon.8 255 + 169 | chr1 907454 907530 NM_032129#exon.8 255 + 170 | chr1 907667 907804 NM_032129#exon.9 255 + 171 | chr1 907667 907804 NM_001160184#exon.9 255 + 172 | chr1 908240 908390 NM_032129#exon.10 255 + 173 | chr1 908240 908390 NM_001160184#exon.10 255 + 174 | chr1 908565 908706 NM_032129#exon.11 255 + 175 | chr1 908879 909020 NM_001160184#exon.11 255 + 176 | chr1 908879 909020 NM_032129#exon.12 255 + 177 | chr1 909212 909431 NM_032129#exon.13 255 + 178 | chr1 909212 909431 NM_001160184#exon.12 255 + 179 | chr1 909695 909744 NM_032129#exon.14 255 + 180 | chr1 909695 909744 NM_001160184#exon.13 255 + 181 | chr1 909821 910484 NM_032129#exon.15 255 + 182 | chr1 909821 910484 NM_001160184#exon.14 255 + 183 | chr1 910578 911649 NR_027693#exon.4 255 - 184 | chr1 911878 912004 NR_027693#exon.3 255 - 185 | chr1 914260 916037 NR_027693#exon.2 255 - 186 | chr1 916516 916553 NR_027693#exon.1 255 - 187 | chr1 917444 917473 NR_027693#exon.0 255 - 188 | chr1 934341 934812 NM_021170#exon.3 255 - 189 | chr1 934343 934812 NM_001142467#exon.2 255 - 190 | chr1 934905 934993 NM_021170#exon.2 255 - 191 | chr1 934905 934993 NM_001142467#exon.1 255 - 192 | chr1 935071 935167 NM_021170#exon.1 255 - 193 | chr1 935071 935552 NM_001142467#exon.0 255 - 194 | chr1 935245 935552 NM_021170#exon.0 255 - 195 | chr1 948846 948956 NM_005101#exon.0 255 + 196 | chr1 949363 949919 NM_005101#exon.1 255 + 197 | chr1 955502 955753 NM_198576#exon.0 255 + 198 | chr1 957580 957842 NM_198576#exon.1 255 + 199 | chr1 970656 970704 NM_198576#exon.2 255 + 200 | chr1 976044 976260 NM_198576#exon.3 255 + 201 | chr1 976552 976777 NM_198576#exon.4 255 + 202 | chr1 976857 977082 NM_198576#exon.5 255 + 203 | chr1 977335 977542 NM_198576#exon.6 255 + 204 | chr1 978618 978837 NM_198576#exon.7 255 + 205 | chr1 978917 979112 NM_198576#exon.8 255 + 206 | chr1 979202 979403 NM_198576#exon.9 255 + 207 | chr1 979488 979637 NM_198576#exon.10 255 + 208 | chr1 979713 979819 NM_198576#exon.11 255 + 209 | chr1 980540 980657 NM_198576#exon.12 255 + 210 | chr1 980738 980903 NM_198576#exon.13 255 + 211 | chr1 981112 981256 NM_198576#exon.14 255 + 212 | chr1 981343 981468 NM_198576#exon.15 255 + 213 | chr1 981539 981645 NM_198576#exon.16 255 + 214 | chr1 981776 982115 NM_198576#exon.17 255 + 215 | chr1 982199 982337 NM_198576#exon.18 255 + 216 | chr1 982706 982834 NM_198576#exon.19 255 + 217 | chr1 982952 983067 NM_198576#exon.20 255 + 218 | chr1 983155 983275 NM_198576#exon.21 255 + 219 | chr1 983391 983745 NM_198576#exon.22 255 + 220 | chr1 984246 984439 NM_198576#exon.23 255 + 221 | chr1 984615 984831 NM_198576#exon.24 255 + 222 | chr1 984945 985175 NM_198576#exon.25 255 + 223 | chr1 985282 985417 NM_198576#exon.26 255 + 224 | chr1 985612 985709 NM_198576#exon.27 255 + 225 | chr1 985806 985971 NM_198576#exon.28 255 + 226 | chr1 986105 986217 NM_198576#exon.29 255 + 227 | chr1 986632 986749 NM_198576#exon.30 255 + 228 | chr1 986832 987025 NM_198576#exon.31 255 + 229 | chr1 987107 987195 NM_198576#exon.32 255 + 230 | chr1 989132 989357 NM_198576#exon.33 255 + 231 | chr1 989827 989931 NM_198576#exon.34 255 + 232 | chr1 990203 991499 NM_198576#exon.35 255 + 233 | chr2 38813 41627 NM_001077710#exon.1 255 - 234 | chr2 45439 46588 NM_001077710#exon.0 255 - 235 | chr2 218135 219001 NM_001159597#exon.8 255 - 236 | chr2 218135 219001 NM_015677#exon.9 255 - 237 | chr2 224863 224920 NM_015677#exon.8 255 - 238 | chr2 229965 230044 NM_001159597#exon.7 255 - 239 | chr2 229965 230044 NM_015677#exon.7 255 - 240 | chr2 231022 231191 NM_001159597#exon.6 255 - 241 | chr2 231022 231191 NM_015677#exon.6 255 - 242 | chr2 233100 233229 NM_015677#exon.5 255 - 243 | chr2 233100 233229 NM_001159597#exon.5 255 - 244 | chr2 234159 234272 NM_015677#exon.4 255 - 245 | chr2 234159 234272 NM_001159597#exon.4 255 - 246 | chr2 247537 247602 NM_015677#exon.3 255 - 247 | chr2 247537 247602 NM_001159597#exon.3 255 - 248 | chr2 249730 249844 NM_001159597#exon.2 255 - 249 | chr2 249730 249844 NM_015677#exon.2 255 - 250 | chr2 253004 253115 NM_001159597#exon.1 255 - 251 | chr2 253004 253115 NM_015677#exon.1 255 - 252 | chr2 263983 264068 NM_001159597#exon.0 255 - 253 | chr2 263983 264068 NM_015677#exon.0 255 - 254 | chr2 264868 265007 NM_001040649#exon.0 255 + 255 | chr2 264868 265007 NM_007099#exon.0 255 + 256 | chr2 264868 265007 NR_024080#exon.0 255 + 257 | chr2 264868 265007 NM_004300#exon.0 255 + 258 | chr2 271865 271939 NM_001040649#exon.1 255 + 259 | chr2 271865 271939 NM_004300#exon.1 255 + 260 | chr2 271865 271939 NM_007099#exon.1 255 + 261 | chr2 271865 271939 NR_024080#exon.1 255 + 262 | chr2 272036 272065 NR_024080#exon.2 255 + 263 | chr2 272036 272150 NM_004300#exon.2 255 + 264 | chr2 272036 272481 NM_001040649#exon.2 255 + 265 | chr2 272191 272305 NM_007099#exon.2 255 + 266 | chr2 272191 272305 NR_024080#exon.3 255 + 267 | chr2 275139 275201 NM_007099#exon.3 255 + 268 | chr2 275139 275201 NR_024080#exon.4 255 + 269 | chr2 275139 275201 NM_004300#exon.3 255 + 270 | chr2 276979 277085 NM_007099#exon.4 255 + 271 | chr2 276979 277085 NR_024080#exon.5 255 + 272 | chr2 276979 277085 NM_004300#exon.4 255 + 273 | chr2 277226 278282 NM_007099#exon.5 255 + 274 | chr2 277226 278282 NR_024080#exon.6 255 + 275 | chr2 277226 278282 NM_004300#exon.5 255 + 276 | chr2 279560 280152 NM_001002919#exon.5 255 - 277 | chr2 283110 283175 NM_001002919#exon.4 255 - 278 | chr2 286122 286203 NM_001002919#exon.3 255 - 279 | chr2 286289 286343 NM_001002919#exon.2 255 - 280 | chr2 287582 287892 NM_001002919#exon.1 255 - 281 | chr2 288012 288308 NM_001002919#exon.0 255 - 282 | chr2 667972 669675 NM_152834#exon.4 255 - 283 | chr2 669756 669850 NM_152834#exon.3 255 - 284 | chr2 672807 672862 NM_152834#exon.2 255 - 285 | chr2 675509 675630 NM_152834#exon.1 255 - 286 | chr2 677288 677439 NM_152834#exon.0 255 - 287 | chr2 779836 780588 NR_033880#exon.3 255 - 288 | chr2 806036 806257 NR_033880#exon.2 255 - 289 | chr2 849657 849896 NR_033880#exon.1 255 - 290 | chr2 863803 864112 NR_033880#exon.0 255 - 291 | chr2 946553 946754 NM_018968#exon.0 255 + 292 | chr3 238278 238746 NM_006614#exon.0 255 + 293 | chr3 238278 238746 NM_001253387#exon.0 255 + 294 | chr3 239325 239775 NR_045572#exon.0 255 + 295 | chr3 286295 286375 NR_045572#exon.1 255 + 296 | chr3 286295 286375 NM_006614#exon.1 255 + 297 | chr3 286295 286375 NM_001253387#exon.1 255 + 298 | chr3 288250 290282 NR_045572#exon.2 255 + 299 | chr3 361365 361550 NM_006614#exon.2 255 + 300 | chr3 361365 361550 NM_001253388#exon.0 255 + 301 | chr3 361365 361550 NM_001253387#exon.2 255 + 302 | chr3 367641 367747 NM_001253388#exon.1 255 + 303 | chr3 367641 367747 NM_001253387#exon.3 255 + 304 | chr3 367641 367747 NM_006614#exon.3 255 + 305 | chr3 369849 370037 NM_001253387#exon.4 255 + 306 | chr3 369849 370037 NM_006614#exon.4 255 + 307 | chr3 369849 370037 NM_001253388#exon.2 255 + 308 | chr3 382476 382599 NM_001253387#exon.5 255 + 309 | chr3 382476 382599 NM_006614#exon.5 255 + 310 | chr3 382476 382599 NM_001253388#exon.3 255 + 311 | chr3 383594 383765 NM_001253387#exon.6 255 + 312 | chr3 383594 383765 NM_006614#exon.6 255 + 313 | chr3 383594 383765 NM_001253388#exon.4 255 + 314 | chr3 384666 384714 NM_006614#exon.7 255 + 315 | chr3 384666 384714 NM_001253388#exon.5 255 + 316 | chr3 386271 386392 NM_001253387#exon.7 255 + 317 | chr3 386271 386392 NM_001253388#exon.6 255 + 318 | chr3 386271 386392 NM_006614#exon.8 255 + 319 | chr3 391041 391226 NM_001253388#exon.7 255 + 320 | chr3 391041 391226 NM_006614#exon.9 255 + 321 | chr3 391041 391226 NM_001253387#exon.8 255 + 322 | chr3 396322 396454 NM_001253388#exon.8 255 + 323 | chr3 396322 396454 NM_006614#exon.10 255 + 324 | chr3 396322 396454 NM_001253387#exon.9 255 + 325 | chr3 401966 402107 NM_001253387#exon.10 255 + 326 | chr3 401966 402107 NM_001253388#exon.9 255 + 327 | chr3 401966 402107 NM_006614#exon.11 255 + 328 | chr3 403381 403493 NM_001253387#exon.11 255 + 329 | chr3 403381 403493 NM_001253388#exon.10 255 + 330 | chr3 403381 403493 NM_006614#exon.12 255 + 331 | chr3 404899 405066 NM_001253387#exon.12 255 + 332 | chr3 404899 405066 NM_001253388#exon.11 255 + 333 | chr3 404899 405066 NM_006614#exon.13 255 + 334 | chr3 407632 407798 NM_006614#exon.14 255 + 335 | chr3 407632 407798 NM_001253387#exon.13 255 + 336 | chr3 407632 407798 NM_001253388#exon.12 255 + 337 | chr3 419500 419625 NM_006614#exon.15 255 + 338 | chr3 419500 419625 NM_001253387#exon.14 255 + 339 | chr3 419500 419625 NM_001253388#exon.13 255 + 340 | chr3 423861 423963 NM_001253388#exon.14 255 + 341 | chr3 423861 423963 NM_006614#exon.16 255 + 342 | chr3 423861 423963 NM_001253387#exon.15 255 + 343 | chr3 424156 424354 NM_001253388#exon.15 255 + 344 | chr3 424156 424354 NM_006614#exon.17 255 + 345 | chr3 424156 424354 NM_001253387#exon.16 255 + 346 | chr3 425498 425569 NM_001253388#exon.16 255 + 347 | chr3 425498 425569 NM_001253387#exon.17 255 + 348 | chr3 425498 425569 NM_006614#exon.18 255 + 349 | chr3 430934 431157 NM_001253388#exon.17 255 + 350 | chr3 430934 431157 NM_001253387#exon.18 255 + 351 | chr3 430934 431157 NM_006614#exon.19 255 + 352 | chr3 432383 432499 NM_001253388#exon.18 255 + 353 | chr3 432383 432499 NM_001253387#exon.19 255 + 354 | chr3 432383 432499 NM_006614#exon.20 255 + 355 | chr3 432637 432842 NM_006614#exon.21 255 + 356 | chr3 432637 432842 NM_001253388#exon.19 255 + 357 | chr3 432637 432842 NM_001253387#exon.20 255 + 358 | chr3 433357 433480 NM_006614#exon.22 255 + 359 | chr3 433357 433480 NM_001253388#exon.20 255 + 360 | chr3 433357 433480 NM_001253387#exon.21 255 + 361 | chr3 436375 436555 NM_006614#exon.23 255 + 362 | chr3 436375 436555 NM_001253388#exon.21 255 + 363 | chr3 436375 436555 NM_001253387#exon.22 255 + 364 | chr3 439909 440068 NM_006614#exon.24 255 + 365 | chr3 439909 440068 NM_001253387#exon.23 255 + 366 | chr3 440699 440831 NM_006614#exon.25 255 + 367 | chr3 440699 440831 NM_001253387#exon.24 255 + 368 | chr3 440699 440831 NM_001253388#exon.22 255 + 369 | chr3 443308 443381 NM_006614#exon.26 255 + 370 | chr3 443308 443381 NM_001253388#exon.23 255 + 371 | chr3 443308 443381 NM_001253387#exon.25 255 + 372 | chr3 447177 451097 NM_001253387#exon.26 255 + 373 | chr3 447177 451097 NM_006614#exon.27 255 + 374 | chr3 447177 451097 NM_001253388#exon.24 255 + 375 | chr4 53226 53385 NM_182524#exon.0 255 + 376 | chr4 53276 53385 NM_001039127#exon.0 255 + 377 | chr4 59322 59449 NM_182524#exon.1 255 + 378 | chr4 59322 59449 NM_001039127#exon.1 255 + 379 | chr4 59950 60046 NM_182524#exon.2 255 + 380 | chr4 59950 60046 NM_001039127#exon.2 255 + 381 | chr4 85621 86032 NM_182524#exon.3 255 + 382 | chr4 86033 86035 NM_182524#exon.4 255 + 383 | chr4 86036 88099 NM_182524#exon.5 255 + 384 | chr4 154701 156490 NM_001039127#exon.3 255 + 385 | chr4 206388 206570 NR_027481#exon.0 255 + 386 | chr4 247348 249773 NR_027481#exon.1 255 + 387 | chr4 264463 266419 NM_001137608#exon.2 255 - 388 | chr4 289226 289322 NM_001137608#exon.1 255 - 389 | chr4 289817 289944 NM_001137608#exon.0 255 - 390 | chr4 331595 331775 NM_003441#exon.0 255 + 391 | chr4 337570 337697 NM_003441#exon.1 255 + 392 | chr4 338123 338219 NM_003441#exon.2 255 + 393 | chr4 366452 367691 NM_003441#exon.3 255 + 394 | chr4 419223 420762 NR_002451#exon.2 255 - 395 | chr4 433778 438221 NM_133474#exon.2 255 - 396 | chr4 466363 466490 NR_002451#exon.1 255 - 397 | chr4 466363 466490 NM_133474#exon.1 255 - 398 | chr4 467769 467998 NR_002451#exon.0 255 - 399 | chr4 492844 493442 NM_133474#exon.0 255 - 400 | chr4 492988 493278 NM_017733#exon.0 255 + 401 | chr4 492988 493278 NM_001127178#exon.0 255 + 402 | chr4 494184 494390 NM_001127178#exon.1 255 + 403 | chr4 494184 494390 NM_017733#exon.1 255 + 404 | chr4 499506 499716 NM_001127178#exon.2 255 + 405 | chr4 499506 499716 NM_017733#exon.2 255 + 406 | chr4 501193 501382 NM_001127178#exon.3 255 + 407 | chr4 501193 501382 NM_017733#exon.3 255 + 408 | chr4 502617 502759 NM_001127178#exon.4 255 + 409 | chr4 502617 502759 NM_017733#exon.4 255 + 410 | chr4 509761 509974 NM_017733#exon.5 255 + 411 | chr4 509761 509974 NM_001127178#exon.5 255 + 412 | chr4 514844 515038 NM_017733#exon.6 255 + 413 | chr4 514844 515062 NM_001127178#exon.6 255 + 414 | chr4 515448 515730 NM_017733#exon.7 255 + 415 | chr4 515448 515730 NM_001127178#exon.7 255 + 416 | chr4 517247 517702 NM_001127178#exon.8 255 + 417 | chr4 517247 517702 NM_017733#exon.8 255 + 418 | chr4 520827 521019 NM_017733#exon.9 255 + 419 | chr4 520827 521019 NM_001127178#exon.9 255 + 420 | chr4 524224 524534 NM_017733#exon.10 255 + 421 | chr4 524224 524534 NM_001127178#exon.10 255 + 422 | chr4 527606 527770 NM_017733#exon.11 255 + 423 | chr4 527606 527770 NM_001127178#exon.11 255 + 424 | chr4 532941 533320 NM_017733#exon.12 255 + 425 | chr4 532941 533320 NM_001127178#exon.12 255 + 426 | chr4 619362 619883 NM_001145291#exon.0 255 + 427 | chr4 619362 619883 NM_000283#exon.0 255 + 428 | chr4 628465 628618 NM_001145291#exon.1 255 + 429 | chr4 628465 628618 NM_000283#exon.1 255 + 430 | chr4 629668 629758 NM_001145291#exon.2 255 + 431 | chr4 629668 629758 NM_000283#exon.2 255 + 432 | chr4 646965 647108 NM_001145292#exon.0 255 + 433 | chr4 647640 647781 NM_001145292#exon.1 255 + 434 | chr4 647640 647781 NM_001145291#exon.3 255 + 435 | chr4 647640 647781 NM_000283#exon.3 255 + 436 | chr4 647868 647943 NM_001145291#exon.4 255 + 437 | chr4 647868 647943 NM_001145292#exon.2 255 + 438 | chr4 647868 647943 NM_000283#exon.4 255 + 439 | chr4 648612 648677 NM_001145291#exon.5 255 + 440 | chr4 648612 648677 NM_001145292#exon.3 255 + 441 | chr4 648612 648677 NM_000283#exon.5 255 + 442 | chr4 649728 649795 NM_001145292#exon.4 255 + 443 | chr4 649728 649795 NM_001145291#exon.6 255 + 444 | chr4 649728 649795 NM_000283#exon.6 255 + 445 | chr4 650033 650081 NM_001145291#exon.7 255 + 446 | chr4 650033 650081 NM_001145292#exon.5 255 + 447 | chr4 650033 650081 NM_000283#exon.7 255 + 448 | chr4 650662 650812 NM_000283#exon.8 255 + 449 | chr4 650662 650812 NM_001145291#exon.8 255 + 450 | chr4 650662 650812 NM_001145292#exon.6 255 + 451 | chr4 651139 651283 NM_001145291#exon.9 255 + 452 | chr4 651139 651283 NM_000283#exon.9 255 + 453 | chr4 651139 651283 NM_001145292#exon.7 255 + 454 | chr4 652740 652806 NM_000283#exon.10 255 + 455 | chr4 652740 652806 NM_001145291#exon.10 255 + 456 | chr4 652740 652806 NM_001145292#exon.8 255 + 457 | chr4 654255 654402 NM_000283#exon.11 255 + 458 | chr4 654255 654402 NM_001145292#exon.9 255 + 459 | chr4 654255 654402 NM_001145291#exon.11 255 + 460 | chr4 655922 656030 NM_000283#exon.12 255 + 461 | chr4 655922 656030 NM_001145291#exon.12 255 + 462 | chr4 655922 656030 NM_001145292#exon.10 255 + 463 | chr4 656297 656407 NM_001145291#exon.13 255 + 464 | chr4 656297 656407 NM_001145292#exon.11 255 + 465 | chr4 656297 656407 NM_000283#exon.13 255 + 466 | chr4 656888 656976 NM_001145291#exon.14 255 + 467 | chr4 656888 656976 NM_001145292#exon.12 255 + 468 | chr4 656888 656976 NM_000283#exon.14 255 + 469 | chr4 657558 657659 NM_001145291#exon.15 255 + 470 | chr4 657558 657659 NM_001145292#exon.13 255 + 471 | chr4 657558 657659 NM_000283#exon.15 255 + 472 | chr4 657902 658010 NM_001145292#exon.14 255 + 473 | chr4 657902 658010 NM_001145291#exon.16 255 + 474 | chr4 657902 658010 NM_000283#exon.16 255 + 475 | chr4 658669 658733 NM_001145292#exon.15 255 + 476 | chr4 658669 658733 NM_001145291#exon.17 255 + 477 | chr4 658669 658733 NM_000283#exon.17 255 + 478 | chr4 659043 659118 NM_001145291#exon.18 255 + 479 | chr4 659043 659118 NM_000283#exon.18 255 + 480 | chr4 659043 659118 NM_001145292#exon.16 255 + 481 | chr4 660319 660403 NM_001145291#exon.19 255 + 482 | chr4 660319 660403 NM_000283#exon.19 255 + 483 | chr4 660319 660403 NM_001145292#exon.17 255 + 484 | chr4 661644 661795 NM_000283#exon.20 255 + 485 | chr4 661644 661795 NM_001145291#exon.20 255 + 486 | chr4 661644 661795 NM_001145292#exon.18 255 + 487 | chr4 663834 664681 NM_000283#exon.21 255 + 488 | chr4 663834 664681 NM_001145292#exon.19 255 + 489 | chr4 663837 664681 NM_001145291#exon.21 255 + 490 | chr4 666224 666308 NM_007100#exon.3 255 - 491 | chr4 666224 666308 NR_033743#exon.2 255 - 492 | chr4 667091 667190 NR_033743#exon.1 255 - 493 | chr4 667091 667190 NM_007100#exon.2 255 - 494 | chr4 667700 667755 NM_007100#exon.1 255 - 495 | chr4 668000 668127 NR_033743#exon.0 255 - 496 | chr4 668000 668127 NM_007100#exon.0 255 - 497 | chr4 671710 671818 NM_002477#exon.0 255 + 498 | chr4 672446 672554 NM_002477#exon.1 255 + 499 | chr4 672746 672822 NM_002477#exon.2 255 + 500 | chr4 673702 673807 NM_002477#exon.3 255 + 501 | chr4 674297 674376 NM_002477#exon.4 255 + 502 | chr4 674880 674929 NM_002477#exon.5 255 + 503 | chr4 675617 676165 NM_032219#exon.9 255 - 504 | chr4 675681 675817 NM_002477#exon.6 255 + 505 | chr4 676569 676679 NM_032219#exon.8 255 - 506 | chr4 676998 677156 NM_032219#exon.7 255 - 507 | chr4 677397 677550 NM_032219#exon.6 255 - 508 | chr4 678271 678388 NM_032219#exon.5 255 - 509 | chr4 678507 678645 NM_032219#exon.4 255 - 510 | chr4 679623 679700 NM_032219#exon.3 255 - 511 | chr4 679877 680091 NM_032219#exon.2 255 - 512 | chr4 680320 680479 NM_032219#exon.1 255 - 513 | chr4 682781 682973 NM_032219#exon.0 255 - 514 | chr4 699572 699759 NM_006315#exon.0 255 + 515 | chr4 724418 724457 NM_006315#exon.1 255 + 516 | chr4 724758 724899 NM_006315#exon.2 255 + 517 | chr4 727460 727578 NM_006315#exon.3 255 + 518 | chr4 728719 728816 NM_006315#exon.4 255 + 519 | chr4 731254 731310 NM_006315#exon.5 255 + 520 | chr4 737261 737372 NM_006315#exon.6 255 + 521 | chr4 738387 738476 NM_006315#exon.7 255 + 522 | chr4 755066 755204 NM_006315#exon.8 255 + 523 | chr4 758771 758852 NM_006315#exon.9 255 + 524 | chr4 759819 764427 NM_006315#exon.10 255 + 525 | chr4 773936 775636 NR_036511#exon.0 255 - 526 | chr4 773936 774476 NR_036512#exon.1 255 - 527 | chr4 775559 775636 NR_036512#exon.0 255 - 528 | chr4 778744 780486 NM_006651#exon.3 255 - 529 | chr4 786220 786396 NM_006651#exon.2 255 - 530 | chr4 818279 818389 NM_006651#exon.1 255 - 531 | chr4 819833 819945 NM_006651#exon.0 255 - 532 | chr4 843064 843562 NM_005255#exon.27 255 - 533 | chr4 843679 843856 NM_005255#exon.26 255 - 534 | chr4 844723 844872 NM_005255#exon.25 255 - 535 | chr4 845537 845762 NM_005255#exon.24 255 - 536 | chr4 853393 853510 NM_005255#exon.23 255 - 537 | chr4 858909 859032 NM_005255#exon.22 255 - 538 | chr4 860151 860322 NM_005255#exon.21 255 - 539 | chr4 860743 861220 NM_005255#exon.20 255 - 540 | chr4 862326 862473 NM_005255#exon.19 255 - 541 | chr4 864498 864692 NM_005255#exon.18 255 - 542 | chr4 870317 870397 NM_005255#exon.17 255 - 543 | chr4 870877 870995 NM_005255#exon.16 255 - 544 | chr4 871402 871597 NM_005255#exon.15 255 - 545 | chr4 875694 875828 NM_005255#exon.14 255 - 546 | chr4 876484 876607 NM_005255#exon.13 255 - 547 | chr4 877102 877251 NM_005255#exon.12 255 - 548 | chr4 877824 877874 NM_005255#exon.11 255 - 549 | chr4 882634 882758 NM_005255#exon.10 255 - 550 | chr4 884319 884410 NM_005255#exon.9 255 - 551 | chr4 887164 887277 NM_005255#exon.8 255 - 552 | chr4 887661 887797 NM_005255#exon.7 255 - 553 | chr4 890247 890337 NM_005255#exon.6 255 - 554 | chr4 891820 891946 NM_005255#exon.5 255 - 555 | chr4 898424 898567 NM_005255#exon.4 255 - 556 | chr4 905460 905575 NM_005255#exon.3 255 - 557 | chr4 906522 906582 NM_005255#exon.2 255 - 558 | chr4 907394 907456 NM_005255#exon.1 255 - 559 | chr4 925830 926174 NM_005255#exon.0 255 - 560 | chr4 926261 926328 NM_032326#exon.0 255 + 561 | chr4 941496 941680 NM_032326#exon.1 255 + 562 | chr4 941903 941942 NM_032326#exon.2 255 + 563 | chr4 944208 944306 NM_032326#exon.3 255 + 564 | chr4 944994 945046 NM_032326#exon.4 255 + 565 | chr4 945469 945505 NM_032326#exon.5 255 + 566 | chr4 946154 946238 NM_032326#exon.6 255 + 567 | chr4 946977 947142 NM_032326#exon.7 255 + 568 | chr4 949192 949271 NM_032326#exon.8 255 + 569 | chr4 949542 949678 NM_032326#exon.9 255 + 570 | chr4 951611 952443 NM_032326#exon.10 255 + 571 | chr4 952671 954509 NM_001347#exon.22 255 - 572 | chr4 954836 954989 NM_001347#exon.21 255 - 573 | chr4 955254 955366 NM_001347#exon.20 255 - 574 | chr4 955475 955622 NM_001347#exon.19 255 - 575 | chr4 955769 955870 NM_001347#exon.18 255 - 576 | chr4 956222 956401 NM_001347#exon.17 255 - 577 | chr4 956559 956708 NM_001347#exon.16 255 - 578 | chr4 956926 957078 NM_001347#exon.15 255 - 579 | chr4 958963 959079 NM_001347#exon.14 255 - 580 | chr4 959278 959317 NM_001347#exon.13 255 - 581 | chr4 959715 959866 NM_001347#exon.12 255 - 582 | chr4 960253 960315 NM_001347#exon.11 255 - 583 | chr4 960535 960590 NM_001347#exon.10 255 - 584 | chr4 960751 960842 NM_001347#exon.9 255 - 585 | chr4 960916 961149 NM_001347#exon.8 255 - 586 | chr4 961336 961437 NM_001347#exon.7 255 - 587 | chr4 961515 961590 NM_001347#exon.6 255 - 588 | chr4 961667 961815 NM_001347#exon.5 255 - 589 | chr4 962069 962195 NM_001347#exon.4 255 - 590 | chr4 962266 962352 NM_001347#exon.3 255 - 591 | chr4 962598 962698 NM_001347#exon.2 255 - 592 | chr4 964780 964860 NM_001347#exon.1 255 - 593 | chr4 966999 967348 NM_001347#exon.0 255 - 594 | chr4 972862 973292 NM_134425#exon.2 255 - 595 | chr4 980784 981030 NM_000203#exon.0 255 + 596 | chr4 981444 984150 NM_022042#exon.2 255 - 597 | chr4 981444 984150 NM_213613#exon.3 255 - 598 | chr4 981596 981737 NM_000203#exon.1 255 + 599 | chr4 984915 985518 NM_213613#exon.2 255 - 600 | chr4 984915 985518 NM_022042#exon.1 255 - 601 | chr4 984915 985518 NM_134425#exon.1 255 - 602 | chr4 986508 986723 NM_213613#exon.1 255 - 603 | chr4 987088 987183 NM_022042#exon.0 255 - 604 | chr4 987088 987224 NM_213613#exon.0 255 - 605 | chr4 987088 987183 NM_134425#exon.0 255 - 606 | chr4 994399 994485 NM_000203#exon.2 255 + 607 | chr4 994669 994777 NM_000203#exon.3 255 + 608 | chr4 995255 995351 NM_000203#exon.4 255 + 609 | chr4 995466 995669 NM_000203#exon.5 255 + 610 | chr4 995769 995949 NM_000203#exon.6 255 + 611 | chr4 996056 996273 NM_000203#exon.7 255 + 612 | chr4 996519 996732 NM_000203#exon.8 255 + 613 | chr4 996823 996945 NM_000203#exon.9 255 + 614 | chr4 997132 997258 NM_000203#exon.10 255 + 615 | chr4 997336 997413 NM_000203#exon.11 255 + 616 | chr4 997799 997900 NM_000203#exon.12 255 + 617 | chr4 998047 998317 NM_000203#exon.13 255 + 618 | -------------------------------------------------------------------------------- /test_data/test_input.txt: -------------------------------------------------------------------------------- 1 | sample_name /home/priesgo/src/github/tronflow-mutect2/test_data/SRR8244887.preprocessed.downsampled.bam /home/priesgo/src/github/tronflow-mutect2/test_data/SRR8244836.preprocessed.downsampled.bam 2 | -------------------------------------------------------------------------------- /test_data/test_input_with_replicates.txt: -------------------------------------------------------------------------------- 1 | sample_name_with_replicates /home/priesgo/src/github/tronflow-mutect2/test_data/TESTX_S1_L001.bam,/home/priesgo/src/github/tronflow-mutect2/test_data/TESTX_S1_L001.bam /home/priesgo/src/github/tronflow-mutect2/test_data/TESTX_S1_L002.bam,/home/priesgo/src/github/tronflow-mutect2/test_data/TESTX_S1_L002.bam 2 | -------------------------------------------------------------------------------- /test_data/ucsc.hg19.minimal.dict: -------------------------------------------------------------------------------- 1 | @HD VN:1.6 2 | @SQ SN:chr1 LN:1000000 M5:f6b4870ef0a68d56d0a063ec02e002dd UR:file:/home/priesgo/src/tronflow-mutect2/test_data/ucsc.hg19.minimal.fasta 3 | @SQ SN:chr2 LN:1000000 M5:733f916294fa3763451afe21fa840337 UR:file:/home/priesgo/src/tronflow-mutect2/test_data/ucsc.hg19.minimal.fasta 4 | @SQ SN:chr3 LN:1000000 M5:6eade235671c9124cb0318fa90bab90d UR:file:/home/priesgo/src/tronflow-mutect2/test_data/ucsc.hg19.minimal.fasta 5 | @SQ SN:chr4 LN:1000000 M5:f99f74284026740843aa921e0ef4241f UR:file:/home/priesgo/src/tronflow-mutect2/test_data/ucsc.hg19.minimal.fasta 6 | -------------------------------------------------------------------------------- /test_data/ucsc.hg19.minimal.fasta.fai: -------------------------------------------------------------------------------- 1 | chr1 1000000 6 60 61 2 | chr2 1000000 1016679 60 61 3 | chr3 1000000 2033352 60 61 4 | chr4 1000000 3050025 60 61 5 | -------------------------------------------------------------------------------- /tests/test_00.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | nextflow main.nf --help -------------------------------------------------------------------------------- /tests/test_01.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test1 6 | 7 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 8 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt 9 | 10 | test -s $output/sample_name/sample_name.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_02.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test2 6 | 7 | nextflow main.nf -profile test,conda,ci --disable_common_germline_filter --output $output --input_files test_data/test_input.txt 8 | 9 | test -s $output/sample_name/sample_name.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_03.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test3 6 | 7 | echo -e "sample_name_with_replicates\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam,"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam,"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input_with_replicates.txt 8 | nextflow main.nf -profile test,conda,ci --input_files test_data/test_input_with_replicates.txt --output $output 9 | 10 | test -s $output/sample_name_with_replicates/sample_name_with_replicates.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_04.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test4 6 | 7 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt --intervals false 8 | 9 | test -s $output/sample_name/sample_name.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_05.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | ################################################################## 3 | # Error condition: RGSM tag is the same in tumor and normal #### 4 | ################################################################## 5 | 6 | source bin/assert.sh 7 | output=output/test5 8 | 9 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 10 | 11 | { # try 12 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt && 13 | assert_true false "Error condition not captured" 14 | } || { # catch 15 | assert_true true 16 | } -------------------------------------------------------------------------------- /tests/test_06.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ############################################################################## 4 | # Error condition: RGSM tag is the different betweeen two tumor #### 5 | # samples or two normal samples #### 6 | ############################################################################## 7 | 8 | source bin/assert.sh 9 | output=output/test6 10 | 11 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam,"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 12 | { # try 13 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt && 14 | assert_true false "Error condition not captured" 15 | } || { # catch 16 | assert_true true 17 | } 18 | 19 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam,"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 20 | { # try 21 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt && 22 | assert_true false "Error condition not captured" 23 | } || { # catch 24 | assert_true true 25 | } -------------------------------------------------------------------------------- /tests/test_07.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Test Funcotator functionallity 4 | 5 | source bin/assert.sh 6 | output=output/test5 7 | 8 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 9 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt \ 10 | --reference_version_funcotator hg19 \ 11 | --funcotator /home/priesgo/funcotator/funcotator_dataSources.v1.7.20200521s 12 | 13 | test -s $output/sample_name/sample_name.mutect2.funcotated.maf || { echo "Missing output VCF file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_08.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test8 6 | 7 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 8 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt --enable_bam_output 9 | 10 | test -s $output/sample_name/sample_name.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } 11 | test -s $output/sample_name/sample_name.mutect2.assembled_haplotypes.bam || { echo "Missing output BAM file!"; exit 1; } 12 | test -s $output/sample_name/sample_name.mutect2.assembled_haplotypes.bai || { echo "Missing output BAI file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_09.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test9 6 | 7 | echo -e "sample_name\t"`pwd`"/test_data/SRR8244887.preprocessed.downsampled.bam\t"`pwd`"/test_data/SRR8244836.preprocessed.downsampled.bam" > test_data/test_input.txt 8 | nextflow main.nf -profile test,conda,ci --output $output --input_files test_data/test_input.txt --gnomad false --args_filter "--contamination-estimate 0.2" 9 | 10 | test -s $output/sample_name/sample_name.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } -------------------------------------------------------------------------------- /tests/test_10.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | source bin/assert.sh 5 | output=output/test10 6 | 7 | nextflow main.nf -profile test,conda,ci --output $output \ 8 | --input_name sample_name \ 9 | --input_tumor_bam `pwd`/test_data/SRR8244887.preprocessed.downsampled.bam \ 10 | --input_normal_bam `pwd`/test_data/SRR8244836.preprocessed.downsampled.bam 11 | 12 | test -s $output/sample_name/sample_name.mutect2.vcf || { echo "Missing output VCF file!"; exit 1; } --------------------------------------------------------------------------------