13 |
14 |
TIER 1 - SNV/INDEL
POLE/MMR deficiency
15 |
16 |
17 |
TIER 2 - SNV/INDEL
POLD1/MMR deficiency
18 |
19 |
20 |
sCNA
MUTYH/BER deficiency
21 |
22 |
23 |
24 |
TUMOR PURITY
MUTYH/BER deficiency
25 |
26 |
27 |
TUMOR PLOIDY
Aristolochic acid exposure
28 |
29 |
30 |
MSI STATUS
POLD1/MMR deficiency
31 |
32 |
33 |
DOMINANT SIGNATURE ETIOLOGY
Sequencing artefact
34 |
35 |
36 |
MUTATIONAL BURDEN
37 |
38 |
39 |
KATAEGIS EVENTS
40 |
41 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [build-system]
2 | requires = ["setuptools >= 61.0"]
3 | build-backend = "setuptools.build_meta"
4 |
5 |
6 | [project]
7 | name = "pcgr"
8 | version = "2.2.5" # versioned by bump2version
9 | description = "Personal Cancer Genome Reporter (PCGR) - variant interpretation for precision cancer medicine"
10 | authors = [
11 | {name = "Sigve Nakken", email = "sigven@gmail.com"},
12 | ]
13 | maintainers = [
14 | {name = "Sigve Nakken", email = "sigven@gmail.com"},
15 | {name = "Peter Diakumis", email = "peterdiakumis@gmail.com"},
16 | ]
17 | readme = "README.md"
18 | license = {file = "LICENSE"}
19 | keywords = ["cancer", "genomics", "pcgr"]
20 | classifiers = [
21 | "License :: OSI Approved :: MIT License",
22 | "Development Status :: 5 - Production/Stable",
23 | "Intended Audience :: Science/Research",
24 | "Operating System :: MacOS :: MacOS X",
25 | "Operating System :: POSIX",
26 | "Operating System :: Unix",
27 | "Programming Language :: Python :: 3",
28 | "Programming Language :: R",
29 | "Topic :: Scientific/Engineering :: Bio-Informatics",
30 | ]
31 |
32 |
33 | [project.urls]
34 | Homepage = "https://sigven.github.io/pcgr/"
35 | Documentation = "https://sigven.github.io/pcgr/"
36 | Repository = "https://github.com/sigven/pcgr"
37 | Changelog = "https://sigven.github.io/pcgr/articles/CHANGELOG.html"
38 |
39 |
40 | [project.scripts]
41 | pcgr = "pcgr.main:cli"
42 | cpsr = "pcgr.cpsr:main"
43 |
44 | [tool.setuptools.packages]
45 | find = {}
46 |
47 | [tool.setuptools]
48 | script-files = [
49 | "scripts/cpsr_validate_input.py",
50 | "scripts/pcgr_summarise.py",
51 | "scripts/pcgr_validate_input.py",
52 | "scripts/pcgr_vcfanno.py",
53 | "scripts/pcgrr.R",
54 | "scripts/cpsr.R",
55 | ]
56 |
--------------------------------------------------------------------------------
/pcgrr/data-raw/oncogenicity.tsv:
--------------------------------------------------------------------------------
1 | code score category pole description resource
2 | ONCG_OVS1 8 funcvar P Null variant - predicted as LoF - in bona fide tumor suppressor gene VEP;CGC;CancerMine
3 | ONCG_OS1 4 funcvar P Same amino acid change as previously established oncogenic variant - regardless of nucleotide change ClinVar
4 | ONCG_OS3 4 funcvar P Located in a mutation hotspot with >= 50 samples with variant at AA position, >= 10 samples with same AA change cancerhotspots.org
5 | ONCG_OM1 2 funcvar P Presumably critical site of functional domain CIViC
6 | ONCG_OM2 2 funcvar P Protein length changes from in-frame dels/ins in known oncogene/tumor suppressor genes or stop-loss variants in a tumor suppressor gene VEP;CGC;CancerMine
7 | ONCG_OM3 2 funcvar P Missense variant at an amino acid residue where a different missense variant determined to be oncogenic (using this standard) has been documented ClinVar
8 | ONCG_OM4 2 funcvar P Located in a mutation hotspot with < 50 samples with variant at AA position, >= 10 samples with same AA change cancerhotspots.org
9 | ONCG_OP1 1 funccomp P Multiple lines of computational evidence support of a damaging variant effect on the gene or gene product dbNSFP
10 | ONCG_OP3 1 funcvar P Located in a mutation hotspot with < 10 samples with the same amino acid change cancerhotspots.org
11 | ONCG_OP4 1 clinpop P Absent from controls (gnomAD) / very low MAF (any five major gnomAD subpopulations) gnomAD
12 | ONCG_SBVS1 -8 clinpop B Very high MAF (any five major gnomAD subpopulations) gnomAD
13 | ONCG_SBS1 -4 clinpop B High MAF (any five major gnomAD subpopulations) gnomAD
14 | ONCG_SBP1 -1 funccomp B Multiple lines of computational evidence support a benign variant effect on the gene or gene product dbNSFP
15 | ONCG_SBP2 -1 funcvar B Silent and intronic changes outside of the consensus splice site VEP
16 |
--------------------------------------------------------------------------------
/pcgrr/tests/test2.css:
--------------------------------------------------------------------------------
1 | #container {
2 | /*border: 2px dashed #444;*/
3 | height: 30px;
4 | margin: 7px;
5 |
6 | /* just for demo */
7 | min-width: 750px;
8 | }
9 |
10 | .red, .amber, .exploratory, .green, .nolist, .custom {
11 | width: 100px;
12 | height: 28px;
13 | /*font-family: Helvetica;*/
14 | margin-right: 5px;
15 | margin-top: 8px;
16 | padding: 5px;
17 | color: white;
18 | display: inline-block;
19 | text-align: center;
20 |
21 | /*display: inline;*/
22 | /*zoom: 1*/
23 | }
24 | /*.stretch {
25 | width: 100%;
26 | display: inline-block;
27 | font-size: 0;
28 | line-height: 0
29 | }
30 | */
31 |
32 | .exploratory {
33 | background: #000;
34 | border-radius: 5px;
35 |
36 | }
37 |
38 | .green {
39 | background: #3fad46;
40 | border-radius: 5px;
41 |
42 | }
43 |
44 | .red {
45 | background: #d9534f;
46 | border-radius: 5px;
47 |
48 | }
49 |
50 | .nolist {
51 | background: #b8b8ba;
52 | border-radius: 5px;
53 |
54 |
55 | }
56 |
57 | .amber {
58 | background: #f0ad4e;
59 | border-radius: 5px;
60 |
61 | }
62 |
63 | .custom {
64 | background: darkmagenta;
65 | border-radius: 5px;
66 |
67 | }
68 |
69 | .custom > a:hover,
70 | .green > a:hover,
71 | .red > a:hover,
72 | .amber > a:hover,
73 | .nolist > a:hover,
74 | .exploratory > a:hover {
75 | color: white;
76 | text-decoration: underline;
77 | }
78 |
79 | .custom > a:link,
80 | .green > a:link,
81 | .red > a:link,
82 | .amber > a:link,
83 | .nolist > a:link,
84 | .exploratory > a:link{
85 | text-decoration: none;
86 | color: white;
87 | }
88 |
89 | .custom > a:visited,
90 | .green > a:visited,
91 | .red > a:visited,
92 | .amber > a:visited,
93 | .nolist > a:visited,
94 | .exploratory > a:visited{
95 | text-decoration: none;
96 | color: white;
97 | }
98 |
--------------------------------------------------------------------------------
/pcgrr/tests/cpsr.css:
--------------------------------------------------------------------------------
1 | #container {
2 | /*border: 2px dashed #444;*/
3 | height: 30px;
4 | margin: 7px;
5 |
6 | /* just for demo */
7 | min-width: 750px;
8 | }
9 |
10 | .red, .amber, .exploratory, .green, .nolist, .custom {
11 | width: 100px;
12 | height: 28px;
13 | /*font-family: Helvetica;*/
14 | margin-right: 5px;
15 | margin-top: 8px;
16 | padding: 5px;
17 | color: white;
18 | display: inline-block;
19 | text-align: center;
20 |
21 | /*display: inline;*/
22 | /*zoom: 1*/
23 | }
24 | /*.stretch {
25 | width: 100%;
26 | display: inline-block;
27 | font-size: 0;
28 | line-height: 0
29 | }
30 | */
31 |
32 | .exploratory {
33 | background: #000;
34 | border-radius: 5px;
35 |
36 | }
37 |
38 | .green {
39 | background: #3fad46;
40 | border-radius: 5px;
41 |
42 | }
43 |
44 | .red {
45 | background: #d9534f;
46 | border-radius: 5px;
47 |
48 | }
49 |
50 | .nolist {
51 | background: #b8b8ba;
52 | border-radius: 5px;
53 |
54 |
55 | }
56 |
57 | .amber {
58 | background: #f0ad4e;
59 | border-radius: 5px;
60 |
61 | }
62 |
63 | .custom {
64 | background: darkmagenta;
65 | border-radius: 5px;
66 |
67 | }
68 |
69 | .custom > a:hover,
70 | .green > a:hover,
71 | .red > a:hover,
72 | .amber > a:hover,
73 | .nolist > a:hover,
74 | .exploratory > a:hover {
75 | color: white;
76 | text-decoration: underline;
77 | }
78 |
79 | .custom > a:link,
80 | .green > a:link,
81 | .red > a:link,
82 | .amber > a:link,
83 | .nolist > a:link,
84 | .exploratory > a:link{
85 | text-decoration: none;
86 | color: white;
87 | }
88 |
89 | .custom > a:visited,
90 | .green > a:visited,
91 | .red > a:visited,
92 | .amber > a:visited,
93 | .nolist > a:visited,
94 | .exploratory > a:visited{
95 | text-decoration: none;
96 | color: white;
97 | }
98 |
99 |
--------------------------------------------------------------------------------
/pcgrr/data-raw/effect_prediction_algorithms.tsv:
--------------------------------------------------------------------------------
1 | algorithm url display_name
2 | primateai https://github.com/Illumina/PrimateAI PrimateAI
3 | metalr https://www.ncbi.nlm.nih.gov/pubmed/25552646 Ensembl-LogisticRegression
4 | metasvm https://www.ncbi.nlm.nih.gov/pubmed/25552646 Ensembl-SVM
5 | deogen2 https://www.ncbi.nlm.nih.gov/pubmed/28498993 DEOGEN2
6 | mutationassessor http://mutationassessor.org MutationAssessor
7 | mutationtaster http://www.mutationtaster.org MutationTaster
8 | sift https://sift.bii.a-star.edu.sg/ SIFT
9 | lrt http://www.genetics.wustl.edu/jflab/lrt_query.html LRT
10 | provean http://provean.jcvi.org/index.php PROVEAN
11 | mutpred http://mutpred.mutdb.org MutPred
12 | m-cap http://bejerano.stanford.edu/MCAP/ M-CAP
13 | splice_site_rf http://nar.oxfordjournals.org/content/42/22/13534 Splice site effect (Random forest)
14 | splice_site_ada http://nar.oxfordjournals.org/content/42/22/13534 Splice site effect (Adaptive boosting)
15 | gerp_rs http://mendel.stanford.edu/SidowLab/downloads/gerp/ GERP++ RS score
16 | list_s2 https://doi.org/10.1093/nar/gkaa288 LIST-S2
17 | bayesdel_addaf https://doi.org/10.1002/humu.23158 BayesDel
18 | aloft http://aloft.gersteinlab.org/ ALoFT
19 | esm1b https://huggingface.co/spaces/ntranoslab/esm_variants/tree/main ESM1b
20 | alphamissense https://console.cloud.google.com/storage/browser/dm_alphamissense AlphaMissense
21 | mutformer https://github.com/WGLab/mutformer MutFormer
22 | phactboost https://github.com/CompGenomeLab/PHACTboost PHACTboost
23 | metarnn http://www.liulab.science/metarnn.html MetaRNN
24 | cadd https://cadd.gs.washington.edu/ CADD
25 | vest4 https://www.cravat.us/CRAVAT/help.jsp#vest-4 VEST-4
26 | fathmm_xf https://fathmm.biocompute.org.uk/fathmm-xf/ FATHMM-XF
27 | clinpred https://sites.google.com/site/clinpred/ ClinPred
28 | polyphen2_hvar http://genetics.bwh.harvard.edu/pph2/ PolyPhen2
29 | revel https://sites.google.com/site/revelgenomics/ REVEL
30 |
--------------------------------------------------------------------------------
/scripts/pcgrr.R:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env Rscript
2 |
3 | options(warn=-1)
4 | .libPaths(R.home("library")) # use conda R pkgs, not e.g. user's local installation
5 |
6 | suppressWarnings(suppressPackageStartupMessages(library(pcgrr)))
7 | suppressWarnings(suppressPackageStartupMessages(library(log4r)))
8 | suppressWarnings(suppressPackageStartupMessages(library(argparse)))
9 |
10 | args <- commandArgs(trailingOnly=TRUE)
11 |
12 | ## YAML file produced by PCGR Python workflow
13 | ## - settings and paths to reference data and annotated input sample files
14 | yaml_fname <- as.character(args[1])
15 | quarto_evars_path <- as.character(args[2])
16 | pcgrr::export_quarto_evars(quarto_evars_path)
17 |
18 | my_log4r_layout <- function(level, ...) {
19 | paste0(format(Sys.time()), " - pcgr-report-generation - ",
20 | level, " - ", ..., "\n", collapse = "")
21 | }
22 |
23 | log4r_logger <-
24 | log4r::logger(
25 | threshold = "INFO", appenders = log4r::console_appender(my_log4r_layout))
26 |
27 | ## this gets passed on to all the log4r_* functions inside the pkg
28 | options("PCGRR_LOG4R_LOGGER" = log4r_logger)
29 |
30 | ## Generate report content
31 | pcg_report <- pcgrr::generate_report(
32 | yaml_fname = yaml_fname
33 | )
34 |
35 | ## Write report contents to output files (HTML, XLSX, TSV)
36 | if (!is.null(pcg_report)) {
37 | if(pcg_report$settings$conf$other$no_html == FALSE){
38 | pcgrr::write_report_quarto_html(report = pcg_report)
39 | }
40 | else{
41 | pcgrr::log4r_info("Skipping HTML report generation (option '--no_html' set to TRUE)")
42 | }
43 |
44 | pcgrr::write_report_excel(report = pcg_report)
45 | pcgrr::write_report_tsv(report = pcg_report, output_type = 'snv_indel')
46 | pcgrr::write_report_tsv(report = pcg_report, output_type = 'snv_indel_unfiltered')
47 | pcgrr::write_report_tsv(report = pcg_report, output_type = 'cna_gene')
48 | pcgrr::write_report_tsv(report = pcg_report, output_type = 'msigs')
49 | }
50 |
--------------------------------------------------------------------------------
/pcgrr/DESCRIPTION:
--------------------------------------------------------------------------------
1 | Package: pcgrr
2 | Type: Package
3 | Title: Personal Cancer Genome ReporteR
4 | Version: 2.2.5
5 | Authors@R:
6 | c(person(given = "Sigve",
7 | family = "Nakken",
8 | role = c("aut", "cre"),
9 | email = "sigven@ifi.uio.no",
10 | comment = c(ORCID = "0000-0001-8468-2050")),
11 | person(given = "Peter",
12 | family = "Diakumis",
13 | role = c("aut", "ctb"),
14 | email = "peterdiakumis@gmail.com",
15 | comment = c(ORCID = "0000-0002-7502-7545")))
16 | Maintainer: Sigve Nakken
, Peter Diakumis
17 | URL: https://github.com/sigven/pcgr,
18 | https://sigven.github.io/pcgr/
19 | BugReports: https://github.com/sigven/pcgr/issues
20 | Description: Functions, tools and utilities for the generation of clinical
21 | cancer genome reports with PCGR. This R package is an integrated part of
22 | the PCGR workflow (https://github.com/sigven/pcgr), it should thus not be
23 | used as a stand-alone package.
24 | License: MIT + file LICENSE
25 | biocViews:
26 | Imports:
27 | assertable,
28 | assertthat,
29 | Biostrings,
30 | bslib,
31 | caret,
32 | crosstalk,
33 | dplyr,
34 | DT,
35 | formattable,
36 | GenomeInfoDb,
37 | ggplot2,
38 | glue,
39 | htmltools,
40 | log4r,
41 | MutationalPatterns,
42 | openxlsx2,
43 | plotly,
44 | quantiseqr,
45 | quarto,
46 | randomForest,
47 | readr,
48 | reshape2,
49 | rlang,
50 | rrapply,
51 | S4Vectors,
52 | scales,
53 | shiny,
54 | stringr,
55 | tibble,
56 | tidyr,
57 | yaml
58 | Depends:
59 | R (>= 4.0)
60 | Suggests:
61 | testthat,
62 | devtools,
63 | stringi,
64 | BSgenome.Hsapiens.UCSC.hg19,
65 | BSgenome.Hsapiens.UCSC.hg38
66 | Encoding: UTF-8
67 | LazyData: true
68 | RoxygenNote: 7.3.2
69 | Roxygen: list(markdown = TRUE)
70 |
--------------------------------------------------------------------------------
/pcgrr/R/value_boxes.R:
--------------------------------------------------------------------------------
1 |
2 | #' Function that plots four value boxes with the most
3 | #' important findings in the cancer genome
4 | #'
5 | #' @param pcg_report pcg report with list elements
6 | #' @return p
7 | #'
8 | #' @export
9 |
10 | plot_value_boxes <- function(pcg_report) {
11 | df <- data.frame(
12 | x = rep(seq(0, 16, 8), 3),
13 | y = c(rep(1, 3), rep(4.5, 3), rep(8, 3)),
14 | h = rep(3, 9),
15 | w = rep(7, 9),
16 | info = c(pcg_report[["content"]][["value_box"]][["tmb"]],
17 | pcg_report[["content"]][["value_box"]][["signatures"]],
18 | pcg_report[["content"]][["value_box"]][["kataegis"]],
19 | pcg_report[["content"]][["value_box"]][["tier1"]],
20 | pcg_report[["content"]][["value_box"]][["tier2"]],
21 | pcg_report[["content"]][["value_box"]][["scna"]],
22 | pcg_report[["content"]][["value_box"]][["tumor_purity"]],
23 | pcg_report[["content"]][["value_box"]][["tumor_ploidy"]],
24 | pcg_report[["content"]][["value_box"]][["msi"]]
25 |
26 | ),
27 | color = factor(1:9)
28 | )
29 |
30 | assay_props <-
31 | pcg_report[["metadata"]][["config"]][["assay_props"]]
32 |
33 | ## color - tumor-control
34 | color <- rep(pcgrr::color_palette[["tier"]][["values"]][1], 9)
35 | if (assay_props[["vcf_tumor_only"]] == T) {
36 | ## color - tumor-only
37 | color <- rep(pcgrr::color_palette[["report_color"]][["values"]][2], 9)
38 | }
39 |
40 | p <- ggplot2::ggplot(df, ggplot2::aes(.data$x, .data$y, height = .data$h, width = .data$w,
41 | label = .data$info, fill = color)) +
42 | ggplot2::geom_tile() +
43 | ggplot2::geom_text(color = "white", fontface = "bold", size = 7) +
44 | ggplot2::coord_fixed() +
45 | ggplot2::scale_fill_manual(values = rep(color, 9)) +
46 | ggplot2::theme_void() +
47 | ggplot2::guides(fill = F)
48 |
49 | return(p)
50 | }
51 |
52 |
53 |
54 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/kataegis.qmd:
--------------------------------------------------------------------------------
1 | ## Kataegis events
2 |
3 | Kataegis describes a pattern of localized hypermutations identified in some cancer genomes, in which [a large number of highly-patterned basepair mutations occur in a small region of DNA](https://en.wikipedia.org/wiki/Kataegis). Kataegis is prevalently seen among breast cancer patients, and it is also exists in lung cancers, cervical, head and neck, and bladder cancers, as shown in the results from tracing APOBEC mutation signatures (ref Wikipedia). PCGR implements the kataegis detection algorithm outlined in the [KataegisPortal R package](https://github.com/MeichunCai/KataegisPortal).
4 |
5 | Explanation of key columns in the resulting table of potential kataegis events:
6 |
7 | * __weight.C>X__: proportion of C>X mutations
8 | * __confidence__: confidence degree of potential kataegis events (range: 0 to 3)
9 | - 0 - a hypermutation with weight.C>X < 0.8;
10 | - 1 - one hypermutation with weight.C>X >= 0.8 in a chromosome;
11 | - 2 - two hypermutations with weight.C>X >= 0.8 in a chromosome;
12 | - 3 - high confidence with three or more hypermutations with weight.C>X >= 0.8 in a chromosome)
13 |
14 |
15 | ```{r mutsigs_kataegis}
16 | #| echo: false
17 | #| eval: true
18 |
19 | df <- data.frame(
20 | 'sample_id' = character(),
21 | 'chrom' = character(),
22 | 'start' = integer(),
23 | 'end' = integer(),
24 | 'chrom.arm' = character(),
25 | 'length' = integer(),
26 | 'number.mut' = integer(),
27 | 'weight.C>X' = numeric(),
28 | 'confidence' = integer(),
29 | stringsAsFactors = F)
30 |
31 | if(is.data.frame(pcg_report$content$kataegis$events)){
32 | df <- pcg_report$content$kataegis$events
33 | }
34 | ## data frame with potential kataegis events present in tumor sample
35 | myOptions <- list(paging = F,pageLength=5, searching=F,caching=F,
36 | buttons = c('csv','excel'),dom = 'Bfrtip')
37 | DT::datatable(df ,options = myOptions,extensions=c("Buttons","Responsive"))
38 |
39 | ```
40 |
41 |
--------------------------------------------------------------------------------
/.bumpversion.toml:
--------------------------------------------------------------------------------
1 | [tool.bumpversion]
2 | current_version = "2.2.5"
3 | search = "{current_version}"
4 | replace = "{new_version}"
5 | message = "Bump version: {current_version} → {new_version}"
6 | regex = false
7 | ignore_missing_version = false
8 | ignore_missing_files = false
9 | commit = true
10 | parse = """(?x)
11 | (?P0|[1-9]\\d*)\\.
12 | (?P0|[1-9]\\d*)\\.
13 | (?P0|[1-9]\\d*)
14 | (?:\\.(?P\\d+))?
15 | """
16 |
17 | serialize = [
18 | "{major}.{minor}.{patch}.{dev}",
19 | "{major}.{minor}.{patch}",
20 | ]
21 |
22 | [[tool.bumpversion.files]]
23 | filename = "pcgrr/DESCRIPTION"
24 | search = "Version: {current_version}"
25 | replace = "Version: {new_version}"
26 |
27 | [[tool.bumpversion.files]]
28 | filename = "pcgrr/vignettes/installation.Rmd"
29 | search = "{current_version}"
30 | replace = "{new_version}"
31 |
32 | [[tool.bumpversion.files]]
33 | filename = "pcgr/_version.py"
34 | search = "__version__ = '{current_version}'"
35 | replace = "__version__ = '{new_version}'"
36 |
37 | [[tool.bumpversion.files]]
38 | filename = "pyproject.toml"
39 | search = 'version = "{current_version}"'
40 | replace = 'version = "{new_version}"'
41 |
42 | [[tool.bumpversion.files]]
43 | filename = "conda/recipe/pcgr/meta.yaml"
44 | search = "version: {current_version}"
45 | replace = "version: {new_version}"
46 |
47 | [[tool.bumpversion.files]]
48 | filename = "conda/recipe/pcgrr/meta.yaml"
49 | search = "version: {current_version}"
50 | replace = "version: {new_version}"
51 |
52 | [[tool.bumpversion.files]]
53 | filename = "conda/env/yml/pcgr.yml"
54 | search = "pcgr =={current_version}"
55 | replace = "pcgr =={new_version}"
56 |
57 | [[tool.bumpversion.files]]
58 | filename = "conda/env/yml/pcgrr.yml"
59 | search = "pcgrr =={current_version}"
60 | replace = "pcgrr =={new_version}"
61 |
62 | [[tool.bumpversion.files]]
63 | filename = "conda/env/yml/pkgdown.yml"
64 | search = "pcgrr =={current_version}"
65 | replace = "pcgrr =={new_version}"
66 |
67 | [[tool.bumpversion.files]]
68 | filename = ".github/workflows/build_conda_recipes.yaml"
69 | search = "VERSION: '{current_version}'"
70 | replace = "VERSION: '{new_version}'"
71 |
--------------------------------------------------------------------------------
/pcgrr/pkgdown/_pkgdown.yml:
--------------------------------------------------------------------------------
1 | url: https://sigven.github.io/pcgr/
2 | title: PCGR
3 | toc:
4 | depth: 3
5 | template:
6 | bootstrap: 5
7 | bslib:
8 | info: "#9B3297"
9 | dropdown-link-hover-bg: "#9B3297"
10 | dropdown-link-hover-color: "white"
11 | dropdown-link-active-color: "white"
12 | navbar-light-color: "white"
13 | navbar-light-brand-color: "white"
14 | navbar-light-brand-hover-color: "white"
15 | navbar-link-color: "white"
16 | includes:
17 | in_header: |
18 |
19 |
26 | authors:
27 | Sigve Nakken:
28 | href: "https://github.com/sigven"
29 | Peter Diakumis:
30 | href: "https://github.com/pdiakumis"
31 | navbar:
32 | link-color: "white"
33 | light-color: "white"
34 | light-brand-color: "white"
35 | type: light
36 | bg: info
37 | structure:
38 | left: [installation, running, articles, faq, changelog]
39 | right: [search, github]
40 | components:
41 | installation:
42 | text: Installation
43 | href: articles/installation.html
44 | running:
45 | text: Running
46 | href: articles/running.html
47 | articles:
48 | text: Articles
49 | menu:
50 | - text: Input files
51 | href: articles/input.html
52 | - text: Output files
53 | href: articles/output.html
54 | - text: Variant classification
55 | href: articles/variant_classification.html
56 | - text: Annotation resources
57 | href: articles/annotation_resources.html
58 | - text: Developer notes
59 | href: articles/developers.html
60 | - text: Primary tumor sites
61 | href: articles/primary_tumor_sites.html
62 | faq:
63 | text: FAQ
64 | href: articles/faq.html
65 | changelog:
66 | text: CHANGELOG
67 | href: articles/CHANGELOG.html
68 | home:
69 | sidebar:
70 | structure: [links, license, community, citation, authors, dev]
71 | components:
72 | citation:
73 | title: Citation
74 | text: "[Citing PCGR](authors.html#citation)"
75 | news:
76 | one_page: true
77 |
--------------------------------------------------------------------------------
/conda/recipe/pcgrr/meta.yaml:
--------------------------------------------------------------------------------
1 | package:
2 | name: r-pcgrr
3 | version: 2.2.5 # versioned by bump2version
4 |
5 | source:
6 | path: ../../../pcgrr
7 |
8 | build:
9 | number: 0
10 | noarch: generic
11 | rpaths:
12 | - lib/R/lib/
13 | - lib/
14 |
15 | requirements:
16 | build:
17 | - git
18 | host:
19 | - r-base ==4.3.3
20 | - r-assertable
21 | - r-assertthat
22 | - bioconductor-biostrings
23 | - r-bslib
24 | - r-caret
25 | - r-crosstalk
26 | - r-dplyr
27 | - r-dt
28 | - r-formattable
29 | - bioconductor-genomeinfodb
30 | - r-ggplot2
31 | - r-glue
32 | - r-htmltools
33 | - r-log4r
34 | - bioconductor-mutationalpatterns
35 | - r-openxlsx2
36 | - r-plotly
37 | - bioconductor-quantiseqr
38 | - r-quarto
39 | - quarto
40 | - r-randomforest
41 | - r-readr
42 | - r-reshape2
43 | - r-rlang
44 | - r-rrapply
45 | - bioconductor-s4vectors
46 | - r-scales
47 | - r-shiny
48 | - r-stringr
49 | - r-stringi
50 | - r-tidyr
51 | - r-yaml
52 |
53 | run:
54 | - r-base ==4.3.3
55 | - r-assertable
56 | - r-assertthat
57 | - bioconductor-biostrings
58 | - r-bslib
59 | - r-caret
60 | - r-crosstalk
61 | - r-dplyr
62 | - r-dt
63 | - r-formattable
64 | - bioconductor-genomeinfodb
65 | - r-ggplot2
66 | - r-glue
67 | - r-htmltools
68 | - r-log4r
69 | - bioconductor-mutationalpatterns
70 | - r-openxlsx2
71 | - r-plotly
72 | - bioconductor-quantiseqr
73 | - r-quarto
74 | - quarto
75 | - r-randomforest
76 | - r-readr
77 | - r-reshape2
78 | - r-rlang
79 | - r-rrapply
80 | - bioconductor-s4vectors
81 | - r-scales
82 | - r-shiny
83 | - r-stringr
84 | - r-stringi
85 | - r-tidyr
86 | - r-yaml
87 |
88 | test:
89 | commands:
90 | - $R -e "library('pcgrr')"
91 |
92 | about:
93 | home: https://github.com/sigven/pcgr/pcgrr
94 | license: MIT
95 | summary: Personal Cancer Genome ReporteR.
96 | Functions, tools and utilities for the generation of clinical
97 | cancer genome reports with PCGR. This R package is an integrated
98 | part of the Docker/Conda-based PCGR workflow (https://github.com/sigven/pcgr),
99 | it should thus not be used as a stand-alone package.
100 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/mutational_signatures/signature_similarity.qmd:
--------------------------------------------------------------------------------
1 | ### Signature similarity
2 |
3 | Here, we perform a comparison of the sample's raw mutational spectrum to each of the known signatures (SBS) in COSMIC. Input samples with less than 30 SNVs are omitted from this analysis. The cosine similarity is calculated between the mutational spectrum of the sample (i.e. frequency of DNA trinucleotide contexts) and each of the COSMIC signatures. The cosine similarity (*SIMILARITY* column below) ranges from 0 to 1, where 1 indicates a perfect match.
4 |
5 | ```{r highlight_signatures}
6 | #| echo: false
7 | #| eval: !expr as.logical(pcg_report$settings$conf$somatic_snv$mutational_signatures$all_reference_signatures) == FALSE & as.logical(pcg_report$settings$conf$somatic_snv$mutational_signatures$no_prevalence_data) == FALSE
8 | #| output: asis
9 |
10 | cat('\n::: {.callout-note}\n## Site-specific signatures\n\n',
11 | 'Signatures that are previously attributed to ',
12 | conf$sample_properties$site,
13 | ' cancers (if any available, with prevalence >= ',
14 | as.character(msig_conf$prevalence_reference_signatures),
15 | '%) are highlighted. If you see non site-attributed signatures ',
16 | 'with very high similarity to the mutational spectrum of the input sample, a re-run of the fitting procedure using all ',
17 | 'reference signatures is warranted. \n\n:::\n\n', sep='')
18 | ```
19 |
20 |
21 |
22 | ```{r signature_similarity}
23 | #| eval: !expr is.null(msig_content$result$signature_similarity) == FALSE
24 | #| output: asis
25 |
26 | similarity_table <-
27 | DT::datatable(
28 | msig_content$result$signature_similarity,
29 | extensions = "Responsive",
30 | options = list(
31 | pageLength = 13,
32 | dom = 'tp'),
33 | escape = F) |>
34 | DT::formatStyle(
35 | 'SIMILARITY',
36 | fontWeight = 'bold') |>
37 | DT::formatStyle(
38 | c('SIGNATURE_ID',
39 | 'AETIOLOGY_KEYWORD'),
40 | 'SITE_SPECIFIC',
41 | fontWeight = 'bold',
42 | color = "white",
43 | backgroundColor = DT::styleEqual(
44 | c('YES','NO','NOT_DEFINED'),
45 | c(pcg_report$settings$conf$report_color,
46 | pcgrr::color_palette$none,
47 | pcgrr::color_palette$none)
48 | )
49 | )
50 |
51 | #similarity_table$x$data$SITE_SPECIFIC <- NULL
52 |
53 | bslib::card(
54 | height = "700px",
55 | bslib::card_header(
56 | class = "bg-dark",
57 | paste0(
58 | "Similarity of ",
59 | pcg_report$settings$sample_id,
60 | " to COSMIC signatures (SBS)")
61 | ),
62 | bslib::card_body(
63 | similarity_table
64 | )
65 | )
66 |
67 |
68 |
69 | ```
70 |
71 |
72 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/snv_indel.qmd:
--------------------------------------------------------------------------------
1 | ## Somatic SNVs/InDels
2 |
3 |
4 |
5 | ```{r key_snv_indel_numbers}
6 | #| echo: false
7 | #| output: asis
8 | #| eval: true
9 |
10 |
11 | bslib::page_fillable(
12 | bslib::layout_columns(
13 | col_widths = c(3,3,3,3),
14 | height = "100px",
15 | bslib::value_box(
16 | title = "Total variants",
17 | value = paste0(
18 | pcg_report$content$snv_indel$vstats$n),
19 | showcase = NULL,
20 | theme = dplyr::if_else(
21 | pcg_report$content$snv_indel$vstats$n > 0,
22 | "dark",
23 | "dark"
24 | )
25 | ),
26 | bslib::value_box(
27 | title = "Coding variants",
28 | value = paste0(
29 | pcg_report$content$snv_indel$vstats$n_coding),
30 | showcase = NULL,
31 | theme = dplyr::if_else(
32 | pcg_report$content$snv_indel$vstats$n_coding > 0,
33 | "dark",
34 | "dark"
35 | )
36 | ),
37 | bslib::value_box(
38 | title = "SNVs",
39 | value = paste0(
40 | pcg_report$content$snv_indel$vstats$n_snv),
41 | showcase = NULL,
42 | theme = dplyr::if_else(
43 | pcg_report$content$snv_indel$vstats$n_snv > 0,
44 | "dark",
45 | "dark"
46 | )
47 | ),
48 | bslib::value_box(
49 | title = "InDels",
50 | value = paste0(
51 | pcg_report$content$snv_indel$vstats$n_indel),
52 | showcase = NULL,
53 | theme = dplyr::if_else(
54 | pcg_report$content$snv_indel$vstats$n_indel > 0,
55 | "dark",
56 | "dark"
57 | )
58 | )
59 | )
60 | )
61 |
62 | ```
63 |
64 |
65 | ```{r section_variant_filtering}
66 | #| eval: !expr as.logical(pcg_report$settings$conf$assay_properties$vcf_tumor_only) == TRUE
67 | #| output: asis
68 | #| child: pcgr_quarto_report/snv_indel/variant_filtering.qmd
69 |
70 | ```
71 |
72 |
73 | ```{r section_variant_statistics}
74 | #| eval: !expr as.logical(pcg_report$content$snv_indel$vstats$n > 0) == TRUE
75 | #| output: asis
76 | #| child: pcgr_quarto_report/snv_indel/variant_statistics.qmd
77 |
78 | ```
79 |
80 |
81 |
82 |
83 |
84 |
85 | ```{r section_oncogenicity}
86 | #| output: asis
87 | #| eval: !expr as.logical(pcg_report$content$snv_indel$vstats$n != 0) == TRUE
88 | #| child: pcgr_quarto_report/snv_indel/oncogenicity.qmd
89 |
90 | ```
91 |
92 |
93 |
94 | ```{r section_actionability}
95 | #| output: asis
96 | #| eval: !expr as.logical(pcg_report$content$snv_indel$vstats$n != 0) == TRUE
97 | #| child: pcgr_quarto_report/snv_indel/actionability.qmd
98 | ```
99 |
100 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgrr.scss:
--------------------------------------------------------------------------------
1 | /*-- scss:defaults --*/
2 |
3 | $theme: "cosmo" !default;
4 |
5 | //
6 | // Color system
7 | //
8 |
9 | $white: #fff !default;
10 | $gray-100: #f8f9fa !default;
11 | $gray-200: #e9ecef !default;
12 | $gray-300: #dee2e6 !default;
13 | $gray-400: #ced4da !default;
14 | $gray-500: #adb5bd !default;
15 | $gray-600: #868e96 !default;
16 | $gray-700: #495057 !default;
17 | $gray-800: #373a3c !default;
18 | $gray-900: #212529 !default;
19 | $black: #000 !default;
20 |
21 | $blue: #2780e3 !default;
22 | $indigo: #6610f2 !default;
23 | $purple: #613d7c !default;
24 | $pink: #e83e8c !default;
25 | $red: #ff0039 !default;
26 | $orange: #f0ad4e !default;
27 | $yellow: #ff7518 !default;
28 | $green: #3fb618 !default;
29 | $teal: #20c997 !default;
30 | $cyan: #9954bb !default;
31 |
32 | $primary: $blue !default;
33 | $secondary: $gray-800 !default;
34 | $success: $green !default;
35 | $info: $cyan !default;
36 | $warning: $yellow !default;
37 | $danger: $red !default;
38 | $light: $gray-100 !default;
39 | $dark: $gray-800 !default;
40 | $navbar-bg: #9954bb;
41 | $min-contrast-ratio: 2.6 !default;
42 |
43 | // Options
44 |
45 | $enable-rounded: false !default;
46 |
47 | // Body
48 |
49 | $body-color: $gray-800 !default;
50 |
51 | // Fonts
52 |
53 | // stylelint-disable-next-line value-keyword-case
54 | $font-family-sans-serif: "Source Sans Pro", -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol" !default;
55 | $headings-font-weight: 400 !default;
56 |
57 | // Navbar
58 |
59 | $navbar-dark-hover-color: rgba($white, 1) !default;
60 | $navbar-light-hover-color: rgba($black, .9) !default;
61 |
62 | // Alerts
63 |
64 | $alert-border-width: 0 !default;
65 |
66 | // Progress bars
67 |
68 | $progress-height: .5rem !default;
69 |
70 |
71 |
72 | /*-- scss:rules --*/
73 |
74 |
75 | // Variables
76 |
77 | $web-font-path: "https://fonts.googleapis.com/css2?family=Source+Sans+Pro:wght@300;400;700&display=swap" !default;
78 | @if $web-font-path {
79 | @import url($web-font-path);
80 | }
81 |
82 | // Typography
83 |
84 | body {
85 | -webkit-font-smoothing: antialiased;
86 | }
87 |
88 | // Indicators
89 |
90 | .badge {
91 | &.bg-light {
92 | color: $dark;
93 | }
94 | }
95 |
96 | // Progress bars
97 |
98 | .progress {
99 | @include box-shadow(none);
100 |
101 | .progress-bar {
102 | font-size: 8px;
103 | line-height: 8px;
104 | }
105 | }
106 |
107 |
108 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto.css:
--------------------------------------------------------------------------------
1 |
2 | .value_box_container {
3 | /*border: 2px dashed #444;*/
4 | height: 210px;
5 | /*margin: 7px;*/
6 |
7 | /* just for demo */
8 | width: 1200px;
9 | position: absolute;
10 | /*max-width: 600px;*/
11 | /*height: 500px;*/
12 | }
13 |
14 | .quarto-title-banner {
15 | /*height: 120px;*/
16 | margin-left: -15px;
17 | margin-right: -15px;
18 | margin-top: -15px;
19 | }
20 |
21 | .value_box {
22 | width: 250px;
23 | height: 150px;
24 | /*position: fixed;*/
25 | font-family: sans-serif;
26 | margin-right: 15px;
27 | margin-top: 15px;
28 | padding: 10px;
29 | color: white;
30 | position: relative;
31 | display: inline-block;
32 | text-align: center;
33 | background: #000;
34 | /*border-radius: 25px;*/
35 |
36 | /*display: inline;*/
37 | /*zoom: 1*/
38 | }
39 |
40 | h2.title{
41 | color: black;
42 | }
43 |
44 | h2 {
45 | color: black;
46 | }
47 |
48 | h2, #TOC>ul>li {
49 | color: black;
50 | }
51 |
52 | #container {
53 | /*border: 2px dashed #444;*/
54 | height: 30px;
55 | margin: 7px;
56 |
57 | /* just for demo */
58 | min-width: 750px;
59 | }
60 |
61 | .red, .amber, .exploratory, .green, .nolist, .custom, .app_combo {
62 | width: 100px;
63 | height: 28px;
64 | /*font-family: Helvetica;*/
65 | margin-right: 5px;
66 | margin-top: 8px;
67 | padding: 5px;
68 | color: white;
69 | display: inline-block;
70 | text-align: center;
71 |
72 | /*display: inline;*/
73 | /*zoom: 1*/
74 | }
75 | /*.stretch {
76 | width: 100%;
77 | display: inline-block;
78 | font-size: 0;
79 | line-height: 0
80 | }
81 | */
82 |
83 | .exploratory {
84 | background: #000;
85 | border-radius: 5px;
86 |
87 | }
88 |
89 | .app_combo {
90 | background: #084594;
91 | border-radius: 5px;
92 | color: white;
93 |
94 | }
95 |
96 | .green {
97 | background: #3fad46;
98 | border-radius: 5px;
99 |
100 | }
101 |
102 | .red {
103 | background: #d9534f;
104 | border-radius: 5px;
105 |
106 | }
107 |
108 | .nolist {
109 | background: #b8b8ba;
110 | border-radius: 5px;
111 |
112 |
113 | }
114 |
115 | .amber {
116 | background: #f0ad4e;
117 | border-radius: 5px;
118 |
119 | }
120 |
121 | .custom {
122 | background: darkmagenta;
123 | border-radius: 5px;
124 |
125 | }
126 |
127 | .custom > a:hover,
128 | .green > a:hover,
129 | .red > a:hover,
130 | .amber > a:hover,
131 | .app_combo > a:hover,
132 | .nolist > a:hover,
133 | .exploratory > a:hover {
134 | color: white;
135 | text-decoration: underline;
136 | }
137 |
138 | .custom > a:link,
139 | .green > a:link,
140 | .red > a:link,
141 | .amber > a:link,
142 | .nolist > a:link,
143 | .app_combo > a:link,
144 | .exploratory > a:link{
145 | text-decoration: none;
146 | color: white;
147 | }
148 |
149 | .custom > a:visited,
150 | .green > a:visited,
151 | .red > a:visited,
152 | .amber > a:visited,
153 | .nolist > a:visited,
154 | .app_combo > a:visited,
155 | .exploratory > a:visited{
156 | text-decoration: none;
157 | color: white;
158 | }
159 |
160 | .bslib-value-box .value-box-value {
161 | font-size: clamp(.1em, 8cqw, 4em) !important;
162 | }
163 |
164 |
--------------------------------------------------------------------------------
/pcgrr/vignettes/annotation_resources.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Annotation resources"
3 | output: rmarkdown::html_document
4 | ---
5 |
6 | ### Basic variant consequence annotation
7 | * [VEP](http://www.ensembl.org/info/docs/tools/vep/index.html) - Variant Effect Predictor release 113 ([GENCODE v47](https://www.gencodegenes.org/human/) as gene reference database (v19 for grch37))
8 |
9 | ### *Insilico* predictions of effect of coding variants
10 | * [dBNSFP](https://sites.google.com/site/jpopgen/dbNSFP) - database of non-synonymous functional predictions (v5.0, January 2025)
11 |
12 | ### Variant frequency databases
13 | * [gnomAD](http://exac.broadinstitute.org/) - germline variant frequencies exome-wide (r4.1, April 2024)
14 | * [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) - database of short genetic variants (build 156)
15 | * [Cancer Hotspots](http://cancerhotspots.org) - a resource for statistically significant mutations in cancer (v2, 2017)
16 | * [TCGA](https://portal.gdc.cancer.gov/) - somatic mutations discovered across 33 tumor type cohorts (release 41.0, August 2024)
17 |
18 | ### Variant databases of clinical utility
19 | * [ClinVar](http://www.ncbi.nlm.nih.gov/clinvar/) - database of clinically related variants (March 2025)
20 | * [CIViC](https://civicdb.org) - clinical interpretations of variants in cancer (March 13th 2025)
21 | * [CGI](http://www.cancergenomeinterpreter.org/biomarkers) - Cancer Genome Interpreter Cancer Biomarkers Database (CGI) (October 18th 2022)
22 |
23 | ### Protein domains/functional features
24 | * [UniProt/SwissProt KnowledgeBase](http://www.uniprot.org) - resource on protein sequence and functional information (2025_01)
25 | * [Pfam](https://www.ebi.ac.uk/interpro/entry/pfam/#table) - database of protein families and domains (v37.0)
26 |
27 | ### Knowledge resources on gene and protein targets
28 | * [CancerMine](https://zenodo.org/records/7689627) - Literature-mined database of tumor suppressor genes/proto-oncogenes (v50, March 2023)
29 | * [Open Targets Platform](https://www.targetvalidation.org/) - Database on disease-target associations, molecularly targeted drugs and tractability aggregated from multiple sources (literature, pathways, mutations) (2024.09)
30 |
31 | ### Notes on variant annotation datasets
32 |
33 | #### Data quality
34 |
35 | __Genomic biomarkers__
36 |
37 | Genomic biomarkers utilized in PCGR are currently limited to the following:
38 |
39 | * Evidence items for specific markers in CIViC must be *accepted* (*submitted* evidence items are not considered or shown)
40 | * Markers reported at the exact variant level (e.g. __BRAF p.V600E__, __MET c.3028+1G>T__, __g.7:140753336A>T__)
41 | * Markers reported at the codon level (e.g. __KRAS p.G12__)
42 | * Markers reported at the exon level (e.g. __KIT exon 11 mutation__, __EGFR exon 19 deletion__)
43 | * Markers reported at the gene level (e.g. __BRAF mutation__, __TP53 loss-of-function mutation__, __BRCA1 oncogenic mutation__)
44 | * Within the [Cancer bioMarkers database (CGI)](https://www.cancergenomeinterpreter.org/biomarkers), only biomarkers curated from FDA/NCCN guidelines, scientific literature, and clinical trials are included (biomarkers collected from conference abstracts etc. are not included)
45 | * Copy number gains/losses
46 | * RNA fusion and gene expression biomarkers are included in the PCGR reference databundle, but are not currently utilized in the PCGR biomarker matching procedure
47 |
--------------------------------------------------------------------------------
/pcgrr/tests/testthat/test_biomarker.R:
--------------------------------------------------------------------------------
1 | #context("Biomarker check")
2 | #pcgr_data <- readRDS(file='/Users/sigven/research/docker/pcgr/data/grch37/rds/pcgr_data.rds')
3 | #
4 | #eitems_raw <- pcgr_data[["biomarkers"]]
5 | #
6 | #test_that("Biomarker load returns correct origin/type", {
7 | # expect_error(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
8 | # alteration_type = "",
9 | # origin = "Somatic"))
10 | # expect_error(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
11 | # alteration_type = "MUT",
12 | # origin = ""))
13 | # expect_error(pcgrr::load_all_eitems(eitems_raw = NULL,
14 | # alteration_type = "MUT",
15 | # origin = "Somatic"))
16 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
17 | # alteration_type = "MUT",
18 | # origin = "Somatic")$VARIANT_ORIGIN),
19 | # c('Somatic','Somatic Mutation'))
20 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
21 | # alteration_type = "MUT",
22 | # origin = "Germline")$VARIANT_ORIGIN),
23 | # "Germline")
24 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
25 | # alteration_type = "CNA",
26 | # origin = "Somatic")$VARIANT_ORIGIN),
27 | # c("Somatic", "Somatic Mutation"))
28 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
29 | # alteration_type = "MUT",
30 | # origin = "Somatic")$ALTERATION_TYPE),
31 | # "MUT")
32 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
33 | # alteration_type = "CNA",
34 | # origin = "Somatic")$ALTERATION_TYPE),
35 | # "CNA")
36 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
37 | # alteration_type = "CNA",
38 | # origin = "Germline")$ALTERATION_TYPE),
39 | # "CNA")
40 | # expect_gte(ncol(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
41 | # alteration_type = "MUT",
42 | # origin = "Somatic")), 19)
43 | # expect_gte(nrow(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
44 | # alteration_type = "MUT_LOF",
45 | # origin = "Germline")), 1)
46 | # expect_gte(ncol(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
47 | # alteration_type = "CNA",
48 | # origin = "Somatic")), 17)
49 | # expect_equal(unique(pcgrr::load_all_eitems(eitems_raw = eitems_raw,
50 | # alteration_type = "MUT",
51 | # origin = "Somatic")$BIOMARKER_MAPPING),
52 | # c("exact", "gene", "codon", "exon"))
53 | #})
54 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/mutational_signatures/mutational_spectra.qmd:
--------------------------------------------------------------------------------
1 | ### Raw mutational spectrum
2 |
3 | As a background perspective, we here provide various views of the raw mutational spectrum for the sample, i.e. not considering a signature re-fitting analysis.
4 |
5 | ::: {.panel-tabset .nav-pills}
6 |
7 | #### SBS - mutational contexts
8 |
9 |
10 |
11 | ```{r setup_spectra_data}
12 | #| echo: false
13 | #| eval: true
14 |
15 | msig_content <-
16 | pcg_report$content$mutational_signatures
17 | ```
18 |
19 | ```{r raw_context_plot}
20 | #| output: asis
21 | #| eval: !expr is.null(msig_content$result$mut_mat) == FALSE
22 |
23 | catalogue_mat <-
24 | msig_content$result$mut_mat
25 |
26 | y_max <-
27 | plyr::round_any(
28 | max(catalogue_mat[,1] / sum(catalogue_mat[,1])),
29 | 0.05, f = ceiling)
30 |
31 | plotly::ggplotly(
32 | height = 400,
33 | MutationalPatterns::plot_96_profile(
34 | catalogue_mat,
35 | colors = head(pcgrr::color_palette$tier$values, 6),
36 | ymax = y_max)
37 | )
38 |
39 | ```
40 |
41 |
42 | ```{r missing_context_plot}
43 | #| output: asis
44 | #| eval: !expr is.null(msig_content$result$mut_mat) == TRUE
45 |
46 | cat("\n::: {.callout-warning}\n## No SNVs\n\nNo SNVs/single base substitutions",
47 | "were found in this sample\n\n:::\n\n")
48 |
49 | ```
50 |
51 |
52 | #### SBS - type occurrences
53 |
54 |
55 |
56 | ```{r sbs_type_occurrences}
57 | #| eval: !expr is.null(msig_content$result$type_occurrences) == FALSE
58 | #| output: asis
59 |
60 |
61 | if(rowSums(msig_content$result$type_occurrences[1:6]) > 0){
62 | plotly::ggplotly(
63 | height = 400,
64 | MutationalPatterns::plot_spectrum(
65 | msig_content$result$type_occurrences,
66 | CT = TRUE,
67 | error_bars = 'none',
68 | colors = head(
69 | pcgrr::color_palette$tier$values, 7))) |>
70 | plotly::layout(
71 | legend = list(
72 | orientation = "h", # horizontal legend
73 | x = 0.5, # center it
74 | xanchor = "center", # anchor it at the center
75 | y = -0.2 # place it below the plot
76 | )
77 | )
78 | }
79 |
80 | ```
81 |
82 | ```{r sbs_types_missing}
83 | #| eval: !expr (is.null(msig_content$result$type_occurrences) == TRUE) & pcg_report$content$snv_indel$vstats$n_snv == 0
84 | #| output: asis
85 |
86 | cat("\n::: {.callout-warning}\n## No SNVs\n\nNo SNVs/single base substitutions",
87 | "were found in this sample\n\n:::\n\n")
88 |
89 | ```
90 |
91 |
92 | #### ID - mutational contexts
93 |
94 |
95 |
96 | ```{r indel_contexts}
97 | #| eval: !expr is.null(msig_content$result$indel_counts) == FALSE
98 | #| fig-width: 14
99 | #| fig-height: 7
100 |
101 | MutationalPatterns::plot_indel_contexts(
102 | counts = msig_content$result$indel_counts,
103 | condensed = T) +
104 | ggplot2::theme(legend.position = "bottom") +
105 | ggplot2::theme(text = ggplot2::element_text(size = 14))
106 |
107 |
108 | ```
109 |
110 | ```{r indel_contexts_missing}
111 | #| eval: !expr is.null(msig_content$result$indel_counts) == TRUE
112 | #| output: asis
113 |
114 | cat(
115 | "\n::: {.callout-warning}\n## No/very few InDels\n\nNo/very few insertion/deletions",
116 | "were analyzed in this sample\n\n:::\n")
117 |
118 | ```
119 |
120 | :::
121 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/germline.qmd:
--------------------------------------------------------------------------------
1 | ## Germline findings
2 |
3 | ```{r prepare_panel_url}
4 | #| echo: false
5 | #| results: asis
6 |
7 | panel_link <- pcg_report[['content']][['germline_classified']][['panel_info']][['description']]
8 | if(pcg_report[['content']][['germline_classified']][['panel_info']][['panel_id']] != "-1" &
9 | !stringr::str_detect(pcg_report[['content']][['germline_classified']][['panel_info']][['url']], ",")){
10 | description <- pcg_report[['content']][['germline_classified']][['panel_info']][['description']]
11 | description_trait <-
12 | pcg_report[['content']][['germline_classified']][['panel_info']][['description_trait']]
13 | url_raw <- pcg_report[['content']][['germline_classified']][['panel_info']][['url']]
14 | description_full <- paste0(description,': ', description_trait)
15 | if(pcg_report[['content']][['germline_classified']][['panel_info']][['panel_id']] == "0"){
16 | description_full <- description
17 | }
18 | panel_link <- paste0("",
19 | description_full,
20 | "")
21 | }
22 |
23 | ```
24 |
25 |
26 | * Based on a germline variant analysis of the query case using the [Cancer Predisposition Sequencing Reporter (CPSR)](https://github.com/sigven/cpsr), we here list the variants of clinical significance in cancer predisposition genes, both novel (not recorded in ClinVar), and those with existing classifications in ClinVar.
27 | * Virtual panel of cancer predisposition genes screened: `r panel_link`
28 | * Protein-coding variants of uncertain significance shown: __`r !pcg_report$settings$conf$germline$ignore_vus`__
29 |
30 |
31 | ```{r cpsr_clinvar_findings}
32 | #| echo: false
33 | #| eval: true
34 | #| output: asis
35 |
36 | cat('\n')
37 | htmltools::br()
38 |
39 | germline_calls <- pcg_report[['content']][['germline_classified']][['callset']]
40 |
41 | if(NROW(germline_calls$variant_display) > 100){
42 | cat('NOTE - only considering top 100 variants (due to limitations with client-side tables)
',sep="\n")
43 | cat('
')
44 | germline_calls$variant_display <-
45 | head(germline_calls$variant_display, 100)
46 | }
47 |
48 | germline_dt <- DT::datatable(
49 | germline_calls$variant_display,
50 | escape = F,
51 | extensions = c("Buttons","Responsive"),
52 | options = list(
53 | pageLength = 10,
54 | scrollCollapse = T,
55 | buttons = c('csv','excel'),
56 | dom = 'Bfrtip'
57 | )) |>
58 | DT::formatStyle(
59 | columns = c("SYMBOL","ALTERATION"),
60 | valueColumns = c("CLINICAL_SIGNIFICANCE"),
61 | color = "white",
62 | backgroundColor =
63 | DT::styleEqual(
64 | c("Pathogenic", "Likely_Pathogenic","VUS"),
65 | c("#9E0142","#D53E4F","#000000")
66 | )
67 | )
68 |
69 | bslib::page_fillable(
70 | bslib::card(
71 | bslib::card_header(
72 | class = "bg-dark",
73 | paste0("Germline variants - ",
74 | pcg_report[['content']][['germline_classified']]$sample_id)
75 | ),
76 | bslib::card_body(
77 | height = min(500, 150 + NROW(germline_calls$variant_display) * 80),
78 | if(NROW(germline_calls$variant_display) > 0){
79 | germline_dt
80 | }else{
81 | "NO cancer-predisposing variants of clinical significance were found in the query case (CPSR report)."
82 | }
83 | )
84 | )
85 | )
86 |
87 |
88 | ```
89 |
90 |
91 |
92 |
--------------------------------------------------------------------------------
/pcgrr/R/clinicaltrials.R:
--------------------------------------------------------------------------------
1 | #' Function that retrieves relevant (interventional based on molecular target)
2 | #' clinical trials for a given tumor type
3 | #' @param pcgr_data PCGR data bundle object
4 | #' @param config PCGR run configurations
5 | #' @param sample_name sample name
6 | #'
7 | #' @return pcg_report_trials data frame with all report elements
8 | #' @export
9 |
10 | generate_report_data_trials <- function(pcgr_data, config, sample_name) {
11 |
12 |
13 | invisible(assertthat::assert_that(!is.null(pcgr_data)))
14 | invisible(assertthat::assert_that(!is.null(pcgr_data$clinicaltrials$trials)))
15 | invisible(assertthat::assert_that(is.data.frame(
16 | pcgr_data$clinicaltrials$trials)))
17 | invisible(assertthat::assert_that(
18 | NROW(pcgr_data$clinicaltrials$trials) > 0))
19 |
20 |
21 | pcg_report_trials <- pcgrr::init_report(config = config,
22 | class = "clinicaltrials")
23 | pcg_report_trials[["eval"]] <- T
24 |
25 | pcg_report_trials[["trials"]] <-
26 | pcgr_data[["clinicaltrials"]][["trials"]] |>
27 | dplyr::filter(.data$primary_site ==
28 | config[["t_props"]][["tumor_type"]])
29 |
30 | if (nrow(pcg_report_trials[["trials"]]) > 0) {
31 |
32 | pcg_report_trials[["trials"]] <- pcg_report_trials[["trials"]] |>
33 | dplyr::select(.data$nct_id, .data$title, .data$overall_status,
34 | .data$cui_link, .data$intervention_link,
35 | .data$phase, .data$start_date,
36 | .data$primary_completion_date, .data$cui_name,
37 | .data$intervention,
38 | .data$intervention_target,
39 | .data$biomarker_context,
40 | .data$chromosome_abnormality,
41 | .data$clinical_context,
42 | .data$world_region,
43 | .data$metastases, .data$gender,
44 | .data$minimum_age, .data$maximum_age, .data$phase,
45 | .data$n_primary_cancer_sites,
46 | .data$study_design_primary_purpose) |>
47 | dplyr::rename(condition_raw = .data$cui_name,
48 | condition = .data$cui_link,
49 | intervention2 = .data$intervention_link,
50 | intervention_raw = .data$intervention,
51 | biomarker_index = .data$biomarker_context,
52 | keyword = .data$clinical_context,
53 | chrom_abnormalities = .data$chromosome_abnormality,
54 | metastases_index = .data$metastases) |>
55 | dplyr::rename(intervention = .data$intervention2)
56 |
57 | colnames(pcg_report_trials[["trials"]]) <-
58 | toupper(colnames(pcg_report_trials[["trials"]]))
59 | pcg_report_trials[["trials"]] <- pcg_report_trials[["trials"]]
60 | #magrittr::set_colnames(toupper(names(.))) |>
61 | dplyr::arrange(.data$N_PRIMARY_CANCER_SITES,
62 | .data$OVERALL_STATUS,
63 | dplyr::desc(.data$START_DATE),
64 | dplyr::desc(nchar(.data$BIOMARKER_INDEX)),
65 | dplyr::desc(.data$STUDY_DESIGN_PRIMARY_PURPOSE)) |>
66 | dplyr::select(-c(.data$N_PRIMARY_CANCER_SITES, .data$STUDY_DESIGN_PRIMARY_PURPOSE))
67 |
68 | if (nrow(pcg_report_trials[["trials"]]) > 2000) {
69 | pcg_report_trials[["trials"]] <-
70 | utils::head(pcg_report_trials[["trials"]], 2000)
71 | }
72 |
73 | }else{
74 | pcg_report_trials[["missing_data"]] <- T
75 | }
76 | return(pcg_report_trials)
77 | }
78 |
--------------------------------------------------------------------------------
/pcgrr/NAMESPACE:
--------------------------------------------------------------------------------
1 | # Generated by roxygen2: do not edit by hand
2 |
3 | export(af_distribution)
4 | export(append_annotation_links)
5 | export(append_cancer_association_ranks)
6 | export(append_cancer_gene_evidence)
7 | export(append_dbmts_var_link)
8 | export(append_dbnsfp_var_link)
9 | export(append_drug_var_link)
10 | export(append_gwas_citation_phenotype)
11 | export(append_oncogenicity_docs)
12 | export(append_targeted_drug_annotations)
13 | export(append_tcga_var_link)
14 | export(append_tfbs_annotation)
15 | export(assign_amp_asco_tiers)
16 | export(assign_germline_popfreq_status)
17 | export(assign_mutation_type)
18 | export(assign_somatic_classification)
19 | export(assign_somatic_germline_evidence)
20 | export(check_common_colnames)
21 | export(check_file_exists)
22 | export(clinvar_germline_status)
23 | export(cosmic_somatic_status)
24 | export(dbsnp_germline_status)
25 | export(deduplicate_eitems)
26 | export(detect_vcf_sample_name)
27 | export(df_string_replace)
28 | export(exclude_non_chrom_variants)
29 | export(expand_biomarker_items)
30 | export(export_quarto_evars)
31 | export(filter_eitems_by_site)
32 | export(filter_maf_file)
33 | export(filter_read_support)
34 | export(generate_annotation_link)
35 | export(generate_report)
36 | export(generate_report_data_expression)
37 | export(generate_report_data_kataegis)
38 | export(generate_report_data_msi)
39 | export(generate_report_data_rainfall)
40 | export(generate_report_data_signatures)
41 | export(generate_report_data_tmb)
42 | export(generate_report_data_trials)
43 | export(generate_tier_tsv)
44 | export(get_clin_assocs_cna)
45 | export(get_dt_tables)
46 | export(get_excel_sheets)
47 | export(get_genome_obj)
48 | export(get_oncogenic_cna_events)
49 | export(get_prevalent_site_signatures)
50 | export(get_tumor_only_filtering_criteria)
51 | export(get_valid_chromosomes)
52 | export(get_variant_statistics)
53 | export(het_af_germline_status)
54 | export(hex_to_rgba)
55 | export(hom_af_status)
56 | export(init_cna_vstats)
57 | export(init_expression_content)
58 | export(init_germline_content)
59 | export(init_kataegis_content)
60 | export(init_m_signature_content)
61 | export(init_msi_content)
62 | export(init_rainfall_content)
63 | export(init_report)
64 | export(init_snv_indel_vstats)
65 | export(init_tmb_content)
66 | export(init_tumor_only_content)
67 | export(init_var_content)
68 | export(kataegis_detect)
69 | export(kataegis_input)
70 | export(load_all_eitems)
71 | export(load_cpsr_classified_variants)
72 | export(load_dna_variants)
73 | export(load_eitems)
74 | export(load_expression_csq)
75 | export(load_expression_outliers)
76 | export(load_expression_similarity)
77 | export(load_reference_data)
78 | export(load_somatic_cna)
79 | export(load_somatic_snv_indel)
80 | export(load_yaml)
81 | export(log4r_debug)
82 | export(log4r_fatal)
83 | export(log4r_info)
84 | export(log4r_warn)
85 | export(log_var_eitem_stats)
86 | export(max_af_gnomad)
87 | export(mkdir)
88 | export(msi_indel_fraction_plot)
89 | export(msi_indel_load_plot)
90 | export(order_variants)
91 | export(plot_cna_segments)
92 | export(plot_filtering_stats_exonic)
93 | export(plot_filtering_stats_germline)
94 | export(plot_signature_contributions)
95 | export(plot_tmb_primary_site_tcga)
96 | export(plot_value_boxes)
97 | export(plotly_pie_chart)
98 | export(pon_status)
99 | export(predict_msi_status)
100 | export(qc_var_eitems)
101 | export(remove_cols_from_df)
102 | export(sort_chromosomal_segments)
103 | export(strip_html)
104 | export(structure_var_eitems)
105 | export(tcga_somatic_status)
106 | export(tier_af_distribution)
107 | export(update_report)
108 | export(vaf_plot)
109 | export(variant_stats_report)
110 | export(write_processed_vcf)
111 | export(write_report_excel)
112 | export(write_report_quarto_html)
113 | export(write_report_tsv)
114 | importFrom(rlang,":=")
115 | importFrom(rlang,.data)
116 |
--------------------------------------------------------------------------------
/pcgrr/R/variant_stats.R:
--------------------------------------------------------------------------------
1 | #' Function that computes various variant statistics from a data frame
2 | #' with variant records
3 | #'
4 | #' @param var_df data frame with variants
5 | #' @param pct_other_limit numeric value specifying the percentage limit
6 | #' for the 'Other' category
7 | #'
8 | #' @export
9 | #'
10 | get_variant_statistics <- function(var_df = NULL, pct_other_limit = 4){
11 |
12 | assertthat::assert_that(
13 | !is.null(var_df),
14 | is.data.frame(var_df),
15 | msg = "Argument 'var_df' must be a valid data.frame"
16 | )
17 |
18 | assertable::assert_colnames(
19 | var_df, c("VARIANT_CLASS", "CONSEQUENCE","CODING_STATUS"),
20 | only_colnames = F, quiet = T
21 | )
22 |
23 | consequence_stats <-
24 | var_df |>
25 | dplyr::mutate(CONSEQUENCE = stringr::str_replace_all(
26 | .data$CONSEQUENCE, "(, [0-9A-Za-z_]{1,}){1,}$",""
27 | )) |>
28 | dplyr::group_by(.data$CONSEQUENCE) |>
29 | dplyr::summarise(
30 | N = dplyr::n(),
31 | .groups = "drop"
32 | ) |>
33 | dplyr::arrange(dplyr::desc(.data$N))
34 |
35 | if(NROW(consequence_stats) > 5) {
36 | consequence_stats_top <- utils::head(consequence_stats, 4)
37 | consequence_stats_other <- consequence_stats |>
38 | dplyr::slice_tail(n = -4) |>
39 | dplyr::summarise(
40 | N = sum(.data$N),
41 | CONSEQUENCE = "other_consequences"
42 | )
43 | consequence_stats <- dplyr::bind_rows(
44 | consequence_stats_top, consequence_stats_other) |>
45 | dplyr::arrange(dplyr::desc(.data$N))
46 | }
47 |
48 | consequence_stats <- consequence_stats |>
49 | dplyr::mutate(Pct = .data$N / sum(.data$N) * 100)
50 |
51 | consequence_stats_coding <-
52 | var_df |>
53 | dplyr::filter(.data$CODING_STATUS == "coding")
54 |
55 | if(NROW(consequence_stats_coding) > 0) {
56 | consequence_stats_coding <-
57 | consequence_stats_coding |>
58 | dplyr::mutate(CONSEQUENCE = stringr::str_replace_all(
59 | .data$CONSEQUENCE, "(, [0-9A-Za-z_]{1,}){1,}$",""
60 | )) |>
61 | dplyr::group_by(.data$CONSEQUENCE) |>
62 | dplyr::summarise(
63 | N = dplyr::n(),
64 | .groups = "drop"
65 | ) |>
66 | dplyr::arrange(dplyr::desc(.data$N))
67 |
68 | if(NROW(consequence_stats_coding) > 5) {
69 | consequence_stats_coding_top <- utils::head(consequence_stats_coding, 4)
70 | consequence_stats_coding_other <- consequence_stats_coding |>
71 | dplyr::slice_tail(n = -4) |>
72 | dplyr::summarise(
73 | N = sum(.data$N),
74 | CONSEQUENCE = "other_consequences"
75 | )
76 | consequence_stats_coding <- dplyr::bind_rows(
77 | consequence_stats_coding_top,
78 | consequence_stats_coding_other) |>
79 | dplyr::arrange(dplyr::desc(.data$N))
80 | }
81 |
82 | consequence_stats_coding <-
83 | consequence_stats_coding |>
84 | dplyr::mutate(Pct = .data$N / sum(.data$N) * 100)
85 | }
86 |
87 |
88 | variant_class_stats <-
89 | var_df |>
90 | dplyr::group_by(.data$VARIANT_CLASS) |>
91 | dplyr::summarise(
92 | N = dplyr::n(),
93 | .groups = "drop"
94 | ) |>
95 | dplyr::mutate(Pct = .data$N / sum(.data$N) * 100) |>
96 | dplyr::arrange(dplyr::desc(.data$Pct))
97 |
98 | coding_stats <-
99 | var_df |>
100 | dplyr::group_by(.data$CODING_STATUS) |>
101 | dplyr::summarise(
102 | N = dplyr::n(),
103 | .groups = "drop"
104 | ) |>
105 | dplyr::mutate(Pct = .data$N / sum(.data$N) * 100) |>
106 | dplyr::arrange(dplyr::desc(.data$Pct))
107 |
108 | result <- list()
109 | result[['consequence']] <- consequence_stats
110 | result[['consequence_coding']] <- consequence_stats_coding
111 | result[['variant_class']] <- variant_class_stats
112 | result[['coding']] <- coding_stats
113 |
114 | return(result)
115 | }
116 |
--------------------------------------------------------------------------------
/pcgrr/R/data.R:
--------------------------------------------------------------------------------
1 | #' List of URLS and variant identifiers for variant/gene/protein domain databases
2 | #'
3 | #'
4 | #' @format A data.frame with 6 rows and 5 columns that indicates URL's for various variant/gene databases
5 | #' and how to use PCGR annotation columns to generate variant links
6 | #' \itemize{
7 | #' \item \emph{name} - Name encoding for variant/gene database
8 | #' \item \emph{group_by_var} - Which column should be used for grouping
9 | #' \item \emph{url_prefix} - URL prefix
10 | #' \item \emph{link_key_var} - Which column to be used as the key value in link
11 | #' \item \emph{link_display_var} - Which column to be used as the display variable in link
12 | #' }
13 | #'
14 | "variant_db_url"
15 |
16 |
17 | #' Oncogenicity criteria (ClinGen/CGC/VICC)
18 | #'
19 | "oncogenicity_criteria"
20 |
21 | #' Fixed data types/categories used for biomarker evidence, e.g. 'types','levels' etc.
22 | #'
23 | "biomarker_evidence"
24 |
25 | #' List of coltype definitions for input files to pcgrr (e.g. VCF-converted TSV, CNA TVS etc.)
26 | #'
27 | "data_coltype_defs"
28 |
29 | #' List of COSMIC reference mutational signatures (SBS, v3.4)
30 | #'
31 | #' @format A list with two matrix objects ('all' and 'no_artefacts').
32 | #' One matrix contains the COSMIC reference mutational signatures without signature
33 | #' artefacts ('no_artefacts', number of columns = 68), while the other contains
34 | #' all signatures, including artefacts ('all', number of columns = 86). Each
35 | #' matrix has 96 rows, one for each of the 96 possible trinucleotide contexts.
36 | #'
37 | "cosmic_sbs_signatures"
38 |
39 | #' Data frame with all TCGA cohorts
40 | #'
41 | #' @format A data.frame with 33 rows and 2 columns that indicates TCGA cohorts
42 | "tcga_cohorts"
43 |
44 |
45 | #' Data frame with immune cell types
46 | #'
47 | #' @format A data.frame with 11 rows and 2 columns that indicates immune
48 | #' cell types used in immune contexture analysis by quanTIseq
49 | #'
50 | #'
51 | "immune_celltypes"
52 |
53 | #' Data frame with germline filtering criteria
54 | #'
55 | #' @format A character vector listing all germline filtering criteria
56 | #' applied on input callsets (SNVs/InDels) in tumor-only mode
57 | #'
58 | "germline_filter_levels"
59 |
60 | #' List of URLs for a range of variant effect prediction algorithms
61 | #'
62 | #'
63 | #' @format A data.frame with 21 rows and 3 columns that indicates URL's for
64 | #' variant effect prediction algorithms
65 | #' \itemize{
66 | #' \item \emph{algorithm} - Name encoding for effect prediction algorithm
67 | #' \item \emph{url} - URL
68 | #' \item \emph{display_name} - Display name for use in reporting
69 | #' }
70 | #'
71 | "effect_prediction_algos"
72 |
73 | #' Regular expression of terms indicative of cancer-related phenotypes and syndromes
74 | #'
75 | #' @format A long regular expression of cancer-related phenotype terms
76 | #'
77 | "cancer_phenotypes_regex"
78 |
79 | #' Color encodings for report elements of PCGR/CPSR
80 | #'
81 | #' @format A list object with different report elements that are color-coded in
82 | #' PCGR/CPSR reports. Each list element have two vectors: 'levels' and 'values'.
83 | #' Currently, the following list elements are included:
84 | #' \itemize{
85 | #' \item \emph{pathogenicity} - Colors for five-level pathogenicity levels (CPSR)
86 | #' \item \emph{clinical_evidence} - Colors for strength of evidence of cancer-variant associations (A-E)
87 | #' \item \emph{tier} - Colors for tier levels for variant prioritization (PCGR)
88 | #' \item \emph{report_color} - Colors for PCGR assay mode (tumor-control vs. tumor-only)
89 | #' \item \emph{warning} - Color for warning (low confidence in PCGR analysis output)
90 | #' \item \emph{success} - Color for success (no evident uncertainty in PCGR analysis output)
91 | #' }
92 | #'
93 | "color_palette"
94 |
95 | #' TSV columns
96 | "tsv_cols"
97 |
98 | #' DT Display
99 | "dt_display"
100 |
--------------------------------------------------------------------------------
/pcgrr/R/expression.R:
--------------------------------------------------------------------------------
1 | #' Function that generates expression data for PCGR report
2 | #'
3 | #' @param ref_data PCGR reference data object
4 | #' @param settings PCGR run/configuration settings
5 | #'
6 | #' @export
7 | generate_report_data_expression <-
8 | function(ref_data = NULL,
9 | settings = NULL) {
10 |
11 | pcg_report_expression <-
12 | pcgrr::init_expression_content()
13 |
14 | pcg_report_expression[["eval"]] <- TRUE
15 |
16 | if(as.logical(settings$conf$expression$similarity_analysis) == TRUE){
17 | pcg_report_expression[["similarity_analysis"]] <-
18 | load_expression_similarity(settings = settings)
19 | }
20 |
21 | if(settings$molecular_data$fname_expression_outliers_tsv != "None" &
22 | file.exists(settings$molecular_data$fname_expression_outliers_tsv)){
23 |
24 | pcg_report_expression[["outliers"]] <-
25 | pcgrr::load_expression_outliers(settings = settings,
26 | ref_data = ref_data)
27 | }
28 |
29 | if(settings$molecular_data$fname_expression_tsv != "None" &
30 | file.exists(settings$molecular_data$fname_expression_tsv)){
31 |
32 | exp_data <-
33 | readr::read_tsv(
34 | settings$molecular_data$fname_expression_tsv,
35 | show_col_types = F, na = "."
36 | )
37 |
38 | pcg_report_expression[["expression"]] <- exp_data
39 |
40 | if("SYMBOL" %in% colnames(exp_data) == FALSE |
41 | ("TPM" %in% colnames(exp_data) == FALSE &
42 | "TPM_GENE" %in% colnames(exp_data) == FALSE) |
43 | "BIOTYPE" %in% colnames(exp_data) == FALSE){
44 | pcgrr::log4r_warn(
45 | "Missing a required column in expression file: SYMBOL, TPM/TPM_GENE, BIOTYPE")
46 | }else{
47 |
48 | n_pc <- sum(exp_data$BIOTYPE == "protein_coding")
49 |
50 | if("TPM_GENE" %in% colnames(exp_data)){
51 | exp_data$TPM <- as.numeric(exp_data$TPM_GENE)
52 | }
53 |
54 | if(n_pc > 0){
55 | pcgrr::log4r_info(
56 | "Estimating immune contexture of tumor sample from RNA-seq data")
57 | exp_protein_coding <- exp_data |>
58 | dplyr::filter(.data$BIOTYPE == "protein_coding") |>
59 | dplyr::group_by(.data$SYMBOL) |>
60 | dplyr::summarise(TPM = sum(.data$TPM, na.rm = TRUE)) |>
61 | dplyr::select(c("SYMBOL", "TPM")) |>
62 | dplyr::distinct()
63 |
64 | if(NROW(exp_protein_coding) > 0){
65 | rown <- exp_protein_coding$SYMBOL
66 | mat <- as.matrix(exp_protein_coding$TPM)
67 | rownames(mat) <- rown
68 | colnames(mat) <- "TPM"
69 |
70 | pcg_report_expression[["immune_contexture"]] <-
71 | suppressMessages(quantiseqr::run_quantiseq(
72 | expression_data = mat,
73 | is_tumordata = TRUE,
74 | ))
75 |
76 | if(is.data.frame(pcg_report_expression[["immune_contexture"]]) &
77 | "Sample" %in% colnames(pcg_report_expression[["immune_contexture"]])){
78 | pcg_report_expression[["immune_contexture"]] <-
79 | pcg_report_expression[["immune_contexture"]] |>
80 | dplyr::rename(sample_id = "Sample") |>
81 | dplyr::mutate(sample_id = settings$sample_id)
82 | rownames(pcg_report_expression[["immune_contexture"]]) <- NULL
83 |
84 | pcg_report_expression[["immune_contexture"]] <-
85 | pcg_report_expression[["immune_contexture"]] |>
86 | tidyr::pivot_longer(
87 | !.data$sample_id, names_to = "method_cell_type",
88 | values_to = "fraction") |>
89 | dplyr::mutate(fraction = round(
90 | .data$fraction, digits = 3)) |>
91 | dplyr::left_join(
92 | pcgrr::immune_celltypes, by = "method_cell_type") |>
93 | dplyr::distinct()
94 |
95 | }
96 | }
97 | }
98 | }
99 |
100 | }
101 |
102 |
103 | return(pcg_report_expression)
104 | }
105 |
106 |
107 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/snv_indel/variant_filtering.qmd:
--------------------------------------------------------------------------------
1 | ```{r variant_filtering_prep}
2 | #| eval: true
3 | #| output: asis
4 | #| echo: false
5 |
6 | #pcg_report <- readRDS(
7 | # file = "/Users/sigven/project_data/packages/package__pcgr/pcgr/pcgrr/pcg_report.rds")
8 |
9 | to_settings <- pcg_report$settings$conf$somatic_snv$tumor_only
10 |
11 | filtering_stats <- list()
12 |
13 | filtering_stats[['germline']] <- pcgrr::plot_filtering_stats_germline(
14 | report = pcg_report)
15 |
16 |
17 | if(pcg_report$settings$conf$somatic_snv$tumor_only$exclude_nonexonic == TRUE){
18 |
19 | filtering_stats[['exonic']] <- pcgrr::plot_filtering_stats_exonic(
20 | report = pcg_report,
21 | plot_margin_bottom = 100)
22 | }
23 |
24 |
25 |
26 | ```
27 |
28 | ### Variant filtering
29 |
30 | In an effort to minimize the presence/impact of germline events from input variants (SNVs/InDels) called through a **tumor-only assay**, we here show the results of germline variant filters applied on the raw input set. The filters provide a classification of variants as somatic or likely germline (i.e. catched by various filters), where the latter set are removed/excluded, and not used for any analysis shown in this report.
31 |
32 | ::: {.panel-tabset}
33 |
34 | #### Filtering results
35 |
36 | ```{r variant_filtering_pie}
37 | #| eval: true
38 | #| output: asis
39 | #| echo: false
40 |
41 |
42 | if(pcg_report$settings$conf$somatic_snv$tumor_only$exclude_nonexonic == TRUE){
43 |
44 | bslib::card(
45 | full_screen = TRUE,
46 | height = "365px",
47 | bslib::card_header(
48 | class = "bg-dark",
49 | paste0("Variant filtering statistics - ",
50 | pcg_report$settings$sample_id)),
51 | bslib::card_body(
52 | bslib::layout_columns(
53 | widths = c(6,6),
54 | filtering_stats[['germline']]$plot,
55 | filtering_stats[['exonic']]$plot
56 | )
57 | )
58 | )
59 |
60 | }else{
61 | bslib::card(
62 | full_screen = TRUE,
63 | height = "365px",
64 | bslib::card_header(
65 | class = "bg-dark",
66 | paste0("Variant filtering statistics - ",
67 | pcg_report$settings$sample_id)),
68 | bslib::card_body(
69 | filtering_stats[['germline']]$plot)
70 | )
71 | }
72 |
73 |
74 | ```
75 |
76 | #### Settings
77 |
78 | The variant filtering has been performed based on the following criteria:
79 |
80 | :::: {.columns}
81 |
82 | ::: {.column width="47.5%"}
83 |
84 | * Variant filtering aginst gnomAD (filter *GERMLINE_GNOMAD*): __TRUE__ (not configurable)
85 | * maximum allowed population-specific gnomAD MAF's for somatic events (configurable):
86 | * AFR: __`r to_settings$maf_gnomad_afr`__
87 | * AMR: __`r to_settings$maf_gnomad_amr`__
88 | * ASJ: __`r to_settings$maf_gnomad_asj`__
89 | * EAS: __`r to_settings$maf_gnomad_eas`__
90 | * FIN: __`r to_settings$maf_gnomad_fin`__
91 | * NFE: __`r to_settings$maf_gnomad_nfe`__
92 | * OTH: __`r to_settings$maf_gnomad_oth`__
93 | * SAS: __`r to_settings$maf_gnomad_sas`__
94 | * maximum allowed global gnomAD MAF for somatic events: __`r to_settings$maf_gnomad_global`__
95 |
96 | :::
97 |
98 | ::: {.column width="5%"}
99 |
100 | :::
101 |
102 | ::: {.column width="47.5%"}
103 | * Variant filtering against ClinVar (filter *GERMLINE_CLINVAR*): __`r as.logical(to_settings$exclude_clinvar_germline)`__
104 | * Variant filtering against dbSNP (filter *GERMLINE_DBSNP*): __`r as.logical(to_settings$exclude_dbsnp_nonsomatic)`__
105 | * Variant filtering based on tumor VAF value (likely heterozygous germline event, VAF in 0.4 - 0.6, filter *GERMLINE_HET*): __`r as.logical(to_settings$exclude_likely_het_germline)`__
106 | * Variant filtering based on tumor VAF value (likely homozygous germline event, VAF = 1, filter *GERMLINE_HOM*): __`r as.logical(to_settings$exclude_likely_hom_germline)`__
107 | * Variant filtering against panel-of-normals (filter *GERMLINE_PON*: __`r as.logical(to_settings$exclude_pon)`__
108 | * Variant filtering based on exonic regions: __`r as.logical(to_settings$exclude_nonexonic)`__
109 |
110 | :::
111 |
112 | ::::
113 |
114 | :::
115 |
116 |
117 |
--------------------------------------------------------------------------------
/pcgrr/inst/templates/pcgr_quarto_report/clinicaltrials.qmd:
--------------------------------------------------------------------------------
1 |
2 | ## Clinical trials
3 |
4 | * Ongoing or planned clinical trials in the relevant tumor type have been retrieved from [clinicaltrials.gov](https://clinicaltrials.gov), focusing on the subset with molecularly targeted therapies
5 | * Key information entities (interventions/drugs, conditions) in trial records have been mapped to established thesauri ([ChEMBL](https://www.ebi.ac.uk/chembl/), [NCI Thesaurus](https://ncithesaurus.nci.nih.gov/ncitbrowser/), [UMLS/MedGen](https://www.ncbi.nlm.nih.gov/medgen/))
6 | * Results from a text-mining procedure on unstructured trial text (e.g. inclusion/exclusion criteria) attempts to highlight the presence of established molecular biomarkers in cancer and relevant therapeutic contexts.
7 |
8 |
9 |
10 | ```{r table_browse_trials, echo=F, results = "asis", eval = !pcg_report[['content']][['clinicaltrials']][['missing_data']]}
11 |
12 | trials_ttype <- crosstalk::SharedData$new(pcg_report[['content']][['clinicaltrials']][['trials']])
13 | crosstalk::bscols(
14 | list(
15 | crosstalk::filter_select("CONDITION_RAW", "Condition (cancer subtype)", trials_ttype, ~CONDITION_RAW),
16 | crosstalk::filter_select("OVERALL_STATUS", "Status", trials_ttype, ~OVERALL_STATUS),
17 | crosstalk::filter_select("WORLD_REGION", "Location", trials_ttype, ~WORLD_REGION),
18 | crosstalk::filter_select("INTERVENTION_RAW", "Drug(s)", trials_ttype, ~INTERVENTION_RAW),
19 | crosstalk::filter_select("INTERVENTION_TARGET", "Drug target(s)", trials_ttype, ~INTERVENTION_TARGET),
20 | crosstalk::filter_select("KEYWORD", "Therapeutic context mentions (text-mined)", trials_ttype, ~KEYWORD),
21 | crosstalk::filter_select("BIOMARKER_INDEX", "Biomarker mentions (text-mined)", trials_ttype, ~BIOMARKER_INDEX)
22 |
23 | ),
24 | list(
25 | crosstalk::filter_select("PHASE","Phase",trials_ttype,~PHASE),
26 | crosstalk::filter_checkbox("GENDER", "Gender", trials_ttype, ~GENDER),
27 | crosstalk::filter_slider("MINIMUM_AGE", "Minimum age", trials_ttype, ~MINIMUM_AGE),
28 | crosstalk::filter_slider("MAXIMUM_AGE", "Maximum age", trials_ttype, ~MAXIMUM_AGE),
29 | crosstalk::filter_select("METASTASES_INDEX", "Metastases mentions (text-mined)", trials_ttype, ~METASTASES_INDEX)
30 | )
31 | )
32 |
33 |
34 | ```
35 |
36 |
37 | ```{r trials_missing_filters, echo=F, results = 'asis', eval = pcg_report[['content']][['clinicaltrials']][['missing_data']]}
38 | cat('\n* No molecularly targeted trials retrieved for the tumor type in question.', sep='\n')
39 | cat('\n')
40 | ```
41 |
42 |
43 |
44 |
45 | ```{r trials_table_all, eval = !pcg_report[['content']][['clinicaltrials']][['missing_data']]}
46 |
47 | trials_ttype |>
48 | DT::datatable(escape = F,
49 | extensions = c("Buttons","Responsive"),
50 | options = list(pageLength = 10,
51 | buttons = c('csv','excel'),
52 | dom = 'Bfrtip')) |>
53 | DT::formatStyle("OVERALL_STATUS",color="white",
54 | backgroundColor = DT::styleEqual(c('Recruiting',
55 | 'Not yet recruiting',
56 | 'Active, not recruiting',
57 | 'Enrolling by invitation',
58 | 'Completed',
59 | 'Suspended',
60 | 'Withdrawn',
61 | 'Unknown status'), c("#00a65a","#00a65a","#CD534C","#CD534C","#CD534C","#CD534C","#CD534C", "#8F7700")))
62 |
63 | # DT::formatStyle(color="white", "SYMBOL", "BM_RESOLUTION", fontWeight = 'bold', `text-align` = 'center',
64 | # backgroundColor = DT::styleEqual(c('exact','codon','exon','gene'),
65 | # c('#000','#000',pcgrr::color_palette[['warning']][['values']][1], pcgrr::color_palette[['warning']][['values']][1])))
66 |
67 | ```
68 |
69 |
70 | ```{r trials_missing_data, echo=F, results = 'asis', eval = pcg_report[['content']][['clinicaltrials']][['missing_data']]}
71 | cat('\n* No molecularly targeted trials retrieved for the tumor type in question .',sep='\n')
72 | cat('\n')
73 | ```
74 |
--------------------------------------------------------------------------------
/pcgrr/vignettes/faq.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "FAQ"
3 | output: rmarkdown::html_document
4 | ---
5 |
6 | Frequently asked questions regarding PCGR usage and functionality:
7 |
8 | __1. I do not see any data related to allelic depth/support in my report. I thought that PCGR can grab this information automatically from my VCF?__
9 |
10 | _Answer: VCF variant genotype data (i.e. AD/DP) is something that you as a user need to specify explicitly when running PCGR. In our experience, there is currently no uniform way that variant callers format these types of data (allelic fraction/depth, tumor/normal) in the VCF, and this makes it very challenging for PCGR to automatically grab this information from any VCF. Please take a careful look at the example VCF files (`examples` folder) that comes with PCGR for how PCGR expects this information to be formatted, and make sure your VCF is formatted accordingly. There is also an in-depth explanation on the matter [described here](input.html#formatting-of-allelic-depthsupport-dpad)_
11 |
12 | __2. Is it possible to utilize PCGR for analysis of multiple samples?__
13 |
14 | _Answer: As the name of the tool implies, PCGR was developed for the detailed analysis of individual tumor samples. However, if you take advantage of the different outputs from PCGR, it can also be utilized for analysis of multiple samples. First, make sure your input files are organized per sample (i.e. one VCF file per sample, one CNA file per sample), so that they can be fed directly to PCGR. Now, once all samples have been processed with PCGR, note that all the tab-separated output files (i.e. annotated SNVs, gene copy numbers) contain the sample identifier, which enable them to be aggregated and suitable for a downstream multi-sample analysis. Also note the multi-sheet Excel workbook, which contains numerous outputs from PCGR, and can be processed to aggregate findings across samples._
15 |
16 | __3. I do not see the expected transcript-specific consequence for a particular variant. In what way is the primary variant consequence established?__
17 |
18 | _Answer: PCGR relies upon_ [VEP](https://www.ensembl.org/info/docs/tools/vep/index.html) _for consequence prioritization, in which a specific transcript-specific consequence is chosen as the primary variant consequence. In the PCGR configuration file, you may customise how this is chosen by changing the order of criteria applied when choosing a primary consequence block - parameter_ [vep_pick_order](https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick_options)
19 |
20 | __4. Is it possible to use RefSeq as the underlying gene transcript model in PCGR?__
21 |
22 | _Answer: PCGR uses GENCODE as the primary gene transcript model, but we provide cross-references to corresponding RefSeq transcripts when this is available._
23 |
24 | __5. I have a VCF with structural variants detected in my tumor sample, can PCGR process those as well?__
25 |
26 | _Answer: This is currently not supported as input for PCGR, but is something we want to incorporate in the future._
27 |
28 | __6. Is it possible to see all the invididual cancer subtypes that belong to each of the 30 different tumor sites?__
29 |
30 | _Answer: Yes, see_ [an overview of phenotypes associated with primary tumor sites](primary_tumor_sites.html). See also the related GitHub repository [phenOncoX](https://github.com/sigven/phenOncoX)
31 |
32 | __7. Are there any plans to incorporate genomic biomarker evidence from__ [OncoKB](https://www.oncokb.org) __in PCGR?__
33 |
34 | _Answer: No. PCGR relies upon publicly available, open-source resources, and further that the PCGR reference bundle can be distributed freely to the user community. It is our understanding that_ [OncoKB's terms of use](https://www.oncokb.org/terms) _do not fit well with this strategy._
35 |
36 | __8. I have RNA fusion data that I want to analyse and include in the report. Is this possible with PCGR?__
37 |
38 | _Answer: This is currently not supported as input for PCGR, but is something we are actively working on. The focus will be on whether detected RNA fusion events are previously known (i.e. seen in other tumor samples, e.g. from the Mitelman database), and whether some of them are presently in use as biomarkers for diagnosis or treatment._
39 |
40 | __9. Is it possible for the users to update the data bundle to get the most recent versions of all underlying data sources?__
41 |
42 | _Answer: As of now, the data bundle is updated only with each release of PCGR. The data harmonization pipeline of knowledge databases in PCGR contain numerous and complex procedures, with several cleaning, quality control, and re-formatting steps, and is semi-automated in its present form. The versions of all databases and key software elements are outlined in each PCGR report._
43 |
--------------------------------------------------------------------------------