165 |
166 |
167 |
168 | Langmead, Ben, Cole Trapnell, Mihai Pop, and Steven L. Salzberg. 2009.
169 | “Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the
170 | Human Genome.” *Genome Biology* 10 (3): R25.
171 | .
172 |
173 |
174 |
175 |
176 |
177 | Li, Heng, and Richard Durbin. 2009. “Fast and Accurate Short Read
178 | Alignment with Burrows–Wheeler Transform.” *Bioinformatics* 25 (14):
179 | 1754–60.
180 |
181 |
182 |
183 |
184 |
--------------------------------------------------------------------------------
/Design_CRISPRbe/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs for CRISPRbe"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 |
15 | # Introduction
16 |
17 | In this tutorial, We illustrate the CRISPR base editing (CRISPRbe) functionalities
18 | of `crisprDesign` by designing and characterizing gRNAs targeting the
19 | human gene KRAS using the cytidine base editor BE4max [@koblan2018improving].
20 |
21 |
22 | # Installation
23 |
24 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
25 | `crisprDesign`, `crisprDesignData`
26 |
27 |
28 | # Terminology
29 |
30 | See the [CRISPRko design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar
31 | with the terminology used throughout this tutorial.
32 |
33 |
34 | # CRISPR base editing with BE4max
35 |
36 | ## Loading packages
37 |
38 | We first load the necessary packages for this tutorial:
39 |
40 | ```{r, warning=FALSE, message=FALSE, results='hide'}
41 | library(crisprBase)
42 | library(crisprDesign)
43 | library(crisprDesignData)
44 | library(BSgenome.Hsapiens.UCSC.hg38)
45 | ```
46 |
47 |
48 | ## Creating the GuideSet
49 |
50 | We first load the BE4max `BaseEditor` object from the `crisprBase` package:
51 |
52 | ```{r}
53 | data(BE4max, package="crisprBase")
54 | BE4max
55 | ```
56 |
57 | The editing probabilities of the base editor BE4max are stored in a
58 | matrix where rows correspond to the different nucleotide substitutions,
59 | and columns correspond to the genomic coordinate relative to the PAM site.
60 | The `editingWeights` function from `crisprBase` retrieves those probabilities.
61 | One can see that C to T editing is optimal around 15 nucleotides upstream of
62 | the PAM site for the BE4max base editor:
63 |
64 | ```{r}
65 | crisprBase::editingWeights(BE4max)["C2T",]
66 | ```
67 |
68 | Let's create the `GuideSet` containing gRNAs targeting KRAS.
69 |
70 | We first load the data containing gene regions for the human genome
71 | from `crisprDesignData`:
72 |
73 | ```{r}
74 | data("txdb_human", package="crisprDesignData")
75 | ```
76 |
77 | For more information on `txdb_human` and how to create
78 | similar gene annotation objects, see the [Building a gene annotation object tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation).
79 |
80 |
81 | We will also load the `BSgenome` package containing DNA sequences for the hg38
82 | genome:
83 |
84 | ```{r, message=FALSE, warning=FALSE, results='hide'}
85 | library(BSgenome.Hsapiens.UCSC.hg38)
86 | ```
87 |
88 |
89 | We retrive the genomic coordinates of the KRAS CDS
90 | ```{r}
91 | gr <- queryTxObject(txObject=txdb_human,
92 | featureType="cds",
93 | queryColumn="gene_symbol",
94 | queryValue="KRAS")
95 | ```
96 |
97 | and design all possigle gRNAs using the function `findSpacers`:
98 |
99 | ```{r}
100 | bsgenome <- BSgenome.Hsapiens.UCSC.hg38
101 | gs <- findSpacers(gr,
102 | bsgenome=bsgenome,
103 | crisprNuclease=BE4max)
104 | ```
105 |
106 |
107 | ## Annotating the GuideSet
108 |
109 | Next, we annotate our candidate gRNAs to assess quality. There are several
110 | functions in `crisprDesign` that provide annotation for features that are
111 | nonspecific to CRISPRbe, for which we refer the reader to the
112 | [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial for more information. The sections below will
113 | cover annotation functions that are of particular interest to,
114 | or deserve extra care for CRISPRbe applications.
115 |
116 |
117 | ### Adding edited alleles
118 |
119 | The function `addEditedAlleles` finds, characterizes, and scores
120 | predicted edited alleles for each gRNA and a chosen transcript.
121 | It requires a transcript-specific annotation that can be obtained
122 | with the `getTxInfoDataFrame` function. Here, we perform the analysis
123 | using the primary isoform of KRAS (Ensembl transcript ID: ENST00000311936).
124 |
125 | We first get the transcript table for our transcript
126 |
127 | ```{r}
128 | txid <- "ENST00000311936"
129 | txTable <- getTxInfoDataFrame(tx_id=txid,
130 | txObject=txdb_human,
131 | bsgenome=bsgenome)
132 | head(txTable)
133 | ```
134 |
135 | and then add the edited alleles annotation to the `GuideSet`:
136 |
137 | ```{r}
138 | editingWindow <- c(-20,-8)
139 | gs <- addEditedAlleles(gs,
140 | baseEditor=BE4max,
141 | txTable=txTable,
142 | editingWindow=editingWindow)
143 | ```
144 |
145 | The `editingWindow` argument specifies the window of editing that we
146 | are interested in. When not provided, it uses the default window provided
147 | in the `BaseEditor` object. Note that providing large windows can
148 | exponentially increase computing time as the number of possible
149 | alleles grows exponentially.
150 |
151 | Let's retrieve the edited alleles for the first gRNA:
152 |
153 | ```{r}
154 | alleles <- editedAlleles(gs)[[1]]
155 | ```
156 |
157 | We get a `DataFrame` object with useful metadata:
158 |
159 | ```{r}
160 | metadata(alleles)
161 | ```
162 |
163 | The `wildtypeAllele` reports the unedited nucleotide sequence of the
164 | region specified by the editing window (with respect to the gRNA PAM site).
165 | It is always reported from the 5' to 3' direction on the strand corresponding
166 | to the gRNA strand. The `start` and `end` fields specify the
167 | corresponding coordinates on the transcript.
168 |
169 | Let's look at the edited alleles:
170 |
171 | ```{r}
172 | head(alleles)
173 | ```
174 |
175 | The `DataFrame` is ordered by descending values in the `score` column.
176 | This `score` represents the likelihood of the edited allele to occur
177 | relative to all possible edited alleles, and is calculated using the
178 | editing weights stored in the `BE4max` object. The `seq` column represents
179 | the edited nucleotide sequences. As with the `wildtypeAllele` in the metadata,
180 | they are always reported from the 5' to 3' direction on the strand
181 | corresponding to the gRNA strand.
182 |
183 | The `variant` column describes the
184 | functional consequence of the editing event (silent, nonsense or
185 | missense mutation). If an edited allele results in multiple editing
186 | events, as can happen when multiple bases are edited, the most
187 | consequential mutation (nonsense over missense, missense over silent)
188 | is reported. Finally, the `aa` column reports the resulting edited
189 | amino acid sequence, with each single letter code mapping to its
190 | corresponding nucleotide (`*` for termination).
191 |
192 | Note that `addEditedAlleles` also appended several gRNA-level aggregate
193 | scores to the `GuideSet` object:
194 |
195 | ```{r}
196 | head(gs)
197 | ```
198 |
199 | The `score_missense`, `score_nonsense` and `score_silent` columns report
200 | aggregated scores for each mutation type. They are calculated by summing
201 | all scores of a given mutation type across the set of edited alleles for
202 | a given gRNA. The `maxVariant` column indicates the most probable mutation
203 | type for the given gRNA based on the maximum aggregated score, which is
204 | stored in `maxVariantScore`. In our example, the highest score for `spacer_4`
205 | is `score_nonsense`, and so `maxVariant` is set to `nonsense`.
206 |
207 |
208 |
209 | # Session Info
210 |
211 | ```{r}
212 | sessionInfo()
213 | ```
214 |
215 |
216 | # References
217 |
218 |
--------------------------------------------------------------------------------
/Design_Custom_Sequence/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs for custom sequences"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 | # Introduction
15 |
16 | In this tutorial, we illustrate the main functionalities of `crisprDesign`
17 | for designing gRNAs for custom sequences. To design gRNAs for targets
18 | located in an organism genome, see the [introductory CRISPRko tutorial ](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9).
19 |
20 |
21 |
22 | # Installation
23 |
24 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
25 | `crisprDesign`, `crisprDesignData`
26 |
27 |
28 | # Use case: designing gRNAs against EGFP
29 |
30 | Suppose we are engineering a human cell line to express the
31 | enhanced green fluorescent protein (EGFP) marker, and that we want
32 | to design gRNAs that knockout EGFP as experimental controls.
33 | Such control gRNAs should target EGFP with (1) high efficiency,
34 | and (2) should be specific to EGFP, that is, should not target
35 | the cell genome (human genome in this case). Supposed also that the
36 | cell line is also stably expressing SpCas9.
37 |
38 | ## Loading necessary packages
39 |
40 | We first start by loading the necessary packages:
41 |
42 | ```{r, message=FALSE, warning=FALSE, results='hide'}
43 | library(Biostrings)
44 | library(crisprBase)
45 | library(crisprDesign)
46 | library(crisprDesignData)
47 | library(BSgenome.Hsapiens.UCSC.hg38)
48 | ```
49 |
50 |
51 | ## Obtaining the DNA sequence
52 |
53 | In the folder `data`, we have included a fasta file containing the DNA
54 | sequence of the EGFP marker. The sequence was obtained from the [SnapGene website](https://www.snapgene.com/resources/plasmid-files/?set=fluorescent_protein_genes_and_plasmids&plasmid=EGFP)
55 |
56 |
57 | We can read in the fasta file using the `readDNAStringSet` function from
58 | the package `Biostrings`:
59 |
60 | ```{r}
61 | dna <- Biostrings::readDNAStringSet("data/egfp.fa")
62 | names(dna) <- "EGFP"
63 | dna
64 | ```
65 |
66 | This could also be simply constructed from a regular string:
67 |
68 | ```{r, eval=FALSE}
69 | dna <- "ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAA"
70 | dna <- DNAStringSet(dna)
71 | names(dna) <- "EGFP"
72 | ```
73 |
74 | (Note that the function also accepts a simple string, which would be
75 | internally converted into a `DNAStringSet`). This is the custom sequence
76 | input that we will use to design gRNAs.
77 |
78 | ## Constructing the `GuideSet` object:
79 |
80 | Next, we design all possible SpCas9 gRNAs targeting EGFP. First, we load
81 | the SpCas9 object from the `crisprBase` package:
82 |
83 | ```{r}
84 | data(SpCas9, package="crisprBase")
85 | ```
86 |
87 | and we design gRNAs using the function `findSpacers` from `crisprDesign`:
88 |
89 | ```{r}
90 | gs <- findSpacers(dna,
91 | crisprNuclease=SpCas9)
92 | head(gs)
93 | ```
94 |
95 | The resulting output is a regular `GuideSet` object, and all functionalities
96 | described in the [introductory CRISPRko tutorial ](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9)
97 | can be applied here as well.
98 |
99 | There are a few key differences to note with respect to a `GuideSet` object
100 | constructed using a reference genome. First, the name of the input DNA sequence
101 | (EGFP) is used as the chromosome name stored in the `seqnames` field.
102 | Second, the `pam_site` and `cut_site` coordinates are all relative to the
103 | first nucleotide of the custom DNA sequence. Finally, the `GuideSet` object
104 | stores the input sequence, which can be accessed using the function
105 | `customSequences`:
106 |
107 | ```{r}
108 | customSequences(gs)
109 | ```
110 |
111 | ## Finding off-targets in the human genome to find gRNAs specific to EGFP
112 |
113 |
114 | Now that we have designed all possible gRNAs targeting EGFP, we will
115 | filter out gRNAs that have on- and off-targets located in the human
116 | genome. We will use the bowtie aligner to find targets, so we need to
117 | first specify the path of a bowtie index constructed on the human genome:
118 |
119 | ```{r}
120 | # Path of the hg38 bowtie index on my personal laptop:
121 | bowtie_index <- "/Users/fortinj2/crisprIndices/bowtie/hg38/hg38"
122 | ```
123 |
124 | For instructions on how to build a Bowtie index from a given reference genome,
125 | see the [genome index tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices).
126 |
127 | To annotate off-targets with genomic context, for instance to know whether
128 | or not they are located in coding regions, we will also need a gene model
129 | object. We will use the gene model object `txdb_human`
130 | from `crisprDesignData`, which contains genomic coordinates of all
131 | human protein-coding genes. See the [crisprDesignData package](https://github.com/crisprVerse/crisprDesignData) for more details.
132 |
133 | ```{r}
134 | data(txdb_human, package="crisprDesignData")
135 | ```
136 |
137 | We are now ready to find all on- and off-targets using the
138 | `addSpacerAlignments` function from `crisprDesign`:
139 |
140 |
141 | ```{r, warning=FALSE, message=FALSE}
142 | gs <- addSpacerAlignments(gs,
143 | aligner="bowtie",
144 | aligner_index=bowtie_index,
145 | bsgenome=BSgenome.Hsapiens.UCSC.hg38,
146 | n_mismatches=3,
147 | txObject=txdb_human)
148 | gs
149 | ```
150 |
151 |
152 |
153 | ## Predicting on-target activity
154 |
155 | We also want to make sure to filter out gRNAs that are predicted to have
156 | poor on-target activity. To do so, we annotate gRNAs with the DeepHF
157 | on-target activity score:
158 |
159 | ```{r}
160 | gs <- addOnTargetScores(gs, methods="deephf")
161 | ```
162 |
163 |
164 |
165 | Finally, we characterize the spacer sequences using the `addSequenceFeatures`
166 | function from `crisprDesign`:
167 |
168 | ```{r}
169 | gs <- addSequenceFeatures(gs)
170 | ```
171 |
172 | ## Final selection
173 |
174 | For our use case, we will only retain gRNAs that do not map to the human
175 | genome (`n0=0`), don't have any 1 or 2-mismatch off-targets (`n1=0`
176 | and `n2=0`), and do not have 3-mismatch off-targets located
177 | in coding regions (`n3_c=0`):
178 |
179 | ```{r}
180 | gs <- gs[gs$n0==0 & gs$n1==0 & gs$n2==0 & gs$n3_c==0]
181 | ```
182 |
183 | We also remove gRNAs that contain polyT sequences
184 |
185 | ```{r}
186 | gs <- gs[!gs$polyT,]
187 | ```
188 |
189 | and only keep gRNAs that don't have extreme GC content:
190 |
191 | ```{r}
192 | gs <- gs[gs$percentGC>=20 & gs$percentGC<=80]
193 | ```
194 |
195 | Finally, we rank gRNAs from the highest to the lowest on-target activity score:
196 |
197 | ```{r}
198 | gs <- gs[order(-gs$score_deephf)]
199 | head(gs)
200 | ```
201 |
202 | Users can select the top gRNAs as their control gRNAs.
203 |
204 | # Session Info
205 |
206 | ```{r}
207 | sessionInfo()
208 | ```
209 |
210 | # References
211 |
212 |
213 |
--------------------------------------------------------------------------------
/Design_CRISPRkd_CasRx/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs for CRISPRkd with CasRx"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 | # Introduction
15 |
16 | In this tutorial, we will design gRNAs for the RNA-targeting nuclease
17 | CasRx (RfxCas13d) [@cas13d]. We will design all gRNAs targeting the primary
18 | isoform of the human gene KRAS.
19 |
20 | # Installation
21 |
22 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
23 | `crisprDesign`, `crisprDesignData`
24 |
25 |
26 | # Terminology
27 |
28 | See the [CRISPRko design vignette](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar with the terminology used throughout
29 | this tutorial.
30 |
31 |
32 | # End-to-end gRNA design workflow
33 |
34 | We first start by loading the crisprVerse packages needed for
35 | this tutorial:
36 |
37 | ```{r, message=FALSE, warning=FALSE, results='hide'}
38 | library(crisprBase)
39 | library(crisprDesign)
40 | library(crisprDesignData)
41 | ```
42 |
43 | We will also load the `BSgenome` package containing DNA sequences for the hg38
44 | genome:
45 |
46 | ```{r, message=FALSE, warning=FALSE, results='hide'}
47 | library(BSgenome.Hsapiens.UCSC.hg38)
48 | ```
49 |
50 |
51 | ## Creating the GuideSet
52 |
53 | We begin by loading the CasRx `CrisprNuclease` object from the `crisprBase`
54 | package:
55 |
56 | ```{r}
57 | data(CasRx, package="crisprBase")
58 | CasRx
59 | ```
60 |
61 | The PFS sequence (the equivalent of a PAM sequence for RNA-targeting nucleases)
62 | for CasRx is `N`, meaning there is no specific PFS sequences preferred by CasRx.
63 |
64 | Next, we will extract the mRNA sequence for the KRAS transcript ENST00000311936
65 | with the function `getMrnaSequences` from `crisprDesign`. The function
66 | requires a gene annotation object. We will load the Ensembl model from
67 | the `crisprDesignData` package stored in the `GRangesList` object `txdb_human`:
68 |
69 | ```{r}
70 | data("txdb_human", package="crisprDesignData")
71 | ```
72 |
73 | For more information on `txdb_human` and how to create similar gene
74 | annotation objects, see the [Building a gene annotation object](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation) tutorial).
75 |
76 | We also need a `BSgenome` object containing the DNA sequences:
77 |
78 | ```{r}
79 | bsgenome <- BSgenome.Hsapiens.UCSC.hg38
80 | ```
81 |
82 | We are now ready to obtain our mRNA sequence:
83 |
84 | ```{r}
85 | txid <- "ENST00000311936"
86 | mrna <- getMrnaSequences(txids=txid,
87 | bsgenome=bsgenome,
88 | txObject=txdb_human)
89 | mrna
90 | ```
91 |
92 | Similar to the CRISPRko gRNA design, we use the function `findSpacers`
93 | to design our gRNAs:
94 |
95 | ```{r, warning=FALSE}
96 | gs <- findSpacers(mrna,
97 | crisprNuclease=CasRx)
98 | head(gs)
99 | ```
100 |
101 | Note that all protospacer sequences are located on the original strand of
102 | the mRNA sequence. For RNA-targeting nucleases, the spacer and protospacer
103 | sequences are the reverse complement of each other.
104 | (Compare the output of the code below with a `GuideSet` that
105 | uses a DNA-targeting nuclease--for such `GuideSet` pbjects, the output
106 | of `spacers` and `protospacers` are identical.)
107 |
108 | ```{r}
109 | head(spacers(gs))
110 | head(protospacers(gs))
111 | ```
112 |
113 |
114 | ## Annotating the GuideSet
115 |
116 | Next, we annotate our candidate gRNAs to assess quality. There are several
117 | functions in `crisprDesign` that provide annotation for features that are
118 | nonspecific to CRISPRkd, for which we refer the reader to the
119 | [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial for more information. The sections below will
120 | cover annotation functions that are of particular interest to,
121 | or deserve extra care for CRISPRkd applications.
122 |
123 | ### Adding spacer alignments
124 |
125 | Since our CRISPR nuclease targets RNA rather than DNA, off-target
126 | searches should be restricted to the transcriptome. We can perform
127 | such a search using one of two methods.
128 |
129 | #### Adding spacer alignments with Biostrings
130 |
131 | For the first method, we set the `aligner` argument to `"biostrings"`
132 | and pass a `DNAStringSet` representation of the transcriptome to the
133 | argument `custom_seq`. We can create this representation with
134 | `getMrnaSequences` and all transcript IDs found in `txdb_human`.
135 | The code below uses this method to search for off-targets having up to
136 | one mismatch and passes `txdb_human` to the `txObject` argument so
137 | that the alignments will be accompanied with gene annotation.
138 |
139 | ```{r, eval=FALSE}
140 | exon_ids <- unique(txdb_human$exons$tx_id)
141 | mrnasHuman <- getMrnaSequences(exon_ids,
142 | bsgenome=BSgenome.Hsapiens.UCSC.hg38,
143 | txObject=txdb_human)
144 | ## long run time
145 | results <- addSpacerAlignments(gs,
146 | aligner="biostrings",
147 | txObject=txdb_human,
148 | n_mismatches=1,
149 | custom_seq=mrnasHuman)
150 | ```
151 |
152 | NOTE: since `mrnasHuman` contains many sequences (>100k), this method
153 | has a very long run time; for transcriptome-wide searches,
154 | or for searches against a large number of sequences, we recommend the
155 | following method instead.
156 |
157 |
158 | #### Adding spacer alignments with bowtie or BWA
159 |
160 | The second method uses the `bowtie` (or `bwa`) aligner. This requires
161 | building a transcriptome bowtie (or BWA) index file first. See the [Building genome indices for short read aligners](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices) tutorial for more information.
162 |
163 | Here we set `aligner` to `"bowtie"` and pass a precomputed
164 | transcriptome bowtie index to `aligner_index` to find off-targets:
165 |
166 | ```{r, warning=FALSE, message=FALSE, results='hide'}
167 | bowtie_index <- "/Users/fortinj2/crisprIndices/bowtie/ensembl_human_104/ensembl_human_104"
168 | results <- addSpacerAlignments(gs,
169 | aligner="bowtie",
170 | aligner_index=bowtie_index,
171 | txObject=txdb_human,
172 | n_mismatches=1)
173 | ```
174 | ```{r}
175 | head(results)
176 | ```
177 |
178 | The columns `n0_gene` and `n0_tx` report the number of on-targets
179 | at the gene- and transcript-level, respectively. For instance,
180 | each spacer shown above shows `n0_gene` equal to 1 and `n0_tx`
181 | equal to 4, meaning each spacer maps to all four isoforms of KRAS.
182 | We can retrieve information about each alignment with the `onTargets`
183 | function. Looking at the on-targets for the first spacer we can see
184 | where the target `pam_site` is relative to the start of the transcript
185 | with respect to each isoform of KRAS.
186 |
187 | ```{r}
188 | onTargets(results["spacer_1"])
189 | ```
190 |
191 | Note that each annotated alignment is specific to the transcript
192 | ID given under `seqnames`.
193 |
194 | Below is a spacer that targets (with no mismatches) multiple genes:
195 |
196 | ```{r}
197 | results["spacer_244"]
198 | ```
199 |
200 | Upon further inspection of this spacer's alignments,
201 | however, we can see that the off-target occurs in the pseudogene KRASP1,
202 | and should be harmless.
203 |
204 | ```{r}
205 | onTargets(results["spacer_244"])
206 | ```
207 |
208 |
209 | ## On-target scoring (gRNA efficiency)
210 |
211 | Finally, we add an on-target activity score using the CasRx-RF
212 | method [@casrxrf] using the `addOnTargetScores` function from `crisprDesign`
213 | package:
214 |
215 | ```{r, eval=TRUE, warning=FALSE, message=FALSE}
216 | gs <- addOnTargetScores(gs, methods=c("casrxrf"))
217 | gs
218 | ```
219 |
220 |
221 |
222 | # Session Info
223 |
224 | ```{r}
225 | sessionInfo()
226 | ```
227 |
228 |
229 | # References
230 |
231 |
--------------------------------------------------------------------------------
/Design_CRISPRkd_Csm/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs for the CRISPR-Csm complex"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 | # Introduction
15 |
16 | The CRISPR-Csm complex is a programmable RNA-targeting system that does
17 | not induce indiscriminate trans-cleavage activity, which is
18 | an important advantage in comparison to the CRISPR-Cas13 family of RNA-targeting
19 | nucleases [@csm1]. It has recently been shown that it can be use to
20 | perform effective single-molecule live-cell RNA imaging [@csm2].
21 |
22 | In this tutorial, we will design gRNAs for the CRISPR-Csm system for the
23 | primary isoform of the human gene KRAS.
24 |
25 | # Installation
26 |
27 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
28 | `crisprDesign`, `crisprDesignData`
29 |
30 |
31 | # Terminology
32 |
33 | See the [CRISPRko design vignette](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar with the terminology used throughout
34 | this tutorial.
35 |
36 |
37 | # End-to-end gRNA design workflow
38 |
39 | We first start by loading the crisprVerse packages needed for
40 | this tutorial:
41 |
42 | ```{r, message=FALSE, warning=FALSE, results='hide'}
43 | library(crisprBase)
44 | library(crisprDesign)
45 | library(crisprDesignData)
46 | ```
47 |
48 | We will also load the `BSgenome` package containing DNA sequences for the hg38
49 | genome:
50 |
51 | ```{r, message=FALSE, warning=FALSE, results='hide'}
52 | library(BSgenome.Hsapiens.UCSC.hg38)
53 | ```
54 |
55 |
56 | ## Creating the GuideSet
57 |
58 | We begin by loading the Csm `CrisprNuclease` object from the `crisprBase`
59 | package:
60 |
61 | ```{r}
62 | data(Csm, package="crisprBase")
63 | Csm
64 | ```
65 |
66 | The PFS sequence (the equivalent of a PAM sequence for RNA-targeting nucleases)
67 | for Csm is `N`, meaning there is no specific PFS sequences preferred by Csm.
68 | The default spacer length of the Csm nuclease is 32nt. This can be changed
69 | using `spacerLength` (for instance, `spacerLength(Csm) <- 36`).
70 |
71 | Next, we will extract the mRNA sequence for the KRAS transcript ENST00000311936
72 | with the function `getMrnaSequences` from `crisprDesign`. The function
73 | requires a gene annotation object. We will load the Ensembl model from
74 | the `crisprDesignData` package stored in the `GRangesList` object `txdb_human`:
75 |
76 | ```{r}
77 | data("txdb_human", package="crisprDesignData")
78 | ```
79 |
80 | For more information on `txdb_human` and how to create similar gene
81 | annotation objects, see the [Building a gene annotation object](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation) tutorial).
82 |
83 | We also need a `BSgenome` object containing the DNA sequences:
84 |
85 | ```{r}
86 | bsgenome <- BSgenome.Hsapiens.UCSC.hg38
87 | ```
88 |
89 | We are now ready to obtain our mRNA sequence:
90 |
91 | ```{r}
92 | txid <- "ENST00000311936"
93 | mrna <- getMrnaSequences(txids=txid,
94 | bsgenome=bsgenome,
95 | txObject=txdb_human)
96 | mrna
97 | ```
98 |
99 | Similar to the CRISPRko gRNA design, we use the function `findSpacers`
100 | to design our gRNAs:
101 |
102 | ```{r, warning=FALSE}
103 | gs <- findSpacers(mrna,
104 | crisprNuclease=Csm)
105 | head(gs)
106 | ```
107 |
108 | Note that all protospacer sequences are located on the original strand of
109 | the mRNA sequence. For RNA-targeting nucleases, the spacer and protospacer
110 | sequences are the reverse complement of each other.
111 | (Compare the output of the code below with a `GuideSet` that
112 | uses a DNA-targeting nuclease--for such `GuideSet` pbjects, the output
113 | of `spacers` and `protospacers` are identical.)
114 |
115 | ```{r}
116 | head(spacers(gs))
117 | head(protospacers(gs))
118 | ```
119 |
120 |
121 | ## Annotating the GuideSet
122 |
123 | Next, we annotate our candidate gRNAs to assess quality. There are several
124 | functions in `crisprDesign` that provide annotation for features that are
125 | nonspecific to CRISPRkd, for which we refer the reader to the
126 | [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial for more information. The sections below will
127 | cover annotation functions that are of particular interest to,
128 | or deserve extra care for CRISPRkd applications.
129 |
130 | ### Adding spacer alignments
131 |
132 | Since our CRISPR nuclease targets RNA rather than DNA, off-target
133 | searches should be restricted to the transcriptome. We can perform
134 | such a search using one of two methods.
135 |
136 | #### Adding spacer alignments with Biostrings
137 |
138 | For the first method, we set the `aligner` argument to `"biostrings"`
139 | and pass a `DNAStringSet` representation of the transcriptome to the
140 | argument `custom_seq`. We can create this representation with
141 | `getMrnaSequences` and all transcript IDs found in `txdb_human`.
142 | The code below uses this method to search for off-targets having up to
143 | one mismatch and passes `txdb_human` to the `txObject` argument so
144 | that the alignments will be accompanied with gene annotation.
145 |
146 | ```{r, eval=FALSE}
147 | exon_ids <- unique(txdb_human$exons$tx_id)
148 | mrnasHuman <- getMrnaSequences(exon_ids,
149 | bsgenome=BSgenome.Hsapiens.UCSC.hg38,
150 | txObject=txdb_human)
151 | ## long run time
152 | results <- addSpacerAlignments(gs,
153 | aligner="biostrings",
154 | txObject=txdb_human,
155 | n_mismatches=1,
156 | custom_seq=mrnasHuman)
157 | ```
158 |
159 | NOTE: since `mrnasHuman` contains many sequences (>100k), this method
160 | has a very long run time; for transcriptome-wide searches,
161 | or for searches against a large number of sequences, we recommend the
162 | following method instead.
163 |
164 |
165 | #### Adding spacer alignments with bowtie or BWA
166 |
167 | The second method uses the `bowtie` (or `bwa`) aligner. This requires
168 | building a transcriptome bowtie (or BWA) index file first. See the [Building genome indices for short read aligners](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices) tutorial for more information.
169 |
170 | Here we set `aligner` to `"bowtie"` and pass a precomputed
171 | transcriptome bowtie index to `aligner_index` to find off-targets:
172 |
173 | ```{r, warning=FALSE, message=FALSE, results='hide'}
174 | bowtie_index <- "/Users/fortinj2/crisprIndices/bowtie/ensembl_human_104/ensembl_human_104"
175 | results <- addSpacerAlignments(gs,
176 | aligner="bowtie",
177 | aligner_index=bowtie_index,
178 | txObject=txdb_human,
179 | n_mismatches=1)
180 | ```
181 | ```{r}
182 | head(results)
183 | ```
184 |
185 | The columns `n0_gene` and `n0_tx` report the number of on-targets
186 | at the gene- and transcript-level, respectively. For instance,
187 | each spacer shown above shows `n0_gene` equal to 1 and `n0_tx`
188 | equal to 4, meaning each spacer maps to all four isoforms of KRAS.
189 | We can retrieve information about each alignment with the `onTargets`
190 | function. Looking at the on-targets for the first spacer we can see
191 | where the target `pam_site` is relative to the start of the transcript
192 | with respect to each isoform of KRAS.
193 |
194 | ```{r}
195 | onTargets(results["spacer_1"])
196 | ```
197 |
198 | Note that each annotated alignment is specific to the transcript
199 | ID given under `seqnames`.
200 |
201 | Below is a spacer that targets (with no mismatches) multiple genes:
202 |
203 | ```{r}
204 | results["spacer_244"]
205 | ```
206 |
207 | Upon further inspection of this spacer's alignments,
208 | however, we can see that the off-target occurs in the pseudogene KRASP1,
209 | and should be harmless.
210 |
211 | ```{r}
212 | onTargets(results["spacer_244"])
213 | ```
214 |
215 |
216 |
217 | # Session Info
218 |
219 | ```{r}
220 | sessionInfo()
221 | ```
222 |
223 |
224 | # References
225 |
226 |
--------------------------------------------------------------------------------
/Building_Database_Human_Cas9/script.R:
--------------------------------------------------------------------------------
1 | # This is the script used in the tutorial:
2 |
3 | #0. required packages
4 | library(crisprDatabase)
5 | library(crisprBase)
6 | library(crisprDesign)
7 | library(crisprDesignData)
8 | library(crisprDesignFacilitator)
9 | overwrite <- FALSE
10 |
11 |
12 | #i=1
13 | #species <- "human"
14 | #species <- "mouse"
15 | #modality <- "crispra"
16 | #modality <- "crisprkd"
17 | #modality <- "crisprko"
18 | #nuclease <- "SpCas9"
19 | #nuclease <- "enAsCas12a"
20 | #nuclease <- "CasRx"
21 | #version <- "v6"
22 |
23 |
24 | #1. Tech options
25 | i <- as.numeric(commandArgs(TRUE)[1])
26 | species <- as.character(commandArgs(TRUE)[2])
27 | modality <- as.character(commandArgs(TRUE)[3])
28 | nuclease <- as.character(commandArgs(TRUE)[4])
29 | version <- "v6"
30 |
31 |
32 |
33 | #2. gRNA design options:
34 | scoring_methods_cas9 <- c("ruleset1", "azimuth",
35 | "deephf", "crisprscan", "crisprai",
36 | "crisprater", "deepspcas9", "ruleset3")
37 | scoring_methods_cas12a <- c("deepcpf1", "enpamgb")
38 | scoring_methods_cas13d <- "casrxrf"
39 | scoring_methods <- c(scoring_methods_cas9,
40 | scoring_methods_cas12a,
41 | scoring_methods_cas13d)
42 | n_mismatches <- 3
43 | max_mm <- 2
44 | canonical_offtarget <- FALSE
45 | if (nuclease=="SpCas9"){
46 | canonical_ontarget=TRUE
47 | } else if (nuclease=="enAsCas12a"){
48 | canonical_ontarget=FALSE
49 | } else if (nuclease=="CasRx"){
50 | canonical_ontarget=FALSE
51 | }
52 |
53 | ### TSS WINDOW
54 | if (modality=="crispko" | modality=="crisprkd"){
55 | tss_window <- NULL
56 | } else if (modality=="crispra"){
57 | tss_window <- c(-500,0)
58 | } else if (modality=="crispri"){
59 | tss_window <- c(0, 500)
60 | }
61 |
62 |
63 | ### Getting CRISPR nuclease object:
64 | data(SpCas9, package="crisprBase")
65 | data(enAsCas12a, package="crisprBase")
66 | data(CasRx, package="crisprBase")
67 | if (nuclease=="SpCas9"){
68 | crisprNuclease <- SpCas9
69 | } else if (nuclease=="enAsCas12a"){
70 | crisprNuclease <- enAsCas12a
71 | } else if (nuclease=="CasRx"){
72 | crisprNuclease <- CasRx
73 | }
74 |
75 |
76 | ### Necessary annotation files
77 | snpFile <- getSNPFile()
78 | if (nuclease=="CasRx"){
79 | bowtie_index <- getBowtieIndex(species=species, what="rna")
80 | } else {
81 | bowtie_index <- getBowtieIndex(species=species, what="dna")
82 | }
83 | bsgenome <- getGenomePackage(species=species)
84 |
85 | # Stuff for CRISPRai
86 | if (species=="human"){
87 | chromatinFiles <- getChromatinFiles()
88 | fastaFile <- getGenomeFasta()
89 | } else {
90 | chromatinFiles <- NULL
91 | fastaFile <- NULL
92 | }
93 | if (species=="human" & nuclease=="SpCas9"){
94 | useDistanceToTss <- FALSE
95 | } else {
96 | useDistanceToTss <- TRUE
97 | }
98 |
99 | # Getting binaries for CasRx
100 | if (nuclease=="CasRx"){
101 | binaries <- crisprDesignFacilitator::getCasRxRfBinaries()
102 | } else {
103 | binaries <- NULL
104 | }
105 |
106 |
107 |
108 | # SNP stuff
109 | if (species=="human"){
110 | vcf <- getSNPFile()
111 | } else {
112 | vcf <- NULL
113 | }
114 |
115 | # Conservation stuff
116 | if (modality=="crisprko"){
117 | conservationFile <- crisprDesignFacilitator::getConservationFiles(species)
118 | } else {
119 | conservationFile <- NULL
120 | }
121 |
122 |
123 |
124 | # Isoform stuff:
125 | if (modality=="crisprko"){
126 | if (species=="human"){
127 | data(canonicalHuman, package="crisprDesignData")
128 | canonicalIsoforms <- canonicalHuman
129 | } else {
130 | data(canonicalMouse, package="crisprDesignData")
131 | canonicalIsoforms <- canonicalMouse
132 | }
133 | } else {
134 | canonicalIsoforms <- NULL
135 | }
136 |
137 |
138 | if (species=="human"){
139 | data(tss_human, package="crisprDesignFacilitator")
140 | data(txdb_human, package="crisprDesignData")
141 | data(gr.repeats.hg38, package="crisprDesignData")
142 | data(pfamTableHuman, package="crisprDesignData")
143 | txObject <- txdb_human
144 | tssObject <- tss_human
145 | grRepeats <- gr.repeats.hg38
146 | pfamTable <- pfamTableHuman
147 | } else {
148 | data(tss_mouse, package="crisprDesignFacilitator")
149 | data(txdb_mouse, package="crisprDesignData")
150 | data(gr.repeats.mm10, package="crisprDesignData")
151 | data(pfamTableMouse, package="crisprDesignData")
152 | txObject <- txdb_mouse
153 | tssObject <- tss_mouse
154 | grRepeats <- gr.repeats.mm10
155 | pfamTable <- pfamTableMouse
156 | }
157 |
158 | ### Modality for crisprDesign
159 | modality2 <- gsub("crispr", "CRISPR", modality)
160 |
161 |
162 |
163 | #3. gene specification:
164 | genedir <- crisprDatabase::getGeneModelDir(version=version,
165 | species=species)
166 | if (modality=="crisprko"){
167 | ids <- readRDS(file.path(genedir, "genesids.500chunks.rds"))
168 | } else if (modality=="crisprkd"){
169 | ids <- readRDS(file.path(genedir, "txids.500chunks.rds"))
170 | } else {
171 | ids <- readRDS(file.path(genedir, "tssids.500chunks.rds"))
172 | }
173 | ids <- ids[[i]]
174 |
175 |
176 |
177 | if (modality=="crisprko"){
178 | queryColumn="gene_id"
179 | } else if (modality=="crispra" | modality=="crispri"){
180 | queryColumn="ID"
181 | } else if (modality=="crisprkd"){
182 | queryColumn=NULL
183 | }
184 |
185 |
186 | #gene <- "ENSG00000133703" #KRAS
187 | #gene <- "ENSG00000130270" #With repeats
188 | extdir <- crisprDatabase::getCrisprDir(version=version,
189 | modality=modality,
190 | nuclease=nuclease,
191 | species=species)
192 | if (!dir.exists(extdir)){
193 | dir.create(extdir, recursive=TRUE)
194 | }
195 | for (k in seq_along(ids)){
196 | id <- ids[k]
197 | filename <- file.path(extdir, paste0(id, '.rds'))
198 | print(modality)
199 | print(species)
200 | print(nuclease)
201 | print(k)
202 | print(id)
203 | if (!overwrite & file.exists(filename)){
204 | cat("Overwrite mode is off, and data already generated for this gene. Skipping :) \n")
205 | } else {
206 | gs <- crisprDesign::designCompleteAnnotation(queryValue=id,
207 | queryColumn=queryColumn,
208 | modality=modality2,
209 | bsgenome=bsgenome,
210 | vcf=vcf,
211 | tssObject=tssObject,
212 | txObject=txObject,
213 | bowtie_index=bowtie_index,
214 | crisprNuclease=crisprNuclease,
215 | n_mismatches=n_mismatches,
216 | scoring_methods=scoring_methods,
217 | max_mm=max_mm,
218 | tss_window=tss_window,
219 | canonical_ontarget=canonical_ontarget,
220 | canonical_offtarget=canonical_offtarget,
221 | grRepeats=grRepeats,
222 | fastaFile=fastaFile,
223 | chromatinFiles=chromatinFiles,
224 | conservationFile=conservationFile,
225 | geneCol="gene_symbol",
226 | canonicalIsoforms=canonicalIsoforms,
227 | binaries=binaries,
228 | pfamTable=pfamTable)
229 | if (modality=="crisprko"){
230 | txid <- canonicalIsoforms$tx_id[match(id, canonicalIsoforms$gene_id)]
231 | } else {
232 | txid <- NULL
233 | }
234 | if (class(gs)=="GuideSet"){
235 | gs <- rankSpacers(gs,
236 | modality=modality2,
237 | commonExon=TRUE,
238 | tx_id=txid,
239 | useDistanceToTss=useDistanceToTss)
240 | }
241 | saveRDS(gs, file=filename)
242 | }
243 | }
244 | q("no")
245 |
246 |
247 |
248 |
249 |
250 |
251 |
252 |
253 |
254 |
--------------------------------------------------------------------------------
/Design_Minor_Major_Allele/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs with minor and major alleles"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | ---
8 |
9 | ```{r, echo=FALSE, results="hide"}
10 | options("knitr.graphics.auto_pdf"=TRUE)
11 | ```
12 |
13 |
14 | # Introduction
15 |
16 | Genetic variants such as single nucleotide polymorphisms (SNPs) can
17 | be problematic in guide RNA (gRNA) design, as different alleles can
18 | result in unintended gRNA:DNA mismatches for on-targets that reduce
19 | gRNA efficacy. To circumvent this, it is advisable to generally avoid
20 | targeting sequences that contain variants. However, this may not always
21 | be possible, due to a small target window and/or few target options,
22 | or desirable, if, for example, a CRISPR application intends to target
23 | a pathogenic variant.
24 |
25 | Functions in `crisprDesign` are well equipped to handle these cases.
26 | gRNAs overlapping SNPs can be identified with the `addSNPAnnotation`
27 | function, as documented in the [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial. Should the user wish to target region despite (or because of)
28 | the presence of variants, the user only needs to take care in the choice
29 | of `BSgenome` when constructing the `GuideSet` (an alternative option is
30 | to supply a custom target sequence; see the [Working with a custom sequence](https://github.com/crisprVerse/Tutorials/tree/master/Design_Custom_Sequence) tutorial for more information).
31 |
32 | This tutorial covers use cases for `BSgenome` objects that store variants
33 | of the reference human genome (hg38) injected with major and minor alleles.
34 | It assumes the reader is familiar with constructing and using gene
35 | annotation objects (see the [Building a gene annotation object](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation) tutorial) and `GuideSet` objects (see the [CRISPRko design with Cas9]( https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial) so that the content may focus on the utility of
36 | the `BSgenome` variants discussed herein. Please consult the
37 | applicable tutorials if necessary.
38 |
39 | Finally, it goes without saying that the user should be knowledgeable
40 | of the sequence(s), including possible variations in such, he or she
41 | is designing gRNAs for.
42 |
43 |
44 |
45 | # Installation
46 |
47 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the crisprVerse packages necessary
48 | for this tutorial.
49 |
50 |
51 |
52 | # Terminology
53 |
54 | See the [CRISPRko design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar
55 | with the terminology used throughout this tutorial.
56 |
57 | # gRNA design
58 |
59 | ## Loading packages
60 |
61 | We first load the necessary packages for this tutorial:
62 |
63 | ```{r, warning=FALSE, message=FALSE, results='hide'}
64 | library(crisprBase)
65 | library(crisprDesign)
66 | library(crisprDesignData)
67 | library(BSgenome.Hsapiens.UCSC.hg38)
68 | library(BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major)
69 | library(BSgenome.Hsapiens.UCSC.hg38.dbSNP151.minor)
70 | ```
71 |
72 | We will also load the gene annotation model `txdb_human` from the
73 | `crisprDesignData` package:
74 |
75 | ```{r}
76 | data(txdb_human, package="crisprDesignData")
77 | ```
78 |
79 |
80 | The `BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major` and
81 | `BSgenome.Hsapiens.UCSC.hg38.dbSNP151.minor` packages are `BSgenome`
82 | packages that contain the major and minor alleles of the human reference
83 | genome hg38 based dbSNP151.For more information, type the following:
84 |
85 |
86 | ```{r, eval=FALSE}
87 | help(BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major)
88 | help(BSgenome.Hsapiens.UCSC.hg38.dbSNP151.minor)
89 | ```
90 |
91 | ## Designing gRNAs for human (hg38) with injected major alleles
92 |
93 | It is worth noting that the human reference genome sequence (GRCh38.p12),
94 | does not give the major allele
95 | (i.e. most common allele in a population) at all nucleotide locations.
96 | Indeed, given that it was historically constructed from a small set of
97 | human genomes, it contains minor alleles that were common across this
98 | set of human genomes.
99 |
100 | For example, in the coding region (CDS) of the human gene SMC3,
101 | the reference genome contains the minor allele of the SNP rs2419565; the
102 | reference allele (A) frequency is 0.00577, and the alternative allele (G)
103 | frequency is 0.99423 as indicated in the table here [here](https://www.ncbi.nlm.nih.gov/snp/rs2419565)).
104 |
105 |
106 | Designing gRNAs targeting the CDS of SMC3 with the SpCas9 nuclease returns
107 | one gRNA that overlaps this SNP. Below, we first construct a
108 | `GuideSet` object using the reference `BSgenome` object:
109 |
110 |
111 | ```{r, collapse=TRUE, results='markup'}
112 | smc3 <- queryTxObject(txdb_human,
113 | featureType="cds",
114 | queryColumn="gene_symbol",
115 | queryValue="SMC3")
116 | gs_reference <- findSpacers(smc3,
117 | crisprNuclease=SpCas9,
118 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
119 | ```
120 |
121 | and a `GuideSet` object with the `BSgenome` object that contains
122 | the major alleles:
123 |
124 | ```{r, collapse=TRUE, results='markup'}
125 | gs_major <- findSpacers(smc3,
126 | crisprNuclease=SpCas9,
127 | bsgenome=BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major)
128 | ```
129 |
130 | Let's compare the protospacer sequences from both objects:
131 |
132 | ```{r, collapse=TRUE, results='markup'}
133 |
134 | protospacers(gs_reference["spacer_199"])
135 | protospacers(gs_major["spacer_199"])
136 | ```
137 |
138 | The variant occurs in the seed sequence of this gRNA, 5 bases upstream of
139 | the `pam_site`, so a gRNA:DNA mismatch at this location is likely detrimental
140 | to its efficacy. Also, as this major allele occurs at >99% frequency, it
141 | may be more beneficial to design gRNAs in this example using the major allele
142 | genome contained in `BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major`.
143 |
144 |
145 | ## Designing gRNAs for human (hg38) with injected minor alleles
146 |
147 | It may be desirable, in some applications, to target a genic sequence
148 | that contains a minor allele (i.e. less common allele) rather than the
149 | major or reference allele. For example, if a particular minor allele is
150 | pathogenic and the host cell has a single copy of that allele, the user
151 | may want to target that pathogenic variant and disrupt its behavior while
152 | leaving the other copy undisturbed.
153 |
154 | As an example, using `BSgenome.Hsapiens.UCSC.hg38.dbSNP151.minor`, we can
155 | target the pathogenic minor allele ([rs398122995](https://www.ncbi.nlm.nih.gov/clinvar/variation/92240/?oq=rs398122995&m=NM_001378454.1(ALMS1):c.1897C%3ET%20(p.Gln633Ter))) located in the human gene
156 | ALMS1:
157 |
158 |
159 |
160 | ```{r}
161 | alms1 <- queryTxObject(txdb_human, 'cds', 'gene_symbol', 'ALMS1')
162 | gs_minor <- findSpacers(alms1,
163 | crisprNuclease=SpCas9,
164 | bsgenome=BSgenome.Hsapiens.UCSC.hg38.dbSNP151.minor)
165 | gs_minor <- unique(gs_minor)
166 | ```
167 |
168 | We also include, for comparison, the resulting `GuideSet` using the
169 | reference genome sequence:
170 |
171 | ```{r}
172 | gs_reference <- findSpacers(alms1,
173 | crisprNuclease=SpCas9,
174 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
175 | gs_reference <- unique(gs_reference)
176 | ```
177 |
178 | and compare the two versions of the gRNA:
179 |
180 | ```{r}
181 | gs_reference["spacer_615"]
182 | gs_minor["spacer_612"]
183 | ```
184 |
185 | The variant occurs 1 base upstream of the `pam_site`, and likely influences
186 | gRNA activity, that is, we can design a gRNA that targets the minor allele
187 | and has a much lower affinity for the reference, or major allele.
188 |
189 | Note that while the two `GuideSet` objects differ only by their
190 | `BSgenome` object, we need to provide different indices to access
191 | protospacers at equivalent `pam_site`s. This is due to variants in
192 | one `BSgenome` (in this case theone with minor alleles) eliminating
193 | PAM sequences, that is, one of the Gs in NGG is changed to another base
194 | such that SpCas9 does not recognize it. This, where permissible, is also
195 | an effective way of ensuring gRNAs only target a specific sequence if
196 | that sequence contains the desired variant.
197 |
198 |
199 | # Session Info
200 |
201 | ```{r}
202 | sessionInfo()
203 | ```
204 |
205 |
206 |
207 |
--------------------------------------------------------------------------------
/Design_Cross_Reactivity/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs that map across species"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | ---
8 |
9 | ```{r, echo=FALSE, results="hide"}
10 | options("knitr.graphics.auto_pdf"=TRUE)
11 | ```
12 |
13 |
14 |
15 | # Introduction
16 |
17 | This tutorial describes how to design guide RNAs (gRNAs) that target
18 | homologous genes across multiple species using functions from
19 | the [crisprDesign](https://github.com/crisprVerse/crisprDesign) package.
20 | This strategy can be applied to any two (or more) species for which the
21 | genome sequence and gene model annotation is available.
22 |
23 |
24 | # Installation
25 |
26 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
27 | `crisprDesign`, `crisprDesignData`
28 |
29 |
30 | # Terminology
31 |
32 | See the [CRISPRko design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar
33 | with the terminology used throughout this tutorial.
34 |
35 |
36 | # Mapping gRNAs across species
37 |
38 | ## Loading packages
39 |
40 | We first load the necessary packages for this tutorial:
41 |
42 | ```{r, warning=FALSE, message=FALSE, results='hide'}
43 | library(crisprBase)
44 | library(crisprDesign)
45 | library(crisprDesignData)
46 | library(BSgenome.Hsapiens.UCSC.hg38)
47 | library(BSgenome.Mmusculus.UCSC.mm10)
48 | ```
49 |
50 |
51 | ## Creating the GuideSet
52 |
53 | In this tutorial, we will design gRNAs using the SpCas9 nuclease that
54 | target both the human KRAS gene and its mouse ortholog Kras. There
55 | are multiple ways to go about this, which we describe in the following
56 | sections.
57 |
58 | We first create a `GuideSet` object containing gRNAs targeting the
59 | coding sequence (CDS) of human KRAS.
60 | To do so, we start by loading the SpCas9 `CrisprNuclease` object from
61 | the `crisprBase` package:
62 |
63 | ```{r}
64 | data(SpCas9, package="crisprBase")
65 | ```
66 |
67 | and then load data containing gene regions for the human genome
68 | from the `crisprDesignData` package, `txdb_human` (we will also load
69 | a similar object for the mouse genome, `txdb_mouse`):
70 |
71 | ```{r}
72 | data(txdb_human, package="crisprDesignData")
73 | data(txdb_mouse, package="crisprDesignData")
74 | ```
75 |
76 | For more information on `txdb_human` and `txdb_mouse` and how to create
77 | similar gene annotation objects, see the [Building a gene annotation object tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation) tutorial.
78 |
79 | Next, we find the coordinates for the CDS of KRAS using the `queryTxObject` function:
80 |
81 | ```{r}
82 | kras_human <- queryTxObject(txdb_human,
83 | featureType="cds",
84 | queryColumn="gene_symbol",
85 | queryValue="KRAS")
86 | ```
87 |
88 | and build our `GuideSet` object with the `findSpacers` function:
89 |
90 | ```{r}
91 | gs_human <- findSpacers(kras_human,
92 | crisprNuclease=SpCas9,
93 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
94 | ```
95 |
96 |
97 | ## Mapping gRNAs across species via `intersect`
98 |
99 | As a first strategy to find gRNAs that target both species, we first
100 | create a similar `GuideSet` targeting the mouse ortholog Kras:
101 |
102 | ```{r}
103 | kras_mouse <- queryTxObject(txdb_mouse,
104 | featureType="cds",
105 | queryColumn="gene_symbol",
106 | queryValue="Kras")
107 | gs_mouse <- findSpacers(kras_mouse,
108 | crisprNuclease=SpCas9,
109 | bsgenome=BSgenome.Mmusculus.UCSC.mm10)
110 | ```
111 |
112 | Then, we find the common spacers between the two `GuideSet` objects
113 | using `intersect`
114 |
115 | ```{r}
116 | common_spacers <- intersect(spacers(gs_human),
117 | spacers(gs_mouse))
118 | length(common_spacers)
119 | ```
120 |
121 | There are 18 spacers that target KRAS in both species. We can filter
122 | each `GuideSet` object for this common spacer set:
123 |
124 | ```{r}
125 | results_human <- gs_human[spacers(gs_human) %in% common_spacers]
126 | results_mouse <- gs_mouse[spacers(gs_mouse) %in% common_spacers]
127 | ```
128 |
129 | Let's look at the results:
130 |
131 | ```{r}
132 | results_human
133 | results_mouse
134 | ```
135 |
136 | This simple approach, however, has some drawbacks. It requires gRNAs
137 | to have perfect sequence matching, which, while perhaps acceptable
138 | for targets having many gRNA choices, may be too restrictive for
139 | those applications that have fewer choices and may need tolerate
140 | mismatches in the target genes. Also, and more notably, we now have
141 | multiple `GuideSet` objects to maintain in the process of selecting
142 | candidate gRNAs (see [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9))--essentially twice the work.
143 |
144 |
145 | ## Mapping gRNAs across species via `addSpacerAlignments`
146 |
147 | To avoid the drawbacks of the above strategy, we can use
148 | the `addSpacerAlignments` function on our human KRAS `GuideSet`
149 | to append alignment annotation of the **mouse** genome.
150 |
151 | For this example, we will use the bowtie aligner, and weneed to specify a
152 | bowtie index for the mouse genome:
153 |
154 | ```{r}
155 | # Path of the mm10 bowtie index on my personal laptop:
156 | bowtie_index_mouse <- "/Users/fortinj2/crisprIndices/bowtie/mm10/mm10"
157 | ```
158 |
159 | For instructions on how to build a Bowtie index from a given reference genome,
160 | see the [genome index tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices).
161 |
162 | We will also search up to 1 mismatch and pass the gene model
163 | object `txdb_mouse` to the `txObject` argument, so the alignments will
164 | be annotated with genomic context and we can determine which of our spacers
165 | map to the CDS of Kras.
166 |
167 | As we will also want to search for off-targets in the human genome in a later
168 | step, we can ensure these results are not overwritten by setting
169 | the `colname` argument to a non-default value, such as `alignments_mouse`.
170 |
171 | ```{r}
172 | results_human <- addSpacerAlignments(gs_human,
173 | aligner="bowtie",
174 | aligner_index=bowtie_index_mouse,
175 | bsgenome=BSgenome.Mmusculus.UCSC.mm10,
176 | txObject=txdb_mouse,
177 | colname="alignments_mouse",
178 | n_mismatches=1)
179 | results_human
180 | ```
181 |
182 | Our results are stored in the `alignments_mouse` column. We can access
183 | these alignments with the `alignments` function and by specifying
184 | the `columnName`:
185 |
186 | ```{r}
187 | alignments(results_human, columnName="alignments_mouse")
188 | ```
189 |
190 | With these data, we can filter our gRNAs for those that target
191 | both orthologs (and we have off-target annotation for the mouse genome).
192 |
193 | ```{r}
194 | aln <- alignments(results_human, columnName="alignments_mouse")
195 | cds_targets <- aln$cds
196 | aln <- aln[!is.na(cds_targets) & cds_targets == "Kras"]
197 | targets_Kras <- unique(names(aln))
198 | results_human <- results_human[targets_Kras]
199 | ```
200 |
201 | Adding alignments for the human genome (or any other genome) will overwrite
202 | the summary columns in `results_human` (`n0`, `n0_c`, `n1`, and `n1_c`)
203 | unless we set `addSummary=FALSE` in `addSpacerAlignments`. We should
204 | also take care to ensure the column name for our alignments
205 | annotation remains unique so it will not be overwritten. Here,
206 | we add alignment annotation for the human genome, but overwrite the
207 | mouse alignment summary columns (see the warning message below).
208 |
209 | ```{r}
210 | # Path of the hg38 bowtie index on my personal laptop:
211 | bowtie_index_human <- "/Users/fortinj2/crisprIndices/bowtie/hg38/hg38"
212 |
213 | results_human <- addSpacerAlignments(results_human,
214 | aligner="bowtie",
215 | aligner_index=bowtie_index_human,
216 | bsgenome=BSgenome.Hsapiens.UCSC.hg38,
217 | txObject=txdb_human,
218 | colname="alignments_human",
219 | n_mismatches=1)
220 | results_human
221 | ```
222 |
223 |
224 |
225 | # Session Info
226 |
227 | ```{r}
228 | sessionInfo()
229 | ```
230 |
231 |
232 |
233 |
--------------------------------------------------------------------------------
/Building_Database_Human_Cas9/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Building a genome-wide gRNA database"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | ---
8 |
9 | ```{r, echo=FALSE, results="hide"}
10 | options("knitr.graphics.auto_pdf"=TRUE)
11 | ```
12 |
13 |
14 | # Introduction
15 |
16 | In this tutorial, we provide reproducible code to design and annotate
17 | gRNAs against all human protein-coding genes using the nuclease SpCas9.
18 |
19 | # Loading necessary packages
20 |
21 | We first load the necessary packages:
22 |
23 | ```{r, warning=FALSE, message=FALSE}
24 | library(crisprBase)
25 | library(crisprScore)
26 | library(crisprDesign)
27 | library(crisprDesignData)
28 | library(BSgenome.Hsapiens.UCSC.hg38)
29 | ```
30 |
31 |
32 | ### Specifying the genome
33 |
34 | We specify a `BSGenome` object that contains the DNA sequence of the human
35 | genome in hg38 coordinates:
36 |
37 | ```{r}
38 | bsgenome <- BSgenome.Hsapiens.UCSC.hg38
39 | ```
40 |
41 | ### Specifying the genome index
42 |
43 | We specify the file path of the Bowtie index that we will need for
44 | off-target alignment:
45 |
46 | ```{r}
47 | bowtie_index <- "/Users/fortinj2/crisprIndices/bowtie/hg38/hg38"
48 | ```
49 |
50 | For instructions on how to build a Bowtie index from a given reference
51 | genome, see the [genome index tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices).
52 |
53 |
54 |
55 | ### Specifying a SNP VCF file
56 |
57 | To flag gRNAs overlapping common SNPs, we specify a VCF file obtained from the
58 | dbSNP website containing common SNPs from the dbSNP151 release:
59 |
60 | ```{r}
61 | vcf <- "/Users/fortinj2/crisprIndices/snps/dbsnp151.grch38/00-common_all.vcf.gz"
62 | ```
63 |
64 | The VCF file was obtained from [NCBI](https://www.ncbi.nlm.nih.gov/variation/docs/human_variation_vcf).
65 |
66 | ### Specifying the nuclease
67 |
68 | We load a `CrisprNuclease` object representing the SpCas9 nuclease from
69 | the `crisprBase` package:
70 |
71 | ```{r}
72 | data(SpCas9, package="crisprBase")
73 | crisprNuclease <- SpCas9
74 | ```
75 |
76 | To learn how to specify or build a custom nuclease, see the [nuclease tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Custom_Nuclease).
77 |
78 |
79 |
80 | ### Specifying on-target scoring methods
81 |
82 | We specify which on-target scoring methods should be used to score
83 | the gRNAs:
84 |
85 | ```{r}
86 | scoring_methods <- c("deephf", "deepspcas9")
87 | ```
88 |
89 |
90 | One can see which scoring methods are available for a given nuclease using
91 | the following command:
92 |
93 |
94 | ```{r}
95 | crisprScore::scoringMethodsInfo
96 | ```
97 |
98 |
99 | ### Specifying gene models and TSS annotations
100 |
101 | To annotate gRNAs with a gene and TSS annotation, we need to specify a gene model
102 | formatted as a `GRangesList` object, as well as a TSS annotation with a
103 | `GRanges` object. The `crisprDesignData` contains such objects for both
104 | the human and mouse genomes, in GRCh38 (hg38) and GRCm38 (mm10) coordinates,
105 | respectively. Ensembl gene models were used to generate such objects.
106 | We load those objects:
107 |
108 |
109 | ```{r, warning=FALSE, message=FALSE}
110 | data(txdb_human, package="crisprDesignData")
111 | data(tss_human, package="crisprDesignData")
112 | txObject <- txdb_human
113 | tssObject <- tss_human
114 | ```
115 |
116 | See the [gene annotation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation)
117 | to learn how to build such objects. The [crisprDesignData](https://github.com/crisprVerse/crisprDesignData) also has
118 | tons of useful information.
119 |
120 |
121 | ### Specifying repeat elements
122 |
123 | To avoid designing gRNAs targeting repeat elements, we will specify a `GRanges`
124 | object containing repeats coordinates for the human genome. Here, we use the object `gr.repeats.hg38` in `crisprDesignData`. It contains genomic coordinates of
125 | the RepeatMasker UCSC track, for the hg38 reference genome:
126 |
127 | ```{r, warning=FALSE, message=FALSE}
128 | data(gr.repeats.hg38, package="crisprDesignData")
129 | grRepeats <- gr.repeats.hg38
130 | ```
131 |
132 |
133 | # Building a complete annotation for a given gene
134 |
135 |
136 | ```{r, echo=FALSE, warning=FALSE, message=FALSE}
137 | # Explicitely calling those packages here instead of
138 | # printout when calling precomputeGuides. This can be omitted.
139 | library(crisprScoreData)
140 | library(ExperimentHub)
141 | library(AnnotationHub)
142 | library(BiocFileCache)
143 | library(dbplyr)
144 | ```
145 |
146 | The `designCompleteAnnotation` function in `crisprDesign` provides a
147 | one-step workflow to design and annotate all gRNAs targeting a given gene.
148 | The function was designed to be as comprehensive as possible to
149 | design and annotate gRNAs in one step. It does the following:
150 |
151 | - Extract the DNA/RNA sequences with `queryTss`/`queryTxDB`
152 | - Design gRNAs with `findSpacers`
153 | - Remove gRNAs targeting repeat elements with `removeRepeats`
154 | - Characterize spacer sequences with `addSequenceFeatures`
155 | - Find on- and off-targets with `addSpacerAlignmentsIterative`
156 | - Add gene annotation with `addGeneAnnotation`
157 | - Add TSS annotation with `addTssAnnotation`
158 | - Add on-target efficiency scores with `addOnTargetScores`
159 | - Add off-target specificity scores with `addOffTargetScores`
160 | - Add SNP annotation with `addSNPAnnotation`
161 | - Add restriction enzymes information with `addRestrictionEnzymes`
162 |
163 |
164 | Here, we design all CRISPRko gRNAs targeting the
165 | human KRAS gene (ENSG00000133703):
166 |
167 |
168 |
169 | ```{r, warning=FALSE}
170 | gs <- designCompleteAnnotation(queryValue="ENSG00000133703",
171 | queryColumn="gene_id",
172 | modality="CRISPRko",
173 | bsgenome=bsgenome,
174 | bowtie_index=bowtie_index,
175 | crisprNuclease=SpCas9,
176 | txObject=txObject,
177 | tssObject=tssObject,
178 | grRepeats=grRepeats,
179 | vcf=vcf,
180 | n_mismatches=1,
181 | scoring_methods=scoring_methods)
182 | ```
183 |
184 |
185 | The resulting object is a `GuideSet` object. To learn more about what are
186 | `GuideSet` objects, and how to interact with them, see the [CRISPRko gRNA design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9).
187 |
188 |
189 | ```{r}
190 | gs
191 | ```
192 |
193 |
194 | ### Converting the `GuideSet` object to a list of data.frames
195 |
196 | The `flattenGuideSet` function in `crisprDesign` is a convenience function
197 | to convert a `GuideSet` object into a set of `data.frames` that can be
198 | saved as plain text files:
199 |
200 | ```{r}
201 | dfs <- flattenGuideSet(gs)
202 | ```
203 |
204 | We can look at the names of the data.frames:
205 |
206 | ```{r}
207 | names(dfs)
208 | ```
209 |
210 | As an example, let's look at the first rows of the primary data.frame:
211 |
212 | ```{r}
213 | head(dfs$primary)
214 | ```
215 |
216 | # Building a complete gRNA database across all protein-coding genes
217 |
218 | We first get all possibles genes from our gene model:
219 |
220 | ```{r}
221 | gene_ids <- unique(txObject$cds$gene_id)
222 | head(gene_ids)
223 | ```
224 |
225 |
226 | and specify where to save the `GuideSet` objects:
227 |
228 |
229 | ```{r, eval=FALSE}
230 | dir <- "./crisprko_cas9_hg38"
231 | if (!dir.exists(dir)){
232 | dir.create(dir, recursive=TRUE)
233 | }
234 | ```
235 |
236 |
237 | We are now looping over all genes to generate the data:
238 |
239 | ```{r, eval=FALSE}
240 | lapply(gene_index, function(gene){
241 | gs <- designCompleteAnnotation(queryValue=gene,
242 | queryColumn="gene_id",
243 | modality="CRISPRko",
244 | bsgenome=bsgenome,
245 | bowtie_index=bowtie_index,
246 | crisprNuclease=SpCas9,
247 | txObject=txObject,
248 | tssObject=tssObject,
249 | grRepeats=grRepeats,
250 | vcf=vcf,
251 | n_mismatches=3,
252 | scoring_methods=scoring_methods)
253 | write.rds(gs, file=file.path(dir, paste0(gene, ".rds")))
254 | })
255 | ```
256 |
257 | This loop can be modified by the user to use an embarrassingly-parallel
258 | approach, using the [BiocParallel](https://bioconductor.org/packages/BiocParallel/) package, for instance.
259 |
260 |
261 | Building a database for CRISPRa and CRISPRi applications works similarly
262 | See `?designCompleteAnnotation` for more information.
263 |
264 |
265 | # Reproducibility
266 |
267 | ```{r}
268 | sessionInfo()
269 | ```
270 |
--------------------------------------------------------------------------------
/Building_Custom_Nuclease/references.bib:
--------------------------------------------------------------------------------
1 | @article{komor,
2 | title={Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage},
3 | author={Komor, Alexis C and Kim, Yongjoo B and Packer, Michael S and Zuris, John A and Liu, David R},
4 | journal={Nature},
5 | volume={533},
6 | number={7603},
7 | pages={420--424},
8 | year={2016},
9 | publisher={Nature Publishing Group}
10 | }
11 |
12 | @article{behive,
13 | title={Determinants of base editing outcomes from target library analysis and machine learning},
14 | author={Arbab, Mandana and Shen, Max W and Mok, Beverly and Wilson, Christopher and Matuszek, {\.Z}aneta and Cassa, Christopher A and Liu, David R},
15 | journal={Cell},
16 | volume={182},
17 | number={2},
18 | pages={463--480},
19 | year={2020},
20 | publisher={Elsevier}
21 | }
22 |
23 | @article{langmead2009bowtie,
24 | Abstract = {Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.},
25 | Author = {Langmead, Ben and Trapnell, Cole and Pop, Mihai and Salzberg, Steven L.},
26 | Da = {2009/03/04},
27 | Doi = {10.1186/gb-2009-10-3-r25},
28 | Id = {Langmead2009},
29 | Isbn = {1474-760X},
30 | Journal = {Genome Biology},
31 | Number = {3},
32 | Pages = {R25},
33 | Title = {Ultrafast and memory-efficient alignment of short DNA sequences to the human genome},
34 | Ty = {JOUR},
35 | Url = {https://doi.org/10.1186/gb-2009-10-3-r25},
36 | Volume = {10},
37 | Year = {2009}
38 | }
39 |
40 |
41 | @article{lindel,
42 | title={Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair},
43 | author={Chen, Wei and McKenna, Aaron and Schreiber, Jacob and Haeussler, Maximilian and Yin, Yi and Agarwal, Vikram and Noble, William Stafford and Shendure, Jay},
44 | journal={Nucleic acids research},
45 | volume={47},
46 | number={15},
47 | pages={7989--8003},
48 | year={2019},
49 | publisher={Oxford University Press}
50 | }
51 |
52 | @article{azimuth,
53 | title={Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9},
54 | author={Doench, John G and Fusi, Nicolo and Sullender, Meagan and Hegde, Mudra and Vaimberg, Emma W and Donovan, Katherine F and Smith, Ian and Tothova, Zuzana and Wilen, Craig and Orchard, Robert and others},
55 | journal={Nature biotechnology},
56 | volume={34},
57 | number={2},
58 | pages={184},
59 | year={2016},
60 | publisher={Nature Publishing Group}
61 | }
62 |
63 | @article{deepcas9,
64 | title={Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning},
65 | author={Wang, Daqi and Zhang, Chengdong and Wang, Bei and Li, Bin and Wang, Qiang and Liu, Dong and Wang, Hongyan and Zhou, Yan and Shi, Leming and Lan, Feng and others},
66 | journal={Nature communications},
67 | volume={10},
68 | number={1},
69 | pages={1--14},
70 | year={2019},
71 | publisher={Nature Publishing Group}
72 | }
73 |
74 | @article{deepcpf1,
75 | title={Deep learning improves prediction of CRISPR--Cpf1 guide RNA activity},
76 | author={Kim, Hui Kwon and Min, Seonwoo and Song, Myungjae and Jung, Soobin and Choi, Jae Woo and Kim, Younggwang and Lee, Sangeun and Yoon, Sungroh and Kim, Hyongbum Henry},
77 | journal={Nature biotechnology},
78 | volume={36},
79 | number={3},
80 | pages={239},
81 | year={2018},
82 | publisher={Nature Publishing Group}
83 | }
84 |
85 | @article{perturbseq,
86 | title={A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response},
87 | author={Adamson, Britt and Norman, Thomas M and Jost, Marco and Cho, Min Y and Nu{\~n}ez, James K and Chen, Yuwen and Villalta, Jacqueline E and Gilbert, Luke A and Horlbeck, Max A and Hein, Marco Y and others},
88 | journal={Cell},
89 | volume={167},
90 | number={7},
91 | pages={1867--1882},
92 | year={2016},
93 | publisher={Elsevier}
94 | }
95 |
96 | @article{cropseq,
97 | title={Pooled CRISPR screening with single-cell transcriptome readout},
98 | author={Datlinger, Paul and Rendeiro, Andr{\'e} F and Schmidl, Christian and Krausgruber, Thomas and Traxler, Peter and Klughammer, Johanna and Schuster, Linda C and Kuchler, Amelie and Alpar, Donat and Bock, Christoph},
99 | journal={Nature methods},
100 | volume={14},
101 | number={3},
102 | pages={297},
103 | year={2017},
104 | publisher={Nature Publishing Group}
105 | }
106 |
107 | @article{crispracrisprireview,
108 | title={CRISPRi and CRISPRa screens in mammalian cells for precision biology and medicine},
109 | author={Kampmann, Martin},
110 | journal={ACS chemical biology},
111 | volume={13},
112 | number={2},
113 | pages={406--416},
114 | year={2018},
115 | publisher={ACS Publications}
116 | }
117 |
118 |
119 |
120 | @article{crispri,
121 | title={CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes},
122 | author={Gilbert, Luke A and Larson, Matthew H and Morsut, Leonardo and Liu, Zairan and Brar, Gloria A and Torres, Sandra E and Stern-Ginossar, Noam and Brandman, Onn and Whitehead, Evan H and Doudna, Jennifer A and others},
123 | journal={Cell},
124 | volume={154},
125 | number={2},
126 | pages={442--451},
127 | year={2013},
128 | publisher={Elsevier}
129 | }
130 |
131 |
132 | @article{sam,
133 | title={Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex},
134 | author={Konermann, Silvana and Brigham, Mark D and Trevino, Alexandro E and Joung, Julia and Abudayyeh, Omar O and Barcena, Clea and Hsu, Patrick D and Habib, Naomi and Gootenberg, Jonathan S and Nishimasu, Hiroshi and others},
135 | journal={Nature},
136 | volume={517},
137 | number={7536},
138 | pages={583},
139 | year={2015},
140 | publisher={Nature Publishing Group}
141 | }
142 |
143 | @article{fortin2019,
144 | title={Multiple-gene targeting and mismatch tolerance can confound analysis of genome-wide pooled CRISPR screens},
145 | author={Fortin, Jean-Philippe and Tan, Jenille and Gascoigne, Karen E and Haverty, Peter M and Forrest, William F and Costa, Michael R and Martin, Scott E},
146 | journal={Genome biology},
147 | volume={20},
148 | number={1},
149 | pages={21},
150 | year={2019},
151 | publisher={Springer}
152 | }
153 |
154 |
155 |
156 | @article{sanson2018optimized,
157 | title={Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
158 | author={Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
159 | journal={Nature communications},
160 | volume={9},
161 | number={1},
162 | pages={1--15},
163 | year={2018},
164 | publisher={Nature Publishing Group}
165 | }
166 |
167 |
168 |
169 | @article{horlbeck2016compact,
170 | title={Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
171 | author={Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
172 | journal={Elife},
173 | volume={5},
174 | pages={e19760},
175 | year={2016},
176 | publisher={eLife Sciences Publications Limited}
177 | }
178 |
179 |
180 | @article{ceres,
181 | title={Computational correction of copy number effect improves specificity of CRISPR--Cas9 essentiality screens in cancer cells},
182 | author={Meyers, Robin M and Bryan, Jordan G and McFarland, James M and Weir, Barbara A and Sizemore, Ann E and Xu, Han and Dharia, Neekesh V and Montgomery, Phillip G and Cowley, Glenn S and Pantel, Sasha and others},
183 | journal={Nature genetics},
184 | volume={49},
185 | number={12},
186 | pages={1779--1784},
187 | year={2017},
188 | publisher={Nature Publishing Group}
189 | }
190 |
191 |
192 | @article{score,
193 | title={Prioritization of cancer therapeutic targets using CRISPR--Cas9 screens},
194 | author={Behan, Fiona M and Iorio, Francesco and Picco, Gabriele and Gon{\c{c}}alves, Emanuel and Beaver, Charlotte M and Migliardi, Giorgia and Santos, Rita and Rao, Yanhua and Sassi, Francesco and Pinnelli, Marika and others},
195 | journal={Nature},
196 | volume={568},
197 | number={7753},
198 | pages={511},
199 | year={2019},
200 | publisher={Nature Publishing Group}
201 | }
202 |
203 |
204 |
205 | @article{mit,
206 | title={DNA targeting specificity of RNA-guided Cas9 nucleases},
207 | author={Hsu, Patrick D and Scott, David A and Weinstein, Joshua A and Ran, F Ann and Konermann, Silvana and Agarwala, Vineeta and Li, Yinqing and Fine, Eli J and Wu, Xuebing and Shalem, Ophir and others},
208 | journal={Nature biotechnology},
209 | volume={31},
210 | number={9},
211 | pages={827},
212 | year={2013},
213 | publisher={Nature Publishing Group}
214 | }
215 |
216 |
217 |
218 | @article{rebase,
219 | title={REBASE—a database for DNA restriction and modification: enzymes, genes and genomes},
220 | author={Roberts, Richard J and Vincze, Tamas and Posfai, Janos and Macelis, Dana},
221 | journal={Nucleic acids research},
222 | volume={38},
223 | number={suppl\_1},
224 | pages={D234--D236},
225 | year={2010},
226 | publisher={Oxford University Press}
227 | }
228 |
229 |
230 |
231 |
--------------------------------------------------------------------------------
/Design_CRISPRbe/references.bib:
--------------------------------------------------------------------------------
1 | @article{ops,
2 | title={Optical pooled screens in human cells},
3 | author={Feldman, David and Singh, Avtar and Schmid-Burgk, Jonathan L and Carlson, Rebecca J and Mezger, Anja and Garrity, Anthony J and Zhang, Feng and Blainey, Paul C},
4 | journal={Cell},
5 | volume={179},
6 | number={3},
7 | pages={787--799},
8 | year={2019},
9 | publisher={Elsevier}
10 | }
11 |
12 | @article{cas13d,
13 | author = {Konermann, Silvana and Lotfy, Peter and Brideau, Nicholas J and Oki, Jennifer and Shokhirev, Maxim N and Hsu, Patrick D},
14 | journal = {Cell},
15 | number = {3},
16 | pages = {665--676},
17 | publisher = {Elsevier},
18 | title = {Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors},
19 | volume = {173},
20 | year = {2018}}
21 |
22 | @article{koblan2018improving,
23 | title={Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction},
24 | author={Koblan, Luke W and Doman, Jordan L and Wilson, Christopher and Levy, Jonathan M and Tay, Tristan and Newby, Gregory A and Maianti, Juan Pablo and Raguram, Aditya and Liu, David R},
25 | journal={Nature biotechnology},
26 | volume={36},
27 | number={9},
28 | pages={843--846},
29 | year={2018},
30 | publisher={Nature Publishing Group}
31 | }
32 |
33 | @article{sanson2018optimized,
34 | title={Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
35 | author={Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
36 | journal={Nature communications},
37 | volume={9},
38 | number={1},
39 | pages={1--15},
40 | year={2018},
41 | publisher={Nature Publishing Group}
42 | }
43 |
44 | @article{langmead2009bowtie,
45 | Abstract = {Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.},
46 | Author = {Langmead, Ben and Trapnell, Cole and Pop, Mihai and Salzberg, Steven L.},
47 | Da = {2009/03/04},
48 | Doi = {10.1186/gb-2009-10-3-r25},
49 | Id = {Langmead2009},
50 | Isbn = {1474-760X},
51 | Journal = {Genome Biology},
52 | Number = {3},
53 | Pages = {R25},
54 | Title = {Ultrafast and memory-efficient alignment of short DNA sequences to the human genome},
55 | Ty = {JOUR},
56 | Url = {https://doi.org/10.1186/gb-2009-10-3-r25},
57 | Volume = {10},
58 | Year = {2009}
59 | }
60 |
61 |
62 | @article{lindel,
63 | title={Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair},
64 | author={Chen, Wei and McKenna, Aaron and Schreiber, Jacob and Haeussler, Maximilian and Yin, Yi and Agarwal, Vikram and Noble, William Stafford and Shendure, Jay},
65 | journal={Nucleic acids research},
66 | volume={47},
67 | number={15},
68 | pages={7989--8003},
69 | year={2019},
70 | publisher={Oxford University Press}
71 | }
72 |
73 | @article{azimuth,
74 | title={Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9},
75 | author={Doench, John G and Fusi, Nicolo and Sullender, Meagan and Hegde, Mudra and Vaimberg, Emma W and Donovan, Katherine F and Smith, Ian and Tothova, Zuzana and Wilen, Craig and Orchard, Robert and others},
76 | journal={Nature biotechnology},
77 | volume={34},
78 | number={2},
79 | pages={184},
80 | year={2016},
81 | publisher={Nature Publishing Group}
82 | }
83 |
84 | @article{deepcas9,
85 | title={Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning},
86 | author={Wang, Daqi and Zhang, Chengdong and Wang, Bei and Li, Bin and Wang, Qiang and Liu, Dong and Wang, Hongyan and Zhou, Yan and Shi, Leming and Lan, Feng and others},
87 | journal={Nature communications},
88 | volume={10},
89 | number={1},
90 | pages={1--14},
91 | year={2019},
92 | publisher={Nature Publishing Group}
93 | }
94 |
95 | @article{deepcpf1,
96 | title={Deep learning improves prediction of CRISPR--Cpf1 guide RNA activity},
97 | author={Kim, Hui Kwon and Min, Seonwoo and Song, Myungjae and Jung, Soobin and Choi, Jae Woo and Kim, Younggwang and Lee, Sangeun and Yoon, Sungroh and Kim, Hyongbum Henry},
98 | journal={Nature biotechnology},
99 | volume={36},
100 | number={3},
101 | pages={239},
102 | year={2018},
103 | publisher={Nature Publishing Group}
104 | }
105 |
106 | @article{perturbseq,
107 | title={A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response},
108 | author={Adamson, Britt and Norman, Thomas M and Jost, Marco and Cho, Min Y and Nu{\~n}ez, James K and Chen, Yuwen and Villalta, Jacqueline E and Gilbert, Luke A and Horlbeck, Max A and Hein, Marco Y and others},
109 | journal={Cell},
110 | volume={167},
111 | number={7},
112 | pages={1867--1882},
113 | year={2016},
114 | publisher={Elsevier}
115 | }
116 |
117 | @article{cropseq,
118 | title={Pooled CRISPR screening with single-cell transcriptome readout},
119 | author={Datlinger, Paul and Rendeiro, Andr{\'e} F and Schmidl, Christian and Krausgruber, Thomas and Traxler, Peter and Klughammer, Johanna and Schuster, Linda C and Kuchler, Amelie and Alpar, Donat and Bock, Christoph},
120 | journal={Nature methods},
121 | volume={14},
122 | number={3},
123 | pages={297},
124 | year={2017},
125 | publisher={Nature Publishing Group}
126 | }
127 |
128 | @article{crispracrisprireview,
129 | title={CRISPRi and CRISPRa screens in mammalian cells for precision biology and medicine},
130 | author={Kampmann, Martin},
131 | journal={ACS chemical biology},
132 | volume={13},
133 | number={2},
134 | pages={406--416},
135 | year={2018},
136 | publisher={ACS Publications}
137 | }
138 |
139 |
140 |
141 | @article{crispri,
142 | title={CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes},
143 | author={Gilbert, Luke A and Larson, Matthew H and Morsut, Leonardo and Liu, Zairan and Brar, Gloria A and Torres, Sandra E and Stern-Ginossar, Noam and Brandman, Onn and Whitehead, Evan H and Doudna, Jennifer A and others},
144 | journal={Cell},
145 | volume={154},
146 | number={2},
147 | pages={442--451},
148 | year={2013},
149 | publisher={Elsevier}
150 | }
151 |
152 |
153 | @article{sam,
154 | title={Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex},
155 | author={Konermann, Silvana and Brigham, Mark D and Trevino, Alexandro E and Joung, Julia and Abudayyeh, Omar O and Barcena, Clea and Hsu, Patrick D and Habib, Naomi and Gootenberg, Jonathan S and Nishimasu, Hiroshi and others},
156 | journal={Nature},
157 | volume={517},
158 | number={7536},
159 | pages={583},
160 | year={2015},
161 | publisher={Nature Publishing Group}
162 | }
163 |
164 | @article{fortin2019,
165 | title={Multiple-gene targeting and mismatch tolerance can confound analysis of genome-wide pooled CRISPR screens},
166 | author={Fortin, Jean-Philippe and Tan, Jenille and Gascoigne, Karen E and Haverty, Peter M and Forrest, William F and Costa, Michael R and Martin, Scott E},
167 | journal={Genome biology},
168 | volume={20},
169 | number={1},
170 | pages={21},
171 | year={2019},
172 | publisher={Springer}
173 | }
174 |
175 |
176 |
177 | @article{sanson2018optimized,
178 | title={Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
179 | author={Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
180 | journal={Nature communications},
181 | volume={9},
182 | number={1},
183 | pages={1--15},
184 | year={2018},
185 | publisher={Nature Publishing Group}
186 | }
187 |
188 |
189 |
190 | @article{horlbeck2016compact,
191 | title={Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
192 | author={Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
193 | journal={Elife},
194 | volume={5},
195 | pages={e19760},
196 | year={2016},
197 | publisher={eLife Sciences Publications Limited}
198 | }
199 |
200 |
201 | @article{ceres,
202 | title={Computational correction of copy number effect improves specificity of CRISPR--Cas9 essentiality screens in cancer cells},
203 | author={Meyers, Robin M and Bryan, Jordan G and McFarland, James M and Weir, Barbara A and Sizemore, Ann E and Xu, Han and Dharia, Neekesh V and Montgomery, Phillip G and Cowley, Glenn S and Pantel, Sasha and others},
204 | journal={Nature genetics},
205 | volume={49},
206 | number={12},
207 | pages={1779--1784},
208 | year={2017},
209 | publisher={Nature Publishing Group}
210 | }
211 |
212 |
213 | @article{score,
214 | title={Prioritization of cancer therapeutic targets using CRISPR--Cas9 screens},
215 | author={Behan, Fiona M and Iorio, Francesco and Picco, Gabriele and Gon{\c{c}}alves, Emanuel and Beaver, Charlotte M and Migliardi, Giorgia and Santos, Rita and Rao, Yanhua and Sassi, Francesco and Pinnelli, Marika and others},
216 | journal={Nature},
217 | volume={568},
218 | number={7753},
219 | pages={511},
220 | year={2019},
221 | publisher={Nature Publishing Group}
222 | }
223 |
224 |
225 |
226 | @article{mit,
227 | title={DNA targeting specificity of RNA-guided Cas9 nucleases},
228 | author={Hsu, Patrick D and Scott, David A and Weinstein, Joshua A and Ran, F Ann and Konermann, Silvana and Agarwala, Vineeta and Li, Yinqing and Fine, Eli J and Wu, Xuebing and Shalem, Ophir and others},
229 | journal={Nature biotechnology},
230 | volume={31},
231 | number={9},
232 | pages={827},
233 | year={2013},
234 | publisher={Nature Publishing Group}
235 | }
236 |
237 |
238 |
239 |
--------------------------------------------------------------------------------
/Design_PairedGuides/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design paired gRNAs"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 |
15 | # Introduction
16 |
17 | In this tutorial, we illustrate the main functionalities of `crisprDesign`
18 | for designing pairs of gRNAs.
19 |
20 | # Getting started
21 |
22 | # Installation
23 |
24 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
25 | `crisprDesign`, `crisprDesignData`
26 |
27 |
28 | ## Terminology
29 |
30 | See the [CRISPRko Cas9 design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar with the terminology used throughout this tutorial.
31 |
32 | ## Paired gRNA design overview
33 |
34 | There are several applications that require the design of gRNA pairs:
35 |
36 | 1. Double nicking with CRISPR/Cas9 [@ran2013double]
37 | 2. Dual-promoter screening systems [@han2017synergistic]
38 | 3. Multiplexing gRNAs with enAsCas12a [@deweirdt2021optimization]
39 | 4. Nanopore Cas9-targeted sequencing (nCATS) [@gilpatrick2020targeted]
40 |
41 | The `crisprDesign` package provides an infrastructure to store an annotate
42 | gRNA pairs via the `PaireGuideSet` object, which behaves very similarly
43 | to the `GuideSet` object used for unpaired gRNAs. We designed the
44 | functionalities for paired gRNAs with the aforementioned applications in mind.
45 |
46 | In this tutorial, we will go through a simple example to illustrate
47 | the general concept behind paired gRNA design with `crisprDesign`.
48 |
49 |
50 | # A simple example: deleting a KRAS exon with a pair of gRNAs
51 |
52 | We will show here how to design an optimal pair of Cas9 gRNAs flanking the
53 | second exon of the human gene KRAS (ENSG00000133703), with the goal of
54 | creating a deletion that will excise the exon.
55 |
56 | We first start by loading the necessary packages:
57 |
58 | ```{r, message=FALSE, warning=FALSE, results='hide' }
59 | library(crisprDesign)
60 | library(crisprDesignData)
61 | library(crisprBase)
62 | library(BSgenome.Hsapiens.UCSC.hg38)
63 | ```
64 |
65 | We will be designing gRNAs for the SpCas9 nuclease, which can be loaded
66 | from We load the `crisprBase` package (see the `crisprBase` [vignette](https://github.com/crisprVerse/crisprBase) for
67 | instructions on how to create or load alternative nucleases):
68 |
69 |
70 | ```{r}
71 | data(SpCas9, package="crisprBase")
72 | ```
73 |
74 | Let's get the genomic coordinates of the second exon.
75 | First, we obtain from `crisprDesignData` a `GRangesList` object that defines
76 | the genomic coordinates (hg38 genome) of human protein-coding genes:
77 |
78 | ```{r}
79 | data(txdb_human, package="crisprDesignData")
80 | ```
81 |
82 | We then get the exonic coordinates of the canonical transcript ENST00000311936
83 | using the function `queryTxObject` from `crisprDesign`:
84 |
85 |
86 | ```{r}
87 | exons <- queryTxObject(txObject=txdb_human,
88 | featureType="exons",
89 | queryColumn="tx_id",
90 | queryValue="ENST00000311936")
91 | exons
92 | ```
93 |
94 | Finally, we select the second exon:
95 |
96 | ```{r}
97 | exon <- exons[exons$exon_rank==2]
98 | names(exon) <- "exon_kras"
99 | exon
100 | ```
101 |
102 | The exon is on chr12, and spans the region 25245274-25245395 (122
103 | nucleotides in length). We aim to design gRNAs pairs for which one
104 | gRNA is located upstream of the exon, and another located downstream
105 | of the exon. To be able to find good gRNA candddates, let's define those
106 | regions to have 100 nucleotides on each side:
107 |
108 |
109 | ```{r}
110 | library(IRanges)
111 | regionUpstream <- IRanges::flank(exon, width=100, start=FALSE)
112 | regionDownstream <- IRanges::flank(exon, width=100, start=TRUE)
113 | names(regionUpstream) <- "upstreamTarget"
114 | names(regionDownstream) <- "downstreamTarget"
115 | ```
116 |
117 | Similar to the `findSpacers` function in `crisprDesign`, we will need
118 | to specify a `BSgenome` object containing the reference genome DNA
119 | sequences:
120 |
121 | ```{r}
122 | bsgenome <- BSgenome.Hsapiens.UCSC.hg38
123 | ```
124 |
125 | We are now ready to find all candidate gRNA pairs:
126 |
127 | ```{r}
128 | pairs <- findSpacerPairs(x1=regionUpstream,
129 | x2=regionDownstream,
130 | bsgenome=bsgenome,
131 | crisprNuclease=SpCas9)
132 | ```
133 |
134 | The `x1` and `x2` arguments specify the genomic regions in which gRNAs at
135 | position 1 and position 2 should be targeting, respectively.
136 | The function finds all possible pair combinations between spacers
137 | found in the region specified by `x1` and spacers found in the region s
138 | pecified by `x2`. Let' first name our pairs:
139 |
140 | ```{r}
141 | names(pairs) <- paste0("pair_", seq_along(pairs))
142 | ```
143 |
144 | Let's see what the results look like:
145 | ```{r}
146 | head(pairs, n=3)
147 | ```
148 |
149 | The returned object is a `PairedGuideSet`, which can be though of a list
150 | of two `GuideSet` objects. The first and second `GuideSet` store
151 | information about gRNAs at position 1 and position 2, respectively.
152 | They can be accessed using the `first` and `second` functions:
153 |
154 | ```{r}
155 | grnas1 <- first(pairs)
156 | head(grnas1, n=3)
157 | ```
158 |
159 | and
160 |
161 | ```{r}
162 | grnas2 <- second(pairs)
163 | head(grnas2, n=3)
164 | ```
165 |
166 | The `pamOrientation` function returns the PAM orientation of the pairs:
167 |
168 | ```{r}
169 | head(pamOrientation(pairs))
170 | ```
171 |
172 | and takes 4 different values: `in` (for PAM-in configuration), `out` (for
173 | PAM-out configuration), `fwd` (both gRNAs target the forward strand),
174 | and `rev` (both gRNAs target the reverse strand); see figure below for
175 | an illustration of the PAM orientations for the SpCas9 nuclease.
176 | The importance of the PAM orientation is application-specific. For
177 | Nanopore Cas9-targeted sequencing, PAM-in configuration is preferred.
178 | For double nicking with CRISPR/Cas9, PAM-out configuration is preferred.
179 | For applications using a dual-promoter system, no configuration is
180 | preferred.
181 |
182 |
183 | ```{r, echo=FALSE, out.width = "75%", fig.align="center", fig.cap="Different PAM orientations for Cas9 paired gRNAs"}
184 | knitr::include_graphics("./figures/paired_simplified.svg")
185 | ```
186 |
187 |
188 | The function `pamDistance` returns the distance between the PAM sites of the
189 | two gRNAs. The function `cutLength` returns the distance between the
190 | cut sites of the two gRNAs, and the function `spacerDistance`
191 | returns the distance between the two spacer sequences of the gRNAs.
192 |
193 | Most functionalities available for designing single gRNAs (`GuideSet`
194 | annotation functions described in [this tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9)) work similarly for `PairedGuideSet` objects. This includes:
195 |
196 | - `addSequenceFeatures`
197 | - `addSpacerAlignments`
198 | - `addGeneAnnotation`
199 | - `addTssAnnotation`
200 | - `addOnTargetScores`
201 | - `addOffTargetScores`
202 | - `addPamScores`
203 | - `addSNPAnnotation`
204 | - `addRestrictionEnzymes`
205 | - `addCompositeScores`
206 | - `addConservationScores`
207 |
208 | Each function adds an annotation to the first and second `GuideSet` objects
209 | stored in the `PairedGuideSet`. Let's look at an example using
210 | `addSequenceFeatures`:
211 |
212 | ```{r}
213 | pairs <- addSequenceFeatures(pairs)
214 | ```
215 |
216 | and let's look at the `GuideSet` in the first position:
217 |
218 | ```{r}
219 | head(first(pairs), n=3)
220 | ```
221 |
222 | This comes in handy to filter out pairs with unwanted sgRNA characteristics,
223 | e.g. sgRNA with polyT stretches:
224 |
225 | ```{r}
226 | good1 <- !first(pairs)$polyT
227 | good2 <- !second(pairs)$polyT
228 | pairs <- pairs[good1 & good2]
229 | ```
230 |
231 | To select the final candidate pairs to excise the KRAS exon, we will
232 | filter out pairs with low predicted on-target activity using the
233 | DeepHF on-target activity score. We first add the score:
234 |
235 | ```{r}
236 | pairs <- addOnTargetScores(pairs, methods="deephf")
237 | ```
238 |
239 | and only keep pairs for which both gRNAs have a score greater than 0.5:
240 |
241 | ```{r}
242 | good1 <- first(pairs)$score_deephf>=0.5
243 | good2 <- second(pairs)$score_deephf>=0.5
244 | pairs <- pairs[good1 & good2]
245 | ```
246 |
247 | This leaves us with 2 candidate pairs:
248 |
249 | ```{r}
250 | pairs
251 | ```
252 |
253 |
254 | Finally, let's check for off-targets.
255 | We need to specify the path of the bowtie index that was generated
256 | from the human reference genome:
257 |
258 | ```{r}
259 | bowtie_index <- "/Users/fortinj2/crisprIndices/bowtie/hg38/hg38"
260 | ```
261 |
262 | For instructions on how to build a Bowtie index from a given reference genome, see the [genome index tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices) or the [crisprBowtie page](https://github.com/crisprVerse/crisprBowtie).
263 |
264 | We are now ready to search for off-targets with up to 3 mismatches:
265 |
266 |
267 | ```{r}
268 | pairs <- addSpacerAlignments(pairs,
269 | txObject=txdb_human,
270 | aligner_index=bowtie_index,
271 | bsgenome=bsgenome,
272 | n_mismatches=3)
273 | ```
274 |
275 | We are in luck, none of the spacer sequences has an off-target in the coding
276 | region of other genes:
277 |
278 |
279 | ```{r}
280 | good1 <- first(pairs)$n1_c==0 & first(pairs)$n2_c==0 & first(pairs)$n3_c==0
281 | good2 <- second(pairs)$n1_c==0 & second(pairs)$n2_c==0 & second(pairs)$n3_c==0
282 | pairs <- pairs[good1 & good2]
283 | pairs
284 | ```
285 |
286 |
287 | One can get the spacer sequences using the `spacers` accessor function
288 | as usual:
289 |
290 | ```{r}
291 | spacers(pairs)
292 | ```
293 |
294 |
295 |
296 | # Session Info
297 |
298 | ```{r}
299 | sessionInfo()
300 | ```
301 |
302 | # References
303 |
304 |
305 |
--------------------------------------------------------------------------------
/Design_OPS/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Using crisprDesign to design gRNAs for optical pooled screening (OPS)"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 |
15 | # Introduction
16 |
17 | Optical pooled screening (OPS) combines image-based sequencing (in
18 | situ sequencing) of gRNAs and optical phenotyping on the same physical
19 | wells [@ops]. In such experiments, guide RNA (gRNA) spacer sequences
20 | are partially sequenced from the 5-prime end; the length of these
21 | truncated sequences, or barcodes, which corresponds to the number of
22 | sequencing cycles, is fixed and chosen by the experimentalist. From a
23 | gRNA design perspective, additional constraints are needed to ensure
24 | sufficient dissimilarity between the truncated barcodes for their
25 | identification during the analysis.
26 |
27 | This tutorial will demonstrate how to design gRNAs for use in
28 | optical pooled screens, with emphasis on the constraints described
29 | above. Common gRNA design steps that are not specific to OPS are omitted
30 | in this tutorial (e.g. off-target search, or on-target activity prediction)
31 | here. Users can peruse through the list of [available tutorials](https://github.com/crisprVerse/Tutorials) for more information
32 | regarding application-specific gRNA design rules.
33 |
34 |
35 |
36 | # Installation
37 |
38 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
39 | `crisprDesign`, `crisprDesignData`
40 |
41 |
42 | # Terminology
43 |
44 | See the [CRISPRko design vignette](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar with the terminology used throughout this tutorial.
45 |
46 |
47 | # Design for optical pooled screening (OPS)
48 |
49 | To illustrate the functionalities of `crisprDesign` for designing
50 | OPS libraries, we will design a small CRISPRko OPS library targeting
51 | 3 genes of the human RAS family: KRAS, HRAS, and NRAS. We will use the SpCas9
52 | nuclease.
53 |
54 | We will design gRNAs for an experiment that uses 8 in situ sequencing cycles:
55 |
56 | ```{r}
57 | n_cycles=8
58 | ```
59 |
60 | ## Loading packages
61 |
62 | Before we start, we first load the necessary packages for this tutorial:
63 |
64 | ```{r, warning=FALSE, message=FALSE, results='hide'}
65 | library(crisprBase)
66 | library(crisprDesign)
67 | library(crisprDesignData)
68 | library(BSgenome.Hsapiens.UCSC.hg38)
69 | ```
70 |
71 |
72 | ## Creating the GuideSet
73 |
74 | We begin by loading the SpCas9 `CrisprNuclease` object from the `crisprBase`
75 | package
76 |
77 | ```{r}
78 | data(SpCas9, package="crisprBase")
79 | ```
80 |
81 | as well as data containing gene regions for the human genome:
82 |
83 | ```{r}
84 | data(txdb_human, package="crisprDesignData")
85 | ```
86 |
87 | For more information on `txdb_human` and how to create similar gene annotation objects, see the [Building a gene annotation object](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation) tutorial.
88 |
89 |
90 | Next, we find the CDS coordinates for our genes using
91 | the `queryTxObject` function:
92 |
93 | ```{r}
94 | target_genes <- c("KRAS", "HRAS", "NRAS")
95 | target_regions <- queryTxObject(txdb_human,
96 | featureType="cds",
97 | queryColumn="gene_symbol",
98 | queryValue=target_genes)
99 | ```
100 |
101 | then build our `GuideSet` with the `findSpacers` function:
102 |
103 | ```{r}
104 | gs <- findSpacers(target_regions,
105 | crisprNuclease=SpCas9,
106 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
107 | ```
108 |
109 | As we will want to distinguish which gene each spacer targets,
110 | we will add `gene_symbol` and `gene_id` columns from `target_regions`.
111 |
112 | ```{r}
113 | gene_info <- target_regions[gs$region]
114 | gs$gene_symbol <- gene_info$gene_symbol
115 | gs$gene_id <- gene_info$gene_id
116 | ```
117 |
118 |
119 | ## Adding OPS barcodes
120 |
121 | We can add our OPS barcodes to the GuideSet with the `addOpsBarcodes` function.
122 | This function extracts the `n_cycles` nucleotides from the 5-prime end of
123 | our spacers and stores them in the `opsBarcode` column:
124 |
125 | ```{r}
126 | gs <- addOpsBarcodes(gs,
127 | n_cycles=n_cycles)
128 | head(gs$opsBarcode)
129 | ```
130 |
131 |
132 | ## Barcode distance matrix
133 |
134 | We can pass our barcodes to the function `getBarcodeDistanceMatrix` to
135 | calculate the nucleotide distance between them. The `dist_method` argument
136 | determines the type of distance to calculate: `"hamming"`, which only
137 | considers substitutions (default) or `"levenstein"`, which also allows for
138 | insertions and deletions.
139 |
140 | As a brief demonstration, let's look at the distances between the first few
141 | barcodes in our `GuideSet`. We set the `binarize` argument (more on this
142 | parameter later) to `FALSE` to show distances:
143 |
144 | ```{r}
145 | barcodes <- gs$opsBarcode
146 | dist <- getBarcodeDistanceMatrix(barcodes[1:5],
147 | binnarize=FALSE)
148 | dist
149 | ```
150 |
151 | Note that the output is a sparse matrix, so the barcodes along the diagonal
152 | (i.e., compared against themselves) return `.`, or a distance of zero.
153 | To compare one set of barcodes against another, we can pass the other set
154 | to the `targetBarcodes` argument (the former barcode set being passed
155 | to the `queryBarcodes` argument, which is compared against
156 | itself when `targetBarcodes` is `NULL`):
157 |
158 | ```{r}
159 | dist <- getBarcodeDistanceMatrix(barcodes[1:5],
160 | targetBarcodes=barcodes[6:10],
161 | binnarize=FALSE)
162 | dist
163 | ```
164 |
165 | The question we are interested in with respect to barcode distances is
166 | whether this distance is sufficiently dissimilar for accurate identification
167 | of spacers during sequencing. This minimum distance edit (`min_dist_edit`)
168 | relies on the accuracy of various steps in the experiment. Suppose,
169 | as a conservative estimate, that we can expect no more than two edits
170 | per barcode in our example. A `min_dist_edit` of `3` should
171 | suffice. Setting the `binnarize` argument to `TRUE`, and passing
172 | our minimum distance edit value to `min_dist_edit` will binarize the
173 | output, flagging barcodes (with a value of `1`) that are too similar
174 | and should not both be included in our library:
175 |
176 | ```{r}
177 | dist <- getBarcodeDistanceMatrix(barcodes[1:5],
178 | barcodes[6:10],
179 | binnarize=TRUE,
180 | min_dist_edit=3)
181 | dist
182 | ```
183 |
184 | Using this function with large sets of barcodes can be taxing on memory.
185 | To manage this, it is recommended to set `splitByChunks=TRUE` and specify
186 | the number of chunks with `n_chunks` (see `?getBarcodeDistanceMatrix`).
187 |
188 |
189 | ## Designing OPS libraries
190 |
191 | The `designOpsLibrary` function allows users to perform a complete
192 | end-to-end OPS library design. We will design our library with 4
193 | gRNAs per gene using the `n_guides` and `gene_field` (to identify gRNAs
194 | by gene target) parameters. We will also use the same distance method
195 | and minimum distance edit parameters as in the example above.
196 |
197 | Note that this requires a `rank` column in the metadata columns of the GuideSet object to be able to select best guides first. For the purpose of this tutorial, we will create a mock rank column. In practice, to learn how to rank gRNAs, see the [Cas9 gRNA design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9).
198 |
199 | ```{r}
200 | gs$rank <- 1:length(gs)
201 | ```
202 |
203 | NOTE: it is advised to first complete other steps in gRNA design
204 | (annotating, filtering, and ranking gRNAs in the `GuideSet`) prior to
205 | using this function; this will ensure the library contains the best
206 | gRNAs. As this example did not rank gRNAs, we are notified that
207 | rankings are assigned by the order in which gRNAs appear in our input.
208 |
209 | ```{r}
210 | opsLibrary <- designOpsLibrary(gs,
211 | n_cycles=n_cycles,
212 | n_guides=4,
213 | gene_field="gene_symbol",
214 | min_dist_edit=5,
215 | dist_method="hamming")
216 | opsLibrary
217 | ```
218 |
219 |
220 | ## Adding gRNAs to an existing OPS library
221 |
222 | Suppose we later wish to add another gene target to our library,
223 | but also want to retain the gRNAs that are currently in our library.
224 | We can append these additional gRNAs with the `updateOpsLibrary` function.
225 | This function has the same parameters as `designOpsLibrary`, with an
226 | additional `opsLibrary` argument to which we pass our original OPS library.
227 |
228 | To demonstrate, we will add the MRAS gene to our library. We first
229 | construct the `GuideSet` for MRAS:
230 |
231 | ```{r}
232 | target_region <- queryTxObject(txdb_human,
233 | featureType="cds",
234 | queryColumn="gene_symbol",
235 | queryValue="MRAS")
236 | gs_mras <- findSpacers(target_region,
237 | crisprNuclease=SpCas9,
238 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
239 | gs_mras$gene_symbol <- "MRAS"
240 | gs_mras$gene_id <- "ENSG00000158186"
241 | gs_mras$rank <- 1:length(gs_mras)
242 | ```
243 |
244 | then add barcodes:
245 |
246 | ```{r}
247 | ## add OPS barcodes
248 | gs_mras <- addOpsBarcodes(gs_mras,
249 | n_cycles=n_cycles)
250 | ```
251 |
252 | which we then pass with our other parameters to `updateOpsLibrary`:
253 |
254 | ```{r}
255 | opsLibrary <- updateOpsLibrary(opsLibrary,
256 | gs_mras,
257 | n_cycles=n_cycles,
258 | n_guides=4,
259 | gene_field="gene_symbol",
260 | min_dist_edit=5,
261 | dist_method="hamming")
262 | opsLibrary
263 | ```
264 |
265 |
266 |
267 | # Session Info
268 |
269 | ```{r}
270 | sessionInfo()
271 | ```
272 |
273 |
274 | # References
275 |
--------------------------------------------------------------------------------
/Design_CRISPRkd_CasRx/references.bib:
--------------------------------------------------------------------------------
1 | @article{casrxrf,
2 | title={Massively parallel Cas13 screens reveal principles for guide RNA design},
3 | author={Wessels, Hans-Hermann and M{\'e}ndez-Mancilla, Alejandro and Guo, Xinyi and Legut, Mateusz and Daniloski, Zharko and Sanjana, Neville E},
4 | journal={Nature biotechnology},
5 | volume={38},
6 | number={6},
7 | pages={722--727},
8 | year={2020},
9 | publisher={Nature Publishing Group}
10 | }
11 |
12 |
13 | @article{ops,
14 | title={Optical pooled screens in human cells},
15 | author={Feldman, David and Singh, Avtar and Schmid-Burgk, Jonathan L and Carlson, Rebecca J and Mezger, Anja and Garrity, Anthony J and Zhang, Feng and Blainey, Paul C},
16 | journal={Cell},
17 | volume={179},
18 | number={3},
19 | pages={787--799},
20 | year={2019},
21 | publisher={Elsevier}
22 | }
23 |
24 | @article{cas13d,
25 | author = {Konermann, Silvana and Lotfy, Peter and Brideau, Nicholas J and Oki, Jennifer and Shokhirev, Maxim N and Hsu, Patrick D},
26 | journal = {Cell},
27 | number = {3},
28 | pages = {665--676},
29 | publisher = {Elsevier},
30 | title = {Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors},
31 | volume = {173},
32 | year = {2018}}
33 |
34 | @article{koblan2018improving,
35 | title={Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction},
36 | author={Koblan, Luke W and Doman, Jordan L and Wilson, Christopher and Levy, Jonathan M and Tay, Tristan and Newby, Gregory A and Maianti, Juan Pablo and Raguram, Aditya and Liu, David R},
37 | journal={Nature biotechnology},
38 | volume={36},
39 | number={9},
40 | pages={843--846},
41 | year={2018},
42 | publisher={Nature Publishing Group}
43 | }
44 |
45 | @article{sanson2018optimized,
46 | title={Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
47 | author={Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
48 | journal={Nature communications},
49 | volume={9},
50 | number={1},
51 | pages={1--15},
52 | year={2018},
53 | publisher={Nature Publishing Group}
54 | }
55 |
56 | @article{langmead2009bowtie,
57 | Abstract = {Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.},
58 | Author = {Langmead, Ben and Trapnell, Cole and Pop, Mihai and Salzberg, Steven L.},
59 | Da = {2009/03/04},
60 | Doi = {10.1186/gb-2009-10-3-r25},
61 | Id = {Langmead2009},
62 | Isbn = {1474-760X},
63 | Journal = {Genome Biology},
64 | Number = {3},
65 | Pages = {R25},
66 | Title = {Ultrafast and memory-efficient alignment of short DNA sequences to the human genome},
67 | Ty = {JOUR},
68 | Url = {https://doi.org/10.1186/gb-2009-10-3-r25},
69 | Volume = {10},
70 | Year = {2009}
71 | }
72 |
73 |
74 | @article{lindel,
75 | title={Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair},
76 | author={Chen, Wei and McKenna, Aaron and Schreiber, Jacob and Haeussler, Maximilian and Yin, Yi and Agarwal, Vikram and Noble, William Stafford and Shendure, Jay},
77 | journal={Nucleic acids research},
78 | volume={47},
79 | number={15},
80 | pages={7989--8003},
81 | year={2019},
82 | publisher={Oxford University Press}
83 | }
84 |
85 | @article{azimuth,
86 | title={Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9},
87 | author={Doench, John G and Fusi, Nicolo and Sullender, Meagan and Hegde, Mudra and Vaimberg, Emma W and Donovan, Katherine F and Smith, Ian and Tothova, Zuzana and Wilen, Craig and Orchard, Robert and others},
88 | journal={Nature biotechnology},
89 | volume={34},
90 | number={2},
91 | pages={184},
92 | year={2016},
93 | publisher={Nature Publishing Group}
94 | }
95 |
96 | @article{deepcas9,
97 | title={Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning},
98 | author={Wang, Daqi and Zhang, Chengdong and Wang, Bei and Li, Bin and Wang, Qiang and Liu, Dong and Wang, Hongyan and Zhou, Yan and Shi, Leming and Lan, Feng and others},
99 | journal={Nature communications},
100 | volume={10},
101 | number={1},
102 | pages={1--14},
103 | year={2019},
104 | publisher={Nature Publishing Group}
105 | }
106 |
107 | @article{deepcpf1,
108 | title={Deep learning improves prediction of CRISPR--Cpf1 guide RNA activity},
109 | author={Kim, Hui Kwon and Min, Seonwoo and Song, Myungjae and Jung, Soobin and Choi, Jae Woo and Kim, Younggwang and Lee, Sangeun and Yoon, Sungroh and Kim, Hyongbum Henry},
110 | journal={Nature biotechnology},
111 | volume={36},
112 | number={3},
113 | pages={239},
114 | year={2018},
115 | publisher={Nature Publishing Group}
116 | }
117 |
118 | @article{perturbseq,
119 | title={A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response},
120 | author={Adamson, Britt and Norman, Thomas M and Jost, Marco and Cho, Min Y and Nu{\~n}ez, James K and Chen, Yuwen and Villalta, Jacqueline E and Gilbert, Luke A and Horlbeck, Max A and Hein, Marco Y and others},
121 | journal={Cell},
122 | volume={167},
123 | number={7},
124 | pages={1867--1882},
125 | year={2016},
126 | publisher={Elsevier}
127 | }
128 |
129 | @article{cropseq,
130 | title={Pooled CRISPR screening with single-cell transcriptome readout},
131 | author={Datlinger, Paul and Rendeiro, Andr{\'e} F and Schmidl, Christian and Krausgruber, Thomas and Traxler, Peter and Klughammer, Johanna and Schuster, Linda C and Kuchler, Amelie and Alpar, Donat and Bock, Christoph},
132 | journal={Nature methods},
133 | volume={14},
134 | number={3},
135 | pages={297},
136 | year={2017},
137 | publisher={Nature Publishing Group}
138 | }
139 |
140 | @article{crispracrisprireview,
141 | title={CRISPRi and CRISPRa screens in mammalian cells for precision biology and medicine},
142 | author={Kampmann, Martin},
143 | journal={ACS chemical biology},
144 | volume={13},
145 | number={2},
146 | pages={406--416},
147 | year={2018},
148 | publisher={ACS Publications}
149 | }
150 |
151 |
152 |
153 | @article{crispri,
154 | title={CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes},
155 | author={Gilbert, Luke A and Larson, Matthew H and Morsut, Leonardo and Liu, Zairan and Brar, Gloria A and Torres, Sandra E and Stern-Ginossar, Noam and Brandman, Onn and Whitehead, Evan H and Doudna, Jennifer A and others},
156 | journal={Cell},
157 | volume={154},
158 | number={2},
159 | pages={442--451},
160 | year={2013},
161 | publisher={Elsevier}
162 | }
163 |
164 |
165 | @article{sam,
166 | title={Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex},
167 | author={Konermann, Silvana and Brigham, Mark D and Trevino, Alexandro E and Joung, Julia and Abudayyeh, Omar O and Barcena, Clea and Hsu, Patrick D and Habib, Naomi and Gootenberg, Jonathan S and Nishimasu, Hiroshi and others},
168 | journal={Nature},
169 | volume={517},
170 | number={7536},
171 | pages={583},
172 | year={2015},
173 | publisher={Nature Publishing Group}
174 | }
175 |
176 | @article{fortin2019,
177 | title={Multiple-gene targeting and mismatch tolerance can confound analysis of genome-wide pooled CRISPR screens},
178 | author={Fortin, Jean-Philippe and Tan, Jenille and Gascoigne, Karen E and Haverty, Peter M and Forrest, William F and Costa, Michael R and Martin, Scott E},
179 | journal={Genome biology},
180 | volume={20},
181 | number={1},
182 | pages={21},
183 | year={2019},
184 | publisher={Springer}
185 | }
186 |
187 |
188 |
189 | @article{sanson2018optimized,
190 | title={Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
191 | author={Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
192 | journal={Nature communications},
193 | volume={9},
194 | number={1},
195 | pages={1--15},
196 | year={2018},
197 | publisher={Nature Publishing Group}
198 | }
199 |
200 |
201 |
202 | @article{horlbeck2016compact,
203 | title={Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
204 | author={Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
205 | journal={Elife},
206 | volume={5},
207 | pages={e19760},
208 | year={2016},
209 | publisher={eLife Sciences Publications Limited}
210 | }
211 |
212 |
213 | @article{ceres,
214 | title={Computational correction of copy number effect improves specificity of CRISPR--Cas9 essentiality screens in cancer cells},
215 | author={Meyers, Robin M and Bryan, Jordan G and McFarland, James M and Weir, Barbara A and Sizemore, Ann E and Xu, Han and Dharia, Neekesh V and Montgomery, Phillip G and Cowley, Glenn S and Pantel, Sasha and others},
216 | journal={Nature genetics},
217 | volume={49},
218 | number={12},
219 | pages={1779--1784},
220 | year={2017},
221 | publisher={Nature Publishing Group}
222 | }
223 |
224 |
225 | @article{score,
226 | title={Prioritization of cancer therapeutic targets using CRISPR--Cas9 screens},
227 | author={Behan, Fiona M and Iorio, Francesco and Picco, Gabriele and Gon{\c{c}}alves, Emanuel and Beaver, Charlotte M and Migliardi, Giorgia and Santos, Rita and Rao, Yanhua and Sassi, Francesco and Pinnelli, Marika and others},
228 | journal={Nature},
229 | volume={568},
230 | number={7753},
231 | pages={511},
232 | year={2019},
233 | publisher={Nature Publishing Group}
234 | }
235 |
236 |
237 |
238 | @article{mit,
239 | title={DNA targeting specificity of RNA-guided Cas9 nucleases},
240 | author={Hsu, Patrick D and Scott, David A and Weinstein, Joshua A and Ran, F Ann and Konermann, Silvana and Agarwala, Vineeta and Li, Yinqing and Fine, Eli J and Wu, Xuebing and Shalem, Ophir and others},
241 | journal={Nature biotechnology},
242 | volume={31},
243 | number={9},
244 | pages={827},
245 | year={2013},
246 | publisher={Nature Publishing Group}
247 | }
248 |
249 |
250 |
251 |
--------------------------------------------------------------------------------
/Design_CRISPRa/references.bib:
--------------------------------------------------------------------------------
1 | %% This BibTeX bibliography file was created using BibDesk.
2 | %% https://bibdesk.sourceforge.io/
3 |
4 | %% Created for Luke Hoberecht at 2022-08-02 15:33:04 -0700
5 |
6 |
7 | %% Saved with string encoding Unicode (UTF-8)
8 |
9 |
10 |
11 | @article{crisprai,
12 | author = {Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
13 | date-added = {2022-08-02 15:33:01 -0700},
14 | date-modified = {2022-08-02 15:33:01 -0700},
15 | journal = {elife},
16 | publisher = {eLife Sciences Publications, Ltd},
17 | title = {Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
18 | volume = {5},
19 | year = {2016}}
20 |
21 | @article{ops,
22 | author = {Feldman, David and Singh, Avtar and Schmid-Burgk, Jonathan L and Carlson, Rebecca J and Mezger, Anja and Garrity, Anthony J and Zhang, Feng and Blainey, Paul C},
23 | journal = {Cell},
24 | number = {3},
25 | pages = {787--799},
26 | publisher = {Elsevier},
27 | title = {Optical pooled screens in human cells},
28 | volume = {179},
29 | year = {2019}}
30 |
31 | @article{cas13d,
32 | author = {Konermann, Silvana and Lotfy, Peter and Brideau, Nicholas J and Oki, Jennifer and Shokhirev, Maxim N and Hsu, Patrick D},
33 | journal = {Cell},
34 | number = {3},
35 | pages = {665--676},
36 | publisher = {Elsevier},
37 | title = {Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors},
38 | volume = {173},
39 | year = {2018}}
40 |
41 | @article{koblan2018improving,
42 | author = {Koblan, Luke W and Doman, Jordan L and Wilson, Christopher and Levy, Jonathan M and Tay, Tristan and Newby, Gregory A and Maianti, Juan Pablo and Raguram, Aditya and Liu, David R},
43 | journal = {Nature biotechnology},
44 | number = {9},
45 | pages = {843--846},
46 | publisher = {Nature Publishing Group},
47 | title = {Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction},
48 | volume = {36},
49 | year = {2018}}
50 |
51 | @article{sanson2018optimized,
52 | author = {Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
53 | journal = {Nature communications},
54 | number = {1},
55 | pages = {1--15},
56 | publisher = {Nature Publishing Group},
57 | title = {Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
58 | volume = {9},
59 | year = {2018}}
60 |
61 | @article{langmead2009bowtie,
62 | abstract = {Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.},
63 | author = {Langmead, Ben and Trapnell, Cole and Pop, Mihai and Salzberg, Steven L.},
64 | da = {2009/03/04},
65 | doi = {10.1186/gb-2009-10-3-r25},
66 | id = {Langmead2009},
67 | isbn = {1474-760X},
68 | journal = {Genome Biology},
69 | number = {3},
70 | pages = {R25},
71 | title = {Ultrafast and memory-efficient alignment of short DNA sequences to the human genome},
72 | ty = {JOUR},
73 | url = {https://doi.org/10.1186/gb-2009-10-3-r25},
74 | volume = {10},
75 | year = {2009},
76 | bdsk-url-1 = {https://doi.org/10.1186/gb-2009-10-3-r25}}
77 |
78 | @article{lindel,
79 | author = {Chen, Wei and McKenna, Aaron and Schreiber, Jacob and Haeussler, Maximilian and Yin, Yi and Agarwal, Vikram and Noble, William Stafford and Shendure, Jay},
80 | journal = {Nucleic acids research},
81 | number = {15},
82 | pages = {7989--8003},
83 | publisher = {Oxford University Press},
84 | title = {Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair},
85 | volume = {47},
86 | year = {2019}}
87 |
88 | @article{azimuth,
89 | author = {Doench, John G and Fusi, Nicolo and Sullender, Meagan and Hegde, Mudra and Vaimberg, Emma W and Donovan, Katherine F and Smith, Ian and Tothova, Zuzana and Wilen, Craig and Orchard, Robert and others},
90 | journal = {Nature biotechnology},
91 | number = {2},
92 | pages = {184},
93 | publisher = {Nature Publishing Group},
94 | title = {Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9},
95 | volume = {34},
96 | year = {2016}}
97 |
98 | @article{deepcas9,
99 | author = {Wang, Daqi and Zhang, Chengdong and Wang, Bei and Li, Bin and Wang, Qiang and Liu, Dong and Wang, Hongyan and Zhou, Yan and Shi, Leming and Lan, Feng and others},
100 | journal = {Nature communications},
101 | number = {1},
102 | pages = {1--14},
103 | publisher = {Nature Publishing Group},
104 | title = {Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning},
105 | volume = {10},
106 | year = {2019}}
107 |
108 | @article{deepcpf1,
109 | author = {Kim, Hui Kwon and Min, Seonwoo and Song, Myungjae and Jung, Soobin and Choi, Jae Woo and Kim, Younggwang and Lee, Sangeun and Yoon, Sungroh and Kim, Hyongbum Henry},
110 | journal = {Nature biotechnology},
111 | number = {3},
112 | pages = {239},
113 | publisher = {Nature Publishing Group},
114 | title = {Deep learning improves prediction of CRISPR--Cpf1 guide RNA activity},
115 | volume = {36},
116 | year = {2018}}
117 |
118 | @article{perturbseq,
119 | author = {Adamson, Britt and Norman, Thomas M and Jost, Marco and Cho, Min Y and Nu{\~n}ez, James K and Chen, Yuwen and Villalta, Jacqueline E and Gilbert, Luke A and Horlbeck, Max A and Hein, Marco Y and others},
120 | journal = {Cell},
121 | number = {7},
122 | pages = {1867--1882},
123 | publisher = {Elsevier},
124 | title = {A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response},
125 | volume = {167},
126 | year = {2016}}
127 |
128 | @article{cropseq,
129 | author = {Datlinger, Paul and Rendeiro, Andr{\'e} F and Schmidl, Christian and Krausgruber, Thomas and Traxler, Peter and Klughammer, Johanna and Schuster, Linda C and Kuchler, Amelie and Alpar, Donat and Bock, Christoph},
130 | journal = {Nature methods},
131 | number = {3},
132 | pages = {297},
133 | publisher = {Nature Publishing Group},
134 | title = {Pooled CRISPR screening with single-cell transcriptome readout},
135 | volume = {14},
136 | year = {2017}}
137 |
138 | @article{crispracrisprireview,
139 | author = {Kampmann, Martin},
140 | journal = {ACS chemical biology},
141 | number = {2},
142 | pages = {406--416},
143 | publisher = {ACS Publications},
144 | title = {CRISPRi and CRISPRa screens in mammalian cells for precision biology and medicine},
145 | volume = {13},
146 | year = {2018}}
147 |
148 | @article{crispri,
149 | author = {Gilbert, Luke A and Larson, Matthew H and Morsut, Leonardo and Liu, Zairan and Brar, Gloria A and Torres, Sandra E and Stern-Ginossar, Noam and Brandman, Onn and Whitehead, Evan H and Doudna, Jennifer A and others},
150 | journal = {Cell},
151 | number = {2},
152 | pages = {442--451},
153 | publisher = {Elsevier},
154 | title = {CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes},
155 | volume = {154},
156 | year = {2013}}
157 |
158 | @article{sam,
159 | author = {Konermann, Silvana and Brigham, Mark D and Trevino, Alexandro E and Joung, Julia and Abudayyeh, Omar O and Barcena, Clea and Hsu, Patrick D and Habib, Naomi and Gootenberg, Jonathan S and Nishimasu, Hiroshi and others},
160 | journal = {Nature},
161 | number = {7536},
162 | pages = {583},
163 | publisher = {Nature Publishing Group},
164 | title = {Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex},
165 | volume = {517},
166 | year = {2015}}
167 |
168 | @article{fortin2019,
169 | author = {Fortin, Jean-Philippe and Tan, Jenille and Gascoigne, Karen E and Haverty, Peter M and Forrest, William F and Costa, Michael R and Martin, Scott E},
170 | journal = {Genome biology},
171 | number = {1},
172 | pages = {21},
173 | publisher = {Springer},
174 | title = {Multiple-gene targeting and mismatch tolerance can confound analysis of genome-wide pooled CRISPR screens},
175 | volume = {20},
176 | year = {2019}}
177 |
178 | @article{sanson2018optimized,
179 | author = {Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
180 | journal = {Nature communications},
181 | number = {1},
182 | pages = {1--15},
183 | publisher = {Nature Publishing Group},
184 | title = {Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
185 | volume = {9},
186 | year = {2018}}
187 |
188 | @article{horlbeck2016compact,
189 | author = {Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
190 | journal = {Elife},
191 | pages = {e19760},
192 | publisher = {eLife Sciences Publications Limited},
193 | title = {Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
194 | volume = {5},
195 | year = {2016}}
196 |
197 | @article{ceres,
198 | author = {Meyers, Robin M and Bryan, Jordan G and McFarland, James M and Weir, Barbara A and Sizemore, Ann E and Xu, Han and Dharia, Neekesh V and Montgomery, Phillip G and Cowley, Glenn S and Pantel, Sasha and others},
199 | journal = {Nature genetics},
200 | number = {12},
201 | pages = {1779--1784},
202 | publisher = {Nature Publishing Group},
203 | title = {Computational correction of copy number effect improves specificity of CRISPR--Cas9 essentiality screens in cancer cells},
204 | volume = {49},
205 | year = {2017}}
206 |
207 | @article{score,
208 | author = {Behan, Fiona M and Iorio, Francesco and Picco, Gabriele and Gon{\c{c}}alves, Emanuel and Beaver, Charlotte M and Migliardi, Giorgia and Santos, Rita and Rao, Yanhua and Sassi, Francesco and Pinnelli, Marika and others},
209 | journal = {Nature},
210 | number = {7753},
211 | pages = {511},
212 | publisher = {Nature Publishing Group},
213 | title = {Prioritization of cancer therapeutic targets using CRISPR--Cas9 screens},
214 | volume = {568},
215 | year = {2019}}
216 |
217 | @article{mit,
218 | author = {Hsu, Patrick D and Scott, David A and Weinstein, Joshua A and Ran, F Ann and Konermann, Silvana and Agarwala, Vineeta and Li, Yinqing and Fine, Eli J and Wu, Xuebing and Shalem, Ophir and others},
219 | journal = {Nature biotechnology},
220 | number = {9},
221 | pages = {827},
222 | publisher = {Nature Publishing Group},
223 | title = {DNA targeting specificity of RNA-guided Cas9 nucleases},
224 | volume = {31},
225 | year = {2013}}
226 |
--------------------------------------------------------------------------------
/Design_CRISPRi/references.bib:
--------------------------------------------------------------------------------
1 | %% This BibTeX bibliography file was created using BibDesk.
2 | %% https://bibdesk.sourceforge.io/
3 |
4 | %% Created for Luke Hoberecht at 2022-08-02 15:33:04 -0700
5 |
6 |
7 | %% Saved with string encoding Unicode (UTF-8)
8 |
9 |
10 |
11 | @article{crisprai,
12 | author = {Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
13 | date-added = {2022-08-02 15:33:01 -0700},
14 | date-modified = {2022-08-02 15:33:01 -0700},
15 | journal = {elife},
16 | publisher = {eLife Sciences Publications, Ltd},
17 | title = {Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
18 | volume = {5},
19 | year = {2016}}
20 |
21 | @article{ops,
22 | author = {Feldman, David and Singh, Avtar and Schmid-Burgk, Jonathan L and Carlson, Rebecca J and Mezger, Anja and Garrity, Anthony J and Zhang, Feng and Blainey, Paul C},
23 | journal = {Cell},
24 | number = {3},
25 | pages = {787--799},
26 | publisher = {Elsevier},
27 | title = {Optical pooled screens in human cells},
28 | volume = {179},
29 | year = {2019}}
30 |
31 | @article{cas13d,
32 | author = {Konermann, Silvana and Lotfy, Peter and Brideau, Nicholas J and Oki, Jennifer and Shokhirev, Maxim N and Hsu, Patrick D},
33 | journal = {Cell},
34 | number = {3},
35 | pages = {665--676},
36 | publisher = {Elsevier},
37 | title = {Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors},
38 | volume = {173},
39 | year = {2018}}
40 |
41 | @article{koblan2018improving,
42 | author = {Koblan, Luke W and Doman, Jordan L and Wilson, Christopher and Levy, Jonathan M and Tay, Tristan and Newby, Gregory A and Maianti, Juan Pablo and Raguram, Aditya and Liu, David R},
43 | journal = {Nature biotechnology},
44 | number = {9},
45 | pages = {843--846},
46 | publisher = {Nature Publishing Group},
47 | title = {Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction},
48 | volume = {36},
49 | year = {2018}}
50 |
51 | @article{sanson2018optimized,
52 | author = {Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
53 | journal = {Nature communications},
54 | number = {1},
55 | pages = {1--15},
56 | publisher = {Nature Publishing Group},
57 | title = {Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
58 | volume = {9},
59 | year = {2018}}
60 |
61 | @article{langmead2009bowtie,
62 | abstract = {Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.},
63 | author = {Langmead, Ben and Trapnell, Cole and Pop, Mihai and Salzberg, Steven L.},
64 | da = {2009/03/04},
65 | doi = {10.1186/gb-2009-10-3-r25},
66 | id = {Langmead2009},
67 | isbn = {1474-760X},
68 | journal = {Genome Biology},
69 | number = {3},
70 | pages = {R25},
71 | title = {Ultrafast and memory-efficient alignment of short DNA sequences to the human genome},
72 | ty = {JOUR},
73 | url = {https://doi.org/10.1186/gb-2009-10-3-r25},
74 | volume = {10},
75 | year = {2009},
76 | bdsk-url-1 = {https://doi.org/10.1186/gb-2009-10-3-r25}}
77 |
78 | @article{lindel,
79 | author = {Chen, Wei and McKenna, Aaron and Schreiber, Jacob and Haeussler, Maximilian and Yin, Yi and Agarwal, Vikram and Noble, William Stafford and Shendure, Jay},
80 | journal = {Nucleic acids research},
81 | number = {15},
82 | pages = {7989--8003},
83 | publisher = {Oxford University Press},
84 | title = {Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair},
85 | volume = {47},
86 | year = {2019}}
87 |
88 | @article{azimuth,
89 | author = {Doench, John G and Fusi, Nicolo and Sullender, Meagan and Hegde, Mudra and Vaimberg, Emma W and Donovan, Katherine F and Smith, Ian and Tothova, Zuzana and Wilen, Craig and Orchard, Robert and others},
90 | journal = {Nature biotechnology},
91 | number = {2},
92 | pages = {184},
93 | publisher = {Nature Publishing Group},
94 | title = {Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9},
95 | volume = {34},
96 | year = {2016}}
97 |
98 | @article{deepcas9,
99 | author = {Wang, Daqi and Zhang, Chengdong and Wang, Bei and Li, Bin and Wang, Qiang and Liu, Dong and Wang, Hongyan and Zhou, Yan and Shi, Leming and Lan, Feng and others},
100 | journal = {Nature communications},
101 | number = {1},
102 | pages = {1--14},
103 | publisher = {Nature Publishing Group},
104 | title = {Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning},
105 | volume = {10},
106 | year = {2019}}
107 |
108 | @article{deepcpf1,
109 | author = {Kim, Hui Kwon and Min, Seonwoo and Song, Myungjae and Jung, Soobin and Choi, Jae Woo and Kim, Younggwang and Lee, Sangeun and Yoon, Sungroh and Kim, Hyongbum Henry},
110 | journal = {Nature biotechnology},
111 | number = {3},
112 | pages = {239},
113 | publisher = {Nature Publishing Group},
114 | title = {Deep learning improves prediction of CRISPR--Cpf1 guide RNA activity},
115 | volume = {36},
116 | year = {2018}}
117 |
118 | @article{perturbseq,
119 | author = {Adamson, Britt and Norman, Thomas M and Jost, Marco and Cho, Min Y and Nu{\~n}ez, James K and Chen, Yuwen and Villalta, Jacqueline E and Gilbert, Luke A and Horlbeck, Max A and Hein, Marco Y and others},
120 | journal = {Cell},
121 | number = {7},
122 | pages = {1867--1882},
123 | publisher = {Elsevier},
124 | title = {A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response},
125 | volume = {167},
126 | year = {2016}}
127 |
128 | @article{cropseq,
129 | author = {Datlinger, Paul and Rendeiro, Andr{\'e} F and Schmidl, Christian and Krausgruber, Thomas and Traxler, Peter and Klughammer, Johanna and Schuster, Linda C and Kuchler, Amelie and Alpar, Donat and Bock, Christoph},
130 | journal = {Nature methods},
131 | number = {3},
132 | pages = {297},
133 | publisher = {Nature Publishing Group},
134 | title = {Pooled CRISPR screening with single-cell transcriptome readout},
135 | volume = {14},
136 | year = {2017}}
137 |
138 | @article{crispracrisprireview,
139 | author = {Kampmann, Martin},
140 | journal = {ACS chemical biology},
141 | number = {2},
142 | pages = {406--416},
143 | publisher = {ACS Publications},
144 | title = {CRISPRi and CRISPRa screens in mammalian cells for precision biology and medicine},
145 | volume = {13},
146 | year = {2018}}
147 |
148 | @article{crispri,
149 | author = {Gilbert, Luke A and Larson, Matthew H and Morsut, Leonardo and Liu, Zairan and Brar, Gloria A and Torres, Sandra E and Stern-Ginossar, Noam and Brandman, Onn and Whitehead, Evan H and Doudna, Jennifer A and others},
150 | journal = {Cell},
151 | number = {2},
152 | pages = {442--451},
153 | publisher = {Elsevier},
154 | title = {CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes},
155 | volume = {154},
156 | year = {2013}}
157 |
158 | @article{sam,
159 | author = {Konermann, Silvana and Brigham, Mark D and Trevino, Alexandro E and Joung, Julia and Abudayyeh, Omar O and Barcena, Clea and Hsu, Patrick D and Habib, Naomi and Gootenberg, Jonathan S and Nishimasu, Hiroshi and others},
160 | journal = {Nature},
161 | number = {7536},
162 | pages = {583},
163 | publisher = {Nature Publishing Group},
164 | title = {Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex},
165 | volume = {517},
166 | year = {2015}}
167 |
168 | @article{fortin2019,
169 | author = {Fortin, Jean-Philippe and Tan, Jenille and Gascoigne, Karen E and Haverty, Peter M and Forrest, William F and Costa, Michael R and Martin, Scott E},
170 | journal = {Genome biology},
171 | number = {1},
172 | pages = {21},
173 | publisher = {Springer},
174 | title = {Multiple-gene targeting and mismatch tolerance can confound analysis of genome-wide pooled CRISPR screens},
175 | volume = {20},
176 | year = {2019}}
177 |
178 | @article{sanson2018optimized,
179 | author = {Sanson, Kendall R and Hanna, Ruth E and Hegde, Mudra and Donovan, Katherine F and Strand, Christine and Sullender, Meagan E and Vaimberg, Emma W and Goodale, Amy and Root, David E and Piccioni, Federica and others},
180 | journal = {Nature communications},
181 | number = {1},
182 | pages = {1--15},
183 | publisher = {Nature Publishing Group},
184 | title = {Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities},
185 | volume = {9},
186 | year = {2018}}
187 |
188 | @article{horlbeck2016compact,
189 | author = {Horlbeck, Max A and Gilbert, Luke A and Villalta, Jacqueline E and Adamson, Britt and Pak, Ryan A and Chen, Yuwen and Fields, Alexander P and Park, Chong Yon and Corn, Jacob E and Kampmann, Martin and others},
190 | journal = {Elife},
191 | pages = {e19760},
192 | publisher = {eLife Sciences Publications Limited},
193 | title = {Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation},
194 | volume = {5},
195 | year = {2016}}
196 |
197 | @article{ceres,
198 | author = {Meyers, Robin M and Bryan, Jordan G and McFarland, James M and Weir, Barbara A and Sizemore, Ann E and Xu, Han and Dharia, Neekesh V and Montgomery, Phillip G and Cowley, Glenn S and Pantel, Sasha and others},
199 | journal = {Nature genetics},
200 | number = {12},
201 | pages = {1779--1784},
202 | publisher = {Nature Publishing Group},
203 | title = {Computational correction of copy number effect improves specificity of CRISPR--Cas9 essentiality screens in cancer cells},
204 | volume = {49},
205 | year = {2017}}
206 |
207 | @article{score,
208 | author = {Behan, Fiona M and Iorio, Francesco and Picco, Gabriele and Gon{\c{c}}alves, Emanuel and Beaver, Charlotte M and Migliardi, Giorgia and Santos, Rita and Rao, Yanhua and Sassi, Francesco and Pinnelli, Marika and others},
209 | journal = {Nature},
210 | number = {7753},
211 | pages = {511},
212 | publisher = {Nature Publishing Group},
213 | title = {Prioritization of cancer therapeutic targets using CRISPR--Cas9 screens},
214 | volume = {568},
215 | year = {2019}}
216 |
217 | @article{mit,
218 | author = {Hsu, Patrick D and Scott, David A and Weinstein, Joshua A and Ran, F Ann and Konermann, Silvana and Agarwala, Vineeta and Li, Yinqing and Fine, Eli J and Wu, Xuebing and Shalem, Ophir and others},
219 | journal = {Nature biotechnology},
220 | number = {9},
221 | pages = {827},
222 | publisher = {Nature Publishing Group},
223 | title = {DNA targeting specificity of RNA-guided Cas9 nucleases},
224 | volume = {31},
225 | year = {2013}}
226 |
--------------------------------------------------------------------------------
/Design_CRISPRa/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "gRNA design for CRISPR activation (CRISPRa)"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 |
15 | # Introduction
16 |
17 | This tutorial will demonstrate how to use `crisprDesign` to design gRNAs
18 | for CRISPR activation (CRISPRa). Specifically, we will target the
19 | human KRAS gene and use the SpCas9 nuclease.
20 |
21 |
22 | # Installation
23 |
24 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
25 | `crisprDesign`, `crisprDesignData`
26 |
27 |
28 | # Terminology
29 |
30 | See the [CRISPRko design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar
31 | with the terminology used throughout this tutorial.
32 |
33 |
34 |
35 | # CRISPRa design
36 |
37 | For CRISPR activation (CRISPRa) and interference (CRISPRi) applications,
38 | the CRISPR nuclease is engineered to lose its endonuclease activity,
39 | and should therefore not introduce double-stranded breaks (DSBs).
40 | We will use the dead SpCas9 (dSpCas9) nuclease as an example here.
41 | Note that users don't have to distinguish between dSpCas9 and SpCas9
42 | when specifying the nuclease in the crisprVerse as they
43 | do not differ in terms of the characteristics stored in
44 | the `CrisprNuclease` object.
45 |
46 | In CRISPRa, dSpCas9 is used to activate gene expression by coupling
47 | the dead nuclease with activation factors. Several CRISPRa systems have
48 | been developed (see @crispracrisprireview for a review). For
49 | optimal activation, gRNAs are usually designed to target the
50 | region directly upstream of the gene transcription start site (TSS).
51 |
52 | `crisprDesign` provides functionalities to be able to take into account
53 | design rules that are specific to CRISPRa applications. The `queryTss` function
54 | allows for specifying genomic coordinates of promoter regions.
55 | The `addTssAnnotation` function annotates gRNAs for known TSSs,
56 | and includes a column `dist_to_tss` that gives the distance in
57 | nucleotides between the TSS position and the PAM site of the gRNA.
58 | For CRISPRa, we recommend targeting the region 75-150bp upstream of
59 | the TSS for optimal activation; see @sanson2018optimized for
60 | more information. Finally, the function `addCrispraiScores` adds on-target
61 | activity scores based on the work of [@crisprai].
62 |
63 |
64 | ## Creating the GuideSet
65 |
66 | We first start by loading the required packages:
67 |
68 | ```{r, message=FALSE, warning=FALSE, results='hide' }
69 | library(crisprBase)
70 | library(crisprDesign)
71 | library(crisprDesignData)
72 | library(BSgenome.Hsapiens.UCSC.hg38)
73 | ```
74 |
75 |
76 | To demonstrate CRISPRa design, we will design gRNAs to activate the human KRAS
77 | gene using the SpCas9 nuclease. To accomplish this, we want our gRNAs to
78 | target the region upstream of the KRAS TSS; let's consider the window
79 | containing 500bp immediately upstream of the TSS. We first need to retrieve
80 | the TSS coordinates for KRAS. These data are conveniently stored in the
81 | `crisprDesignData` package as the dataset `tss_human`. For more information
82 | on `tss_human` and how to create similar TSS annotation objects,
83 | see the [Building a gene annotation object](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation)
84 | tutorial.
85 |
86 | We load the TSS coordinates stored in the `tss_human` object
87 |
88 | ```{r, warning=FALSE, message=FALSE, results='hide'}
89 | data("tss_human", package="crisprDesignData")
90 | ```
91 |
92 | and query for KRAS using the `queryTss` function from `crisprDesign`:
93 |
94 | ```{r, warning=FALSE, message=FALSE, results='hide'}
95 | target_window <- c(-500, 0)
96 | target_region <- queryTss(tss_human,
97 | queryColumn="gene_symbol",
98 | queryValue="KRAS",
99 | tss_window=target_window)
100 | ```
101 |
102 | ```{r, collapse=TRUE}
103 | target_region
104 | ```
105 |
106 | We load the `crisprNuclease` object storing information about the
107 | SpCas9 nuclease from the `crisprBase` package:
108 |
109 | ```{r}
110 | data(SpCas9, package="crisprBase")
111 | ```
112 |
113 | We then find all candidate protospacer sequences in our target region
114 | with `findSpacers`:
115 |
116 | ```{r, warning=FALSE, message=FALSE, results='hide'}
117 | gs <- findSpacers(target_region,
118 | crisprNuclease=SpCas9,
119 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
120 | ```
121 |
122 | ```{r, collapse=TRUE}
123 | gs
124 | ```
125 |
126 |
127 | ## Annotating the GuideSet
128 |
129 | Next, we annotate our candidate gRNAs to assess quality. There are several
130 | functions in `crisprDesign` that provide annotation for features that are
131 | not specific to CRISPRa, for which we refer the reader to the
132 | [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial for more information. The sections below will
133 | cover annotation functions that are of particular interest to
134 | CRISPRa applications.
135 |
136 |
137 | ## Adding TSS annotation
138 |
139 | As the name implies, the `addTssAnnotation` function annotates gRNAs with
140 | TSS context such as the distance between the gRNA and the TSS, as well as
141 | which TSS is targeted (many genes contain different TSSs corresponding to
142 | different isoforms).
143 |
144 | The function requires a `tssObject` object, and the `tss_window` values
145 | that we used earlier to define the target region. We can then retrieve
146 | the appended annotation with the accessor function `tssAnnotation`:
147 |
148 | ```{r, collapse=TRUE}
149 | gs <- addTssAnnotation(gs,
150 | tssObject=tss_human,
151 | tss_window=target_window)
152 | tssAnnotation(gs)
153 | ```
154 |
155 |
156 | ## Adding spacer alignments with TSS annotation
157 |
158 | As with all CRISPR applications, off-targets is an important concern in
159 | assessing gRNA quality. While this concern is somewhat moderated for
160 | CRISPRa, since the dead CRISPR nuclease does not make DSBs, we should
161 | be aware of off-targets occuring in the promoter regions of other genes.
162 | This can be handled by passing our `tssObject` to the `addSpacerAlignments`
163 | function. We will search for up to 2 mismatches and increase the
164 | size of our `tss_window` to err on the safe side.
165 |
166 | Similar to the CRISPRko design tutorial, we need to specify a Bowtie index
167 | of the human referenge genome; see the [Building genome indices for short read aligners](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices) tutorial to learn how to create such an index.
168 |
169 | Here we specify the index that was available to us when generating this
170 | tutorial:
171 |
172 | ```{r}
173 | index_path <- "/Users/fortinj2/crisprIndices/bowtie/hg38/hg38"
174 | ```
175 |
176 | (this needs to be changed by users). We are now ready to add on- and off-target
177 | alignments:
178 |
179 | ```{r, warning=FALSE, message=FALSE, results='hide'}
180 | gs <- addSpacerAlignments(gs,
181 | aligner="bowtie",
182 | aligner_index=index_path,
183 | bsgenome=BSgenome.Hsapiens.UCSC.hg38,
184 | n_mismatches=2,
185 | tssObject=tss_human,
186 | tss_window=c(-2000, 500))
187 | ```
188 | ```{r, collapse=TRUE}
189 | gs
190 | ```
191 |
192 | Including a `tssObject` parameter in the `addSpacerAlignments` function
193 | appends columns to the `GuideSet` that tallies the alignments restricted
194 | to the defined (via `tss_window`) promoter regions:
195 | `n0_p`, `n1_p`, and `n2_p` (the `_p` suffix denotes "promoter").
196 |
197 |
198 | ## Adding CRISPRai scores
199 |
200 | The CRISPRai algorithm was developed by the Weissman lab to score SpCas9
201 | gRNAs for CRISPRa and CRISPRi applications for the
202 | human genome [@crisprai]. The function `addCrispraiScores`
203 | implements this algorithm to add scores to the `GuideSet`.
204 | Compared to other on-target scoring algorithms, it requires several additional
205 | inputs:
206 |
207 | - The `gr` argument is the `GRanges` object derived from the `queryTss`
208 | function and used to create the `GuideSet` object.
209 | In our example, this is the object named `target_region`.
210 | - The `tssObject` argument is a `GRanges` object that contains
211 | TSS coordinates and annotation. It must also contain the following
212 | columns: `ID`, `promoter`, `tx_id`, and `gene_symbol`.
213 | Our `tssObject` in this instance is `tss_human`.
214 | - `geneCol` indicates which column of `tssObject` should be used as the
215 | unique gene identifier.
216 | - `modality` is the modality of the CRISPR application, in our case, `CRISPRa`.
217 | - `fastaFile` is the path of a FASTA file containing the sequence of the
218 | human reference genome in hg38 coordinates. This file is available [here](https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz).
219 | - `chromatinFiles` is a vector of length 3 specifying the path of files
220 | containing the chromatin accessibility data needed for the algorithm
221 | in hg38 coordinates. The chromatin files can be downloaded from Zenodo [here](https://zenodo.org/record/6716721#.YrzCfS-cY4d).
222 |
223 |
224 | We first prepare all needed inputs for `addCrispraiScores`.
225 | We start by specifying the location of the FASTA file on our local machine:
226 |
227 | ```{r}
228 | fastaPath <- "/Users/fortinj2/crisprIndices/genomes/hg38/hg38.fa"
229 | ```
230 |
231 | This corresponds to the path where the downloaded file from [here](https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz) is
232 | stored. Next, we specify the location of the chromatin files:
233 |
234 |
235 |
236 | ```{r}
237 | mnasePath <- "/Users/fortinj2/crisprIndices/chromatin/hg38/crispria_mnase_human_K562_hg38.bigWig"
238 | dnasePath <- "/Users/fortinj2/crisprIndices/chromatin/hg38/crispria_dnase_human_K562_hg38.bigWig"
239 | fairePath <- "/Users/fortinj2/crisprIndices/chromatin/hg38/crispria_faire_human_K562_hg38.bigWig"
240 | chromatinFiles <- c(mnase=mnasePath,
241 | dnase=dnasePath,
242 | faire=fairePath)
243 | ```
244 |
245 | This should correspond to the files that were downloaded from
246 | [here](https://zenodo.org/record/6716721#.YrzCfS-cY4d).
247 |
248 | We are now ready to add the scores:
249 |
250 | ```{r, warning=FALSE, message=FALSE, results='hide', eval=TRUE}
251 | results <- addCrispraiScores(gs,
252 | gr=target_region,
253 | tssObject=tss_human,
254 | geneCol="gene_id",
255 | modality="CRISPRa",
256 | fastaFile=fastaPath,
257 | chromatinFiles=chromatinFiles)
258 | ```
259 |
260 | Let's look at the results:
261 |
262 | ```{r, collapse=TRUE, eval=TRUE}
263 | results
264 | ```
265 |
266 | You can see that the column `score_crispra` was added to the `GuideSet`.
267 | Note that this function works identically for CRISPRi applications, with
268 | the `modality` argument replaced by `CRISPRi`.
269 |
270 |
271 |
272 | # Session Info
273 |
274 | ```{r}
275 | sessionInfo()
276 | ```
277 |
278 |
279 | # References
280 |
--------------------------------------------------------------------------------
/Design_CRISPRi/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "gRNA design for CRISPR interference (CRISPRi)"
3 | author: Jean-Philippe Fortin, Luke Hoberecht
4 | output:
5 | github_document:
6 | toc: true
7 | bibliography: references.bib
8 | ---
9 |
10 | ```{r, echo=FALSE, results="hide"}
11 | options("knitr.graphics.auto_pdf"=TRUE)
12 | ```
13 |
14 |
15 |
16 | # Introduction
17 |
18 | This tutorial will demonstrate how to use `crisprDesign` to design gRNAs
19 | for CRISPR interference (CRISPRi). Specifically, we will target the
20 | human KRAS gene and use the SpCas9 nuclease.
21 |
22 | # Installation
23 |
24 | See the [Installation tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Installation) to learn how to install the packages necessary for this tutorial:
25 | `crisprDesign`, `crisprDesignData`
26 |
27 |
28 | # Terminology
29 |
30 | See the [CRISPRko design tutorial](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) to get familiar
31 | with the terminology used throughout this tutorial.
32 |
33 |
34 | # CRISPRi design
35 |
36 | For CRISPR activation (CRISPRa) and interference (CRISPRi) applications,
37 | the CRISPR nuclease is engineered to lose its endonuclease activity,
38 | and should therefore not introduce double-stranded breaks (DSBs).
39 | We will use the dead SpCas9 (dSpCas9) nuclease as an example here.
40 | Note that users don't have to distinguish between dSpCas9 and SpCas9
41 | when specifying the nuclease in the crisprVerse as they
42 | do not differ in terms of the characteristics stored in
43 | the `CrisprNuclease` object.
44 |
45 | In CRISPRi, fusing dSpCas9 with a Krüppel-associated box (KRAB) domain has been
46 | shown to be effective at repressing transcription in mammalian cells
47 | [@crispri]. The dSpCas9-KRAB fused protein is a commonly-used construct to
48 | conduct CRISPR inhibition (CRISPRi) experiments. To achieve optimal inhibition,
49 | gRNAs are usually designed targeting the region directly downstream of the gene
50 | transcription starting site (TSS).
51 |
52 | `crisprDesign` provides functionalities to be able to take into account
53 | design rules that are specific to CRISPRi applications. The `queryTss` function
54 | allows for specifying genomic coordinates of promoter regions.
55 | The `addTssAnnotation` function annotates gRNAs for known TSSs,
56 | and includes a column `dist_to_tss` that gives the distance in
57 | nucleotides between the TSS position and the PAM site of the gRNA.
58 | For CRISPRi, we recommend targeting the region 25-75bp region
59 | downstream of the TSS for optimal inhibition; see @sanson2018optimized for
60 | more information. Finally, the function `addCrispraiScores` adds
61 | CRISPRi-specific on-target activity scores based on the work of [@crisprai].
62 |
63 |
64 | ## Creating the GuideSet
65 |
66 | We first start by loading the required packages:
67 |
68 | ```{r, message=FALSE, warning=FALSE, results='hide' }
69 | library(crisprBase)
70 | library(crisprDesign)
71 | library(crisprDesignData)
72 | library(BSgenome.Hsapiens.UCSC.hg38)
73 | ```
74 |
75 |
76 | To demonstrate CRISPRi design, we will design gRNAs to inhibit
77 | expression of the human KRAS
78 | gene using the SpCas9 nuclease. To accomplish this, we want our gRNAs to
79 | target the region downstream of the KRAS TSS; let's consider the window
80 | containing 500bp immediately downstream of the TSS to explore candidate gRNAs.
81 |
82 | We first need to retrieve the TSS coordinates for KRAS.
83 | These data are conveniently stored in the
84 | `crisprDesignData` package as the dataset `tss_human`. For more information
85 | on `tss_human` and how to create similar TSS annotation objects,
86 | see the [Building a gene annotation object](https://github.com/crisprVerse/Tutorials/tree/master/Building_Gene_Annotation)
87 | tutorial.
88 |
89 |
90 | We load the TSS coordinates stored in the `tss_human` object
91 |
92 | ```{r, warning=FALSE, message=FALSE, results='hide'}
93 | data("tss_human", package="crisprDesignData")
94 | ```
95 |
96 | and query for KRAS using the `queryTss` function from `crisprDesign`:
97 |
98 | ```{r, warning=FALSE, message=FALSE, results='hide'}
99 | target_window <- c(0, 500)
100 | target_region <- queryTss(tss_human,
101 | queryColumn="gene_symbol",
102 | queryValue="KRAS",
103 | tss_window=target_window)
104 | ```
105 |
106 | ```{r, collapse=TRUE}
107 | target_region
108 | ```
109 |
110 | We load the `crisprNuclease` object storing information about the
111 | SpCas9 nuclease from the `crisprBase` package:
112 |
113 | ```{r}
114 | data(SpCas9, package="crisprBase")
115 | ```
116 |
117 | We then find all candidate protospacer sequences in our target region
118 | with `findSpacers`:
119 |
120 | ```{r, warning=FALSE, message=FALSE, results='hide'}
121 | gs <- findSpacers(target_region,
122 | crisprNuclease=SpCas9,
123 | bsgenome=BSgenome.Hsapiens.UCSC.hg38)
124 | ```
125 |
126 | ```{r, collapse=TRUE}
127 | gs
128 | ```
129 |
130 |
131 | ## Annotating the GuideSet
132 |
133 | Next, we annotate our candidate gRNAs to assess quality. There are several
134 | functions in `crisprDesign` that provide annotation for features that are
135 | not specific to CRISPRi, for which we refer the reader to the
136 | [CRISPRko design with Cas9](https://github.com/crisprVerse/Tutorials/tree/master/Design_CRISPRko_Cas9) tutorial for more information. The sections below will
137 | cover annotation functions that are of particular interest to
138 | CRISPRi applications.
139 |
140 |
141 | ## Adding TSS annotation
142 |
143 | As the name implies, the `addTssAnnotation` function annotates gRNAs with
144 | TSS context such as the distance between the gRNA and the TSS, as well as
145 | which TSS is targeted (many genes contain different TSSs corresponding to
146 | different isoforms).
147 |
148 | The function requires a `tssObject` object, and the `tss_window` values
149 | that we used earlier to define the target region. We can then retrieve
150 | the appended annotation with the accessor function `tssAnnotation`:
151 |
152 | ```{r, collapse=TRUE}
153 | gs <- addTssAnnotation(gs,
154 | tssObject=tss_human,
155 | tss_window=target_window)
156 | tssAnnotation(gs)
157 | ```
158 |
159 |
160 | ## Adding spacer alignments with TSS annotation
161 |
162 | As with all CRISPR applications, potential off-targets effects are an
163 | important concern in assessing gRNA quality. While this concern is
164 | somewhat moderated for CRISPRi, since the dead CRISPR nuclease does
165 | not make DSBs, we should be aware of off-targetsoccuring in the promoter regions of other genes. This can be handled by passing our `tssObject` to the `addSpacerAlignments` function. We will search for up to 2 mismatches and increase the size of our `tss_window` (which defines the promoter region when searching for off-targets) to err
166 | on the safe side.
167 |
168 | Similar to the CRISPRko design tutorial, we need to specify a Bowtie index
169 | of the human referenge genome; see the [Building genome indices for short read aligners](https://github.com/crisprVerse/Tutorials/tree/master/Building_Genome_Indices) tutorial to learn how to create such an index.
170 |
171 | Here, we specify the index that was available to us when generating this
172 | tutorial:
173 |
174 | ```{r}
175 | # Users need to specify the path of their bowtie index
176 | index_path <- "/Users/fortinj2/crisprIndices/bowtie/hg38/hg38"
177 | ```
178 |
179 | We are ready to add on- and off-target alignments:
180 |
181 | ```{r, warning=FALSE, message=FALSE, results='hide'}
182 | gs <- addSpacerAlignments(gs,
183 | aligner="bowtie",
184 | aligner_index=index_path,
185 | bsgenome=BSgenome.Hsapiens.UCSC.hg38,
186 | n_mismatches=2,
187 | tssObject=tss_human,
188 | tss_window=c(-500, 2000))
189 | ```
190 | ```{r, collapse=TRUE}
191 | gs
192 | ```
193 |
194 | Including a `tssObject` parameter in the `addSpacerAlignments` function
195 | appends columns to the `GuideSet` that tallies the alignments restricted
196 | to the defined (via `tss_window`) promoter regions:
197 | `n0_p`, `n1_p`, and `n2_p` (the `_p` suffix denotes "promoter").
198 |
199 |
200 | ## Adding CRISPRai scores
201 |
202 | The CRISPRai algorithm was developed by the Weissman lab to score SpCas9
203 | gRNAs for CRISPRa and CRISPRi applications for the
204 | human genome [@crisprai]. The function `addCrispraiScores`
205 | implements this algorithm to add scores to the `GuideSet`.
206 | Compared to other on-target scoring algorithms, it requires several additional
207 | inputs:
208 |
209 | - The `gr` argument is the `GRanges` object derived from the `queryTss`
210 | function and used to create the `GuideSet` object.
211 | In our example, this is the object named `target_region`.
212 | - The `tssObject` argument is a `GRanges` object that contains
213 | TSS coordinates and annotation. It must also contain the following
214 | columns: `ID`, `promoter`, `tx_id`, and `gene_symbol`.
215 | Our `tssObject` in this instance is `tss_human`.
216 | - `geneCol` indicates which column of `tssObject` should be used as the
217 | unique gene identifier.
218 | - `modality` is the modality of the CRISPR application, in our case, `CRISPRi`.
219 | - `fastaFile` is the path of a FASTA file containing the sequence of the
220 | human reference genome in hg38 coordinates. This file is available [here](https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz).
221 | - `chromatinFiles` is a vector of length 3 specifying the path of files
222 | containing the chromatin accessibility data needed for the algorithm
223 | in hg38 coordinates. The chromatin files can be downloaded from Zenodo [here](https://zenodo.org/record/6716721#.YrzCfS-cY4d).
224 |
225 |
226 | We first prepare all needed inputs for `addCrispraiScores`.
227 | We start by specifying the location of the FASTA file on our local machine:
228 |
229 | ```{r}
230 | fastaPath <- "/Users/fortinj2/crisprIndices/genomes/hg38/hg38.fa"
231 | ```
232 |
233 | This corresponds to the path where the downloaded file from [here](https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz) is
234 | stored. Next, we specify the location of the chromatin files:
235 |
236 | ```{r}
237 | mnasePath <- "/Users/fortinj2/crisprIndices/chromatin/hg38/crispria_mnase_human_K562_hg38.bigWig"
238 | dnasePath <- "/Users/fortinj2/crisprIndices/chromatin/hg38/crispria_dnase_human_K562_hg38.bigWig"
239 | fairePath <- "/Users/fortinj2/crisprIndices/chromatin/hg38/crispria_faire_human_K562_hg38.bigWig"
240 | chromatinFiles <- c(mnase=mnasePath,
241 | dnase=dnasePath,
242 | faire=fairePath)
243 | ```
244 |
245 | This should correspond to the files that were downloaded from
246 | [here](https://zenodo.org/record/6716721#.YrzCfS-cY4d).
247 |
248 | We are now ready to add the scores:
249 |
250 | ```{r, warning=FALSE, message=FALSE, results='hide', eval=TRUE}
251 | results <- addCrispraiScores(gs,
252 | gr=target_region,
253 | tssObject=tss_human,
254 | geneCol="gene_id",
255 | modality="CRISPRi",
256 | fastaFile=fastaPath,
257 | chromatinFiles=chromatinFiles)
258 | ```
259 |
260 | Let's look at the results:
261 |
262 | ```{r, collapse=TRUE, eval=TRUE}
263 | results
264 | ```
265 |
266 | You can see that the column `score_crispri` was added to the `GuideSet`.
267 | Note that this function works identically for CRISPRa applications, with
268 | the `modality` argument replaced by `CRISPRa`.
269 |
270 |
271 |
272 | # Session Info
273 |
274 | ```{r}
275 | sessionInfo()
276 | ```
277 |
278 |
279 | # References
280 |
--------------------------------------------------------------------------------