- VALID
- Use Cases
- Summary and Methods
- Inputs
- Outputs
- Workflow Walkthrough
- Results Walkthrough
- Citations
- Built with
- DR GENE
- Use Cases
- Summary and Methods
- Inputs
- Outputs
- Workflow Walkthrough
- Results Walkthrough
- Citations
- Built with
VALID
Verify Allogenic Loci in DNA
Version 1.6.1
Use Cases
Evaluate the outcome of an editing experiment. For validating Sanger sequencing, please use our sangerAnalyzeR tool in FormBench.
Summary and Methods
This workflow is designed to help the user validate editing experiments. Currently, validation is supported for two sequencing types: Targeted Amplicon Sequencing and Whole Genome Sequencing. Click the toggles below to learn more about each of the supported sequencing types.
Targeted Amplicon Sequencing
Methods
This analysis was performed using the Verify Allogenic Loci in DNA workflow on the Form Bio platform. This workflow examines the efficiency of gene editing techniques on sequence data using Crispresso2. Crispresso2 [1] is a tool that allows users to design and analyze CRISPR-Cas9 experiments. It provides several features that make it easy to design experiments, including a built-in sequence database, a visualization tool, and many analysis tools. Crispresso2 also provides resources, such as tutorials and a forum, that can help users get started with CRISPR-Cas9 experiments. Crispresso2 can be used to design and analyze CRISPR-Cas9 experiments for a variety of organisms, including humans, mice, and bacteria.
Whole Genome Sequencing
Methods
This analysis was performed using the Verify Allogenic Loci in DNA workflow on the Form Bio platform. This workflow examines the efficiency of gene editing techniques on sequence data using variant callers. When the data type is “Somatic”, this workflow can be used to determine genetic variants of tumor NGS data compared to a supported reference genome. When a normal sample is provided, specialized somatic variant calling methods will be applied and allow users to filter germline variants from the resulting VCF files. Reads are trimmed using TrimGalore [2], to remove low quality (qual < 25) ends of reads and remove reads < 35bp. Trimmed reads are aligned to a reference genome using BWA MEM [1] or Minimap2 [3]. Duplicate reads are marked using Picard MarkDuplicates [4]. Somatic variants can be detected in somatic or tumor-only mode using Strelka2 [5], Freebayes [6] and MuTect2 [7]. To increase the speed of analysis, the Parabricks optimized version of these algorithms are used for BWA Mem, Alignment QC and GATK4. Quality reports are produced by MultiQC. When a matched normal sample is present, tumor/normal germline SNP matching is confirmed using NGSCheckMate [8] and microsatellite stability is assessed using MSI-Sensor [9]. Variant effects are determined using SNPEff [10].
Inputs
Experiment Design
- Type of Sequencing: Sanger or NGS
- Input Folder: Folder containing sequencing files
Additional Inputs for NGS
- Editing Strategy: Homology Directed Repair, Base Editing, Primer Editing
- Batch File
- This is a tab delimited file containing the information for each batch run.
- Required columns: name, amplicon_seq
- Knockout/HDR columns: guide_seq, expected_hdr_amplicon_seq
- Base-editing columns: guide_seq, conversion_nuc_from, conversion_nuc_to
- Prime-editing columns: prime_editing_pegRNA_spacer_seq, prime_editing_pegRNA_extension_seq, prime_editing_pegRNA_scaffold_seq, prime_editing_nicking_guide_seq
- PASTE columns: prime_editing_pegRNA_spacer_seq, prime_editing_pegRNA_extension_seq, prime_editing_pegRNA_scaffold_seq, prime_editing_nicking_guide_seq, prime_editing_override_prime_edited_ref_seq
- Additional columns can be added to include extra information. Please see the CRISPResso2 manual for more parameter options.
- Window Size
- Number of bp upstream and downstream from the quantification window center.
- Only mutations within this range will be used to classify reads.
- Setting quantification window size to 0 sets the window to search the entire amplicon.
- Window Center
- Center of the quantification window with respect to 3' end of sgRNA.
- Predicted cleavage position.
- Alignment Score Cutoff
- Minimum idenity percentage for alignment cutoffs
- Minimum/Maximum Overlap
- Minimum and maximum values for overlaps of paired end reads
name | guide_seq | amplicon_seq |
sample1 | CCTCGTGACCACCCTGACCTA | GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA |
sample2 | GCTGAAGCACTGCACGCCGT | GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA |
sample3 | GCTGAAGCACTGCACGCCGT | GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA |
Outputs
- HTML Report
- Summary report to be viewed in a web browser.
- Contains links to all supporting data and results.
- Output plots and summary statistics
- Output Folder
- frequency tables
- mapping statistics
- quantification of editing frequency
- Summary Tables
- Special summaries of the allele information for quick viewing in Excel
- PDF Reports
- PDF versions of the HTML reports
Workflow Walkthrough
- Navigate to the VALID workflow on the Form Bio platform. You can find the workflow using the search bar at the top right corner or by using the Genome Editing filter on the left-hand side.
- Select the version from the dropdown menu. When ready to begin, click “Run Workflow”.
- Select the type of sequencing to be analyzed, either Targeted Amplicon or Whole Genome. Remaining parameters may change based on your selection.
- For Targeted Amplicon, select whether the sample is single input or pooled, and select the type of editing experiment performed.
- For Whole Genome, provide the sequencer platform that was used to generate the data and select which analysis is to be performed.
- For Targeted Amplicon, tune additional parameters related to the workflow run. For most purposes, default parameters are optimal.
- For Whole Genome, select a reference genome and annotation version for the workflow run. Then, tune additional parameters related to the workflow run. For most purposes, default parameters are optimal.
- Give your workflow a unique name, and take a moment to review the chosen inputs and parameters. When ready to begin, click “Run Workflow” to submit your analysis.
Also provide the directory containing the files to be analyzed as well as a NGS attribute file (this table can be created within the workflow itself).
Results Walkthrough
- To view the results of your VALID workflow run, first find and select your run from the Activity tab of the Form Bio platform.
- Once selected, press Open Analysis to view the results of your workflow run in a new tab. It might take a few moments to load the results viewer.
- View the results of your analysis on the new page that opens. Use the left-hand sidebar to navigate.
Citations
- Li, H. [Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM](https://doi.org/arXiv:1303.3997 [q-bio.GN]). arXiv preprint arXiv 00, 3 (2013).
- Krueger, F., James, F., Ewels, P., Afyounian, E. & Schuster-Boeckler, B. FelixKrueger/TrimGalore: V0.6.7 - DOI via Zenodo. (2021) doi:10.5281/ZENODO.5127899.
- Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
- Thomer, A. K., Twidale, M. B., Guo, J. & Yoder, M. J. Picard Tools. in Conference on Human Factors in Computing Systems - Proceedings (2016).
- Kim, S. et al. Strelka2: Fast and accurate calling of germline and somatic variants. Nature Methods 15, 591–594 (2018).
- Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. (2012).
- Benjamin, D. et al. Calling Somatic SNVs and Indels with Mutect2. (2019) doi:10.1101/861054.
- Lee, S. et al. NGSCheckMate: Software for validating sample identity in Next-generation sequencing studies within and across data types. Nucleic Acids Research 45, e103 (2017).
- Jia, P. et al. MSIsensor-pro: Fast, Accurate, and Matched-normal-sample-free Detection of Microsatellite Instability. Genomics, Proteomics and Bioinformatics 18, 65–71 (2020).
- Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Built with
DR GENE
Design and Rank Guides for Editing Nucleotides with Enzymes
Version 2.1.1
Use Cases
- The user wishes to predict and design optimal single-guide RNA (sgRNA) sequences for making a knock-out or edit in a genome.
- The user wishes to rank created guide sequences to determine efficacy for a particular experiment.
- The user wishes to determine potential off-target effects for created guide sequences.
Summary and Methods
This workflow is designed to help the user predict and design the best single-guide RNA (sgRNA) sequences for making a knock-out or edit in a genome. Currently, sgRNA design is supported for two functions: Knockout Experiment and Genome Editing Experiment. This is a new workflow based on improvements made to the CRISPRank and CRISPR Knock-out workflows with additional functionality and optimized algorithms. This workflow can also be used to perform off-target searches. Click the toggles below to learn more about each function.
Knockout Experiment
Summary
This workflow can help the user predict and design the best guide RNA sequences for use in CRISPR knockout experiments. The user will provide an input file that specifies genomic sequences or locations to be knocked out and will set parameters for the type of CRISPR enzyme they wish to create guides for. The user will receive several guide RNA sequences to choose from, PCR primers for confirming the knockout, and potential off-target effects. Results are viewable through an RShiny app with JBrowse for easy comparison.
Methods
This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing locations or nucleotide sequences to be knocked out. Information about the CRISPR analysis to be performed, including the enzyme to design guides for, was provided to the workflow. Knock-out guides were predicted using R libraries crisprVerse [1], BSgenome [2], and Biostrings [3]. Primers were predicted using primer3 [4].
Genome Editing Experiment
Summary
This workflow can help the user predict and design the best guide RNA sequences for use in CRISPR gene editing experiments. The user will provide an input file that specifies genomic edits to be performed and will set parameters for the type of CRISPR edit they wish to create guides for. The user will receive several guide RNA sequences to choose from, PCR primers for confirming the edit, and potential off-target effects. Results are viewable through an RShiny app with JBrowse for easy comparison.
Methods
This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing locations and nucleotide edits to be made. Information about the CRISPR analysis to be performed, including the genome editing strategies to design guides for, was provided to the workflow. Homology-Directed Repair (HDR) guides were predicted using R libraries crisprVerse [1], BSgenome [2], and Biostrings [3]. Prime Editing guides were predicted using PRIDICT [5]. Base Editing guides were predicted using a proprietary Form algorithm with experimentally derived scoring of PAM efficiency [6] denoted as the Chatterjee Score. Primers were predicted using primer3 [4].
Off-Target Search
Summary
This workflow can help the user search for off-targets in a given genome. The user will specify a genome and provide a list of spacers to search, and will be shown all off-target effects in the genome.
Methods
This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing spacers to search the genome for. Information about the CRISPR analysis to be performed, including the genome editing strategies to design guides for, was provided to the workflow.
DR GENE vs CRISPRank Optimization Validation
Workflow Run | DR GENE | CRISPRank | Optimization |
CRISPR Knockout of APOE from GRCh38 | 25m 6s | 1-2 days | > 30x Faster |
CRISPR Knockout of 10 genomic regions | 16m 14s | 4h 52m 25s | > 18x Faster |
CRISPR Edit of 6 SNP edits | 26m 9s | 1h 12m 3s | > 2.5x Faster |
Inputs
- Experiment Design
- Type of Experiment: Knockout, Genome Editing, or Off-target Search
- Editing Strategies: Knockout, Homology Directed Repair, Base Editing, Primer Editing 3 and 3b
- Organism: Reference Genome used for alignment
- Reference Genome Annotation: Annotation that should be used for determining gene and transcript counts
- Nuclease: Cutting Enzyme
- Efficiency Score: Filtered based on the Nuclease selection
- Input
- Gene List: List of gene symbols
- Genomic Regions List: BED file of genomic regions
- Edit Table: Table of the locations and edits to be made
- Sequence: Target and Edited sequence inputs
- Adapter Sequences
- adapter caps for oligo sequences
Additional Inputs for PCR Primer Design
- Forward Primer Window
- Window to filter forward primers.
- Distance upstream from edit location to forward primer.
- default: 90-110
- Reverse Primer Window
- Window to filter reverse primers.
- Distance downstream from edit location to reverse primer.
- default: 130-150
- Size Restrictions
- Temperature Restrictions
- Length Restrictions
Additional Inputs for Homology Directed Repair
- Donor Sequence Information
- Length
- ssODN or dsDNA
Additional Inputs for Knockout
- Genomic Window
- Length upstream/downstream
Outputs
Homology Directed Repair sgRNA Predictions
- sgRNA/${EditLocation}_sgRNAdata.tsv
- sgRNA/${EditLocation}_OffTargetdata.tsv
- accompanying IGV files:
- IGVbed/${EditLocation}_sgRNA.bed
- IGVbed/${EditLocation}_Donorsequences.bed
- IGVbed/${EditLocation}_offtargets.bed
Column Name | Description | Example |
Name | Spacer same | spacer_1 |
Chromosome | Chromosome location of the spacer | chr7 |
Spacer_Start | Start location of the spacer | 142353300 |
Spacer_End | End location of the spacer | 142353320 |
Strand | Direction relative to target sequence | + |
Spacer_Sequence | Predicted guide RNAs | GTGATCGCTTCTCTGCAGAG |
PAM_sequence | PAM sequence | AGG |
PAM_Location | Start location of the PAM | 142353320 |
Cut_Site | Location of the cut site | 142353317 |
Cut2EditDistance | Distance from Cut_Site to edit location | 89 |
Efficiency_Score | Predicted Efficiency | 0.73 |
GC_Content | GC content of target sequence excluding PAM | 55 |
MM0 | Number of off-target with 0 missmatches | 0 |
MM1 | Number of off-target with 1 missmatches | 0 |
MM2 | Number of off-target with 2 missmatches | 8 |
MM3 | Number of off-target with 3 missmatches | 23 |
selfHairpin | Presence of a self-hairpin | FALSE |
backboneHairpin | Presence of a backbone-hairpin | FALSE |
HomopolymerA | Presence of 4 or more repeating A | FALSE |
HomopolymerC | Presence of 4 or more repeating C | FALSE |
HomopolymerG | Presence of 4 or more repeating G | FALSE |
HomopolymerT | Presence of 4 or more repeating T | FALSE |
startingGGGGG | Does the Spacer start with repeating G | FALSE |
EcoRI | Restriction enzyme binding | FALSE |
KpnI | Restriction enzyme binding | FALSE |
BsmBI | Restriction enzyme binding | FALSE |
BbsI | Restriction enzyme binding | FALSE |
PacI | Restriction enzyme binding | FALSE |
BsaI | Restriction enzyme binding | FALSE |
Donor_Start | Start location of the HDR sequence (for easy amplicon sequencing) | 142353100 |
Donor_End | End location of the HDR sequence (for easy amplicon sequencing) | 142353522 |
Donor_Sequence | Nucleotide sequence of the HDR containing sgRNA (for easy amplicon sequencing) |
Column Name | Description | Example |
Spacer_Sequence | Predicted Spacer | CTTCTCTGTGACCTTGTTAC |
OffTarget_Sequence | nucleotide sequence of the off-target | CTTCaCTGTGACgTTGccACTGG |
Mismatches | Number of mismatches between target and off-target | 4 |
CFD_Score | Cutting Frequency Determination | 0.02 |
MIT_Score | MIT Specificity Score | 0.02 |
Chromosome | Chromosome location of the off-target sequence | chr1 |
Start | Start location of the off-target sequeence | 10837397 |
End | End location of the off-target sequeence | 10837419 |
Strand | Direction relative to off-target sequence | + |
PAM_Sequence | PAM sequence of the off-target | NGG |
PAM_Location | Start location of the PAM sequence | 10837419 |
Canonical_PAM | Is this PAM sequence the highest ranked PAM | TRUE |
Cutsite_Location | Location of the off-target cutsite | 10837416 |
Base Editing sgRNA Predictions
- sgRNA/${EditLocation}_Guides.tsv
- sgRNA/${EditLocation}_OffTarget.tsv
- accompanying IGV files:
- IGVbed/${EditLocation}_sgRNA.bed
- IGVbed/${EditLocation}_offtargets.bed
Column Name | Description | Example |
Spacer_Sequence | Predicted guide RNAs | CGGAACGTCTCGAAGCGCTC |
PAM_Sequence | PAM sequence | ACGC |
Chromosome | Chromosome location of the Spacer | chr17 |
Spacer_Start | Start location of the Spacer | 7670683 |
Spacer_End | End Location of the Spacer | 7670707 |
Strand | Direction relative to target sequence | + |
PAM Score | Higher value score means better binding efficiency (Pranam score: experimentally based score of pam sequence) | -5.2 |
EditLocation | Specific location of the edit being made | chr17_7670690 |
EnzymeName | Collumn name displays the enzyme to use. The value is how many Offtargets. NA means not usable. | 0 |
MM0 | Number of off-target with 0 missmatches | 0 |
MM1 | Number of off-target with 1 missmatches | 0 |
MM2 | Number of off-target with 2 missmatches | 8 |
MM3 | Number of off-target with 3 missmatches | 23 |
selfHairpin | Presence of a self-hairpin | FALSE |
backboneHairpin | Presence of a backbone-hairpin | FALSE |
HomopolymerA | Presence of 4 or more repeating A | FALSE |
HomopolymerC | Presence of 4 or more repeating C | FALSE |
HomopolymerG | Presence of 4 or more repeating G | FALSE |
HomopolymerT | Presence of 4 or more repeating T | FALSE |
startingGGGGG | Does the Spacer start with repeating G | FALSE |
EcoRI | Restriction enzyme binding | FALSE |
KpnI | Restriction enzyme binding | FALSE |
BsmBI | Restriction enzyme binding | FALSE |
BbsI | Restriction enzyme binding | FALSE |
PacI | Restriction enzyme binding | FALSE |
Column Name | Description | Example |
Spacer_Sequence | Predicted Spacer | CTTCTCTGTGACCTTGTTAC |
OffTarget_Sequence | nucleotide sequence of the off-target | CTTCaCTGTGACgTTGccACTGG |
Mismatches | Number of mismatches between target and off-target | 4 |
CFD_Score | Cutting Frequency Determination | 0.02 |
MIT_Score | MIT Specificity Score | 0.02 |
Chromosome | Chromosome location of the off-target sequence | chr1 |
Start | Start location of the off-target sequeence | 10837397 |
End | End location of the off-target sequeence | 10837419 |
Strand | Direction relative to off-target sequence | + |
PAM_Sequence | PAM sequence of the off-target | NGG |
PAM_Location | Start location of the PAM sequence | 10837419 |
Canonical_PAM | Is this PAM sequence the highest ranked PAM | TRUE |
Cutsite_Location | Location of the off-target cutsite | 10837416 |
Prime Editing sgRNA Predictions
- pegRNA/${EditLocation}_pegRNA_Pridict_full.csv
- pegRNA/${EditLocation}_nicking_guides.csv
Column Name | Description | Example |
Original_Sequence | Original Input Sequence | AGTAGATGCGCGGGGCGCTAGAGTCGATTAGAGTACGTGCTAGCTAGCTAGCGGGCTAC |
Edited-Sequences | Sequence after Editing | GTCGGCGTGgctgcttGtgcgggctgTGAACAGCATGCTAGCTAGGTCGATCC |
Target-Strand | Strand/Direction | + |
Mutation_Type | Type of Edit made | 1bpReplacement |
Correction_Type | Mode of edit | Replacement |
Correction_Length | length of edit sequence | 10 |
Editing_Position | Position in pegRNA sequence where edit occurs | 11 |
PBSlength | Length of PBS sequence | 11 |
RToverhanglength | Length of RTT overhang | 7 |
RTlength | Length of RTT | 19 |
EditedAllele | Origional strand before the edit | C |
OriginalAllele | Editited strand after the edit | T |
Protospacer-Sequence | pegRNA spacer sequence | GTCATGTCCTTTATCAAGTT |
PBSrevcomp13bp | PBS sequence inside the extension | TTGATAAAGGA |
RTseqoverhangrevcomp | RTT overhang sequence inside extension seq | AAATGAC |
RTrevcomp | RTT sequence inside extension seq | AAATGACgATGATCCAAAC |
Protospacer-Oligo-FW | Forward strand spacer sequence with cloning oligos | CACCGTCATGTCCTTTATCAAGTTGTTTC |
Protospacer-Oligo-RV | Reverse strand spacer sequence with cloning oligos | CTCTGAAACAACTTGATAAAGGACATGAC |
Extension-Oligo-FW | Forward strand extension sequence with cloning oligo | GTGCAAATGACgATGATCCAAACTTGATAAAGGA |
Extension-Oligo-RV | Reverse strand extension sequence with cloning oligo | AAAATCCTTTATCAAGTTTGGATCATcGTCATTT |
pegRNA | full pegRNA sequence | GTCATGTCCTTTATCAAGTTGTTTCAGAGCTATGCTGGAAACAGCATAGCAAGTTGAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAAATGACgATGATCCAAACTTGATAAAGGA |
Editor_Variant | Type of editor enzyme | PE2-NGG |
protospacermt | melting temprature of spacer sequence | 54.0 |
extensionmt | melting temprature of Extension Sequence | 80.0 |
RTmt | melting temprature of RTT sequence | 52.0 |
RToverhangmt | melting temprature RTT sequence | 18.0 |
PBSmt | melting temprature of PBS sequence | 28.0 |
MFE_* | Minimum Free Energy | -37.0 |
PRIDICT_editing_Score_deep | Editing score | 75.68932 |
PRIDICT_unintended_Score_deep | Unintended edits | 2.84563 |
| Nicking-Protospacer | Nicking guide sequence | TAAGGAGATCATTTCCCTG | | Nicking-Position-to-edit | Distance from nick to edit | -40 | | PE3b | PE3b capability | No_PE3b | | Nicking-PAMdisrupt | Is the PAM disrupted | No_nicking_PAM_disrupt | | Target_Strand | Strand location | Fw | | DeepCas9score | Nicking efficiency score for SpCas9 | 59.73 | | Nicking-Proto-Oligo-FW | Forward Strand nicking spacer with cloning adapters | caccgTAAGGAGATCATTTCCCTG | | Nicking-Proto-Oligo-RV | Reverse Strand nicking spacer with cloning adapters | aaacCAGGGAAATGATCTCCTTAc |
Knockout sgRNA Predictions
- sgRNA/${EditLocation}_sgRNAdata.tsv
- sgRNA/${EditLocation}_OffTargetdata.tsv
- accompanying IGV files:
- IGVbed/${EditLocation}_sgRNA.bed
- IGVbed/${EditLocation}_offtargets.bed
Column Name | Description | Example |
Name | Spacer same | spacer_1 |
Chromosome | Chromosome location of the spacer | chr7 |
Spacer_Start | Start location of the spacer | 142353300 |
Spacer_End | End location of the spacer | 142353320 |
Strand | Direction relative to target sequence | + |
Spacer_Sequence | Predicted guide RNAs | GTGATCGCTTCTCTGCAGAG |
PAM_sequence | PAM sequence | AGG |
PAM_Location | Start location of the PAM | 142353320 |
Cut_Site | Location of the cut site | 142353317 |
Efficiency_Score | Predicted Efficiency | 0.73 |
GC_Content | GC content of target sequence excluding PAM | 55 |
MM0 | Number of off-target with 0 missmatches | 0 |
MM1 | Number of off-target with 1 missmatches | 0 |
MM2 | Number of off-target with 2 missmatches | 8 |
MM3 | Number of off-target with 3 missmatches | 23 |
selfHairpin | Presence of a self-hairpin | FALSE |
backboneHairpin | Presence of a backbone-hairpin | FALSE |
HomopolymerA | Presence of 4 or more repeating A | FALSE |
HomopolymerC | Presence of 4 or more repeating C | FALSE |
HomopolymerG | Presence of 4 or more repeating G | FALSE |
HomopolymerT | Presence of 4 or more repeating T | FALSE |
startingGGGGG | Does the Spacer start with repeating G | FALSE |
EcoRI | Restriction enzyme binding | FALSE |
KpnI | Restriction enzyme binding | FALSE |
BsmBI | Restriction enzyme binding | FALSE |
BbsI | Restriction enzyme binding | FALSE |
PacI | Restriction enzyme binding | FALSE |
BsaI | Restriction enzyme binding | FALSE |
Column Name | Description | Example |
Spacer_Sequence | Predicted Spacer | CTTCTCTGTGACCTTGTTAC |
OffTarget_Sequence | nucleotide sequence of the off-target | CTTCaCTGTGACgTTGccACTGG |
Mismatches | Number of mismatches between target and off-target | 4 |
CFD_Score | Cutting Frequency Determination | 0.02 |
MIT_Score | MIT Specificity Score | 0.02 |
Chromosome | Chromosome location of the off-target sequence | chr1 |
Start | Start location of the off-target sequeence | 10837397 |
End | End location of the off-target sequeence | 10837419 |
Strand | Direction relative to off-target sequence | + |
PAM_Sequence | PAM sequence of the off-target | NGG |
PAM_Location | Start location of the PAM sequence | 10837419 |
Canonical_PAM | Is this PAM sequence the highest ranked PAM | TRUE |
Cutsite_Location | Location of the off-target cutsite | 10837416 |
PCR Primer Design
- Primers/${EditLocation}.for
- Primers/${EditLocation}.int
- Primers/${EditLocation}.rev
Column Name | Description | Example |
sequence | Primer sequence | GCAGTCCCACCACCACTC |
1-based start | Start location of primer sequence | 10837297 |
ln | Length of primer sequence | 18 |
# N | Number of N nucleotides | 0 |
GC% | Percent GC | 66.67 |
Tm | Temprature | 59.967 |
self any-th | General reactivity | 0.00 |
self end_th | End reactivity | 0.00 |
hairpin | Hairpin prediction | 0.00 |
quality | quality score | 0.033 |
Workflow Walkthrough
- Navigate to DR GENE workflow on the Form Bio platform. You can search for this workflow using the bar at the top-right corner or by selecting the Genome Editing or Candidate Validation filters on the left-hand side.
- Select the version from the dropdown menu in the top right corner. Take a moment to review some information about the workflow analysis, inputs, and outputs. When ready to begin, click “Run Workflow”.
- Select one of three functions, either Genome Editing, Knock-Out, or Genome-wide Offtarget Search. Depending on your choice, you will be asked to tune certain parameters about the type of experiment to design guides for. All three editing strategies are checked by default. Specify target genome and target genome version. Choose the nuclease that was used in the experiment. Also determine how you want the enzyme efficiency to be scored. Most efficiency scores are trained on specific enzymes in specific organisms. We attempt to account for this variance and weigh each score differently.
- Provide the necessary input file for your experiment. For Genome Editing this will be an edit table, for Knock-Out this will be a list of genes, and for Genome-wide Offtarget Search this will be a spacer sequence list.
- For guide design functions, you’ll be asked to tune some parameters related to PCR primers. In most cases, defaults are desirable.
- For some genome editing algorithms, you’ll be asked to tune some additional parameters related to guide design for that specific algorithm.
- Give your workflow a unique name, and take a moment to review the chosen inputs and parameters. When ready to begin, click “Run Workflow” to submit your analysis.
Results Walkthrough
- Locate your workflow run from the Activity tab, and select it.
- On this page, you can view a variety of information about the workflow run, including inputs, outputs, and parameters. To view the analysis, click Open Analysis in the top right corner.
- A new tab will open containing sequences of interest for all selected editing experiments. Use the tabs in the top-left corner to navigate.
Citations
- Hoberecht, L., Perampalam, P., Lun, A. & Fortin, J.-P. A comprehensive Bioconductor ecosystem for the design of CRISPR guide RNAs across nucleases and technologies. Nature Communications 13, 6568 (2022).
- Pagès, H. & Maduka ), P. C. BSgenome: Software infrastructure for efficient representation of full genomes and their SNPs. (2023) doi:10.18129/B9.bioc.BSgenome.
- Pagès, H. et al. Biostrings: Efficient manipulation of biological strings. (2023) doi:10.18129/B9.bioc.Biostrings.
- Untergasser, A. et al. Primer3new capabilities and interfaces. Nucleic Acids Research 40, e115 (2012).
- Mathis, N. et al. Predicting prime editing efficiency and product purity by deep learning. Nature Biotechnology 1–9 (2023) doi:10.1038/s41587-022-01613-7.
- Walton, R. T., Christie, K. A., Whittaker, M. N. & Kleinstiver, B. P. Unconstrained genome targeting with near-PAMless engineered CRISPR-Cas9 variants. Science 368, 290–296 (2020).