Genome Editing

VALID
Use Cases
Summary and Methods
Inputs
Outputs
Workflow Walkthrough
Results Walkthrough
Citations
Built with
DR GENE
Use Cases
Summary and Methods
Inputs
Outputs
Workflow Walkthrough
Results Walkthrough
Citations
Built with

VALID

Verify Allogenic Loci in DNA

Version 1.6.1

Use Cases

Evaluate the outcome of an editing experiment. For validating Sanger sequencing, please use our sangerAnalyzeR tool in FormBench.

Summary and Methods

This workflow is designed to help the user validate editing experiments. Currently, validation is supported for two sequencing types: Targeted Amplicon Sequencing and Whole Genome Sequencing. Click the toggles below to learn more about each of the supported sequencing types.

‣

Targeted Amplicon Sequencing

Methods

This analysis was performed using the Verify Allogenic Loci in DNA workflow on the Form Bio platform. This workflow examines the efficiency of gene editing techniques on sequence data using Crispresso2. Crispresso2 [1] is a tool that allows users to design and analyze CRISPR-Cas9 experiments. It provides several features that make it easy to design experiments, including a built-in sequence database, a visualization tool, and many analysis tools. Crispresso2 also provides resources, such as tutorials and a forum, that can help users get started with CRISPR-Cas9 experiments. Crispresso2 can be used to design and analyze CRISPR-Cas9 experiments for a variety of organisms, including humans, mice, and bacteria.

‣

Whole Genome Sequencing

Methods

This analysis was performed using the Verify Allogenic Loci in DNA workflow on the Form Bio platform. This workflow examines the efficiency of gene editing techniques on sequence data using variant callers. When the data type is “Somatic”, this workflow can be used to determine genetic variants of tumor NGS data compared to a supported reference genome. When a normal sample is provided, specialized somatic variant calling methods will be applied and allow users to filter germline variants from the resulting VCF files. Reads are trimmed using TrimGalore [2], to remove low quality (qual < 25) ends of reads and remove reads < 35bp. Trimmed reads are aligned to a reference genome using BWA MEM [1] or Minimap2 [3]. Duplicate reads are marked using Picard MarkDuplicates [4]. Somatic variants can be detected in somatic or tumor-only mode using Strelka2 [5], Freebayes [6] and MuTect2 [7]. To increase the speed of analysis, the Parabricks optimized version of these algorithms are used for BWA Mem, Alignment QC and GATK4. Quality reports are produced by MultiQC. When a matched normal sample is present, tumor/normal germline SNP matching is confirmed using NGSCheckMate [8] and microsatellite stability is assessed using MSI-Sensor [9]. Variant effects are determined using SNPEff [10].

‣

Inputs

Experiment Design

Type of Sequencing: Sanger or NGS
Input Folder: Folder containing sequencing files

Additional Inputs for NGS

Editing Strategy: Homology Directed Repair, Base Editing, Primer Editing
Batch File

This is a tab delimited file containing the information for each batch run.
Required columns: name, amplicon_seq
Knockout/HDR columns: guide_seq, expected_hdr_amplicon_seq
Base-editing columns: guide_seq, conversion_nuc_from, conversion_nuc_to
Prime-editing columns: prime_editing_pegRNA_spacer_seq, prime_editing_pegRNA_extension_seq, prime_editing_pegRNA_scaffold_seq, prime_editing_nicking_guide_seq
PASTE columns: prime_editing_pegRNA_spacer_seq, prime_editing_pegRNA_extension_seq, prime_editing_pegRNA_scaffold_seq, prime_editing_nicking_guide_seq, prime_editing_override_prime_edited_ref_seq
Additional columns can be added to include extra information. Please see the CRISPResso2 manual for more parameter options.

name	guide_seq	amplicon_seq
sample1	CCTCGTGACCACCCTGACCTA	GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA
sample2	GCTGAAGCACTGCACGCCGT	GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA
sample3	GCTGAAGCACTGCACGCCGT	GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA

Window Size

Number of bp upstream and downstream from the quantification window center.
Only mutations within this range will be used to classify reads.
Setting quantification window size to 0 sets the window to search the entire amplicon.

Window Center

Center of the quantification window with respect to 3' end of sgRNA.
Predicted cleavage position.

Alignment Score Cutoff

Minimum idenity percentage for alignment cutoffs

Minimum/Maximum Overlap

Minimum and maximum values for overlaps of paired end reads

‣

Outputs

HTML Report

Summary report to be viewed in a web browser.
Contains links to all supporting data and results.
Output plots and summary statistics

Output Folder

frequency tables
mapping statistics
quantification of editing frequency

Summary Tables

Special summaries of the allele information for quick viewing in Excel

PDF Reports

PDF versions of the HTML reports

‣

Workflow Walkthrough

Navigate to the VALID workflow on the Form Bio platform. You can find the workflow using the search bar at the top right corner or by using the Genome Editing filter on the left-hand side.
Select the version from the dropdown menu. When ready to begin, click “Run Workflow”.

Select the type of sequencing to be analyzed, either Targeted Amplicon or Whole Genome. Remaining parameters may change based on your selection.

For Targeted Amplicon, select whether the sample is single input or pooled, and select the type of editing experiment performed.

For Whole Genome, provide the sequencer platform that was used to generate the data and select which analysis is to be performed.

Also provide the directory containing the files to be analyzed as well as a NGS attribute file (this table can be created within the workflow itself).

For Targeted Amplicon, tune additional parameters related to the workflow run. For most purposes, default parameters are optimal.

For Whole Genome, select a reference genome and annotation version for the workflow run. Then, tune additional parameters related to the workflow run. For most purposes, default parameters are optimal.

Give your workflow a unique name, and take a moment to review the chosen inputs and parameters. When ready to begin, click “Run Workflow” to submit your analysis.

‣

Results Walkthrough

To view the results of your VALID workflow run, first find and select your run from the Activity tab of the Form Bio platform.
Once selected, press Open Analysis to view the results of your workflow run in a new tab. It might take a few moments to load the results viewer.

View the results of your analysis on the new page that opens. Use the left-hand sidebar to navigate.

‣

Citations

Li, H. [Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM](https://doi.org/arXiv:1303.3997 [q-bio.GN]). arXiv preprint arXiv 00, 3 (2013).
Krueger, F., James, F., Ewels, P., Afyounian, E. & Schuster-Boeckler, B. FelixKrueger/TrimGalore: V0.6.7 - DOI via Zenodo. (2021) doi:10.5281/ZENODO.5127899.
Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Thomer, A. K., Twidale, M. B., Guo, J. & Yoder, M. J. Picard Tools. in Conference on Human Factors in Computing Systems - Proceedings (2016).
Kim, S. et al. Strelka2: Fast and accurate calling of germline and somatic variants. Nature Methods 15, 591–594 (2018).
Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. (2012).
Benjamin, D. et al. Calling Somatic SNVs and Indels with Mutect2. (2019) doi:10.1101/861054.
Lee, S. et al. NGSCheckMate: Software for validating sample identity in Next-generation sequencing studies within and across data types. Nucleic Acids Research 45, e103 (2017).
Jia, P. et al. MSIsensor-pro: Fast, Accurate, and Matched-normal-sample-free Detection of Microsatellite Instability. Genomics, Proteomics and Bioinformatics 18, 65–71 (2020).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).

Built with

DR GENE

Design and Rank Guides for Editing Nucleotides with Enzymes

Version 2.1.1

Use Cases

The user wishes to predict and design optimal single-guide RNA (sgRNA) sequences for making a knock-out or edit in a genome.

The user wishes to rank created guide sequences to determine efficacy for a particular experiment.
The user wishes to determine potential off-target effects for created guide sequences.

Summary and Methods

This workflow is designed to help the user predict and design the best single-guide RNA (sgRNA) sequences for making a knock-out or edit in a genome. Currently, sgRNA design is supported for two functions: Knockout Experiment and Genome Editing Experiment. This is a new workflow based on improvements made to the CRISPRank and CRISPR Knock-out workflows with additional functionality and optimized algorithms. This workflow can also be used to perform off-target searches. Click the toggles below to learn more about each function.

‣

Knockout Experiment

Summary

This workflow can help the user predict and design the best guide RNA sequences for use in CRISPR knockout experiments. The user will provide an input file that specifies genomic sequences or locations to be knocked out and will set parameters for the type of CRISPR enzyme they wish to create guides for. The user will receive several guide RNA sequences to choose from, PCR primers for confirming the knockout, and potential off-target effects. Results are viewable through an RShiny app with JBrowse for easy comparison.

Methods

This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing locations or nucleotide sequences to be knocked out. Information about the CRISPR analysis to be performed, including the enzyme to design guides for, was provided to the workflow. Knock-out guides were predicted using R libraries crisprVerse [1], BSgenome [2], and Biostrings [3]. Primers were predicted using primer3 [4].

‣

Genome Editing Experiment

Summary

This workflow can help the user predict and design the best guide RNA sequences for use in CRISPR gene editing experiments. The user will provide an input file that specifies genomic edits to be performed and will set parameters for the type of CRISPR edit they wish to create guides for. The user will receive several guide RNA sequences to choose from, PCR primers for confirming the edit, and potential off-target effects. Results are viewable through an RShiny app with JBrowse for easy comparison.

Methods

This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing locations and nucleotide edits to be made. Information about the CRISPR analysis to be performed, including the genome editing strategies to design guides for, was provided to the workflow. Homology-Directed Repair (HDR) guides were predicted using R libraries crisprVerse [1], BSgenome [2], and Biostrings [3]. Prime Editing guides were predicted using PRIDICT [5]. Base Editing guides were predicted using a proprietary Form algorithm with experimentally derived scoring of PAM efficiency [6] denoted as the Chatterjee Score. Primers were predicted using primer3 [4].

‣

Off-Target Search

Summary

This workflow can help the user search for off-targets in a given genome. The user will specify a genome and provide a list of spacers to search, and will be shown all off-target effects in the genome.

Methods

This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing spacers to search the genome for. Information about the CRISPR analysis to be performed, including the genome editing strategies to design guides for, was provided to the workflow.

DR GENE vs CRISPRank Optimization Validation

Workflow Run	DR GENE	CRISPRank	Optimization
CRISPR Knockout of APOE from GRCh38	25m 6s	1-2 days	> 30x Faster
CRISPR Knockout of 10 genomic regions	16m 14s	4h 52m 25s	> 18x Faster
CRISPR Edit of 6 SNP edits	26m 9s	1h 12m 3s	> 2.5x Faster

‣

Inputs

Experiment Design

Type of Experiment: Knockout, Genome Editing, or Off-target Search
Editing Strategies: Knockout, Homology Directed Repair, Base Editing, Primer Editing 3 and 3b
Organism: Reference Genome used for alignment
Reference Genome Annotation: Annotation that should be used for determining gene and transcript counts
Nuclease: Cutting Enzyme
Efficiency Score: Filtered based on the Nuclease selection

Input

Gene List: List of gene symbols
Genomic Regions List: BED file of genomic regions
Edit Table: Table of the locations and edits to be made
Sequence: Target and Edited sequence inputs

Adapter Sequences

adapter caps for oligo sequences

Additional Inputs for PCR Primer Design

Forward Primer Window

Window to filter forward primers.
Distance upstream from edit location to forward primer.
default: 90-110

Reverse Primer Window

Window to filter reverse primers.
Distance downstream from edit location to reverse primer.
default: 130-150

Size Restrictions
Temperature Restrictions
Length Restrictions

Additional Inputs for Homology Directed Repair

Donor Sequence Information

Length
ssODN or dsDNA

Additional Inputs for Knockout

Genomic Window

Length upstream/downstream

‣

Outputs

Homology Directed Repair sgRNA Predictions

sgRNA/${EditLocation}_sgRNAdata.tsv

Column Name	Description	Example
Name	Spacer same	spacer_1
Chromosome	Chromosome location of the spacer	chr7
Spacer_Start	Start location of the spacer	142353300
Spacer_End	End location of the spacer	142353320
Strand	Direction relative to target sequence	+
Spacer_Sequence	Predicted guide RNAs	GTGATCGCTTCTCTGCAGAG
PAM_sequence	PAM sequence	AGG
PAM_Location	Start location of the PAM	142353320
Cut_Site	Location of the cut site	142353317
Cut2EditDistance	Distance from Cut_Site to edit location	89
Efficiency_Score	Predicted Efficiency	0.73
GC_Content	GC content of target sequence excluding PAM	55
MM0	Number of off-target with 0 missmatches	0
MM1	Number of off-target with 1 missmatches	0
MM2	Number of off-target with 2 missmatches	8
MM3	Number of off-target with 3 missmatches	23
selfHairpin	Presence of a self-hairpin	FALSE
backboneHairpin	Presence of a backbone-hairpin	FALSE
HomopolymerA	Presence of 4 or more repeating A	FALSE
HomopolymerC	Presence of 4 or more repeating C	FALSE
HomopolymerG	Presence of 4 or more repeating G	FALSE
HomopolymerT	Presence of 4 or more repeating T	FALSE
startingGGGGG	Does the Spacer start with repeating G	FALSE
EcoRI	Restriction enzyme binding	FALSE
KpnI	Restriction enzyme binding	FALSE
BsmBI	Restriction enzyme binding	FALSE
BbsI	Restriction enzyme binding	FALSE
PacI	Restriction enzyme binding	FALSE
BsaI	Restriction enzyme binding	FALSE
Donor_Start	Start location of the HDR sequence (for easy amplicon sequencing)	142353100
Donor_End	End location of the HDR sequence (for easy amplicon sequencing)	142353522
Donor_Sequence	Nucleotide sequence of the HDR containing sgRNA (for easy amplicon sequencing)

sgRNA/${EditLocation}_OffTargetdata.tsv

Column Name	Description	Example
Spacer_Sequence	Predicted Spacer	CTTCTCTGTGACCTTGTTAC
OffTarget_Sequence	nucleotide sequence of the off-target	CTTCaCTGTGACgTTGccACTGG
Mismatches	Number of mismatches between target and off-target	4
CFD_Score	Cutting Frequency Determination	0.02
MIT_Score	MIT Specificity Score	0.02
Chromosome	Chromosome location of the off-target sequence	chr1
Start	Start location of the off-target sequeence	10837397
End	End location of the off-target sequeence	10837419
Strand	Direction relative to off-target sequence	+
PAM_Sequence	PAM sequence of the off-target	NGG
PAM_Location	Start location of the PAM sequence	10837419
Canonical_PAM	Is this PAM sequence the highest ranked PAM	TRUE
Cutsite_Location	Location of the off-target cutsite	10837416

accompanying IGV files:

IGVbed/${EditLocation}_sgRNA.bed
IGVbed/${EditLocation}_Donorsequences.bed
IGVbed/${EditLocation}_offtargets.bed

Base Editing sgRNA Predictions

sgRNA/${EditLocation}_Guides.tsv

Column Name	Description	Example
Spacer_Sequence	Predicted guide RNAs	CGGAACGTCTCGAAGCGCTC
PAM_Sequence	PAM sequence	ACGC
Chromosome	Chromosome location of the Spacer	chr17
Spacer_Start	Start location of the Spacer	7670683
Spacer_End	End Location of the Spacer	7670707
Strand	Direction relative to target sequence	+
PAM Score	Higher value score means better binding efficiency (Pranam score: experimentally based score of pam sequence)	-5.2
EditLocation	Specific location of the edit being made	chr17_7670690
EnzymeName	Collumn name displays the enzyme to use. The value is how many Offtargets. NA means not usable.	0
MM0	Number of off-target with 0 missmatches	0
MM1	Number of off-target with 1 missmatches	0
MM2	Number of off-target with 2 missmatches	8
MM3	Number of off-target with 3 missmatches	23
selfHairpin	Presence of a self-hairpin	FALSE
backboneHairpin	Presence of a backbone-hairpin	FALSE
HomopolymerA	Presence of 4 or more repeating A	FALSE
HomopolymerC	Presence of 4 or more repeating C	FALSE
HomopolymerG	Presence of 4 or more repeating G	FALSE
HomopolymerT	Presence of 4 or more repeating T	FALSE
startingGGGGG	Does the Spacer start with repeating G	FALSE
EcoRI	Restriction enzyme binding	FALSE
KpnI	Restriction enzyme binding	FALSE
BsmBI	Restriction enzyme binding	FALSE
BbsI	Restriction enzyme binding	FALSE
PacI	Restriction enzyme binding	FALSE

sgRNA/${EditLocation}_OffTarget.tsv

Column Name	Description	Example
Spacer_Sequence	Predicted Spacer	CTTCTCTGTGACCTTGTTAC
OffTarget_Sequence	nucleotide sequence of the off-target	CTTCaCTGTGACgTTGccACTGG
Mismatches	Number of mismatches between target and off-target	4
CFD_Score	Cutting Frequency Determination	0.02
MIT_Score	MIT Specificity Score	0.02
Chromosome	Chromosome location of the off-target sequence	chr1
Start	Start location of the off-target sequeence	10837397
End	End location of the off-target sequeence	10837419
Strand	Direction relative to off-target sequence	+
PAM_Sequence	PAM sequence of the off-target	NGG
PAM_Location	Start location of the PAM sequence	10837419
Canonical_PAM	Is this PAM sequence the highest ranked PAM	TRUE
Cutsite_Location	Location of the off-target cutsite	10837416

accompanying IGV files:

IGVbed/${EditLocation}_sgRNA.bed
IGVbed/${EditLocation}_offtargets.bed

Prime Editing sgRNA Predictions

pegRNA/${EditLocation}_pegRNA_Pridict_full.csv

Column Name	Description	Example
Original_Sequence	Original Input Sequence	AGTAGATGCGCGGGGCGCTAGAGTCGATTAGAGTACGTGCTAGCTAGCTAGCGGGCTAC
Edited-Sequences	Sequence after Editing	GTCGGCGTGgctgcttGtgcgggctgTGAACAGCATGCTAGCTAGGTCGATCC
Target-Strand	Strand/Direction	+
Mutation_Type	Type of Edit made	1bpReplacement
Correction_Type	Mode of edit	Replacement
Correction_Length	length of edit sequence	10
Editing_Position	Position in pegRNA sequence where edit occurs	11
PBSlength	Length of PBS sequence	11
RToverhanglength	Length of RTT overhang	7
RTlength	Length of RTT	19
EditedAllele	Origional strand before the edit	C
OriginalAllele	Editited strand after the edit	T
Protospacer-Sequence	pegRNA spacer sequence	GTCATGTCCTTTATCAAGTT
PBSrevcomp13bp	PBS sequence inside the extension	TTGATAAAGGA
RTseqoverhangrevcomp	RTT overhang sequence inside extension seq	AAATGAC
RTrevcomp	RTT sequence inside extension seq	AAATGACgATGATCCAAAC
Protospacer-Oligo-FW	Forward strand spacer sequence with cloning oligos	CACCGTCATGTCCTTTATCAAGTTGTTTC
Protospacer-Oligo-RV	Reverse strand spacer sequence with cloning oligos	CTCTGAAACAACTTGATAAAGGACATGAC
Extension-Oligo-FW	Forward strand extension sequence with cloning oligo	GTGCAAATGACgATGATCCAAACTTGATAAAGGA
Extension-Oligo-RV	Reverse strand extension sequence with cloning oligo	AAAATCCTTTATCAAGTTTGGATCATcGTCATTT
pegRNA	full pegRNA sequence	GTCATGTCCTTTATCAAGTTGTTTCAGAGCTATGCTGGAAACAGCATAGCAAGTTGAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAAATGACgATGATCCAAACTTGATAAAGGA
Editor_Variant	Type of editor enzyme	PE2-NGG
protospacermt	melting temprature of spacer sequence	54.0
extensionmt	melting temprature of Extension Sequence	80.0
RTmt	melting temprature of RTT sequence	52.0
RToverhangmt	melting temprature RTT sequence	18.0
PBSmt	melting temprature of PBS sequence	28.0
MFE_*	Minimum Free Energy	-37.0
PRIDICT_editing_Score_deep	Editing score	75.68932
PRIDICT_unintended_Score_deep	Unintended edits	2.84563

pegRNA/${EditLocation}_nicking_guides.csv

Knockout sgRNA Predictions

sgRNA/${EditLocation}_sgRNAdata.tsv

Column Name	Description	Example
Name	Spacer same	spacer_1
Chromosome	Chromosome location of the spacer	chr7
Spacer_Start	Start location of the spacer	142353300
Spacer_End	End location of the spacer	142353320
Strand	Direction relative to target sequence	+
Spacer_Sequence	Predicted guide RNAs	GTGATCGCTTCTCTGCAGAG
PAM_sequence	PAM sequence	AGG
PAM_Location	Start location of the PAM	142353320
Cut_Site	Location of the cut site	142353317
Efficiency_Score	Predicted Efficiency	0.73
GC_Content	GC content of target sequence excluding PAM	55
MM0	Number of off-target with 0 missmatches	0
MM1	Number of off-target with 1 missmatches	0
MM2	Number of off-target with 2 missmatches	8
MM3	Number of off-target with 3 missmatches	23
selfHairpin	Presence of a self-hairpin	FALSE
backboneHairpin	Presence of a backbone-hairpin	FALSE
HomopolymerA	Presence of 4 or more repeating A	FALSE
HomopolymerC	Presence of 4 or more repeating C	FALSE
HomopolymerG	Presence of 4 or more repeating G	FALSE
HomopolymerT	Presence of 4 or more repeating T	FALSE
startingGGGGG	Does the Spacer start with repeating G	FALSE
EcoRI	Restriction enzyme binding	FALSE
KpnI	Restriction enzyme binding	FALSE
BsmBI	Restriction enzyme binding	FALSE
BbsI	Restriction enzyme binding	FALSE
PacI	Restriction enzyme binding	FALSE
BsaI	Restriction enzyme binding	FALSE

sgRNA/${EditLocation}_OffTargetdata.tsv

Column Name	Description	Example
Spacer_Sequence	Predicted Spacer	CTTCTCTGTGACCTTGTTAC
OffTarget_Sequence	nucleotide sequence of the off-target	CTTCaCTGTGACgTTGccACTGG
Mismatches	Number of mismatches between target and off-target	4
CFD_Score	Cutting Frequency Determination	0.02
MIT_Score	MIT Specificity Score	0.02
Chromosome	Chromosome location of the off-target sequence	chr1
Start	Start location of the off-target sequeence	10837397
End	End location of the off-target sequeence	10837419
Strand	Direction relative to off-target sequence	+
PAM_Sequence	PAM sequence of the off-target	NGG
PAM_Location	Start location of the PAM sequence	10837419
Canonical_PAM	Is this PAM sequence the highest ranked PAM	TRUE
Cutsite_Location	Location of the off-target cutsite	10837416

accompanying IGV files:

IGVbed/${EditLocation}_sgRNA.bed
IGVbed/${EditLocation}_offtargets.bed

PCR Primer Design

Primers/${EditLocation}.for
Primers/${EditLocation}.int
Primers/${EditLocation}.rev

Column Name	Description	Example
sequence	Primer sequence	GCAGTCCCACCACCACTC
1-based start	Start location of primer sequence	10837297
ln	Length of primer sequence	18
# N	Number of N nucleotides	0
GC%	Percent GC	66.67
Tm	Temprature	59.967
self any-th	General reactivity	0.00
self end_th	End reactivity	0.00
hairpin	Hairpin prediction	0.00
quality	quality score	0.033

‣

Workflow Walkthrough

Navigate to DR GENE workflow on the Form Bio platform. You can search for this workflow using the bar at the top-right corner or by selecting the Genome Editing or Candidate Validation filters on the left-hand side.

Select the version from the dropdown menu in the top right corner. Take a moment to review some information about the workflow analysis, inputs, and outputs. When ready to begin, click “Run Workflow”.

Select one of three functions, either Genome Editing, Knock-Out, or Genome-wide Offtarget Search. Depending on your choice, you will be asked to tune certain parameters about the type of experiment to design guides for. All three editing strategies are checked by default. Specify target genome and target genome version. Choose the nuclease that was used in the experiment. Also determine how you want the enzyme efficiency to be scored. Most efficiency scores are trained on specific enzymes in specific organisms. We attempt to account for this variance and weigh each score differently.

Provide the necessary input file for your experiment. For Genome Editing this will be an edit table, for Knock-Out this will be a list of genes, and for Genome-wide Offtarget Search this will be a spacer sequence list.

For guide design functions, you’ll be asked to tune some parameters related to PCR primers. In most cases, defaults are desirable.

For some genome editing algorithms, you’ll be asked to tune some additional parameters related to guide design for that specific algorithm.

Give your workflow a unique name, and take a moment to review the chosen inputs and parameters. When ready to begin, click “Run Workflow” to submit your analysis.

‣

Results Walkthrough

Locate your workflow run from the Activity tab, and select it.
On this page, you can view a variety of information about the workflow run, including inputs, outputs, and parameters. To view the analysis, click Open Analysis in the top right corner.

A new tab will open containing sequences of interest for all selected editing experiments. Use the tabs in the top-left corner to navigate.

‣

Citations

Hoberecht, L., Perampalam, P., Lun, A. & Fortin, J.-P. A comprehensive Bioconductor ecosystem for the design of CRISPR guide RNAs across nucleases and technologies. Nature Communications 13, 6568 (2022).
Pagès, H. & Maduka ), P. C. BSgenome: Software infrastructure for efficient representation of full genomes and their SNPs. (2023) doi:10.18129/B9.bioc.BSgenome.
Pagès, H. et al. Biostrings: Efficient manipulation of biological strings. (2023) doi:10.18129/B9.bioc.Biostrings.
Untergasser, A. et al. Primer3new capabilities and interfaces. Nucleic Acids Research 40, e115 (2012).
Mathis, N. et al. Predicting prime editing efficiency and product purity by deep learning. Nature Biotechnology 1–9 (2023) doi:10.1038/s41587-022-01613-7.
Walton, R. T., Christie, K. A., Whittaker, M. N. & Kleinstiver, B. P. Unconstrained genome targeting with near-PAMless engineered CRISPR-Cas9 variants. Science 368, 290–296 (2020).