✂️

Genome Editing

VALID

image

Verify Allogenic Loci in DNA

Version 1.6.1

Use Cases

Evaluate the outcome of an editing experiment. For validating Sanger sequencing, please use our sangerAnalyzeR tool in FormBench.

Summary and Methods

This workflow is designed to help the user validate editing experiments. Currently, validation is supported for two sequencing types: Targeted Amplicon Sequencing and Whole Genome Sequencing. Click the toggles below to learn more about each of the supported sequencing types.

Targeted Amplicon Sequencing

Methods

This analysis was performed using the Verify Allogenic Loci in DNA workflow on the Form Bio platform. This workflow examines the efficiency of gene editing techniques on sequence data using Crispresso2. Crispresso2 [1] is a tool that allows users to design and analyze CRISPR-Cas9 experiments. It provides several features that make it easy to design experiments, including a built-in sequence database, a visualization tool, and many analysis tools. Crispresso2 also provides resources, such as tutorials and a forum, that can help users get started with CRISPR-Cas9 experiments. Crispresso2 can be used to design and analyze CRISPR-Cas9 experiments for a variety of organisms, including humans, mice, and bacteria.

Whole Genome Sequencing

Methods

This analysis was performed using the Verify Allogenic Loci in DNA workflow on the Form Bio platform. This workflow examines the efficiency of gene editing techniques on sequence data using variant callers. When the data type is “Somatic”, this workflow can be used to determine genetic variants of tumor NGS data compared to a supported reference genome. When a normal sample is provided, specialized somatic variant calling methods will be applied and allow users to filter germline variants from the resulting VCF files. Reads are trimmed using TrimGalore [2], to remove low quality (qual < 25) ends of reads and remove reads < 35bp. Trimmed reads are aligned to a reference genome using BWA MEM [1] or Minimap2 [3]. Duplicate reads are marked using Picard MarkDuplicates [4]. Somatic variants can be detected in somatic or tumor-only mode using Strelka2 [5], Freebayes [6] and MuTect2 [7]. To increase the speed of analysis, the Parabricks optimized version of these algorithms are used for BWA Mem, Alignment QC and GATK4. Quality reports are produced by MultiQC. When a matched normal sample is present, tumor/normal germline SNP matching is confirmed using NGSCheckMate [8] and microsatellite stability is assessed using MSI-Sensor [9]. Variant effects are determined using SNPEff [10].

Inputs

Experiment Design

  • Type of Sequencing: Sanger or NGS
  • Input Folder: Folder containing sequencing files

Additional Inputs for NGS

  • Editing Strategy: Homology Directed Repair, Base Editing, Primer Editing
  • Batch File
    • This is a tab delimited file containing the information for each batch run.
    • Required columns: name, amplicon_seq
    • Knockout/HDR columns: guide_seq, expected_hdr_amplicon_seq
    • Base-editing columns: guide_seq, conversion_nuc_from, conversion_nuc_to
    • Prime-editing columns: prime_editing_pegRNA_spacer_seq, prime_editing_pegRNA_extension_seq, prime_editing_pegRNA_scaffold_seq, prime_editing_nicking_guide_seq
    • PASTE columns: prime_editing_pegRNA_spacer_seq, prime_editing_pegRNA_extension_seq, prime_editing_pegRNA_scaffold_seq, prime_editing_nicking_guide_seq, prime_editing_override_prime_edited_ref_seq
    • Additional columns can be added to include extra information. Please see the CRISPResso2 manual for more parameter options.
    • name
      guide_seq
      amplicon_seq
      sample1
      CCTCGTGACCACCCTGACCTA
      GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA
      sample2
      GCTGAAGCACTGCACGCCGT
      GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA
      sample3
      GCTGAAGCACTGCACGCCGT
      GAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACA
  • Window Size
    • Number of bp upstream and downstream from the quantification window center.
    • Only mutations within this range will be used to classify reads.
    • Setting quantification window size to 0 sets the window to search the entire amplicon.
  • Window Center
    • Center of the quantification window with respect to 3' end of sgRNA.
    • Predicted cleavage position.
  • Alignment Score Cutoff
    • Minimum idenity percentage for alignment cutoffs
  • Minimum/Maximum Overlap
    • Minimum and maximum values for overlaps of paired end reads

Outputs

  • HTML Report
    • Summary report to be viewed in a web browser.
    • Contains links to all supporting data and results.
    • Output plots and summary statistics
  • Output Folder
    • frequency tables
    • mapping statistics
    • quantification of editing frequency
  • Summary Tables
    • Special summaries of the allele information for quick viewing in Excel
  • PDF Reports
    • PDF versions of the HTML reports

Workflow Walkthrough

  1. Navigate to the VALID workflow on the Form Bio platform. You can find the workflow using the search bar at the top right corner or by using the Genome Editing filter on the left-hand side.
  2. Select the version from the dropdown menu. When ready to begin, click “Run Workflow”.
  3. image
  4. Select the type of sequencing to be analyzed, either Targeted Amplicon or Whole Genome. Remaining parameters may change based on your selection.
    1. For Targeted Amplicon, select whether the sample is single input or pooled, and select the type of editing experiment performed.
    2. image
    3. For Whole Genome, provide the sequencer platform that was used to generate the data and select which analysis is to be performed.
    4. image

      Also provide the directory containing the files to be analyzed as well as a NGS attribute file (this table can be created within the workflow itself).

  5. For Targeted Amplicon, tune additional parameters related to the workflow run. For most purposes, default parameters are optimal.
  6. image
  7. For Whole Genome, select a reference genome and annotation version for the workflow run. Then, tune additional parameters related to the workflow run. For most purposes, default parameters are optimal.
  8. image
    image
  9. Give your workflow a unique name, and take a moment to review the chosen inputs and parameters. When ready to begin, click “Run Workflow” to submit your analysis.
  10. image

Results Walkthrough

  1. To view the results of your VALID workflow run, first find and select your run from the Activity tab of the Form Bio platform.
  2. Once selected, press Open Analysis to view the results of your workflow run in a new tab. It might take a few moments to load the results viewer.
  3. image
  4. View the results of your analysis on the new page that opens. Use the left-hand sidebar to navigate.
  5. image

Citations

  1. Li, H. [Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM](https://doi.org/arXiv:1303.3997 [q-bio.GN]). arXiv preprint arXiv 00, 3 (2013).
  2. Krueger, F., James, F., Ewels, P., Afyounian, E. & Schuster-Boeckler, B. FelixKrueger/TrimGalore: V0.6.7 - DOI via Zenodo. (2021) doi:10.5281/ZENODO.5127899.
  3. Li, H. Minimap2: Pairwise alignment for nucleotide sequencesBioinformatics 34, 3094–3100 (2018).
  4. Thomer, A. K., Twidale, M. B., Guo, J. & Yoder, M. J. Picard Tools. in Conference on Human Factors in Computing Systems - Proceedings (2016).
  5. Kim, S. et al. Strelka2: Fast and accurate calling of germline and somatic variantsNature Methods 15, 591–594 (2018).
  6. Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. (2012).
  7. Benjamin, D. et al. Calling Somatic SNVs and Indels with Mutect2. (2019) doi:10.1101/861054.
  8. Lee, S. et al. NGSCheckMate: Software for validating sample identity in Next-generation sequencing studies within and across data typesNucleic Acids Research 45, e103 (2017).
  9. Jia, P. et al. MSIsensor-pro: Fast, Accurate, and Matched-normal-sample-free Detection of Microsatellite InstabilityGenomics, Proteomics and Bioinformatics 18, 65–71 (2020).
  10. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEffFly 6, 80–92 (2012).

Built with

image
image
image

DR GENE

image

Design and Rank Guides for Editing Nucleotides with Enzymes

Version 2.1.1

Use Cases

  • The user wishes to predict and design optimal single-guide RNA (sgRNA) sequences for making a knock-out or edit in a genome.
    • The user wishes to rank created guide sequences to determine efficacy for a particular experiment.
    • The user wishes to determine potential off-target effects for created guide sequences.

Summary and Methods

This workflow is designed to help the user predict and design the best single-guide RNA (sgRNA) sequences for making a knock-out or edit in a genome. Currently, sgRNA design is supported for two functions: Knockout Experiment and Genome Editing Experiment. This is a new workflow based on improvements made to the CRISPRank and CRISPR Knock-out workflows with additional functionality and optimized algorithms. This workflow can also be used to perform off-target searches. Click the toggles below to learn more about each function.

Knockout Experiment

Summary

This workflow can help the user predict and design the best guide RNA sequences for use in CRISPR knockout experiments. The user will provide an input file that specifies genomic sequences or locations to be knocked out and will set parameters for the type of CRISPR enzyme they wish to create guides for. The user will receive several guide RNA sequences to choose from, PCR primers for confirming the knockout, and potential off-target effects. Results are viewable through an RShiny app with JBrowse for easy comparison.

Methods

This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing locations or nucleotide sequences to be knocked out. Information about the CRISPR analysis to be performed, including the enzyme to design guides for, was provided to the workflow. Knock-out guides were predicted using R libraries crisprVerse [1], BSgenome [2], and Biostrings [3]. Primers were predicted using primer3 [4].

Genome Editing Experiment

Summary

This workflow can help the user predict and design the best guide RNA sequences for use in CRISPR gene editing experiments. The user will provide an input file that specifies genomic edits to be performed and will set parameters for the type of CRISPR edit they wish to create guides for. The user will receive several guide RNA sequences to choose from, PCR primers for confirming the edit, and potential off-target effects. Results are viewable through an RShiny app with JBrowse for easy comparison.

Methods

This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing locations and nucleotide edits to be made. Information about the CRISPR analysis to be performed, including the genome editing strategies to design guides for, was provided to the workflow. Homology-Directed Repair (HDR) guides were predicted using R libraries crisprVerse [1], BSgenome [2], and Biostrings [3]. Prime Editing guides were predicted using PRIDICT [5]. Base Editing guides were predicted using a proprietary Form algorithm with experimentally derived scoring of PAM efficiency [6] denoted as the Chatterjee Score. Primers were predicted using primer3 [4].

Off-Target Search

Summary

This workflow can help the user search for off-targets in a given genome. The user will specify a genome and provide a list of spacers to search, and will be shown all off-target effects in the genome.

Methods

This analysis was performed on the Form Bio platform using the DR GENE workflow. An input file was provided detailing spacers to search the genome for. Information about the CRISPR analysis to be performed, including the genome editing strategies to design guides for, was provided to the workflow.

DR GENE vs CRISPRank Optimization Validation

Workflow Run
DR GENE
CRISPRank
Optimization
CRISPR Knockout of APOE from GRCh38
25m 6s
1-2 days
> 30x Faster
CRISPR Knockout of 10 genomic regions
16m 14s
4h 52m 25s
> 18x Faster
CRISPR Edit of 6 SNP edits
26m 9s
1h 12m 3s
> 2.5x Faster

Inputs

  • Experiment Design
    • Type of Experiment: Knockout, Genome Editing, or Off-target Search
    • Editing Strategies: Knockout, Homology Directed Repair, Base Editing, Primer Editing 3 and 3b
    • Organism: Reference Genome used for alignment
    • Reference Genome Annotation: Annotation that should be used for determining gene and transcript counts
    • Nuclease: Cutting Enzyme
    • Efficiency Score: Filtered based on the Nuclease selection
  • Input
    • Gene List: List of gene symbols
    • Genomic Regions List: BED file of genomic regions
    • Edit Table: Table of the locations and edits to be made
    • Sequence: Target and Edited sequence inputs
  • Adapter Sequences
    • adapter caps for oligo sequences

Additional Inputs for PCR Primer Design

  • Forward Primer Window
    • Window to filter forward primers.
    • Distance upstream from edit location to forward primer.
    • default: 90-110
  • Reverse Primer Window
    • Window to filter reverse primers.
    • Distance downstream from edit location to reverse primer.
    • default: 130-150
  • Size Restrictions
  • Temperature Restrictions
  • Length Restrictions

Additional Inputs for Homology Directed Repair

  • Donor Sequence Information
    • Length
    • ssODN or dsDNA

Additional Inputs for Knockout

  • Genomic Window
    • Length upstream/downstream

Outputs

Homology Directed Repair sgRNA Predictions

  • sgRNA/${EditLocation}_sgRNAdata.tsv
  • Column Name
    Description
    Example
    Name
    Spacer same
    spacer_1
    Chromosome
    Chromosome location of the spacer
    chr7
    Spacer_Start
    Start location of the spacer
    142353300
    Spacer_End
    End location of the spacer
    142353320
    Strand
    Direction relative to target sequence
    +
    Spacer_Sequence
    Predicted guide RNAs
    GTGATCGCTTCTCTGCAGAG
    PAM_sequence
    PAM sequence
    AGG
    PAM_Location
    Start location of the PAM
    142353320
    Cut_Site
    Location of the cut site
    142353317
    Cut2EditDistance
    Distance from Cut_Site to edit location
    89
    Efficiency_Score
    Predicted Efficiency
    0.73
    GC_Content
    GC content of target sequence excluding PAM
    55
    MM0
    Number of off-target with 0 missmatches
    0
    MM1
    Number of off-target with 1 missmatches
    0
    MM2
    Number of off-target with 2 missmatches
    8
    MM3
    Number of off-target with 3 missmatches
    23
    selfHairpin
    Presence of a self-hairpin
    FALSE
    backboneHairpin
    Presence of a backbone-hairpin
    FALSE
    HomopolymerA
    Presence of 4 or more repeating A
    FALSE
    HomopolymerC
    Presence of 4 or more repeating C
    FALSE
    HomopolymerG
    Presence of 4 or more repeating G
    FALSE
    HomopolymerT
    Presence of 4 or more repeating T
    FALSE
    startingGGGGG
    Does the Spacer start with repeating G
    FALSE
    EcoRI
    Restriction enzyme binding
    FALSE
    KpnI
    Restriction enzyme binding
    FALSE
    BsmBI
    Restriction enzyme binding
    FALSE
    BbsI
    Restriction enzyme binding
    FALSE
    PacI
    Restriction enzyme binding
    FALSE
    BsaI
    Restriction enzyme binding
    FALSE
    Donor_Start
    Start location of the HDR sequence (for easy amplicon sequencing)
    142353100
    Donor_End
    End location of the HDR sequence (for easy amplicon sequencing)
    142353522
    Donor_Sequence
    Nucleotide sequence of the HDR containing sgRNA (for easy amplicon sequencing)
  • sgRNA/${EditLocation}_OffTargetdata.tsv
  • Column Name
    Description
    Example
    Spacer_Sequence
    Predicted Spacer
    CTTCTCTGTGACCTTGTTAC
    OffTarget_Sequence
    nucleotide sequence of the off-target
    CTTCaCTGTGACgTTGccACTGG
    Mismatches
    Number of mismatches between target and off-target
    4
    CFD_Score
    Cutting Frequency Determination
    0.02
    MIT_Score
    MIT Specificity Score
    0.02
    Chromosome
    Chromosome location of the off-target sequence
    chr1
    Start
    Start location of the off-target sequeence
    10837397
    End
    End location of the off-target sequeence
    10837419
    Strand
    Direction relative to off-target sequence
    +
    PAM_Sequence
    PAM sequence of the off-target
    NGG
    PAM_Location
    Start location of the PAM sequence
    10837419
    Canonical_PAM
    Is this PAM sequence the highest ranked PAM
    TRUE
    Cutsite_Location
    Location of the off-target cutsite
    10837416
  • accompanying IGV files:
    • IGVbed/${EditLocation}_sgRNA.bed
    • IGVbed/${EditLocation}_Donorsequences.bed
    • IGVbed/${EditLocation}_offtargets.bed

Base Editing sgRNA Predictions

  • sgRNA/${EditLocation}_Guides.tsv
  • Column Name
    Description
    Example
    Spacer_Sequence
    Predicted guide RNAs
    CGGAACGTCTCGAAGCGCTC
    PAM_Sequence
    PAM sequence
    ACGC
    Chromosome
    Chromosome location of the Spacer
    chr17
    Spacer_Start
    Start location of the Spacer
    7670683
    Spacer_End
    End Location of the Spacer
    7670707
    Strand
    Direction relative to target sequence
    +
    PAM Score
    Higher value score means better binding efficiency (Pranam score: experimentally based score of pam sequence)
    -5.2
    EditLocation
    Specific location of the edit being made
    chr17_7670690
    EnzymeName
    Collumn name displays the enzyme to use. The value is how many Offtargets. NA means not usable.
    0
    MM0
    Number of off-target with 0 missmatches
    0
    MM1
    Number of off-target with 1 missmatches
    0
    MM2
    Number of off-target with 2 missmatches
    8
    MM3
    Number of off-target with 3 missmatches
    23
    selfHairpin
    Presence of a self-hairpin
    FALSE
    backboneHairpin
    Presence of a backbone-hairpin
    FALSE
    HomopolymerA
    Presence of 4 or more repeating A
    FALSE
    HomopolymerC
    Presence of 4 or more repeating C
    FALSE
    HomopolymerG
    Presence of 4 or more repeating G
    FALSE
    HomopolymerT
    Presence of 4 or more repeating T
    FALSE
    startingGGGGG
    Does the Spacer start with repeating G
    FALSE
    EcoRI
    Restriction enzyme binding
    FALSE
    KpnI
    Restriction enzyme binding
    FALSE
    BsmBI
    Restriction enzyme binding
    FALSE
    BbsI
    Restriction enzyme binding
    FALSE
    PacI
    Restriction enzyme binding
    FALSE
  • sgRNA/${EditLocation}_OffTarget.tsv
  • Column Name
    Description
    Example
    Spacer_Sequence
    Predicted Spacer
    CTTCTCTGTGACCTTGTTAC
    OffTarget_Sequence
    nucleotide sequence of the off-target
    CTTCaCTGTGACgTTGccACTGG
    Mismatches
    Number of mismatches between target and off-target
    4
    CFD_Score
    Cutting Frequency Determination
    0.02
    MIT_Score
    MIT Specificity Score
    0.02
    Chromosome
    Chromosome location of the off-target sequence
    chr1
    Start
    Start location of the off-target sequeence
    10837397
    End
    End location of the off-target sequeence
    10837419
    Strand
    Direction relative to off-target sequence
    +
    PAM_Sequence
    PAM sequence of the off-target
    NGG
    PAM_Location
    Start location of the PAM sequence
    10837419
    Canonical_PAM
    Is this PAM sequence the highest ranked PAM
    TRUE
    Cutsite_Location
    Location of the off-target cutsite
    10837416
  • accompanying IGV files:
    • IGVbed/${EditLocation}_sgRNA.bed
    • IGVbed/${EditLocation}_offtargets.bed

Prime Editing sgRNA Predictions

  • pegRNA/${EditLocation}_pegRNA_Pridict_full.csv
  • Column Name
    Description
    Example
    Original_Sequence
    Original Input Sequence
    AGTAGATGCGCGGGGCGCTAGAGTCGATTAGAGTACGTGCTAGCTAGCTAGCGGGCTAC
    Edited-Sequences
    Sequence after Editing
    GTCGGCGTGgctgcttGtgcgggctgTGAACAGCATGCTAGCTAGGTCGATCC
    Target-Strand
    Strand/Direction
    +
    Mutation_Type
    Type of Edit made
    1bpReplacement
    Correction_Type
    Mode of edit
    Replacement
    Correction_Length
    length of edit sequence
    10
    Editing_Position
    Position in pegRNA sequence where edit occurs
    11
    PBSlength
    Length of PBS sequence
    11
    RToverhanglength
    Length of RTT overhang
    7
    RTlength
    Length of RTT
    19
    EditedAllele
    Origional strand before the edit
    C
    OriginalAllele
    Editited strand after the edit
    T
    Protospacer-Sequence
    pegRNA spacer sequence
    GTCATGTCCTTTATCAAGTT
    PBSrevcomp13bp
    PBS sequence inside the extension
    TTGATAAAGGA
    RTseqoverhangrevcomp
    RTT overhang sequence inside extension seq
    AAATGAC
    RTrevcomp
    RTT sequence inside extension seq
    AAATGACgATGATCCAAAC
    Protospacer-Oligo-FW
    Forward strand spacer sequence with cloning oligos
    CACCGTCATGTCCTTTATCAAGTTGTTTC
    Protospacer-Oligo-RV
    Reverse strand spacer sequence with cloning oligos
    CTCTGAAACAACTTGATAAAGGACATGAC
    Extension-Oligo-FW
    Forward strand extension sequence with cloning oligo
    GTGCAAATGACgATGATCCAAACTTGATAAAGGA
    Extension-Oligo-RV
    Reverse strand extension sequence with cloning oligo
    AAAATCCTTTATCAAGTTTGGATCATcGTCATTT
    pegRNA
    full pegRNA sequence
    GTCATGTCCTTTATCAAGTTGTTTCAGAGCTATGCTGGAAACAGCATAGCAAGTTGAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAAATGACgATGATCCAAACTTGATAAAGGA
    Editor_Variant
    Type of editor enzyme
    PE2-NGG
    protospacermt
    melting temprature of spacer sequence
    54.0
    extensionmt
    melting temprature of Extension Sequence
    80.0
    RTmt
    melting temprature of RTT sequence
    52.0
    RToverhangmt
    melting temprature RTT sequence
    18.0
    PBSmt
    melting temprature of PBS sequence
    28.0
    MFE_*
    Minimum Free Energy
    -37.0
    PRIDICT_editing_Score_deep
    Editing score
    75.68932
    PRIDICT_unintended_Score_deep
    Unintended edits
    2.84563
  • pegRNA/${EditLocation}_nicking_guides.csv
  • | Nicking-Protospacer | Nicking guide sequence | TAAGGAGATCATTTCCCTG | | Nicking-Position-to-edit | Distance from nick to edit | -40 | | PE3b | PE3b capability | No_PE3b | | Nicking-PAMdisrupt | Is the PAM disrupted | No_nicking_PAM_disrupt | | Target_Strand | Strand location | Fw | | DeepCas9score | Nicking efficiency score for SpCas9 | 59.73 | | Nicking-Proto-Oligo-FW | Forward Strand nicking spacer with cloning adapters | caccgTAAGGAGATCATTTCCCTG | | Nicking-Proto-Oligo-RV | Reverse Strand nicking spacer with cloning adapters | aaacCAGGGAAATGATCTCCTTAc |

Knockout sgRNA Predictions

  • sgRNA/${EditLocation}_sgRNAdata.tsv
  • Column Name
    Description
    Example
    Name
    Spacer same
    spacer_1
    Chromosome
    Chromosome location of the spacer
    chr7
    Spacer_Start
    Start location of the spacer
    142353300
    Spacer_End
    End location of the spacer
    142353320
    Strand
    Direction relative to target sequence
    +
    Spacer_Sequence
    Predicted guide RNAs
    GTGATCGCTTCTCTGCAGAG
    PAM_sequence
    PAM sequence
    AGG
    PAM_Location
    Start location of the PAM
    142353320
    Cut_Site
    Location of the cut site
    142353317
    Efficiency_Score
    Predicted Efficiency
    0.73
    GC_Content
    GC content of target sequence excluding PAM
    55
    MM0
    Number of off-target with 0 missmatches
    0
    MM1
    Number of off-target with 1 missmatches
    0
    MM2
    Number of off-target with 2 missmatches
    8
    MM3
    Number of off-target with 3 missmatches
    23
    selfHairpin
    Presence of a self-hairpin
    FALSE
    backboneHairpin
    Presence of a backbone-hairpin
    FALSE
    HomopolymerA
    Presence of 4 or more repeating A
    FALSE
    HomopolymerC
    Presence of 4 or more repeating C
    FALSE
    HomopolymerG
    Presence of 4 or more repeating G
    FALSE
    HomopolymerT
    Presence of 4 or more repeating T
    FALSE
    startingGGGGG
    Does the Spacer start with repeating G
    FALSE
    EcoRI
    Restriction enzyme binding
    FALSE
    KpnI
    Restriction enzyme binding
    FALSE
    BsmBI
    Restriction enzyme binding
    FALSE
    BbsI
    Restriction enzyme binding
    FALSE
    PacI
    Restriction enzyme binding
    FALSE
    BsaI
    Restriction enzyme binding
    FALSE
  • sgRNA/${EditLocation}_OffTargetdata.tsv
  • Column Name
    Description
    Example
    Spacer_Sequence
    Predicted Spacer
    CTTCTCTGTGACCTTGTTAC
    OffTarget_Sequence
    nucleotide sequence of the off-target
    CTTCaCTGTGACgTTGccACTGG
    Mismatches
    Number of mismatches between target and off-target
    4
    CFD_Score
    Cutting Frequency Determination
    0.02
    MIT_Score
    MIT Specificity Score
    0.02
    Chromosome
    Chromosome location of the off-target sequence
    chr1
    Start
    Start location of the off-target sequeence
    10837397
    End
    End location of the off-target sequeence
    10837419
    Strand
    Direction relative to off-target sequence
    +
    PAM_Sequence
    PAM sequence of the off-target
    NGG
    PAM_Location
    Start location of the PAM sequence
    10837419
    Canonical_PAM
    Is this PAM sequence the highest ranked PAM
    TRUE
    Cutsite_Location
    Location of the off-target cutsite
    10837416
  • accompanying IGV files:
    • IGVbed/${EditLocation}_sgRNA.bed
    • IGVbed/${EditLocation}_offtargets.bed

PCR Primer Design

  • Primers/${EditLocation}.for
  • Primers/${EditLocation}.int
  • Primers/${EditLocation}.rev
  • Column Name
    Description
    Example
    sequence
    Primer sequence
    GCAGTCCCACCACCACTC
    1-based start
    Start location of primer sequence
    10837297
    ln
    Length of primer sequence
    18
    # N
    Number of N nucleotides
    0
    GC%
    Percent GC
    66.67
    Tm
    Temprature
    59.967
    self any-th
    General reactivity
    0.00
    self end_th
    End reactivity
    0.00
    hairpin
    Hairpin prediction
    0.00
    quality
    quality score
    0.033

Workflow Walkthrough

  1. Navigate to DR GENE workflow on the Form Bio platform. You can search for this workflow using the bar at the top-right corner or by selecting the Genome Editing or Candidate Validation filters on the left-hand side.
  2. image
  3. Select the version from the dropdown menu in the top right corner. Take a moment to review some information about the workflow analysis, inputs, and outputs. When ready to begin, click “Run Workflow”.
  4. image
  5. Select one of three functions, either Genome Editing, Knock-Out, or Genome-wide Offtarget Search. Depending on your choice, you will be asked to tune certain parameters about the type of experiment to design guides for. All three editing strategies are checked by default. Specify target genome and target genome version. Choose the nuclease that was used in the experiment. Also determine how you want the enzyme efficiency to be scored. Most efficiency scores are trained on specific enzymes in specific organisms. We attempt to account for this variance and weigh each score differently.
  6. image
  7. Provide the necessary input file for your experiment. For Genome Editing this will be an edit table, for Knock-Out this will be a list of genes, and for Genome-wide Offtarget Search this will be a spacer sequence list.
  8. image
  9. For guide design functions, you’ll be asked to tune some parameters related to PCR primers. In most cases, defaults are desirable.
  10. image
  11. For some genome editing algorithms, you’ll be asked to tune some additional parameters related to guide design for that specific algorithm.
  12. image
  13. Give your workflow a unique name, and take a moment to review the chosen inputs and parameters. When ready to begin, click “Run Workflow” to submit your analysis.

Results Walkthrough

  1. Locate your workflow run from the Activity tab, and select it.
  2. On this page, you can view a variety of information about the workflow run, including inputs, outputs, and parameters. To view the analysis, click Open Analysis in the top right corner.
  3. image
  4. A new tab will open containing sequences of interest for all selected editing experiments. Use the tabs in the top-left corner to navigate.
  5. image

Citations

  1. Hoberecht, L., Perampalam, P., Lun, A. & Fortin, J.-P. A comprehensive Bioconductor ecosystem for the design of CRISPR guide RNAs across nucleases and technologiesNature Communications 13, 6568 (2022).
  2. Pagès, H. & Maduka ), P. C. BSgenome: Software infrastructure for efficient representation of full genomes and their SNPs. (2023) doi:10.18129/B9.bioc.BSgenome.
  3. Pagès, H. et al. Biostrings: Efficient manipulation of biological strings. (2023) doi:10.18129/B9.bioc.Biostrings.
  4. Untergasser, A. et al. Primer3new capabilities and interfacesNucleic Acids Research 40, e115 (2012).
  5. Mathis, N. et al. Predicting prime editing efficiency and product purity by deep learning. Nature Biotechnology 1–9 (2023) doi:10.1038/s41587-022-01613-7.
  6. Walton, R. T., Christie, K. A., Whittaker, M. N. & Kleinstiver, B. P. Unconstrained genome targeting with near-PAMless engineered CRISPR-Cas9 variantsScience 368, 290–296 (2020).

Built with

image