Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
Sanger Sequencing01:57

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
DNA Microarrays02:34

DNA Microarrays

Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...
Next-generation Sequencing03:00

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...
Complementary DNA01:44

Complementary DNA

Overview

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Computational Tool Choice Impacts CRISPR Spacer-Proto spacer Detection.

Bioinformatics (Oxford, England)·2026
Same author

Barbell reveals and resolves demultiplexing and trimming issues in Nanopore data.

Bioinformatics (Oxford, England)·2026
Same author

mim: A lightweight auxiliary index to enable fast, parallel, gzipped FASTQ parsing.

bioRxiv : the preprint server for biology·2025
Same author

The open-closed mod-minimizer algorithm.

Algorithms for molecular biology : AMB·2025
Same author

A near-tight lower bound on the density of forward sampling schemes.

Bioinformatics (Oxford, England)·2024
Same author

A near-tight lower bound on the density of forward sampling schemes.

bioRxiv : the preprint server for biology·2024
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: May 26, 2026

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Sassy: fuzzy searching DNA sequences using SIMD.

Rick Beeloo1, Ragnar Groot Koerkamp2,3

  • 1Department of Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, 3584CH, Netherlands.

Bioinformatics (Oxford, England)
|May 24, 2026
PubMed
Summary
This summary is machine-generated.

Sassy is a new tool for approximate string matching (ASM) that exhaustively finds all pattern matches in long texts with up to k errors. It offers significant speedups for applications like CRISPR off-target detection.

More Related Videos

Single Cell Multiplex Reverse Transcription Polymerase Chain Reaction After Patch-clamp
10:44

Single Cell Multiplex Reverse Transcription Polymerase Chain Reaction After Patch-clamp

Published on: June 20, 2018

Related Experiment Videos

Last Updated: May 26, 2026

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Single Cell Multiplex Reverse Transcription Polymerase Chain Reaction After Patch-clamp
10:44

Single Cell Multiplex Reverse Transcription Polymerase Chain Reaction After Patch-clamp

Published on: June 20, 2018

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Approximate string matching (ASM) is crucial for tasks like CRISPR off-target detection.
  • Existing ASM methods may not guarantee exhaustive results, which is essential for critical applications.
  • There is a need for efficient and exhaustive ASM tools for analyzing large biological datasets.

Purpose of the Study:

  • To introduce Sassy, a novel library and tool for fast and exhaustive approximate string matching.
  • To enable the detection of all pattern occurrences in long texts with a specified number of errors (k).
  • To provide a solution for applications requiring guaranteed complete results, such as CRISPR off-target analysis.

Main Methods:

  • Sassy employs a parallel text-splitting approach and utilizes bitvectors in the text direction.
  • The algorithm achieves a time complexity of O(k⌈n/W⌉) for random texts, where W is the SIMD width (256).
  • It incorporates an overhang cost to find matches near sequence ends, accommodating biological read alignments.

Main Results:

  • Sassy demonstrates significant speedups, being 4x-15x faster than Edlib and over 100x faster than parasail for patterns up to 1000bp.
  • Achieves a high throughput of nearly 2 Gbp/s.
  • In CRISPR off-target detection, Sassy is 100x faster than SWOffinder and scales effectively for larger k values, unlike other tools like CHOPOFF.

Conclusions:

  • Sassy provides a fast and exhaustive solution for approximate string matching in long texts.
  • Its efficiency and accuracy make it highly suitable for demanding applications like CRISPR off-target detection.
  • The tool is readily available as a library and binary, promoting its adoption in the research community.