Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Maximum Size of Aggregate

Maximum Size of Aggregate

The maximum size of aggregate is defined as the aperture of the sieve retaining 15 percent or more of the particles present in the aggregate sample. The aggregate's maximum size impacts the concrete's water requirement, workability, and strength. Larger aggregates reduce the surface area needing cement paste coverage, which can lower water needs, thereby allowing a decrease in the water-to-cement ratio when the desired workability and richness of the mix are to be maintained, which can...

Multi-species Conserved Sequences

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...

Per-Unit Sequence Models

Per-Unit Sequence Models

An ideal Y-Y transformer, grounded through neutral impedances, displays per-unit sequence networks akin to those of a single-phase ideal transformer when subjected to balanced positive- or negative-sequence currents. These currents do not produce neutral currents, and their associated voltage drops.
Zero-sequence currents, which are identical in magnitude and phase, generate a neutral current, resulting in voltage drops across the neutral impedance and the low-voltage winding. If the...

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Maxam-Gilbert Sequencing

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Metappuccino: large language model-driven reconstruction of sequence read archive metadata for cancer research.

Bioinformatics (Oxford, England)·2026

Same author

Automated evaluation of multiple sequence alignment methods to handle third generation sequencing errors.

PeerJ·2026

Same author

K2R: Tinted de Bruijn graphs implementation for efficient read extraction from sequencing datasets.

Bioinformatics advances·2025

Same author

CREMSA: compressed indexing of (ultra) large multiple sequence alignments.

Bioinformatics (Oxford, England)·2025

Same author

OReO: optimizing read order for practical compression.

Bioinformatics advances·2025

Same author

Fractional hitting sets for efficient multiset sketching.

Algorithms for molecular biology : AMB·2025

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 25, 2025

The ITS2 Database

The ITS2 Database

Published on: March 12, 2012

Scalable sequence database search using partitioned aggregated Bloom comb trees.

Camille Marchet¹, Antoine Limasset¹

¹University of Lille, CNRS, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France.

Bioinformatics (Oxford, England)

|June 30, 2023

Summary

This summary is machine-generated.

We developed PAC, a novel data structure for efficiently searching massive sequence datasets. PAC significantly improves construction time and query speed for large biological sequence collections.

More Related Videos

Executing Complexity-Increasing Queries in Relational MySQL and NoSQL MongoDB and EXist Size-Growing ISO/EN 13606 Standardized EHR Databases

Executing Complexity-Increasing Queries in Relational MySQL and NoSQL MongoDB and EXist Size-Growing ISO/EN 13606 Standardized EHR Databases

Published on: March 19, 2018

Hierarchical and Programmable One-Pot Oligosaccharide Synthesis

Hierarchical and Programmable One-Pot Oligosaccharide Synthesis

Published on: September 6, 2019

Related Experiment Videos

Last Updated: Jul 25, 2025

The ITS2 Database

The ITS2 Database

Published on: March 12, 2012

Executing Complexity-Increasing Queries in Relational MySQL and NoSQL MongoDB and EXist Size-Growing ISO/EN 13606 Standardized EHR Databases

Executing Complexity-Increasing Queries in Relational MySQL and NoSQL MongoDB and EXist Size-Growing ISO/EN 13606 Standardized EHR Databases

Published on: March 19, 2018

Hierarchical and Programmable One-Pot Oligosaccharide Synthesis

Hierarchical and Programmable One-Pot Oligosaccharide Synthesis

Published on: September 6, 2019

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

The Sequence Read Archive exceeds 45 petabytes, making traditional BLAST-like searches infeasible.
Existing k-mer based strategies and approximate membership query structures struggle with immense datasets.

Purpose of the Study:

To introduce PAC, a novel approximate membership query data structure for querying large sequence collections.
To demonstrate PAC's efficiency in terms of construction time and query performance.

Main Methods:

PAC utilizes a streaming fashion for index construction with minimal disk footprint.
The data structure is designed for scalable querying of sequence datasets.

Main Results:

PAC offers a 3-6 fold improvement in construction time over comparable compressed methods.
Indexing the entire GenBank bacterial genome collection (3.5 TB) took one day.
Querying 500,000 transcript sequences was achieved in under an hour.

Conclusions:

PAC is a highly scalable and efficient data structure for querying massive sequence collections.
It represents a significant advancement for biological data accessibility and analysis.