Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Maxam-Gilbert Sequencing

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...

Sanger Sequencing

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...

Multi-species Conserved Sequences

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Transcriptomic responses to endurance exercise training in rats.

BMC genomic data·2026

Same author

Systems genetics reveals ITIH5 as a key mediator of adipocyte-Endothelial crosstalk.

Molecular metabolism·2026

Same author

The Rayleigh Quotient and Contrastive Principal Component Analysis II.

bioRxiv : the preprint server for biology·2026

Same author

Hybrid crosses reveal a cell-type-specific landscape of mouse regulatory variation.

bioRxiv : the preprint server for biology·2026

Same author

The impact of package selection and versioning on single-cell RNA-seq analysis.

Cell systems·2026

Same author

Uniform pre-processing of bacterial single-cell RNA-seq.

bioRxiv : the preprint server for biology·2026

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 15, 2025

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources

Published on: September 3, 2009

Metadata retrieval from sequence databases with ffq.

Ángel Gálvez-Merchán¹, Kyung Hoi Joseph Min², Lior Pachter^1,3

¹Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.

Bioinformatics (Oxford, England)

|January 7, 2023

Summary

This summary is machine-generated.

ffq is a new command-line tool that efficiently queries genomic databases for sequence data and metadata. It leverages database structure to extract information, making genomic data more accessible.

More Related Videos

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

The ITS2 Database

The ITS2 Database

Published on: March 12, 2012

Related Experiment Videos

Last Updated: Aug 15, 2025

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources

Published on: September 3, 2009

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

The ITS2 Database

The ITS2 Database

Published on: March 12, 2012

Area of Science:

Bioinformatics
Genomics
Data Science

Background:

Genomic databases contain vast amounts of sequence data and associated metadata.
Existing tools lack specialized functionality to exploit the hierarchical structure of these databases for metadata extraction.

Purpose of the Study:

To develop a command-line tool, ffq, for efficient querying of genomic sequence data and metadata.
To enable users to retrieve metadata and raw data links using accession numbers or DOIs.

Main Methods:

ffq is a command-line tool designed for querying sequence databases.
It processes user-provided accessions or DOIs to fetch relevant metadata.
The tool outputs data and metadata in JSON format.

Main Results:

ffq efficiently retrieves metadata and raw data links from genomic databases.
The tool is extensible to various genomic databases that offer programmatic access.
Metadata and data links are provided in a structured JSON format.

Conclusions:

ffq provides a novel and efficient solution for accessing genomic data and metadata.
Its design facilitates integration with diverse genomic databases.
The tool enhances the accessibility and usability of large-scale genomic datasets.