Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

RNA-seq03:21

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...
Cluster Sampling Method01:20

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SKiM: accurately classifying metagenomic ONT reads in limited memory.

Bioinformatics (Oxford, England)·2025
Same author

Disambiguating a Soft Metagenomic Clustering.

Journal of computational biology : a journal of computational molecular cell biology·2025
Same author

SCEMENT: scalable and memory efficient integration of large-scale single-cell RNA-sequencing data.

Bioinformatics (Oxford, England)·2025
Same author

GraphSlimmer: Preserving Read Mappability with the Minimum Number of Variants.

Journal of computational biology : a journal of computational molecular cell biology·2024
Same author

Coriolis: enabling metagenomic classification on lightweight mobile devices.

Bioinformatics (Oxford, England)·2023
Same author

MCPNet: a parallel maximum capacity-based genome-scale gene network construction framework.

Bioinformatics (Oxford, England)·2023
Same journal

CNV-ECOD: A copy number variation detection method based on ECOD algorithm using next-generation sequencing data.

Journal of bioinformatics and computational biology·2026
Same journal

ReinVar: A model-free paradigm-based reinforcement learning approach to detect copy number variation.

Journal of bioinformatics and computational biology·2026
Same journal

When pipelines run but coordinates fail: A simple spatial specificity check for false locality in post-GWAS analysis.

Journal of bioinformatics and computational biology·2026
Same journal

Comparative benchmarking of template-based, evolutionary-diffusion, and generative language models for IsPETase structure prediction.

Journal of bioinformatics and computational biology·2026
Same journal

Trap spaces as labelled ideals of SCC posets: A structural-functional theory of reachability in asynchronous boolean networks.

Journal of bioinformatics and computational biology·2026
Same journal

Erratum - DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.

Journal of bioinformatics and computational biology·2026
See all related articles

Related Experiment Video

Updated: May 14, 2026

Large-Scale Screens of Metagenomic Libraries
16:05

Large-Scale Screens of Metagenomic Libraries

Published on: May 28, 2007

Large-scale metagenomic sequence clustering on map-reduce clusters.

Xiao Yang1, Jaroslaw Zola, Srinivas Aluru

  • 1Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA. xiaoyang@broadinstitute.org

Journal of Bioinformatics and Computational Biology
|February 23, 2013
PubMed
Summary
This summary is machine-generated.

This study introduces a parallel algorithm for taxonomic clustering of metagenomic data, enabling efficient species identification from millions of DNA fragments. The method uses sketching and map-reduce for high-quality, scalable clustering of large biological datasets.

More Related Videos

Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization
12:37

Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

Published on: April 14, 2016

Related Experiment Videos

Last Updated: May 14, 2026

Large-Scale Screens of Metagenomic Libraries
16:05

Large-Scale Screens of Metagenomic Libraries

Published on: May 28, 2007

Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization
12:37

Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

Published on: April 14, 2016

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Metagenomics involves analyzing DNA from environmental samples, posing challenges in taxonomic clustering of numerous species.
  • Current methods for clustering millions of DNA fragments are computationally intensive and time-consuming.

Purpose of the Study:

  • To develop a parallel algorithm for efficient and accurate taxonomic clustering of large metagenomic datasets.
  • To enable the identification of overlapping clusters within complex microbial communities.

Main Methods:

  • Utilized sketching techniques to rapidly assess sequence similarities, avoiding costly all-versus-all comparisons.
  • Formulated the classification problem as maximal quasi-clique enumeration in a similarity graph.
  • Implemented the algorithm using the map-reduce framework for cloud-based scalability.

Main Results:

  • The parallel algorithm successfully performed high-quality taxonomic clustering on metagenomic samples with millions of reads.
  • Achieved reasonable computation times on a modest-sized cluster, demonstrating efficiency.
  • The framework supports overlapping clusters, providing a more nuanced view of community structure.

Conclusions:

  • The proposed parallel algorithm offers an efficient and scalable solution for taxonomic clustering in metagenomics.
  • This approach facilitates the analysis of complex microbial communities, advancing our understanding of biodiversity.
  • The cloud-ready implementation makes advanced metagenomic analysis accessible.