Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Comparing clustering and pre-processing in taxonomy analysis.

Marc J Bonder1, Sanne Abeln, Egija Zaura

  • 1Department of Preventive Dentistry, Academic Centre for Dentistry Amsterdam-ACTA, University of Amsterdam and VU University Amsterdam, VU University Amsterdam, The Netherlands. bonder.m.j@gmail.com

Bioinformatics (Oxford, England)
|September 11, 2012
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Condition-dependent amorphous protein agglomerates control cytoplasmic rheology.

Molecular cell·2026
Same author

Gap Analysis of Metabolic Conversions of Off-Flavors and Antinutrients in Plant-Based Substrates.

Comprehensive reviews in food science and food safety·2026
Same author

Oral-heart axis from pregnancy and postpartum: maternal oral microbiota relates with cardiac reverse remodeling.

Journal of oral microbiology·2026
Same author

Impact of Electronic Cigarette Use on the Oral Microbiota: A Systematic Review.

Journal of clinical periodontology·2026
Same author

Unique ecology of co-occurring functionally and phylogenetically undescribed species in the infant oral microbiome.

PLoS computational biology·2026
Same author

Nicotine pouches, oral cancer and tobacco harm reduction: current evidence and research priorities.

Frontiers in oral health·2026
Same journal

MCFST: Spatial domain identification method based on multi-view graph convolutional network and graph fusion network.

Bioinformatics (Oxford, England)·2026
Same journal

SpaBiT: Enhancing Spatial Transcriptomics Resolution via Bidirectional Attention Transformers.

Bioinformatics (Oxford, England)·2026
Same journal

EDEL: Enhancing Dense Retrievers for Curation of Biomedical Knowledge Bases.

Bioinformatics (Oxford, England)·2026
Same journal

Informative Relational Learning for Adverse Reaction Prediction with Enhanced Generalization to Novel Drugs.

Bioinformatics (Oxford, England)·2026
Same journal

An interpretable deep learning framework uncovers features governing CRISPR-Cas9 genome-editing efficiency.

Bioinformatics (Oxford, England)·2026
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
See all related articles

Data pre-processing significantly impacts 16S rRNA amplicon sequencing accuracy. Denoising and chimera checking before clustering improved oral microbial community analysis, outperforming various clustering algorithms alone.

Area of Science:

  • Microbiology
  • Bioinformatics
  • Genomics

Background:

  • Massively parallel sequencing enables rapid analysis of large sequence datasets.
  • 16S ribosomal RNA (rRNA) amplicon sequencing is widely used for studying microbial communities.
  • Pre-processing steps before clustering are suggested to improve data accuracy.

Purpose of the Study:

  • To assess the impact of various data pre-processing steps on the accuracy of 16S rRNA sequence clustering.
  • To evaluate the performance of different clustering algorithms in analyzing oral microbial communities.
  • To determine optimal pre-processing strategies for 16S rRNA amplicon data.

Main Methods:

  • Applied combinations of data pre-processing techniques (denoising, chimera checking) to 16S rRNA sequence data.

Related Experiment Videos

  • Utilized various clustering algorithms to group sequences into operational taxonomic units.
  • Assessed cluster accuracy using metrics such as purity and normalized mutual information.
  • Main Results:

    • The number of clusters varied significantly (up to two orders of magnitude) based on pre-processing methods.
    • Pre-processing with both denoising and chimera checking yielded cluster counts closest to the actual species number in a mock dataset.
    • Differences in clustering accuracy among algorithms were minor compared to the impact of pre-processing steps.

    Conclusions:

    • Data pre-processing is a critical step that significantly influences the accuracy of 16S rRNA amplicon sequencing.
    • Combined denoising and chimera checking represent an effective pre-processing strategy for oral microbial community analysis.
    • No single clustering algorithm consistently outperformed others; pre-processing has a greater impact on accuracy.