Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.6K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.6K
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

395
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
395

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Mpox contextual data specification package: a data curation toolkit to support collaborative pathogen genomic surveillance.

Microbial genomics·2026
Same author

Sporadic detection of vaccine-derived poliovirus type 2 using next-generation sequencing in Canadian wastewater in August of 2022.

Scientific reports·2025
Same author

The Canadian VirusSeq Data Portal and Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology.

Microbial genomics·2024
Same author

The Canadian VirusSeq Data Portal & Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology.

ArXiv·2024
Same author

Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications.

Microbial genomics·2023
Same author

Field-based detection of bacteria using nanopore sequencing: Method evaluation for biothreat detection in complex samples.

PloS one·2023
Same journal

Running exercise alleviates chronic heart failure by promoting cardiomyocyte autophagic flux through the NEAT1-QKI affecting Beclin1/LC3B mRNA stability.

Biology direct·2026
Same journal

The PTHR1/PKA/CREB1 axis promotes osteosarcoma progression by activating the PVT1/miR-590-3p/AXIN2 ceRNA network to induce epithelial-mesenchymal transition.

Biology direct·2026
Same journal

Identification and prognostic analysis of genes related to CTNNB1 mutations in hepatocellular carcinoma.

Biology direct·2026
Same journal

TrxR1 inhibition sensitizes hepatocellular carcinoma to Motesanib via an autophagy-ROS-JNK/ER stress axis.

Biology direct·2026
Same journal

Integrated microbiome-metabolome analysis implicates Acinetobacter guillouiae in arachidonic acid metabolic remodeling and endometrial cancer cell proliferation.

Biology direct·2026
Same journal

Comprehensive multi-omics analysis reveals a fatty acid metabolism gene signature for prognostic assessment and immunotherapy in nasopharyngeal carcinoma, and identifies ABCC1 as a potential novel therapeutic target.

Biology direct·2026
See all related articles

Related Experiment Video

Updated: Nov 26, 2025

Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

18.8K

Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data.

Julie Chih-Yu Chen1, Andrea D Tyler2

  • 1National Microbiology Laboratory, Public Health Agency of Canada, 1015 Arlington Street, Winnipeg, Manitoba, R3E 3R2, Canada. chih-yu.chen@canada.ca.

Biology Direct
|December 11, 2020
PubMed
Summary
This summary is machine-generated.

Metagenomic sequencing accurately predicts sample origin when origins are known. Predicting novel origins is challenging, but ambiguity analysis aids inference. Technical and analytical choices impact prediction accuracy.

Keywords:
CAMDALasso regularizationMachine learningMetaSUBMetagenomicsMicrobiomeMulticlass classificationMultivariate regression

More Related Videos

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

4.3K
Purifying the Impure: Sequencing Metagenomes and Metatranscriptomes from Complex Animal-associated Samples
11:23

Purifying the Impure: Sequencing Metagenomes and Metatranscriptomes from Complex Animal-associated Samples

Published on: December 22, 2014

37.5K

Related Experiment Videos

Last Updated: Nov 26, 2025

Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

18.8K
A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

4.3K
Purifying the Impure: Sequencing Metagenomes and Metatranscriptomes from Complex Animal-associated Samples
11:23

Purifying the Impure: Sequencing Metagenomes and Metatranscriptomes from Complex Animal-associated Samples

Published on: December 22, 2014

37.5K

Area of Science:

  • Microbial Ecology
  • Bioinformatics
  • Machine Learning

Background:

  • Metagenomic sequencing reveals microbial patterns useful for sample origin prediction.
  • Machine learning models accurately predict sample origin when origins are pre-sampled.
  • The 2019 CAMDA challenge datasets were used to assess prediction methods.

Purpose of the Study:

  • Evaluate the influence of technical, analytical, and machine learning approaches on sample origin prediction.
  • Assess the accuracy of predicting novel sample origins.
  • Compare regression and classification models for origin prediction.

Main Methods:

  • Compared 16S rRNA amplicon and shotgun sequencing, and metagenomic analytical tools (Kraken2, Bracken).
  • Employed Lasso-regularized multivariate regression for geographic coordinate prediction.
  • Utilized Leave-1-city-out and 10-fold cross-validation to assess model robustness.
  • Developed a strategy based on prediction ambiguity for novel origin inference.

Main Results:

  • Shotgun sequencing with Kraken2/Bracken showed higher detection sensitivity for microbial abundance.
  • Prediction errors were significantly higher in Leave-1-city-out validation, indicating challenges with novel origins.
  • Regression and classification models performed comparably on known origins but struggled with new ones.
  • Including data from different sequencing protocols increased prediction error.

Conclusions:

  • Metagenomics enables accurate sample origin prediction for known origins.
  • Predicting novel origins remains a significant challenge for both regression and classification models.
  • Prediction ambiguity analysis offers a strategy to identify samples from new origins.
  • Sequencing techniques, protocols, and analytical/machine learning methods impact prediction accuracy.