Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

321
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
321
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.5K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.5K
RNA-seq03:21

RNA-seq

10.8K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
10.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Morphological and metabarcoding dietary analysis of the cunner wrasse (Tautogolabrus adspersus) revealed significant regional variation, with large overlap between its common prey species and biofouling animals living on salmonid sea cages.

Journal of fish biology·2025
Same author

20 years of bibliometric data illustrates a lack of concordance between journal impact factor and fungal species discovery in systematic mycology.

MycoKeys·2024
Same author

Sequence signatures within the genome of SARS-CoV-2 can be used to predict host source.

Microbiology spectrum·2024
Same author

A pile of pipelines: An overview of the bioinformatics software for metabarcoding data analyses.

Molecular ecology resources·2023
Same author

MetaWorks: A flexible, scalable bioinformatic pipeline for high-throughput multi-marker biodiversity assessments.

PloS one·2022
Same author

Comparison of traditional and DNA metabarcoding samples for monitoring tropical soil arthropods (Formicidae, Collembola and Isoptera).

Scientific reports·2022
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Nov 5, 2025

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

8.9K

Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and

T M Porter1, M Hajibabaei2

  • 1Department of Integrative Biology and Centre for Biodiversity Genomics, University of Guelph, 50 Stone Road East, Guelph, ON, Canada. terrimporter@gmail.com.

BMC Bioinformatics
|May 20, 2021
PubMed
Summary
This summary is machine-generated.

This study developed a method to filter pseudogenes, non-functional gene copies, from DNA barcoding and metabarcoding data. Incorporating this pseudogene screening improves the accuracy of COI sequence analysis in large datasets.

Keywords:
BioinformaticsCOI mtDNADNA barcodeHidden Markov modelMetabarcodeNuMTNuclear encoded mitochondrial sequencesPseudogene

More Related Videos

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles
10:23

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

347
Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons
10:24

Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons

Published on: August 29, 2014

84.0K

Related Experiment Videos

Last Updated: Nov 5, 2025

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

8.9K
A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles
10:23

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

347
Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons
10:24

Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons

Published on: August 29, 2014

84.0K

Area of Science:

  • Genomics
  • Bioinformatics
  • Molecular Evolution

Background:

  • Pseudogenes, non-functional gene copies, can distort DNA barcoding and metabarcoding results.
  • Current bioinformatics pipelines lack specific methods to identify and remove pseudogenes from protein-coding marker genes.
  • Nuclear mitochondrial DNA segments (nuMTs) are a type of pseudogene that can interfere with COI-based analyses.

Purpose of the Study:

  • To develop and implement a method for screening nuclear mitochondrial DNA segments (nuMTs) in large COI datasets.
  • To assess the impact of pseudogene removal on metabarcoding data accuracy.
  • To integrate a pseudogene filtering step into existing bioinformatics pipelines for COI metabarcoding.

Main Methods:

  • Characterized gene and nuMT features using an artificial COI barcode dataset.
  • Simulated nuMTs in community datasets to evaluate pseudogene removal methods.
  • Utilized open reading frame (ORF) length and Hidden Markov Model (HMM) profile analysis for pseudogene detection.
  • Incorporated a pseudogene filtering step into an Illumina paired-end COI metabarcoding pipeline.

Main Results:

  • Identifying nuMTs is more challenging with shorter metabarcoding sequences compared to full-length DNA barcodes.
  • High percentages of nuMTs in datasets further complicate their identification.
  • Existing pipelines remove some nuMTs, but an additional pseudogene filter can remove up to 5% more sequences.
  • ORF length and HMM analysis effectively identified pseudogenes in simulations.

Conclusions:

  • Open reading frame length filtering, with or without HMM analysis, effectively screens pseudogenes from large datasets.
  • Further research is needed on the frequency, taxonomic distribution, and evolution of COI nuMTs.
  • Encourages submission of verified COI nuMTs to public databases to aid future studies.