Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Methods of Classification and Identification01:28

Methods of Classification and Identification

Bacterial identification relies on a diverse array of techniques to classify and understand microorganisms, each tailored to uncover specific characteristics. Traditional morphological approaches, while still valuable, are limited for closely related or structurally simple organisms. Modern methods integrate biochemical, serological, genetic, and advanced molecular tools to achieve greater accuracy.Morphological and Biochemical TechniquesMorphological characteristics, such as cell shape and...
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
Applications of Molecular Taxonomy01:20

Applications of Molecular Taxonomy

Molecular taxonomy has revolutionized the understanding and classification of bacteria, providing precise insights into their diversity, evolutionary relationships, and ecological roles. By utilizing molecular techniques such as DNA sequencing and fingerprinting, researchers have made significant strides in various fields related to bacterial studies.Resolving Taxonomic AmbiguitiesMolecular taxonomy has been instrumental in distinguishing closely related bacterial species initially thought to...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Scalable HMO-CNN-SVM Framework for Skin Lesion Classification: A Metaheuristic-Driven Approach With Parallelizable Optimization for Cluster Deployment.

Biomedical engineering and computational biology·2026
Same author

Comprehensive control strategy for standalone photovoltaic systems with integrated optimum power harvesting and voltage regulation through microcontroller in the loop experimentation.

Scientific reports·2025
Same author

Sperm swarm optimization for many objective power flow problems with enhanced performance evaluation in power systems.

Scientific reports·2025
Same author

A decentralized power injection-based approach for voltage imbalance mitigation in three-phase distribution networks.

Scientific reports·2025
Same author

Multiple-to-single maximum power point tracking for empowering conventional MPPT algorithms under partial shading conditions.

Scientific reports·2025
Same author

Hierarchical multi step Gray Wolf optimization algorithm for energy systems optimization.

Scientific reports·2025
Same journal

Hidden in the Pangenome? Machine Learning-Driven Discovery of Antimicrobial Potential in <i>Corynebacterium glutamicum</i>.

Bioinformatics and biology insights·2026
Same journal

<i>In silico</i> Design and Analysis of Engineered Proteins Containing Multi-Epitope and Immunodominant Domains Derived From <i>Rickettsia prowazekii A</i>ntigens.

Bioinformatics and biology insights·2026
Same journal

Chemoinformatic Approaches to Identify Bioactive Inhibitors Against Type I Dehydroquinase (DHQ1) Enzyme of Typhoidal <i>Salmonella</i>.

Bioinformatics and biology insights·2026
Same journal

Web-Based Graphical User Interface Design Integrating MATLAB Server for the Mathematical Model of Human Cardiovascular-Respiratory System.

Bioinformatics and biology insights·2026
Same journal

A Novel Bioinformatics Pipeline and a Machine-Learning Approach for Antimicrobial Resistance Phenotypic Prediction.

Bioinformatics and biology insights·2026
Same journal

Integrated Computational Profiling Links Position-specific O-Methylation to Spontaneous Complexation and Improved Binding in Flavonoid-HER2 Systems.

Bioinformatics and biology insights·2026
See all related articles
  1. Home
  2. Listeria Genome Identification Using Dnabert Embedding With Lightgbm And Shap-based Explainable Classification.
  1. Home
  2. Listeria Genome Identification Using Dnabert Embedding With Lightgbm And Shap-based Explainable Classification.

Related Experiment Video

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Listeria Genome Identification Using DNABERT Embedding With LightGBM and SHAP-Based Explainable Classification.

Sajeev Ram Arumugam1, Ananth J P2, Sankar Ganesh Karuppasamy1

  • 1Department of CSE, Vel Tech Rangarajan Dr.Sagunthala R&D Institute of Science and Technology, Chennai, India.

Bioinformatics and Biology Insights
|June 3, 2026

View abstract on PubMed

Summary
This summary is machine-generated.

This study introduces an explainable genomic classification framework using DNABERT and LightGBM for accurate Listeria monocytogenes identification. The novel approach achieves 95% accuracy, enhancing food safety surveillance with interpretable results.

Keywords:
DNABERTLightGBMListeria genomesSHAPexplainable artificial intelligence (XAI)k-mer embeddingstransformer-based models

More Related Videos

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data
04:57

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data

Published on: May 16, 2022

Listeria monocytogenes Infection of the Brain
05:02

Listeria monocytogenes Infection of the Brain

Published on: October 2, 2018

Related Experiment Videos

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data
04:57

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data

Published on: May 16, 2022

Listeria monocytogenes Infection of the Brain
05:02

Listeria monocytogenes Infection of the Brain

Published on: October 2, 2018

Area of Science:

  • Genomics and Bioinformatics
  • Computational Biology
  • Food Safety Science

Background:

  • Accurate whole-genome identification of Listeria monocytogenes is crucial for food safety and outbreak prevention.
  • Current methods (culture-based, PCR, NGS) are often slow, labor-intensive, or use non-interpretable machine learning models.
  • There is a need for efficient, accurate, and explainable genomic identification tools for pathogen surveillance.

Purpose of the Study:

  • To develop an explainable genomic classification framework for distinguishing Listeria from non-Listeria bacterial genomes.
  • To couple transformer-based DNA embeddings with gradient boosting for high-performance genome classification.
  • To utilize SHapley Additive exPlanations (SHAP) for model interpretability and identification of genomic signatures.

Main Methods:

  • Assembled and filtered a dataset of 700 bacterial genomes (350 Listeria monocytogenes, 350 non-Listeria).
  • Encoded genomes using DNABERT for contextual DNA embeddings based on 6-mer tokenization.
  • Classified embeddings using a LightGBM model and interpreted predictions with SHAP analysis.

Main Results:

  • The DNABERT + LightGBM + SHAP pipeline achieved 95.00% corrected accuracy, classifying 665/700 genomes correctly.
  • Achieved high performance metrics: 94.37% precision, 95.71% recall, 95.03% F1-score, and 0.9976 AUC.
  • Outperformed conventional methods including k-mer based Random Forest, TF-IDF + SVM, CNN, XGBoost, and DNABERT + Logistic Regression.

Conclusions:

  • The proposed framework offers a high-performance and interpretable solution for genome-scale Listeria identification.
  • SHAP analysis identified discriminative sequence patterns potentially indicative of Listeria genomic signatures.
  • This approach provides a transferable template for explainable pathogen genomics in food safety and public health.