Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

10.6K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
10.6K
Ribosome Profiling02:24

Ribosome Profiling

3.4K
Ribosome profiling or ribo-sequencing is a deep sequencing technique that produces a snapshot of active translation in a cell. It selectively sequences the mRNAs protected by ribosomes to get an insight into a cell’s translation landscape at any given point in time.
Applications of ribosome profiling
Ribosome profiling has many applications, including in vivo monitoring of translation inside a particular organ or tissue type and quantifying new protein synthesis levels.
The technique...
3.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Zα and Zβ domains of ADAR1 and ZBP1 bind G-quadruplexes with nanomolar affinities, establishing Zβ as a G-quadruplex-specific domain.

Nucleic acids research·2026
Same author

Molecular dynamics simulations refine the pathogenicity of ACVRL1 kinase domain variants by quantifying impacts on ATP binding in pulmonary arterial hypertension.

Journal of structural biology·2026
Same author

The Chromaverse Is Colored by Triplexes Formed Through the Interactions of Noncoding RNAs with HNPRNPU, TP53, AGO, REL Proteins, Intrinsically-Disordered Regions, and Flipons.

International journal of molecular sciences·2026
Same author

G-Quadruplexes Abet Neuronal Burnout in ALS and FTD.

Antioxidants (Basel, Switzerland)·2026
Same author

Prediction of protein-protein interactions using point transformer and spherical Convex Hull graphs.

Computational and structural biotechnology journal·2026
Same author

Control of Gene Expression by Proteins That Bind Many Alternative Nucleic Acid Structures Through the Same Domain.

International journal of molecular sciences·2026
Same journal

From Pixels to Patterns: A Multidimensional Framework to Decode Cytoskeletal Organization.

Computational and structural biotechnology journal·2026
Same journal

A Large Concept Model for Mechanistic Simulation of Disease Trajectories: A Hypothesis-Generating Exemplar for Pediatric Acute Lymphoblastic Leukemia.

Computational and structural biotechnology journal·2026
Same journal

Adversarial Sequence Mutations in AlphaFold and ESMFold Reveal Nonphysical Structural Invariance, Confidence Failures, and Concerns for Protein Design.

Computational and structural biotechnology journal·2026
Same journal

High-Throughput Prediction of Protein-Protein Interactions Uncovers Hidden Molecular Networks in Biosynthetic Gene Clusters.

Computational and structural biotechnology journal·2026
Same journal

A Region-Aware Structured Framework Improves Prediction of Gene Expression from DNA Methylation.

Computational and structural biotechnology journal·2026
Same journal

Ensemble Machine Learning Approaches Predict Survival in Lower-Grade Glioma Based on Glycosphingolipid Gene Expression and Metabolic Modeling.

Computational and structural biotechnology journal·2026
See all related articles

Related Experiment Video

Updated: May 17, 2025

Single-Molecule Fluorescence Visualization of DNA Polymerase Dynamics at G-Quadruplexes
05:37

Single-Molecule Fluorescence Visualization of DNA Polymerase Dynamics at G-Quadruplexes

Published on: April 4, 2025

300

Benchmarking DNA large language models on quadruplexes.

Oleksandr Cherednichenko1, Alan Herbert1,2, Maria Poptsova1

  • 1International Laboratory of Bioinformatics, HSE University, Moscow, Russia.

Computational and Structural Biotechnology Journal
|March 31, 2025
PubMed
Summary
This summary is machine-generated.

This study benchmarks large language models (LLMs) for whole-genome G-quadruplex (GQ) annotation, finding that different LLM architectures excel at detecting distinct functional genomic elements.

Keywords:
CaduseusDNABERTFliponsFoundation modelG-quadruplexesHyenaDNALarge language modelMAMBA-DNANon-B DNA

More Related Videos

Author Spotlight: Characterizing DNA G-Quadruplex by Bis-3-Chloropiperidine Based Chemical Mapping
05:32

Author Spotlight: Characterizing DNA G-Quadruplex by Bis-3-Chloropiperidine Based Chemical Mapping

Published on: May 12, 2023

1.2K
Single-molecule Manipulation of G-quadruplexes by Magnetic Tweezers
08:28

Single-molecule Manipulation of G-quadruplexes by Magnetic Tweezers

Published on: September 19, 2017

7.9K

Related Experiment Videos

Last Updated: May 17, 2025

Single-Molecule Fluorescence Visualization of DNA Polymerase Dynamics at G-Quadruplexes
05:37

Single-Molecule Fluorescence Visualization of DNA Polymerase Dynamics at G-Quadruplexes

Published on: April 4, 2025

300
Author Spotlight: Characterizing DNA G-Quadruplex by Bis-3-Chloropiperidine Based Chemical Mapping
05:32

Author Spotlight: Characterizing DNA G-Quadruplex by Bis-3-Chloropiperidine Based Chemical Mapping

Published on: May 12, 2023

1.2K
Single-molecule Manipulation of G-quadruplexes by Magnetic Tweezers
08:28

Single-molecule Manipulation of G-quadruplexes by Magnetic Tweezers

Published on: September 19, 2017

7.9K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Large language models (LLMs) show promise in predicting genomic elements.
  • Selecting the optimal LLM for specific tasks like whole-genome annotation remains challenging.
  • LLMs in genomics are broadly categorized into transformer-based, long convolution-based, and state-space models (SSMs).

Purpose of the Study:

  • To benchmark different large language model (LLM) architectures for whole-genome annotation of G-quadruplexes (GQ).
  • To evaluate the performance of transformer-based, long convolution-based, and state-space models in identifying GQ structures.
  • To determine which LLM architectures are best suited for specific downstream genomic tasks.

Main Methods:

  • Benchmarking three LLM architectures (transformer-based, long convolution-based, SSMs) for whole-genome G-quadruplex (GQ) mapping.
  • Evaluating model performance using F1 and Matthews Correlation Coefficient (MCC) metrics.
  • Analyzing whole-genome annotations to identify distinct functional elements recovered by each model type.

Main Results:

  • All evaluated LLMs performed comparably, with DNABERT-2 and HyenaDNA showing superior F1 and MCC scores.
  • HyenaDNA demonstrated enhanced recovery of quadruplexes in distal enhancers and intronic regions.
  • Different LLM architectures, particularly HyenaDNA and Caduceus versus transformer-based models, showed distinct patterns in de novo quadruplex generation.

Conclusions:

  • LLM architectures with varying context lengths can detect distinct functional regulatory elements.
  • The choice of LLM architecture is crucial for specific genomic tasks, as different models offer complementary strengths.
  • This study highlights the importance of selecting appropriate LLMs for accurate and comprehensive whole-genome annotation.