Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.2K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.2K
DNA as a Genetic Template02:05

DNA as a Genetic Template

22.9K
Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
22.9K
Gene Duplication and Divergence02:37

Gene Duplication and Divergence

6.4K
The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was  generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are...
6.4K
Complementary DNA01:44

Complementary DNA

29.9K
Overview
29.9K
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

158
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
158
DNA Base Pairing02:27

DNA Base Pairing

29.6K
29.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Simple Guidance Mechanisms for Discrete Diffusion Models.

... International Conference on Learning Representations·2026
Same author

The Diffusion Duality.

Proceedings of machine learning research·2026
Same author

Calibrated Probabilistic Forecasts for Arbitrary Sequences.

Transactions on machine learning research·2026
Same author

BLOCK DIFFUSION: INTERPOLATING BETWEEN AU-TOREGRESSIVE AND DIFFUSION LANGUAGE MODELS.

... International Conference on Learning Representations·2026
Same author

PlantCAD2: A Long-Context DNA Language Model for Cross-Species Functional Annotation in Angiosperms.

bioRxiv : the preprint server for biology·2025
Same author

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks.

Proceedings of machine learning research·2025
Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Sep 18, 2025

Design and Synthesis of a Reconfigurable DNA Accordion Rack
07:44

Design and Synthesis of a Reconfigurable DNA Accordion Rack

Published on: August 15, 2018

7.2K

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling.

Yair Schiff1, Chia-Hsiang Kao1, Aaron Gokaslan1

  • 1Department of Computer Science, Cornell University, New York, NY USA.

Proceedings of Machine Learning Research
|June 26, 2025
PubMed
Summary
This summary is machine-generated.

We introduce Caduceus, a novel DNA language model architecture that effectively models long-range genomic sequences. Caduceus significantly improves performance on downstream tasks, outperforming larger models without bi-directionality or equivariance.

More Related Videos

Analyzing and Building Nucleic Acid Structures with 3DNA
16:24

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

20.7K
Designing a Bio-responsive Robot from DNA Origami
13:32

Designing a Bio-responsive Robot from DNA Origami

Published on: July 8, 2013

22.4K

Related Experiment Videos

Last Updated: Sep 18, 2025

Design and Synthesis of a Reconfigurable DNA Accordion Rack
07:44

Design and Synthesis of a Reconfigurable DNA Accordion Rack

Published on: August 15, 2018

7.2K
Analyzing and Building Nucleic Acid Structures with 3DNA
16:24

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

20.7K
Designing a Bio-responsive Robot from DNA Origami
13:32

Designing a Bio-responsive Robot from DNA Origami

Published on: July 8, 2013

22.4K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Large-scale sequence modeling has advanced rapidly, now impacting biology and genomics.
  • Genomic sequence modeling presents unique challenges, including long-range dependencies, upstream/downstream effects, and DNA's reverse complementarity (RC).

Purpose of the Study:

  • To propose a novel sequence modeling architecture addressing the challenges of genomic data.
  • To develop the first family of RC-equivariant, bi-directional, long-range DNA language models.

Main Methods:

  • Extended the long-range Mamba block to create BiMamba for bi-directionality.
  • Developed MambaDNA, incorporating RC equivariance.
  • Introduced pre-training and fine-tuning strategies for Caduceus DNA foundation models.

Main Results:

  • Caduceus demonstrates superior performance on downstream benchmarks compared to previous long-range models.
  • On a variant effect prediction task, Caduceus outperformed models 10x larger that lacked bi-directionality or equivariance.

Conclusions:

  • Caduceus represents a significant advancement in DNA language modeling.
  • The proposed architecture effectively captures complex genomic sequence characteristics, enabling state-of-the-art performance.