Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

DNA as a Genetic Template02:05

DNA as a Genetic Template

Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
DNA as a Genetic Template02:05

DNA as a Genetic Template

Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
Nonsense-mediated mRNA Decay02:27

Nonsense-mediated mRNA Decay

The Upf proteins that carry out nonsense-mediated decay (NMD) are found in all eukaryotic organisms, including humans. Each protein has an individual role, but they need to work in collaboration. Upf1 is an ATP-dependent RNA helicase that unwinds the RNA helix. Because Upf1 can unwind any RNA, Upf2 and Upf3 are required to help Upf1 discriminate between nonsense and normal mRNAs.
Usually, Upf3 binds to an Exon Junction Complex (EJC) at mRNA splice sites. If a ribosome fully translates the mRNA,...
Nonsense-mediated mRNA Decay02:27

Nonsense-mediated mRNA Decay

The Upf proteins that carry out nonsense-mediated decay (NMD) are found in all eukaryotic organisms, including humans. Each protein has an individual role, but they need to work in collaboration. Upf1 is an ATP-dependent RNA helicase that unwinds the RNA helix. Because Upf1 can unwind any RNA, Upf2 and Upf3 are required to help Upf1 discriminate between nonsense and normal mRNAs.
Usually, Upf3 binds to an Exon Junction Complex (EJC) at mRNA splice sites. If a ribosome fully translates the mRNA,...
From DNA to Protein03:06

From DNA to Protein

The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma, which states that genes specify the sequence of mRNAs, which in turn specify the sequence of amino acids making up all proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. Because the information stored in DNA is so central to cellular function, it makes intuitive sense that the cell would make mRNA copies of this information for protein synthesis...
Leaky Scanning02:28

Leaky Scanning

During most eukaryotic translation processes, the small 40S ribosome subunit scans an mRNA from its 5' end until it encounters the first start AUG codon. The large 60S ribosomal subunit then joins the smaller one to initiate protein synthesis. The location of the translation initiation is largely determined by the nucleotides near the start codon as there may be multiple translation initiation sites present on the mRNA.  Marilyn Kozak discovered that the sequence RCCAUGG (where R stands for...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Data-driven AI system for learning how to run transcript assemblers.

Genome biology·2026
Same author

CodonRL: Multi-Objective Codon Sequence Optimization Using Demonstration-Guided Reinforcement Learning.

bioRxiv : the preprint server for biology·2026
Same author

seq2ribo: Structure-aware integration of machine learning and simulation to predict ribosome location profiles from RNA sequences.

bioRxiv : the preprint server for biology·2026
Same author

Augmenting Electronic Health Records for Adverse Event Detection.

medRxiv : the preprint server for health sciences·2026
Same author

ARCADE: Controllable Codon Design from Foundation Models via Activation Engineering.

bioRxiv : the preprint server for biology·2025
Same author

CodonMoE: DNA Language Models for mRNA Analyses.

ArXiv·2025
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: May 12, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

CodonMoE: DNA language models for codon-dependent mRNA prediction.

Shiyi Du1, Litian Liang1, Jiayi Li1

  • 1Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, PA 15213, United States.

Bioinformatics (Oxford, England)
|May 11, 2026
PubMed
Summary
This summary is machine-generated.

Genomic language models can now analyze RNA using DNA data with CodonMoE. This adapter enhances DNA models for RNA tasks, reducing parameters and computational cost for efficient genomic analysis.

More Related Videos

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data
08:23

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

Published on: February 18, 2022

Probing RNA Structure with Dimethyl Sulfate Mutational Profiling with Sequencing In Vitro and in Cells
10:34

Probing RNA Structure with Dimethyl Sulfate Mutational Profiling with Sequencing In Vitro and in Cells

Published on: December 9, 2022

Related Experiment Videos

Last Updated: May 12, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data
08:23

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

Published on: February 18, 2022

Probing RNA Structure with Dimethyl Sulfate Mutational Profiling with Sequencing In Vitro and in Cells
10:34

Probing RNA Structure with Dimethyl Sulfate Mutational Profiling with Sequencing In Vitro and in Cells

Published on: December 9, 2022

Area of Science:

  • Genomics
  • Computational Biology
  • Machine Learning

Background:

  • Genomic language models (gLMs) face efficiency challenges, requiring either separate models for DNA and RNA or large multi-modal architectures.
  • Both approaches lead to computational burdens, including redundant infrastructure or increased parameter counts and pretraining.
  • Existing methods struggle to efficiently bridge the gap between DNA and RNA genomic data analysis.

Purpose of the Study:

  • To develop a lightweight adapter, CodonMoE (Adaptive Mixture of Codon Reformative Experts), that enables DNA language models to analyze RNA sequences without specific RNA pretraining.
  • To establish CodonMoE as a universal approximator at the codon level for mapping codon sequences to RNA properties.
  • To improve the efficiency and performance of genomic language models for analyzing both DNA and RNA modalities.

Main Methods:

  • Introduced CodonMoE, a lightweight adapter designed to transform existing DNA language models for RNA analysis.
  • Theoretically analyzed CodonMoE's capability as a universal codon-level approximator.
  • Augmented DNA models (e.g., HyenaDNA) with CodonMoE and evaluated performance on four RNA prediction tasks (stability, expression, regulation).

Main Results:

  • DNA models augmented with CodonMoE significantly outperformed unmodified DNA models on RNA prediction tasks.
  • The HyenaDNA+CodonMoE models achieved state-of-the-art results, using 80% fewer parameters than specialized RNA models.
  • CodonMoE demonstrated sub-quadratic complexity while enhancing performance, offering a unified approach to genomic language modeling.

Conclusions:

  • CodonMoE provides an efficient solution to the modality gap in genomic language modeling by enabling DNA models to analyze RNA.
  • This approach leverages abundant DNA data, reduces computational overhead, and achieves superior performance on RNA-specific tasks.
  • CodonMoE offers a principled method for unifying genomic language modeling, enhancing efficiency and performance across modalities.