Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

From DNA to Protein03:06

From DNA to Protein

19.1K
The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma, which states that genes specify the sequence of mRNAs, which in turn specify the sequence of amino acids making up all proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. Because the information stored in DNA is so central to cellular function, it makes intuitive sense that the cell would make mRNA copies of this information for protein synthesis...
19.1K
Combinatorial Gene Control02:33

Combinatorial Gene Control

8.4K
Combinatorial gene control is the synergistic action of several transcriptional factors to regulate the expression of a single gene. The absence of one or more of these factors may lead to a significant difference in the level of gene expression or repression.
The expression of more than 30,000 genes is controlled by approximately 2000-3000 transcription factors. This is possible because a single transcription factor can recognize more than one regulatory sequence. The specificity in gene...
8.4K
tRNA Activation02:26

tRNA Activation

19.9K
Aminoacyl-tRNA synthetases are present in both eukaryotes and bacteria. Though eukaryotes have 20 different aminoacyl-tRNA synthetases to couple to 20 amino acids, many bacteria do not have genes for all of these aminoacyl-tRNA synthetases. Despite this, they still use all 20 amino acids to synthesize their proteins. For instance, some bacteria do not have the gene encoding the enzyme that couples glutamine with its partner tRNA. In these organisms, one enzyme adds glutamic acid to all of the...
19.9K
DNA as a Genetic Template02:05

DNA as a Genetic Template

22.6K
Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
22.6K
The Central Dogma01:25

The Central Dogma

128.3K
Overview
128.3K
Conservative Site-specific Recombination and Phase Variation02:53

Conservative Site-specific Recombination and Phase Variation

6.1K
Because the DNA segments are cut and reorganized in a direction-specific manner, site-specific recombination has emerged as an efficient genetic engineering technique. Flippase and Cyclization recombinases or Flp and Cre, respectively, are two members of the tyrosine recombinase family derived from bacteriophages, that are used to mediate site-specific DNA insertions, deletions, and targeted expression of proteins in mammalian cell lines.
The recognition sites for Cre recombinase called LoxP...
6.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Adaptive genomic evolution and WD40-regulated temporal dynamics of anthocyanins support leaf photoplasticity in Parrotia subaequalis.

BMC plant biology·2026
Same author

Pathogenic Mechanisms of V. vulnificus and Its Role in the Development of Sepsis.

MicrobiologyOpen·2026
Same author

CauFinder: Steering Cell-State and Phenotype Transitions by Causal Disentanglement Learning.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026
Same author

Mesenchymal Stem Cell Therapy for Type 2 Diabetes: Synergistic β-Cell Regeneration, Immune Modulation, and Exosome-Mediated Glucose Homeostasis.

Stem cells international·2026
Same author

Single-cell immunoprofiling reveals a dysfunctional-like immune microenvironment and malignant phenotype in aging penile squamous cell carcinoma.

Cell biology and toxicology·2026
Same author

The Adjunctive Role of Botulinum Toxin A in Wound Healing and Scar Management.

Aesthetic plastic surgery·2026
Same journal

Layered social competition coordinates reproductive hierarchy formation in ants.

bioRxiv : the preprint server for biology·2026
Same journal

Combination epigenetic-targeted therapy increases the immunogenicity of poorly immunogenic sarcomas.

bioRxiv : the preprint server for biology·2026
Same journal

Loss of LanC-like proteins delays post-injury regeneration of aging skeletal muscles.

bioRxiv : the preprint server for biology·2026
Same journal

Integrative Transfer Network: Deep Transfer Learning Across Populations and Prediction Targets.

bioRxiv : the preprint server for biology·2026
Same journal

Confidence-supported label-free metabolic imaging with FPhaS phase autofluorescence microscopy.

bioRxiv : the preprint server for biology·2026
Same journal

Sequence-encoded autoinhibition couples mRNA decapping activity to phase separation.

bioRxiv : the preprint server for biology·2026
See all related articles

Related Experiment Video

Updated: Sep 9, 2025

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
10:58

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules

Published on: July 25, 2013

17.1K

ARCADE: Controllable Codon Design from Foundation Models via Activation Engineering.

Jiayi Li1, Litian Liang1, Shiyi Du1

  • 1Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15217, US.

Biorxiv : the Preprint Server for Biology
|September 2, 2025
PubMed
Summary
This summary is machine-generated.

ARCADE offers flexible control over mRNA codon sequence design by using activation engineering and pretrained genomic models. This approach enhances programmable biological sequence design for applications like mRNA vaccines and gene editing therapies.

Keywords:
activation engineeringcontrollable sequence generationgenomic language model (gLM)mRNA designtest time adaptation

Frequently Asked Questions

More Related Videos

Identifying Amino Acid Overproducers Using Rare-Codon-Rich Markers
10:41

Identifying Amino Acid Overproducers Using Rare-Codon-Rich Markers

Published on: June 24, 2019

8.5K
Residue-specific Incorporation of Noncanonical Amino Acids into Model Proteins Using an Escherichia coli Cell-free Transcription-translation System
11:47

Residue-specific Incorporation of Noncanonical Amino Acids into Model Proteins Using an Escherichia coli Cell-free Transcription-translation System

Published on: August 1, 2016

16.1K

Related Experiment Videos

Last Updated: Sep 9, 2025

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
10:58

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules

Published on: July 25, 2013

17.1K
Identifying Amino Acid Overproducers Using Rare-Codon-Rich Markers
10:41

Identifying Amino Acid Overproducers Using Rare-Codon-Rich Markers

Published on: June 24, 2019

8.5K
Residue-specific Incorporation of Noncanonical Amino Acids into Model Proteins Using an Escherichia coli Cell-free Transcription-translation System
11:47

Residue-specific Incorporation of Noncanonical Amino Acids into Model Proteins Using an Escherichia coli Cell-free Transcription-translation System

Published on: August 1, 2016

16.1K

Area of Science:

  • Computational biology and synthetic genomics.
  • The intersection of deep learning foundation models and controllable codon design for therapeutic mRNA development.
  • Activation engineering techniques for programmable biological sequence design.

Background:

The synthesis of functional Messenger Ribonucleic Acid (mRNA) sequences requires the precise arrangement of nucleotide triplets to ensure efficient protein expression and long-term stability within the complex cellular environment. Prior research has shown that codon selection significantly influences the translation rate and the proper folding of the resulting polypeptide chain by affecting transfer RNA (tRNA) availability and ribosomal movement. Traditional optimization strategies often rely on fixed algorithms that target specific metrics like the codon adaptation index or sequence-wide guanine-cytosine levels to improve protein yield. These established techniques frequently struggle to balance multiple competing functional properties simultaneously during the design phase, often leading to sub-optimal sequences for sophisticated therapeutic applications. Existing computational frameworks often require extensive retraining or fine-tuning when new design objectives are introduced to the pipeline, which consumes significant time and computational resources. This absence of evidence motivated the development of a more adaptable system capable of steering sequence generation toward diverse biological targets without the need for exhaustive model updates.

Purpose Of The Study:

This research introduces ARCADE to provide a flexible and controllable framework for generating optimized codon sequences from pretrained genomic foundation models that have learned biological syntax. The investigators sought to overcome the rigidity of current codon optimization methods by leveraging the internal representations of large-scale neural networks that already understand complex genomic patterns. The study focuses on enabling the modulation of continuous biological metrics, such as thermodynamic stability and sequence composition, without the need for model retraining or architectural modifications. By implementing activation engineering, the team aimed to manipulate the model's latent space to achieve specific functional outcomes in the output mRNA that are critical for therapeutic efficacy. The project specifically targets the improvement of mRNA vaccines and gene editing therapies through enhanced sequence programmability and the ability to meet diverse design constraints. The researchers intended to show that semantic steering vectors could effectively guide the generation process toward desired phenotypic traits by shifting the model's internal activations. This approach provides a versatile tool for synthetic biologists who require precise control over the genetic instructions they engineer for medical use.

Main Methods:

The researchers developed the ARCADE framework by applying activation engineering techniques to pretrained genomic foundation models that capture the underlying syntax of biological sequences across various organisms. The team defined biologically meaningful semantic steering vectors within the activation space of the neural network to represent specific functional directions for sequence optimization. These vectors were designed to modulate continuous-valued properties, including the Codon Adaptation Index (CAI), which measures the usage of preferred codons, and the Minimum Free Energy (MFE). The experimental setup also included the manipulation of GC content to assess the degree of control over sequence composition and its impact on mRNA half-life and stability. The methodology avoids the computationally expensive process of retraining the foundation model for each specific design task by directly intervening in the model's hidden layers during inference. The performance of ARCADE was evaluated by comparing its output against existing codon optimization approaches across multiple design objectives to ensure its utility in real-world biological engineering. This comparative analysis allowed the team to quantify the improvements in flexibility and precision offered by their novel activation-based steering method.

Main Results:

ARCADE showed superior performance and significantly greater flexibility compared to traditional codon optimization methodologies across all tested biological metrics and sequence generation tasks. The implementation of semantic steering vectors allowed for the direct and precise modulation of the Codon Adaptation Index (CAI) within the generated sequences, facilitating optimal translation. The framework successfully controlled the Minimum Free Energy (MFE) of the mRNA, which is a fundamental factor for secondary structure stability and resistance to enzymatic degradation. The researchers observed that GC content could be precisely adjusted through the activation engineering process without degrading the overall sequence quality or functional potential of the mRNA. Experimental data confirmed that the foundation model's inherent knowledge could be effectively harnessed for programmable biological sequence design through simple vector-based interventions in the latent space. The results indicated that the proposed approach maintains high functional integrity while providing a broader range of controllable parameters than previous tools used in synthetic biology. These findings highlight the efficiency of using activation engineering to steer foundation models toward specific biological goals without the need for additional training data.

Conclusions:

The findings suggest that activation engineering represents a powerful paradigm for the precise design of therapeutic mRNA sequences with tailored functional properties for clinical use. The researchers conclude that ARCADE offers a scalable solution for developing novel mRNA vaccines with optimized translation and stability profiles that can be adapted to specific viral targets. The ability to control multiple biological metrics simultaneously may accelerate the creation of more effective gene editing therapies by ensuring high expression of editing enzymes in target cells. Future applications of this technology could extend to other areas of synthetic biology where sequence-to-function mapping is essential for the design of synthetic genes and regulatory elements. The study establishes a foundation for using genomic foundation models as highly adaptable tools for programmable molecular engineering without the need for task-specific fine-tuning or retraining. The authors propose that this framework will reduce the computational burden associated with designing complex biological systems by providing a more efficient path to sequence optimization. This technological advancement could significantly shorten the development timelines for new genetic medicines and improve the precision of synthetic biology interventions.

ARCADE defines semantic steering vectors within the activation space of pretrained genomic foundation models to modulate the Codon Adaptation Index (CAI). By shifting the model's internal representations, the framework guides the generation process toward sequences with optimized codon usage without requiring any retraining of the underlying neural network.

The ARCADE framework enables the direct modulation of the Codon Adaptation Index (CAI), Minimum Free Energy (MFE), and GC content. These continuous-valued properties are adjusted by defining specific vectors in the model's activation space, allowing for precise control over the functional characteristics of the designed mRNA.

Activation engineering was selected because it allows for flexible control over biological metrics like Minimum Free Energy (MFE) without the computational cost of retraining. This approach leverages the inherent knowledge of pretrained genomic foundation models, enabling rapid adaptation to various design objectives in mRNA vaccine development.

The authors developed ARCADE to address the design requirements of novel mRNA vaccines and gene editing therapies. The framework is specifically tailored to generate codon sequences with desired functional properties, though its flexibility allows it to adapt to various other programmable biological sequence design tasks.

The study's authors propose that ARCADE underscores the potential for advancing programmable biological sequence design by harnessing pretrained genomic foundation models. They conclude that this framework provides a far greater level of flexibility than existing codon optimization approaches, facilitating the development of complex synthetic biological systems.