Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

DNA as a Genetic Template02:05

DNA as a Genetic Template

6.8K
6.8K
DNA Base Pairing02:27

DNA Base Pairing

26.9K
26.9K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

18.8K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
18.8K
Base-pairing and DNA Repair02:27

Base-pairing and DNA Repair

64.6K
64.6K
Nucleic Acid Structure01:25

Nucleic Acid Structure

6.1K
The pentose sugar in DNA is deoxyribose, while in RNA the pentose sugar is ribose. The difference between the sugars is the presence of the hydroxyl group on the ribose's second carbon and a hydrogen on the deoxyribose's second carbon. The phosphate residue attaches to the hydroxyl group of the 5′ carbon of one sugar and the hydroxyl group of the 3′ carbon of the sugar of the next nucleotide, which forms  a 5′ to 3′ phosphodiester linkage.
DNA Structure
DNA...
6.1K
Nucleic acids02:43

Nucleic acids

160.9K
Nucleic acids are the most important macromolecules for the continuity of life. They carry the cell's genetic blueprint and carry instructions for its functioning.
DNA and RNA
The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes and in the organelles, chloroplasts, and mitochondria. In prokaryotes,...
160.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Direct pharmacological targeting of asparagine synthetase to overcome resistance to L-asparaginase in ALL therapy.

JCI insight·2026
Same author

Evaluation of season and sex specific differential gonadal transcriptomics in Himalayan snow trout Schizothorax plagiostomus, Heckel, 1838.

Scientific reports·2026
Same author

TURBO-RL: turbulence mitigation using reinforcement learning for severe optical aberrations.

Journal of the Optical Society of America. A, Optics, image science, and vision·2026
Same author

Canonical microRNA loss drives tumor development, implicating therapeutic efficacy of enoxacin in angiosarcoma.

RNA (New York, N.Y.)·2026
Same author

Exploring tumor dynamics and responses of prostate cancer to IL-27 based treatment combinations through biodynamic imaging and RNA sequencing analyses.

Scientific reports·2025
Same author

The nonsteroidal anti-inflammatory drug sulindac reverses obesity-driven immunosuppression and triple-negative breast cancer progression.

Breast cancer research : BCR·2025
Same journal

Region-aware bridge modeling enables interpretable mesoscale representation of spatial transcriptomic tissue sections.

Bioinformatics advances·2026
Same journal

Microbiome differential abundance methodologies to detect relevant taxa associated with chemotherapy toxicity rate in colorectal cancer.

Bioinformatics advances·2026
Same journal

maldipickr dereplicates microbial MALDI-TOF spectra to facilitate multiplexed isolation.

Bioinformatics advances·2026
Same journal

RAM-MSA: an anytime memory-bounded method for exact multiple sequence alignment using path finding.

Bioinformatics advances·2026
Same journal

Interpretable machine learning for low-sample multi-omics: a case study of ferret vaccine response.

Bioinformatics advances·2026
Same journal

DeepTaxa: a hybrid CNN-BERT framework for 16S rRNA taxonomic classification.

Bioinformatics advances·2026
See all related articles

Related Experiment Video

Updated: Jun 15, 2025

DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation
09:26

DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation

Published on: December 29, 2021

4.1K

Understanding the natural language of DNA using encoder-decoder foundation models with byte-level precision.

Aditya Malusare1,2, Harish Kothandaraman2, Dipesh Tamboli3

  • 1School of Industrial Engineering, Purdue University, West Lafayette, IN 47907, United States.

Bioinformatics Advances
|August 23, 2024
PubMed
Summary
This summary is machine-generated.

The Ensemble Nucleotide Byte-level Encoder-Decoder (ENBED) model analyzes DNA sequences precisely. This foundation model achieves state-of-the-art results in genomic tasks like enhancer identification and error detection.

More Related Videos

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources
15:28

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources

Published on: September 3, 2009

20.2K
Analyzing and Building Nucleic Acid Structures with 3DNA
16:24

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

20.5K

Related Experiment Videos

Last Updated: Jun 15, 2025

DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation
09:26

DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation

Published on: December 29, 2021

4.1K
Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources
15:28

Primer Extension Capture: Targeted Sequence Retrieval from Heavily Degraded DNA Sources

Published on: September 3, 2009

20.2K
Analyzing and Building Nucleic Acid Structures with 3DNA
16:24

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

20.5K

Area of Science:

  • Genomics
  • Bioinformatics
  • Machine Learning

Background:

  • Genomic sequence analysis traditionally relies on tokenization methods.
  • Existing models often use encoder-only or decoder-only architectures.
  • Byte-level precision in DNA sequence analysis is crucial for detailed insights.

Purpose of the Study:

  • Introduce the Ensemble Nucleotide Byte-level Encoder-Decoder (ENBED) foundation model.
  • Develop an efficient sequence-to-sequence transformation model for genomic data.
  • Generalize and improve upon existing genomic foundation models.

Main Methods:

  • Utilized an encoder-decoder Transformer architecture with subquadratic attention.
  • Pretrained the model using Masked Language Modeling on reference genome sequences.
  • Applied the model to downstream tasks including functional annotation and error detection.

Main Results:

  • Achieved state-of-the-art performance in identifying enhancers, promotors, and splice sites.
  • Successfully recognized sequences with base call mismatches and insertion/deletion errors.
  • Demonstrated significant improvements in biological function annotation and viral mutation generation.

Conclusions:

  • ENBED offers superior byte-level precision for genomic sequence analysis.
  • The foundation model generalizes effectively across diverse downstream tasks.
  • ENBED represents a significant advancement in computational genomics.