Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

DNA as a Genetic Template02:05

DNA as a Genetic Template

9.3K
9.3K
DNA as a Genetic Template02:05

DNA as a Genetic Template

27.3K
Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
27.3K
Improving Translational Accuracy02:07

Improving Translational Accuracy

14.0K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
14.0K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.5K
3.5K
From DNA to Protein03:06

From DNA to Protein

21.9K
The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma, which states that genes specify the sequence of mRNAs, which in turn specify the sequence of amino acids making up all proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. Because the information stored in DNA is so central to cellular function, it makes intuitive sense that the cell would make mRNA copies of this information for protein synthesis...
21.9K
The DNA Replication Fork01:02

The DNA Replication Fork

18.1K
18.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Optical mapping reveals a higher level of large-scale structural variants in a family with paternally transmitted myotonic dystrophy and independent Parkinson's disease.

The Journal of pathology·2026
Same author

Epigenetic modulation of prostate cancer disparities in men with African ancestry.

Nature reviews. Urology·2026
Same author

A unique transcriptomic landscape defines African-specific grade group 1 prostate cancer.

Research square·2026
Same author

A catalogue of early diverged contemporary human genome variation reveals distinct Khoe-San populations.

Nature communications·2026
Same author

Scoping review of artificial intelligence via mobile technology and social media for health in Africa.

Nature communications·2025
Same author

Methylation reprogramming associated with aggressive prostate cancer and ancestral disparities.

Molecular systems biology·2025
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Jan 13, 2026

Folding and Characterization of a Bio-responsive Robot from DNA Origami
07:59

Folding and Characterization of a Bio-responsive Robot from DNA Origami

Published on: December 3, 2015

15.0K

Fine-tuning a sentence transformer for DNA.

Mpho Mokoatle1,2, Vukosi Marivate3, Darlington Mapiye4

  • 1Department of Computer Science, University of Pretoria, Pretoria, South Africa. mphomokoatle64@gmail.com.

BMC Bioinformatics
|October 30, 2025
PubMed
Summary
This summary is machine-generated.

Fine-tuning a natural language sentence transformer for DNA sequences shows promise. The proposed model offers a practical balance of performance and accuracy, outperforming DNABERT and providing a viable alternative to the nucleotide transformer.

Keywords:
BERTDNABERTSentence transformersSimCSEThe nucleotide transformer

More Related Videos

Author Spotlight: Advancements in DNA Nanosensors – Addressing Sensitivity and Selectivity Challenges in Molecular Detection
07:16

Author Spotlight: Advancements in DNA Nanosensors – Addressing Sensitivity and Selectivity Challenges in Molecular Detection

Published on: February 9, 2024

1.5K
Author Spotlight: Developing Synthetic Cells from Programmable Amphiphilic DNA Nanostructures
08:02

Author Spotlight: Developing Synthetic Cells from Programmable Amphiphilic DNA Nanostructures

Published on: May 31, 2024

1.4K

Related Experiment Videos

Last Updated: Jan 13, 2026

Folding and Characterization of a Bio-responsive Robot from DNA Origami
07:59

Folding and Characterization of a Bio-responsive Robot from DNA Origami

Published on: December 3, 2015

15.0K
Author Spotlight: Advancements in DNA Nanosensors – Addressing Sensitivity and Selectivity Challenges in Molecular Detection
07:16

Author Spotlight: Advancements in DNA Nanosensors – Addressing Sensitivity and Selectivity Challenges in Molecular Detection

Published on: February 9, 2024

1.5K
Author Spotlight: Developing Synthetic Cells from Programmable Amphiphilic DNA Nanostructures
08:02

Author Spotlight: Developing Synthetic Cells from Programmable Amphiphilic DNA Nanostructures

Published on: May 31, 2024

1.4K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Sentence-transformers library enables text embedding for various applications.
  • Vector space embeddings allow for similarity-based text analysis.
  • Existing DNA transformers like DNABERT and Nucleotide transformer are domain-specific.

Purpose of the Study:

  • Fine-tune a natural language sentence transformer for DNA sequences.
  • Evaluate its performance on eight benchmark tasks.
  • Compare its efficacy against DNABERT and the Nucleotide transformer.

Main Methods:

  • Fine-tuning a sentence transformer model on DNA text.
  • Evaluating the model across eight benchmark tasks.
  • Comparative analysis with DNABERT and Nucleotide transformer.

Main Results:

  • The proposed model's DNA embeddings surpassed DNABERT in several tasks.
  • The Nucleotide transformer showed higher classification accuracy but with significant computational costs.
  • The proposed model demonstrated competitive performance and efficiency, especially in retrieval tasks.

Conclusions:

  • The fine-tuned sentence transformer provides a viable option for DNA sequence analysis.
  • It balances performance and accuracy, offering a practical alternative for resource-constrained settings.
  • The proposed model outperforms DNABERT and presents a more computationally feasible option than the Nucleotide transformer.