Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
Genomics02:02

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
Genetic Lingo01:11

Genetic Lingo

Overview
DNA as a Genetic Template02:05

DNA as a Genetic Template

Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

3-(4-Hydroxy-3-methoxyphenyl) propionic acid prevents renal fibrosis and inflammation in CKD mice induced by adenine.

Scientific reports·2026
Same author

CRISPR-Cas9 disruption of flavanone 3-hydroxylase produces a green phenotype and alters flavone metabolites in allotetraploid perilla.

Frontiers in plant science·2026
Same author

Genome-Wide Identification of Transcriptional Start Sites and Candidate Enhancers Regulating Worker Metamorphosis in <i>Apis mellifera</i>.

Insects·2026
Same author

Genome assembly and annotation of the naked mole rat Heterocephalus glaber reared in Japan.

Scientific data·2026
Same author

Functional Annotation Workflow for Fungal Transcriptomes.

Journal of fungi (Basel, Switzerland)·2026
Same author

Functional annotation of novel heat stress-responsive genes in rice utilizing public transcriptomes and structurome.

Bioinformatics advances·2026
Same journal

One-Step Affinity Purification of MarathonRT Reverse Transcriptase for RNA Sequencing Applications.

Bio-protocol·2026
Same journal

Enhanced RNA-Seq Expression Profiling and Functional Enrichment in Non-model Organisms Using Custom Annotations.

Bio-protocol·2026
Same journal

Using Combined Fluorescent In Situ Hybridization With Immunohistochemistry to Co-localize mRNA in Diverse Neuronal Cell Types.

Bio-protocol·2026
Same journal

Stepwise Protocol for Alternative Splicing Analysis in Single-Cell SMART-Seq2 RNA-Seq Data.

Bio-protocol·2026
Same journal

Enriching Bacteria-Specific RNA From Host Samples Before NGS With Transcript-Capture.

Bio-protocol·2026
Same journal

RNA Detection Technologies: A MethodCentric Guide to Principles and Reproducibility.

Bio-protocol·2026
See all related articles

Related Experiment Video

Updated: May 21, 2026

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
07:09

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq

Published on: May 28, 2021

Workflow for Fine-Tuning and Evaluating DNA Language Models for Specific Genomics Issues.

Kazuki Nakamae1,2, Hidemasa Bono2,3,4

  • 1PtBio Inc., Higashi-Hiroshima, Japan.

Bio-Protocol
|May 20, 2026
PubMed
Summary
This summary is machine-generated.

This study provides a computational protocol for fine-tuning DNA language models like DNABERT-2 for genomic tasks. It details data preparation, model fine-tuning, and performance evaluation for sequence classification.

Keywords:
Cytosine base editorDNA language modelDNABERT-2Deep learningEPDnewFine-tuningPromoterRNA off-target

More Related Videos

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

Related Experiment Videos

Last Updated: May 21, 2026

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
07:09

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq

Published on: May 28, 2021

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

Area of Science:

  • Genomics
  • Computational Biology
  • Bioinformatics

Background:

  • DNA language models (e.g., DNABERT-2) show promise for predicting functional genomic elements.
  • Practical protocols for preparing data, fine-tuning, and evaluating these models are lacking.

Purpose of the Study:

  • To present a step-by-step computational protocol for preparing training data, fine-tuning DNABERT-2, and evaluating binary sequence classifiers.
  • To enable researchers to adapt DNA foundation models for new genomic applications.

Main Methods:

  • Developed a command-line protocol for creating DNABERT-2 compatible datasets from positive and negative sequence sets.
  • Fine-tuned the official DNABERT-2 model using prepared datasets.
  • Demonstrated protocol application using RNA off-target sites (PiCTURE pipeline) and core promoter prediction (EPDnew database).

Main Results:

  • Successfully prepared labeled DNA sequence datasets for DNABERT-2 fine-tuning.
  • Showcased reproducibility of DNABERT-2 implementation for binary classification.
  • Applied the protocol to RNA off-target prediction and core promoter identification.

Conclusions:

  • The protocol facilitates the adaptation of DNA foundation models for diverse genomic tasks.
  • Enables safety assessment of genome editing tools and functional annotation of regulatory sequences.
  • Provides a reproducible framework for applying DNABERT-2 to new biological problems.