Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genetic Lingo01:11

Genetic Lingo

100.8K
Overview
100.8K
Genomics02:02

Genomics

35.8K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
35.8K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

18.8K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
18.8K
DNA as a Genetic Template02:05

DNA as a Genetic Template

21.7K
Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
21.7K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

2.4K
2.4K
Next-generation Sequencing03:00

Next-generation Sequencing

87.3K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
87.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Replaying germinal center evolution on a quantified affinity landscape.

Cell·2026
Same author

Tree reconstruction guarantees from CRISPR-Cas9 lineage tracing data using Neighbor-Joining.

Genome research·2026
Same author

Hypervariable loop profiling decodes sequence determinants of antibody stability.

Nature structural & molecular biology·2026
Same author

Inference of germinal center evolutionary dynamics via simulation-based deep learning.

eLife·2026
Same author

Impact of device implantation depth on blood flow dynamics after left atrial appendage closure.

EuroIntervention : journal of EuroPCR in collaboration with the Working Group on Interventional Cardiology of the European Society of Cardiology·2026
Same author

Separating selection from mutation in antibody language models.

eLife·2026
Same journal

Beyond housekeeping: snRNA diversity, regulation, and human disease.

Trends in genetics : TIG·2026
Same journal

Rethinking mitochondrial metabolism: Intraindividual variability meets population constraints.

Trends in genetics : TIG·2026
Same journal

A role for epigenetics in rapid adaptation.

Trends in genetics : TIG·2026
Same journal

The myth of asexual fungi.

Trends in genetics : TIG·2026
Same journal

Rethinking molecular evolution through protein language model embeddings.

Trends in genetics : TIG·2026
Same journal

Co-transcriptional splicing: Distinct phases, mutual benefits, and basis for nuclear architecture.

Trends in genetics : TIG·2026
See all related articles

Related Experiment Video

Updated: Jun 4, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

498

Genomic language models: opportunities and challenges.

Gonzalo Benegas1, Chengzhong Ye2, Carlos Albors1

  • 1Computer Science Division, University of California, Berkeley, CA, USA.

Trends in Genetics : TIG
|January 3, 2025
PubMed
Summary
This summary is machine-generated.

Genomic language models (gLMs), a type of large language model (LLM) trained on DNA, offer powerful tools for understanding genome function and interactions. Developing effective gLMs for complex genomes remains challenging but holds significant potential for biomedical research.

Keywords:
genomic language modelsmachine learningsequence designtransfer learningvariant effect prediction

More Related Videos

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

9.2K
Transcriptomic Analysis of C. elegans RNA Sequencing Data Through the Tuxedo Suite on the Galaxy Project
10:19

Transcriptomic Analysis of C. elegans RNA Sequencing Data Through the Tuxedo Suite on the Galaxy Project

Published on: April 8, 2017

17.3K

Related Experiment Videos

Last Updated: Jun 4, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

498
Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

9.2K
Transcriptomic Analysis of C. elegans RNA Sequencing Data Through the Tuxedo Suite on the Galaxy Project
10:19

Transcriptomic Analysis of C. elegans RNA Sequencing Data Through the Tuxedo Suite on the Galaxy Project

Published on: April 8, 2017

17.3K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Large language models (LLMs) are revolutionizing scientific research, including the biomedical sciences.
  • Understanding biological sequences, particularly DNA, is a central goal in biology.
  • Genomic language models (gLMs) apply LLM principles to DNA sequences for genomic analysis.

Purpose of the Study:

  • To highlight the potential of gLMs in advancing genomic understanding.
  • To showcase key applications of gLMs in functional constraint prediction, sequence design, and transfer learning.
  • To discuss challenges and considerations in developing and evaluating gLMs, especially for complex genomes.

Main Methods:

  • Training LLMs on large datasets of DNA sequences.
  • Applying gLMs to predict functional constraints within genomes.
  • Utilizing gLMs for DNA sequence design and generation.
  • Exploring transfer learning techniques with gLMs across different species or genomic contexts.

Main Results:

  • gLMs demonstrate significant potential for deciphering genome function and DNA element interactions.
  • Applications in functional constraint prediction, sequence design, and transfer learning show promising results.
  • gLMs can help elucidate how DNA elements contribute to complex biological functions.

Conclusions:

  • gLMs represent a powerful emerging tool for genomic research with broad applicability.
  • Challenges remain in developing efficient and effective gLMs, particularly for large and complex genomes.
  • Further research into gLM development and evaluation is crucial for unlocking their full potential in genomics and the biomedical sciences.