Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Next-generation Sequencing03:00

Next-generation Sequencing

87.4K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
87.4K
Sanger Sequencing01:57

Sanger Sequencing

752.8K
DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
752.8K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

11.1K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
11.1K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

5.7K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
5.7K
Leaky Scanning02:28

Leaky Scanning

5.1K
During most eukaryotic translation processes, the small 40S ribosome subunit scans an mRNA from its 5' end until it encounters the first start AUG codon. The large 60S ribosomal subunit then joins the smaller one to initiate protein synthesis. The location of the translation initiation is largely determined by the nucleotides near the start codon as there may be multiple translation initiation sites present on the mRNA.  Marilyn Kozak discovered that the sequence RCCAUGG (where R...
5.1K
Genomics02:02

Genomics

35.9K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
35.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same authorSame journal

CXR-LT 2026 Challenge: Multi-Center Long-Tailed and Zero Shot Chest X-ray Classification.

ArXiv·2026
Same author

Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2026
Same author

Orchestrator multi-agent clinical decision support system for secondary headache diagnosis in primary care.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

Mesh-represented and learning-empowered hologram synthesis for full 3D holographic displays.

Nature communications·2026
Same author

Structural characterization and functional evaluation of deer sinew peptide-calcium chelate for intestinal calcium transport and osteogenic differentiation.

International journal of biological macromolecules·2026
Same author

EventTracer: Fast Path Tracing-Based Event Stream Rendering.

IEEE transactions on visualization and computer graphics·2026
Same journal

Optimization in Sparse 2D to Dense 3D Weakly Supervised Learning: Application to Multi-Label Segmentation of Large ex vivo MRI Data.

ArXiv·2026
Same journal

Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering.

ArXiv·2026
Same journal

Characterizing Universal Object Representations Across Vision Models.

ArXiv·2026
Same journal

What Do Biomedical NER and Entity Linking Benchmarks Measure? A Corpus-Centric Diagnostic Framework.

ArXiv·2026
Same journal

The Origin of Life in the Light of Evolution.

ArXiv·2026
See all related articles

Related Experiment Video

Updated: Jun 5, 2025

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

33.6K

Deciphering genomic codes using advanced NLP techniques: a scoping review.

Shuyan Cheng1, Yishu Wei1, Yiliang Zhou1

  • 1Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065.

Arxiv
|December 9, 2024
PubMed
Summary
This summary is machine-generated.

Natural Language Processing (NLP) and Large Language Models (LLMs) are revolutionizing genomic data analysis. These advanced AI techniques, including tokenization and transformer models, enhance understanding of genomic codes and aid in predicting regulatory elements.

Keywords:
Large Language ModelsNatural Language Processinggenomic sequencing dataregulatory annotations

More Related Videos

Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

18.2K
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
07:55

Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes

Published on: May 31, 2011

10.4K

Related Experiment Videos

Last Updated: Jun 5, 2025

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

33.6K
Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

18.2K
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
07:55

Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes

Published on: May 31, 2011

10.4K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Human genomic sequencing data is vast and complex, posing analytical challenges.
  • Effective interpretation requires advanced computational methodologies.

Purpose of the Study:

  • To review Natural Language Processing (NLP) techniques, specifically Large Language Models (LLMs) and transformer architectures, for genomic data analysis.
  • To assess data and model accessibility in recent literature concerning NLP in genomics.
  • To understand the capabilities and limitations of NLP tools in processing genomic sequencing data.

Main Methods:

  • A scoping review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
  • Searches conducted across major scientific databases (PubMed, Medline, Scopus, Web of Science, Embase, ACM Digital Library).
  • Inclusion criteria focused on NLP methodologies applied to genomic sequencing data, with no date or article type restrictions.

Main Results:

  • 26 studies published between 2021 and April 2024 were reviewed.
  • Tokenization and transformer models significantly improve genomic data processing and comprehension.
  • Applications include predicting regulatory annotations such as transcription-factor binding sites and chromatin accessibility.

Conclusions:

  • NLP and LLMs offer a promising approach to streamline large-scale genomic data interpretation.
  • These AI tools can advance personalized medicine through efficient and scalable genomic analysis.
  • Further research is necessary to address limitations, improve model transparency, and enhance applicability.