Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.6K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.6K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.9K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.9K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

12.6K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
12.6K
Next-generation Sequencing03:00

Next-generation Sequencing

97.8K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
97.8K
Signal Sequences and Sorting Receptors01:41

Signal Sequences and Sorting Receptors

14.8K
Signal sequences are short amino acid sequences that guide newly synthesized proteins to their proper location within the cell. Classical signal sequences are fifteen to sixty amino acids long and present at the N-terminus of a polypeptide chain. Each signal sequence has a conserved segment of basic residues towards their N terminus, a hydrophobic core, and a C-terminus rich in polar residues. The C-terminus also contains a signal cleavage site and features a -3 -1 sequence motif. The -3-1...
14.8K
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

590
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
590

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

DHCRWOA: adaptive whale optimization algorithm with Cauchy-Rayleigh distribution for numerical and engineering design optimization.

Scientific reports·2026
Same author

FD-MSP: feature decoupling network with multi-scale prototypes for domain-adaptive polyp segmentation.

BMC medical imaging·2026
Same author

CPT1A drives cisplatin resistance via acetylation‑dependent activation of DRP1 and mitochondrial fission in small cell lung cancer.

Cell death & disease·2026
Same author

MURM-A*: An Improved A* Within Comprehensive Path-Planning Scheme for Cellular-Connected Multi-UAVs Based on Radio Map and Complex Network.

Sensors (Basel, Switzerland)·2026
Same author

A Geometric Whale Optimization Algorithm with Triangular Flight for Numerical Optimization and Engineering Design.

Scientific reports·2026
Same author

SGAFNet: Robust brain tumor segmentation via learnable sequence-guided adaptive fusion in available MRI acquisitions.

Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society·2026
Same journal

Spatial Heterogeneity of Phytoplankton Taxa and Functional Groups Under Multidimensional Environmental Factors in Karst Urban Rivers.

Biology·2026
Same journal

Paleopathology of a Lower Miocene Carettochelyid Turtle from the Moghra Formation, Egypt.

Biology·2026
Same journal

Effects of Type I Diabetes Mellitus and Masticatory Loading on Mandibular Growth in Growing Rats: A Longitudinal CBCT Study.

Biology·2026
Same journal

Data-Limited Stock Status Assessment of Bonga Shad, <i>Ethmalosa fimbriata</i> (Bowdich, 1825) and Lesser African Threadfin, <i>Galeoides decadactylus</i> (Bloch, 1795) in the Central Gulf of Guinea.

Biology·2026
Same journal

Gonadogenesis in the Bearded Dragon (<i>Pogona vitticeps</i>, Agamidae): A Comprehensive Histological Analysis from Gonadal Ridge Formation to Testicular and Ovarian Development.

Biology·2026
Same journal

The Programmable Microbiome: Integrative AI and Multi-Omics Frameworks for Precision T2DM Management.

Biology·2026
See all related articles

Related Experiment Video

Updated: Jan 16, 2026

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

9.1K

Biological Sequence Representation Methods and Recent Advances: A Review.

Hongwei Zhang1, Yan Shi2, Yapeng Wang1

  • 1Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China.

Biology
|September 27, 2025
PubMed
Summary
This summary is machine-generated.

Biological sequence representation methods are evolving from computational and word embedding techniques to advanced large language models (LLMs). These LLM-based methods enhance machine learning for genomics, drug discovery, and disease prediction.

Keywords:
biological sequencecomputationallarge language modelmachine learningword embedding

More Related Videos

Pyrosequencing for Microbial Identification and Characterization
12:37

Pyrosequencing for Microbial Identification and Characterization

Published on: August 22, 2013

47.9K
A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
07:09

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq

Published on: May 28, 2021

10.4K

Related Experiment Videos

Last Updated: Jan 16, 2026

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

9.1K
Pyrosequencing for Microbial Identification and Characterization
12:37

Pyrosequencing for Microbial Identification and Characterization

Published on: August 22, 2013

47.9K
A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq
07:09

A Bioinformatics Pipeline for Investigating Molecular Evolution and Gene Expression using RNA-seq

Published on: May 28, 2021

10.4K

Area of Science:

  • Computational biology
  • Bioinformatics
  • Machine learning in biology

Background:

  • Biological sequence representation is crucial for machine learning in computational biology.
  • Methods transform nucleotide and protein sequences for enhanced predictive modeling.

Purpose of the Study:

  • To review and categorize biological-sequence representation methods.
  • To detail principles, applications, and limitations of different methods.
  • To outline future directions in the field.

Main Methods:

  • Categorization into computational-based, word embedding-based, and LLM-based methods.
  • Analysis of k-mer counting, PSSM, Word2Vec, GloVe, and Transformer architectures (ESM3, RNAErnie).

Main Results:

  • Computational methods capture statistical/evolutionary patterns.
  • Word embedding methods capture contextual relationships.
  • LLM-based methods model long-range dependencies for superior accuracy in tasks like RNA structure prediction.

Conclusions:

  • Challenges include computational complexity and interpretability.
  • Future work focuses on multimodal data integration and explainable AI.
  • Advancements promise transformative applications in drug discovery, disease prediction, and genomics.