Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.5K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.5K
RNA-seq03:21

RNA-seq

10.8K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
10.8K
Applications of Molecular Taxonomy01:20

Applications of Molecular Taxonomy

258
Molecular taxonomy has revolutionized the understanding and classification of bacteria, providing precise insights into their diversity, evolutionary relationships, and ecological roles. By utilizing molecular techniques such as DNA sequencing and fingerprinting, researchers have made significant strides in various fields related to bacterial studies.Resolving Taxonomic AmbiguitiesMolecular taxonomy has been instrumental in distinguishing closely related bacterial species initially thought to...
258
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

318
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
318
Signal Sequences and Sorting Receptors01:41

Signal Sequences and Sorting Receptors

11.8K
Signal sequences are short amino acid sequences that guide newly synthesized proteins to their proper location within the cell. Classical signal sequences are fifteen to sixty amino acids long and present at the N-terminus of a polypeptide chain. Each signal sequence has a conserved segment of basic residues towards their N terminus, a hydrophobic core, and a C-terminus rich in polar residues. The C-terminus also contains a signal cleavage site and features a -3 -1 sequence motif. The -3-1...
11.8K
Ribosome Profiling02:24

Ribosome Profiling

3.8K
Ribosome profiling or ribo-sequencing is a deep sequencing technique that produces a snapshot of active translation in a cell. It selectively sequences the mRNAs protected by ribosomes to get an insight into a cell’s translation landscape at any given point in time.
Applications of ribosome profiling
Ribosome profiling has many applications, including in vivo monitoring of translation inside a particular organ or tissue type and quantifying new protein synthesis levels.
The technique...
3.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SERIPH: A Two-Step Extraction Protocol for Selective Enrichment of Semi-Extractable RNAs.

RNA (New York, N.Y.)·2026
Same author

LinearCapR: linear-time computation of per-nucleotide structural-context probabilities of RNA without base-pair span limits.

Bioinformatics (Oxford, England)·2026
Same author

Real-Time Raman Monitoring of Photopolymerization in Rubber-Acrylate Networks for Assessing the Impact of Initiator Concentration on Grafting, Kinetic and Thermal Stability.

The journal of physical chemistry. B·2026
Same author

Fast MR elastography via deep learning-based phase interpolation: A technical feasibility study.

Magnetic resonance imaging·2026
Same author

Age-related decline in nuclear envelope LINC complex drives neuronal aging via axon initial segment dysfunction.

EMBO reports·2026
Same author

Simultaneous acquisition of MR elastography, Dixon water-fat separation images, and CT-like bone contrast using a gradient-echo multi-echo sequence.

Magnetic resonance imaging·2026
Same journal

From Pixels to Patterns: A Multidimensional Framework to Decode Cytoskeletal Organization.

Computational and structural biotechnology journal·2026
Same journal

A Large Concept Model for Mechanistic Simulation of Disease Trajectories: A Hypothesis-Generating Exemplar for Pediatric Acute Lymphoblastic Leukemia.

Computational and structural biotechnology journal·2026
Same journal

Adversarial Sequence Mutations in AlphaFold and ESMFold Reveal Nonphysical Structural Invariance, Confidence Failures, and Concerns for Protein Design.

Computational and structural biotechnology journal·2026
Same journal

High-Throughput Prediction of Protein-Protein Interactions Uncovers Hidden Molecular Networks in Biosynthetic Gene Clusters.

Computational and structural biotechnology journal·2026
Same journal

A Region-Aware Structured Framework Improves Prediction of Gene Expression from DNA Methylation.

Computational and structural biotechnology journal·2026
Same journal

Ensemble Machine Learning Approaches Predict Survival in Lower-Grade Glioma Based on Glycosphingolipid Gene Expression and Metabolic Modeling.

Computational and structural biotechnology journal·2026
See all related articles

Related Experiment Video

Updated: Nov 1, 2025

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

3.6K

Representation learning applications in biological sequence analysis.

Hitoshi Iuchi1,2, Taro Matsutani2,3, Keisuke Yamada4

  • 1Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.

Computational and Structural Biotechnology Journal
|June 18, 2021
PubMed
Summary
This summary is machine-generated.

Natural language processing (NLP) transforms biological sequences into vectors for analysis. This review explores representation learning methods for DNA, RNA, and protein sequence data, aiding function and structure estimation.

Keywords:
BERTNatural language processingRepresentation learningSequence analysisWord2vec

More Related Videos

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.4K
A Practical Guide to Phylogenetics for Nonexperts
12:00

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

35.6K

Related Experiment Videos

Last Updated: Nov 1, 2025

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

3.6K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.4K
A Practical Guide to Phylogenetics for Nonexperts
12:00

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

35.6K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • High-throughput sequencing generates vast biological data, posing analysis challenges.
  • Natural language processing (NLP) offers novel approaches for biological sequence analysis.
  • Biological sequences can be treated as text, with nucleic acids/amino acids as words.

Purpose of the Study:

  • To review existing knowledge on representation learning for biological sequence analysis.
  • To highlight the growing trend and importance of applying representation learning in biological research.
  • To provide a consolidated overview of current methods and applications.

Main Methods:

  • Literature review of representation learning techniques applied to biological sequences.
  • Analysis of how NLP concepts, such as word embedding, are adapted for DNA, RNA, and protein sequences.
  • Examination of the conversion of biological sequences into vector representations.

Main Results:

  • Representation learning enables the transformation of biological sequences into numerical vectors.
  • Vectorized sequences serve as input for function and structure prediction models.
  • Various embedding techniques are applicable to biological sequence data.

Conclusions:

  • Representation learning is a powerful approach for analyzing large-scale biological sequence data.
  • This methodology facilitates downstream tasks like function and structure estimation.
  • The integration of NLP and bioinformatics is crucial for advancing biological research.