Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Per-Unit Sequence Models

Per-Unit Sequence Models

An ideal Y-Y transformer, grounded through neutral impedances, displays per-unit sequence networks akin to those of a single-phase ideal transformer when subjected to balanced positive- or negative-sequence currents. These currents do not produce neutral currents, and their associated voltage drops.
Zero-sequence currents, which are identical in magnitude and phase, generate a neutral current, resulting in voltage drops across the neutral impedance and the low-voltage winding. If the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The Riemann Hypothesis manifested in dynamical quantum phase transitions.

Nature communications·2026

Same author

A Camera-Based Multimodal Defect Sensing Framework for Substation Equipment Monitoring via Cross-Modal Feature Mapping.

Sensors (Basel, Switzerland)·2026

Same author

Multiparticle entanglement of nuclear spins in silicon.

Nature communications·2026

Same author

Icariin attenuates diabetic cardiomyopathy by inhibiting NLRP3 inflammasome through SIRT3-mediated TFAM deacetylation.

Frontiers in pharmacology·2026

Same author

NEDD8 Promotes the Progression and Inflammation of Keratoconus by Increasing the Expression of YAP1.

Investigative ophthalmology & visual science·2026

Same author

Comprehensive analyses of archaeal viral genomes reveal genomic characteristics, divergence, and host interactions.

Microbiome·2026

Same journal

Multi-view knowledge-guided flow subgraphs with substructure initialization for explainable DDI prediction.

Briefings in functional genomics·2026

Same journal

Genetically supported mediators linking peripheral metabolism to cerebral ischemia: a multi-omics characterization of HMGCR, TLR4, and MMP9 in angina pectoris and stroke.

Briefings in functional genomics·2026

Same journal

Whole-transcriptome sequencing reveals hypoxic esophageal squamous cell carcinoma-derived migrasomes driving cancer-associated fibroblast activation.

Briefings in functional genomics·2026

Same journal

An integrative meta-analysis of SARS-CoV-2 RNA-protein interactomes identifies conserved host factors shared with other RNA viruses.

Briefings in functional genomics·2026

Same journal

Retraction and replacement of: An integrated complete-genome sequencing and systems biology approach to predict antimicrobial resistance genes in the virulent bacterial strains of Moraxella catarrhalis.

Briefings in functional genomics·2026

Same journal

An integrated complete-genome sequencing and systems biology approach to predict antimicrobial resistance genes in the virulent bacterial strains of Moraxella catarrhalis.

Briefings in functional genomics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 17, 2026

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

Language model-based self-training reduces labeled data requirements by 99% for biological sequence classification.

Jingwen Liu¹, Danmo Gao¹, Yan Yuan²

¹School of Computer Science and Artificial Intelligence, Hubei University of Technology, 28 Nanli Road, Hongshan District, Wuhan 430068, China.

Briefings in Functional Genomics

|June 15, 2026

Summary

This summary is machine-generated.

This study introduces a novel framework integrating pre-trained language models (PLMs) with semi-supervised learning (SSL) for biological sequence function prediction. The method significantly enhances accuracy with minimal labeled data, outperforming traditional approaches.

Keywords:

biological sequence classification pre-trained language models self-training

Related Experiment Videos

Last Updated: Jun 17, 2026

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

Area of Science:

Computational Biology
Bioinformatics
Genomics

Background:

Predicting biological sequence function is crucial for understanding disease mechanisms and genetic variation.
Existing methods face challenges due to limited labeled data and complex sequence context modeling.
Previous research has explored semi-supervised learning (SSL) and pre-trained language models (PLMs) separately, overlooking their combined potential.

Purpose of the Study:

To develop an integrated framework combining PLMs and SSL for improved biological sequence function prediction.
To leverage PLMs for feature extraction and SSL for decision boundary refinement.
To demonstrate the framework's effectiveness in low-resource settings for tasks like DNA-binding protein (DBP) and non-coding RNA (ncRNA) detection.

Main Methods:

Utilized PLMs as feature extractors to capture sequence semantics from large unlabeled datasets.
Employed SSL with confidence-weighted pseudo-label selection to constrain the model's decision boundary.
Applied the integrated framework to DNA-binding protein (DBP) and non-coding RNA (ncRNA) prediction tasks.

Main Results:

Achieved competitive performance compared to fully supervised methods using significantly fewer labeled samples (as little as 1%).
Demonstrated superior performance over traditional SSL methods like TSVM through a language model-based self-training approach.
Successfully identified novel biomolecules, highlighting the framework's efficacy in low-resource scenarios.

Conclusions:

The proposed framework offers an efficient solution for biological sequence classification, particularly in data-scarce environments.
Integrating PLMs and SSL provides a powerful methodology for deciphering biological sequence function.
This approach lays a foundation for advancing computational biology and discovering new biomolecules.