Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Habitat Fragmentation02:31

Habitat Fragmentation

17.8K
Habitat fragmentation describes the division of a more extensive, continuous habitat into smaller, discontinuous areas. Human activities such as land conversion, as well as slower geological processes leading to changes in the physical environment, are the two leading causes of habitat fragmentation. The fragmentation process typically follows the same steps: perforation, dissection, fragmentation, shrinkage, and attrition.
17.8K
Tagging and Fusion Proteins01:24

Tagging and Fusion Proteins

6.8K
Proteins are involved in several cellular processes and biochemical reactions. Analyzing a specific protein of interest requires it to be isolated from the other proteins in the cell. This is achieved by overexpressing the specific gene in a suitable host to produce large quantities of the target protein. A tag or label is recombined with the gene to produce a fusion protein containing the target protein and the tag. The tags on these fusion proteins can then be used for easy detection and...
6.8K
Chunking01:12

Chunking

176
Chunking is a powerful cognitive technique that improves short-term memory retention by organizing information into smaller, more manageable units. The brain, limited by working memory capacity, can more easily process and store information when it is divided into "chunks" rather than presented as discrete, unrelated elements. Chunking is especially useful when dealing with large amounts of information, such as numerical sequences, words, or complex ideas.
The principle behind chunking...
176
Modeling and Similitude01:12

Modeling and Similitude

323
Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...
323
Long-patch Base Excision Repair01:02

Long-patch Base Excision Repair

7.1K
Since the discovery of the two BER pathways, there has been a debate about how a cell chooses one pathway over the other and the factors determining this selection. Numerous in vitro experiments have pointed out multiple determinants for the sub-pathway selection. These are:
7.1K
Molecular Models02:00

Molecular Models

39.9K
Physical models representing molecular architectures of chemical compounds play essential roles in understanding chemistry. The use of molecular models makes it easier to visualize the structures and shapes of atoms and molecules.
39.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Epidemiological correlations and seasonal patterns of osteoporosis and its comorbidities: a 14-year big data analysis using search engine trends.

Frontiers in public health·2026
Same author

Natural zeolite dose-dependently enhances conjugative transfer of antibiotic resistance genes in Escherichia coli.

Journal of hazardous materials·2026
Same author

ZIF-8 functionalized alginate-chitosan hybrid microspheres as a versatile platform for α-glucosidase immobilization and high-throughput inhibitor screening.

Analytica chimica acta·2026
Same author

Clustering characteristics of upper gastrointestinal cancer risk behaviours and their association with social determinants of health: a latent class analysis.

Scientific reports·2026
Same author

Menthol-based hydrophobic deep eutectic solvent for vortex-assisted liquid-liquid microextraction and capillary electrophoretic determination of bioactive compounds in licorice.

Analytical methods : advancing methods and applications·2026
Same author

A Preparation for High Aqueous Dispersion Fe<sub>3</sub>O<sub>4</sub> with Controllable Particle Size and Adjustable Aggregation State in a Magnetic Field.

Langmuir : the ACS journal of surfaces and colloids·2026
Same journal

STED: flexible cross-modal topic modeling infers cell-type-specific regulatory landscapes from bulk epigenomics.

Briefings in bioinformatics·2026
Same journal

A knowledge-guided deep learning framework for quantitative nucleic acid testing.

Briefings in bioinformatics·2026
Same journal

Optimal transport for label transfer in single-cell multi-omics integration.

Briefings in bioinformatics·2026
Same journal

Continuous multi-omics pathway enrichment analysis resolves hidden functional heterogeneity.

Briefings in bioinformatics·2026
Same journal

Evaluating completeness, coherence, and consistency of genome-scale function annotations.

Briefings in bioinformatics·2026
Same journal

Transformers for single-cell RNA sequencing: a survey.

Briefings in bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Aug 28, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

673

SPRoBERTa: protein embedding learning with local fragment modeling.

Lijun Wu1, Chengcan Yin2, Jinhua Zhu3

  • 1Microsoft Research Asia, No. 5 Dan Ling Street, Haidian District, 100080, Beijing, China.

Briefings in Bioinformatics
|September 22, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces SPRoBERTa, a new method for protein embedding learning. It improves understanding of protein structure and function by considering local protein sequence patterns, outperforming existing methods.

Keywords:
local fragment representationprotein pre-trainingprotein tokenizer

More Related Videos

Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

444
Author Spotlight: Advancements in Correlative Light and Electron Microscopy with Fluorescent Protein Preservation
08:47

Author Spotlight: Advancements in Correlative Light and Electron Microscopy with Fluorescent Protein Preservation

Published on: January 12, 2024

1.7K

Related Experiment Videos

Last Updated: Aug 28, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

673
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

444
Author Spotlight: Advancements in Correlative Light and Electron Microscopy with Fluorescent Protein Preservation
08:47

Author Spotlight: Advancements in Correlative Light and Electron Microscopy with Fluorescent Protein Preservation

Published on: January 12, 2024

1.7K

Area of Science:

  • Computational biology
  • Bioinformatics
  • Structural biology

Background:

  • Understanding protein structure and function is crucial for human biology.
  • Limited annotated protein data necessitates advanced computational methods.
  • Current self-supervised methods often overlook local semantic patterns in protein sequences.

Purpose of the Study:

  • To develop a novel pre-training approach, SPRoBERTa, for enhanced protein embedding learning.
  • To address limitations in representing protein sequences by incorporating local fragment patterns.
  • To create a versatile framework for various protein-related prediction tasks.

Main Methods:

  • Developed an unsupervised protein tokenizer to capture local fragment patterns.
  • Introduced a deep pre-training framework for learning protein embeddings.
  • Fine-tuned the pre-trained model for amino acid-level, amino acid pair-level, and protein-level prediction tasks.

Main Results:

  • SPRoBERTa achieved significant improvements across diverse protein prediction tasks.
  • The proposed method outperformed existing state-of-the-art approaches.
  • Ablation studies validated the effectiveness of the protein tokenizer and training framework.

Conclusions:

  • SPRoBERTa offers a powerful new approach for protein representation learning.
  • Incorporating local sequence semantics enhances protein embedding quality.
  • The method demonstrates broad applicability and superior performance in computational biology.