Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Data augmentation algorithms for detecting conserved domains in protein sequences: a comparative study.

Chengpeng Bi1

  • 1Bioinformatics and Intelligent Computing Lab, Children's Mercy Hospitals and Clinics Schools of Medicine, Computing and Engineering University of Missouri, Kansas City, Missouri 64108, USA.

Journal of Proteome Research
|December 18, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Proof-of-concept study for the detection of somatic structural variant driver alterations using HiFi long-read sequencing in a pediatric leukemia cohort.

NPJ genomic medicine·2026
Same author

Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic.

Nature biotechnology·2025
Same author

Clinical Long-Read Sequencing Test for Genetic Disease Diagnosis.

JAMA pediatrics·2025
Same author

Severus detects somatic structural variation and complex rearrangements in cancer genomes using long-read sequencing.

Nature biotechnology·2025
Same author

Successful classification of clinical pediatric leukemia genetic subtypes via structural variant detection using HiFi long-read sequencing.

medRxiv : the preprint server for health sciences·2025
Same author

DeepSomatic: Accurate somatic small variant discovery for multiple sequencing technologies.

bioRxiv : the preprint server for biology·2024
Same journal

Molecular Solution to the Paradox of Ancient Brain Preservation.

Journal of proteome research·2026
Same journal

From Method-Defined Signals to Reference Measurement Procedures: Two Decades of Mass Spectrometry-Based ProGRP Quantification.

Journal of proteome research·2026
Same journal

Proteomic Profiling of Extracellular Vesicle-Enriched Plasma Using Mag-Net for Biomarker Discovery in Pancreatic Ductal Adenocarcinoma.

Journal of proteome research·2026
Same journal

Computationally Efficient Bayesian Estimation of Graphical Networks for Omics Data.

Journal of proteome research·2026
Same journal

Hierarchy of MS-Based Evidence.

Journal of proteome research·2026
Same journal

Proteomic Profiling of Exosomes from HPV-Positive and HPV-Negative Head and Neck Squamous Cell Carcinoma: Selective Cargo Packaging.

Journal of proteome research·2026
See all related articles

This study introduces a data augmentation (DA) framework to improve protein motif discovery. The framework unifies algorithms for finding conserved protein domains, enhancing molecular function analysis.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Structural Biology

Background:

  • Conserved protein domains are crucial for molecular function but often unobserved.
  • Motif discovery methods are essential for identifying these hidden domains within protein sequences.

Purpose of the Study:

  • To present a unified data augmentation (DA) framework for motif discovery.
  • To enhance the detection of unobserved conserved protein domains.

Main Methods:

  • Developed a data augmentation (DA) framework to unify motif-finding algorithms.
  • Utilized likelihood function maximization by imputing unobserved data.
  • Illustrated deterministic and stochastic maximum likelihood-based algorithms under the DA framework.

Related Experiment Videos

Main Results:

  • Described and evaluated four DA motif discovery algorithms.
  • Compared algorithm performance on real and simulated protein sequences.
  • Demonstrated the effectiveness of DA in unifying and improving motif detection.

Conclusions:

  • The data augmentation framework provides a unified approach to motif discovery.
  • DA enhances the identification of conserved protein domains, aiding functional analysis.
  • The framework supports both deterministic and stochastic motif-finding strategies.