Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

vALId: validation of protein sequence quality based on multiple alignment data.

Laurent Bianchetti1, Julie Dawn Thompson, Odile Lecompte

  • 1Plate-Forme de Bioinformatique de Strasbourg, Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire (CNRS/INSERM/ULP), Illkirch Cedex, France. Laurent.Bianchetti@igbmc.u-strasbg.fr

Journal of Bioinformatics and Computational Biology
|August 4, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The extraordinary satellitome diversity of freshwater crayfish: a driver of genome evolution.

Mobile DNA·2026
Same author

Novel <i>PCDH12</i> pathogenic missense variants cause neurodevelopmental disorders with ocular malformation.

medRxiv : the preprint server for health sciences·2026
Same author

Profylo: A Python Package for Phylogenetic Profile Comparison and Analysis.

Journal of molecular evolution·2025
Same author

Characterisation of the noble crayfish immune response to oomycete-derived immunostimulants.

Fish & shellfish immunology·2025
Same author

Knee Osteoarthritis Diagnosis: Future and Perspectives.

Biomedicines·2025
Same author

Characterization of the Orphan Cytochrome P450 CYP135B1 from <i>Mycobacterium tuberculosis</i>: Involvement in Metabolism but Not in the Antibacterial Activity of the Antitubercular Drug SQ109.

ACS infectious diseases·2025
Same journal

CNV-ECOD: A copy number variation detection method based on ECOD algorithm using next-generation sequencing data.

Journal of bioinformatics and computational biology·2026
Same journal

ReinVar: A model-free paradigm-based reinforcement learning approach to detect copy number variation.

Journal of bioinformatics and computational biology·2026
Same journal

When pipelines run but coordinates fail: A simple spatial specificity check for false locality in post-GWAS analysis.

Journal of bioinformatics and computational biology·2026
Same journal

Comparative benchmarking of template-based, evolutionary-diffusion, and generative language models for IsPETase structure prediction.

Journal of bioinformatics and computational biology·2026
Same journal

Trap spaces as labelled ideals of SCC posets: A structural-functional theory of reachability in asynchronous boolean networks.

Journal of bioinformatics and computational biology·2026
Same journal

Erratum - DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.

Journal of bioinformatics and computational biology·2026
See all related articles

Protein sequence errors are common in public databases. We developed vALId, a software tool that identifies and corrects these errors, improving data quality for phylogenetic and structural analyses.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Molecular Biology

Background:

  • Thousands of protein sequences in public databases are predicted computationally (in silico) and often contain errors.
  • Lack of systematic quality verification for these predicted sequences hinders accurate downstream analyses like phylogeny and structure/function studies.

Purpose of the Study:

  • To develop an automated quality control method for protein sequences.
  • To introduce vALId, an interactive web-based software for detecting and correcting sequence errors.

Main Methods:

  • vALId utilizes high-quality multiple alignments of complete protein sequences (MACS).
  • It identifies suspicious insertions, deletions (indels), and divergent segments.
  • Corrections are proposed using transcript and genome contig data.

Related Experiment Videos

Main Results:

  • vALId demonstrated excellent sensitivity (0.96) and specificity (0.96) for indel detection.
  • Divergent segment detection showed variable performance (Sn 0.49, Sp 0.56) dependent on sequence similarity.
  • Analysis of 6195 sequences revealed that 44% of eukaryotic predicted proteins contained errors.

Conclusions:

  • vALId is an effective tool for identifying and correcting errors in protein sequence data.
  • Automated quality control is crucial for improving the reliability of public protein sequence databases.
  • The findings highlight the prevalence of errors in in silico predicted protein sequences.