Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Selection of representative protein data sets.

U Hobohm¹, M Scharf, R Schneider

¹European Molecular Biology Laboratory, Heidelberg, Germany.

Protein Science : a Publication of the Protein Society

|March 1, 1992

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Anti-BAFF treatment modulates intragraft fibrosis and DKK3 expression in a non-adherence model of experimental kidney transplantation.

Transplant immunology·2026

Same author

Depressive and anxiety symptoms in individuals with Long-COVID: Does social network matter? - Results of a German Long-COVID study.

Journal of affective disorders·2026

Same author

Polyethylene liner dissociation in total hip arthroplasty: a retrospective case-control study on a single implant design.

Journal of orthopaedics and traumatology : official journal of the Italian Society of Orthopaedics and Traumatology·2024

Same author

Do values and political attitudes affect help-seeking? Exploring reported help-seeking for mental health problems in a general population sample using a milieu framework.

Epidemiology and psychiatric sciences·2023

Same author

Search for Relativistic Magnetic Monopoles with Eight Years of IceCube Data.

Physical review letters·2022

Same author

Using trend arrows in continuous glucose monitoring systems for insulin adjustment in clinical practice: Brazilian Diabetes Society Position Statement.

Diabetology & metabolic syndrome·2021

Same journal

Macromolecular crowding inhibits degradation of alpha-synuclein amyloid fibrils induced by cathepsins and MMP9.

Protein science : a publication of the Protein Society·2026

Same journal

Sequence-encoded differences in the conformational ensembles of CITED transcriptional activation domains impact coactivator binding.

Protein science : a publication of the Protein Society·2026

Same journal

The phospholipid biosynthesis enzyme PlsB contains three distinct domains for membrane association, lysophosphatidic acid synthesis, and dimerization.

Protein science : a publication of the Protein Society·2026

Same journal

Structural basis of ligand selectivity in FAD/NAD(P)H-dependent dehydrogenases: insights from trypanothione reductase and type II NADH dehydrogenase.

Protein science : a publication of the Protein Society·2026

Same journal

Achieving protease substrate-specific inhibition by mAb dual functional selections.

Protein science : a publication of the Protein Society·2026

Same journal

How important are quantum mechanical effects in controlling biological functions: Enzymes, electron transfer and bird navigation.

Protein science : a publication of the Protein Society·2026

See all related articles

Researchers developed algorithms to create nonredundant protein datasets from the Protein Data Bank. This ensures diverse protein structures are represented for statistical analysis, aiding protein folding and structure studies.

Area of Science:

Structural Biology
Bioinformatics
Computational Biology

Background:

The Protein Data Bank (PDB) contains numerous 3D protein coordinate sets.
High redundancy exists within the PDB due to similar protein sequences.
Statistical analyses require nonredundant datasets for accurate sequence-structure relation studies.

Purpose of the Study:

To develop algorithms for extracting representative, nonredundant protein chain sets from the PDB.
To maximize coverage of unique protein families while minimizing sequence redundancy.
To provide a valuable resource for statistical analyses in protein science.

Main Methods:

Developed two distinct algorithms for data reduction: one optimizing specific properties and another maximizing set size.

Related Experiment Videos

Algorithms involve successive selection/exclusion and cluster thinning based on similarity criteria.

Applied algorithms to the Protein Data Bank, defining similarity by sequence identity thresholds.

Main Results:

Successfully extracted nonredundant sets of protein chains, with the largest containing 155 chains.
Ensured minimal sequence similarity (e.g., <30% identity over >80 residues) between selected proteins.
Achieved representation of all structurally unique protein families within the selected sets.

Conclusions:

The developed algorithms effectively reduce redundancy in the PDB, creating valuable datasets.
These nonredundant sets are crucial for statistical approaches to protein folding and structure analysis.
Updated representative datasets are available via electronic mail, facilitating research.