Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Families02:47

Protein Families

17.5K
Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key...
17.5K
Protein-protein Interfaces02:04

Protein-protein Interfaces

15.0K
Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a...
15.0K
Protein Networks02:26

Protein Networks

4.7K
An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
4.7K
Conserved Binding Sites01:49

Conserved Binding Sites

5.3K
Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...
5.3K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

15.0K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
15.0K
Proteomics01:33

Proteomics

10.2K
A proteome is the entire set of proteins that a cell type produces. We can study proteomes using the knowledge of genomes because genes code for mRNAs, and the mRNAs encode proteins. Although mRNA analysis is a step in the right direction, not all mRNAs are translated into proteins.
Proteomics is the study of proteomes' function. It involves the large-scale systematic study of the proteome to denote the protein complement expressed by a genome. Scientist Mark Wilkins coined the term...
10.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Effect of high-fat diet on secreted milk transcriptome in midlactation mice.

Physiological genomics·2017
Same author

Making models match measurements: model optimization for morphogen patterning networks.

Seminars in cell & developmental biology·2014
Same author

The genome of black cottonwood, Populus trichocarpa (Torr. & Gray).

Science (New York, N.Y.)·2006
Same author

The transcriptional landscape of the mammalian genome.

Science (New York, N.Y.)·2005
Same author

A database designed to computationally aid an experimental approach to alternative splicing.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2004
Same author

Xenogenous fertilization of equine oocytes following recovery from slaughterhouse ovaries and in vitro maturation.

Theriogenology·2003
Same journal

Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000). San Diego, California, USA. August 19-23, 2000.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2001
Same journal

Analysis of gene expression data with pathway scores.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

Mining for putative regulatory elements in the yeast genome using gene expression data.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

A multiple alignment algorithm for metabolic pathway analysis using enzyme hierarchy.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

Sequence database search using jumping alignments.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
See all related articles

Related Experiment Video

Updated: Mar 30, 2026

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.7K

The megaprior heuristic for discovering protein sequence patterns

T L Bailey1, M Gribskov

  • 1San Diego Supercomputer Center, San Diego, California 92186-9784, USA. tabailey@sdsc.edu

Proceedings. International Conference on Intelligent Systems for Molecular Biology
|January 1, 1996
PubMed
Summary
This summary is machine-generated.

This study introduces the megaprior heuristic to solve the convex combination problem in protein sequence pattern discovery algorithms like HMMs and MEME. The heuristic enhances statistical models, preventing inaccurate pattern identification from combined patient data.

More Related Videos

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

4.1K
A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

70.1K

Related Experiment Videos

Last Updated: Mar 30, 2026

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.7K
An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

4.1K
A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

70.1K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Statistical Modeling

Background:

  • Protein sequence analysis commonly uses algorithms like Hidden Markov Models (HMMs), MEME, and Gibbs Sampler to identify patterns.
  • These methods can produce inaccurate models when data from distinct patient groups are combined, leading to 'convex combinations'.

Purpose of the Study:

  • To address the issue of inaccurate statistical models caused by combining dissimilar protein sequence datasets.
  • To present a novel heuristic solution for improving the reliability of pattern discovery algorithms.

Main Methods:

  • Introduction of the 'megapriori heuristic' utilizing low-variance Dirichlet mixture priors.
  • The heuristic adjusts prior strength proportionally to the dataset size to stabilize model components.
  • Mathematical analysis of the convex combination problem and the heuristic's mechanism.

Main Results:

  • The megaprior heuristic effectively prevents the formation of erroneous convex combination models.
  • Each column in the resulting statistical model accurately reflects a single prior component's mean.
  • Demonstrated elimination of the convex combination problem in protein sequence pattern discovery.

Conclusions:

  • The megaprior heuristic offers a robust solution to a significant challenge in computational sequence analysis.
  • This method enhances the accuracy and reliability of motif and pattern discovery in protein sequences.
  • The approach is crucial for distinguishing patterns across potentially mixed biological datasets.