Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Protein Families

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism. Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members. If these new proteins contain similar amino acids in key locations, protein...

Conservation of Protein Domains Over Different Proteins

Conservation of Protein Domains Over Different Proteins

Protein domains are small structurally independent units that are part of a single amino acid chain. Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to form...

Leaky Scanning

Leaky Scanning

During most eukaryotic translation processes, the small 40S ribosome subunit scans an mRNA from its 5' end until it encounters the first start AUG codon. The large 60S ribosomal subunit then joins the smaller one to initiate protein synthesis. The location of the translation initiation is largely determined by the nucleotides near the start codon as there may be multiple translation initiation sites present on the mRNA. Marilyn Kozak discovered that the sequence RCCAUGG (where R stands for...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Exceptional Rare-Earth Half-Heusler Thermoelectrics With Sublattice Softening.

Advanced materials (Deerfield Beach, Fla.)·2026

Same author

Interface Excitons in van der Waals Sandwich Heterostructures.

ACS nano·2026

Same author

Pressure-Induced Superconductivity in the Thermoelectric Semiconductor Mg<sub>3</sub>Sb<sub>2</sub>.

Journal of the American Chemical Society·2026

Same author

circTMEM230 Sponges miR-223-3p to Promote Endplate Chondrocyte Extracellular Matrix Synthesis and Attenuate Tension-Induced Disc Degeneration.

FASEB journal : official publication of the Federation of American Societies for Experimental Biology·2026

Same author

A Molecular Playground for Spin-State Ice and Coupled Electron-Spin Dynamics.

Journal of the American Chemical Society·2026

Same author

Global-focal adaptation with information separation for noise-robust transfer fault diagnosis.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Epitope prediction algorithms for peptide-based vaccine design.

Proceedings. IEEE Computer Society Bioinformatics Conference·2006

Same journal

Keynote address: the role of algorithmic research in computational genomics.

Proceedings. IEEE Computer Society Bioinformatics Conference·2006

Same journal

Stepping up the pace of discovery: the genomes to life program.

Proceedings. IEEE Computer Society Bioinformatics Conference·2006

Same journal

Prokaryote phylogeny without sequence alignment: from avoidance signature to composition distance.

Proceedings. IEEE Computer Society Bioinformatics Conference·2006

Same journal

Efficient reconstruction of phylogenetic networks with constrained recombination.

Proceedings. IEEE Computer Society Bioinformatics Conference·2006

Same journal

A new approach for gene annotation using unambiguous sequence joining.

Proceedings. IEEE Computer Society Bioinformatics Conference·2006

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 26, 2026

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

Accelerating approximate subsequence search on large protein sequence databases.

Jiong Yang¹, Wei Wang, Yi Xia

¹T. J. Watson Research, IBM, USA. jiyang@us.ibm.com

Proceedings. IEEE Computer Society Bioinformatics Conference

|April 20, 2005

Summary

This summary is machine-generated.

This study introduces the BASS-tree indexing method for faster protein sequence searching. It significantly improves approximate matching performance compared to BLAST and suffix trees.

More Related Videos

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Published on: August 16, 2017

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Related Experiment Videos

Last Updated: Jun 26, 2026

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Published on: August 16, 2017

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

The volume of biological sequence data is rapidly increasing.
Current sequence retrieval tools like BLAST are computationally intensive for large databases.
Existing indexing methods, such as suffix trees, face memory limitations with large protein sequence datasets.

Purpose of the Study:

To develop an efficient indexing structure for large protein sequence databases.
To enable sublinear time approximate sequence matching.
To overcome the limitations of existing methods like BLAST and suffix trees.

Main Methods:

Implementation of the BASS-tree indexing structure for protein sequences.
Development of the sequence approximate match (SAM) index method.
Experimental evaluation of the SAM index method against BLAST and suffix trees.

Main Results:

The BASS-tree based SAM index method achieves sublinear time complexity for approximate matching.
Experimental results show an order of magnitude performance improvement over BLAST and suffix trees.
The proposed method effectively directs searches to relevant database portions for faster matching.

Conclusions:

The BASS-tree offers a scalable and efficient solution for indexing large protein sequence databases.
The SAM index method significantly enhances the speed of approximate sequence matching.
This approach provides a valuable tool for bioinformatics research dealing with massive sequence data.