Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

DNA as a Genetic Template

DNA as a Genetic Template

Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...

Protein Families

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism. Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members. If these new proteins contain similar amino acids in key...

Nucleic Acids

Nucleic Acids

Nucleic acids are the most important macromolecules for the continuity of life. They carry the cell's genetic blueprint and carry instructions for its functioning.
DNA and RNA
The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes and in the organelles, chloroplasts, and mitochondria. In prokaryotes,...

Nucleic acids

Nucleic acids

Nucleic acids are the most important macromolecules for the continuity of life. They carry the cell's genetic blueprint and carry instructions for its functioning.
DNA and RNA
The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes and in the organelles, chloroplasts, and mitochondria. In prokaryotes,...

Nucleic Acids and Nucleotides

Nucleic Acids and Nucleotides

Nucleic acids are the most important macromolecules for the continuity of life. They carry the cell's genetic blueprint and have instructions for its functioning. The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA).
Deoxyribonucleic Acid (DNA)
DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes and the organelles such as chloroplasts and mitochondria....

The DNA Helix

The DNA Helix

Deoxyribonucleic acid, or DNA, is the genetic material responsible for passing traits from generation to generation in all organisms and most viruses. DNA is composed of two strands of nucleotides that wind around each other to form a spring-like structure called a double helix. However, the double helix is not perfectly symmetrical. Instead, there are regularly occurring grooves in the structure. The major groove occurs where the sugar-phosphate backbones are relatively far apart. This space...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Improved selection of canonical proteins for reference proteomes.

NAR genomics and bioinformatics·2024

Same author

Comparison of detection methods and genome quality when quantifying nuclear mitochondrial insertions in vertebrate genomes.

Frontiers in genetics·2022

Same author

Barriers to integration of bioinformatics into undergraduate life sciences education: A national study of US life sciences faculty uncover significant barriers to integrating bioinformatics into undergraduate instruction.

PloS one·2019

Same author

Using SQL Databases for Sequence Similarity Searching and Analysis.

Current protocols in bioinformatics·2017

Same author

Query-seeded iterative sequence similarity searching improves selectivity 5-20-fold.

Nucleic acids research·2016

Same author

Protein Function Prediction: Problems and Pitfalls.

Current protocols in bioinformatics·2015

Same journal

Protein Sequence Analysis Using the MPI Bioinformatics Toolkit.

Current protocols in bioinformatics·2020

Same journal

Exploring Manually Curated Annotations of Intrinsically Disordered Proteins with DisProt.

Current protocols in bioinformatics·2020

Same journal

Network Building with the Cytoscape BioGateway App Explained in Five Use Cases.

Current protocols in bioinformatics·2020

Same journal

Expanding the Perseus Software for Omics Data Analysis With Custom Plugins.

Current protocols in bioinformatics·2020

Same journal

Exploring Non-Coding RNAs in RNAcentral.

Current protocols in bioinformatics·2020

Same journal

How to Illuminate the Dark Proteome Using the Multi-omic OpenProt Resource.

Current protocols in bioinformatics·2020

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 23, 2026

A Practical Guide to Phylogenetics for Nonexperts

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

Finding Protein and Nucleotide Similarities with FASTA.

William R Pearson¹

¹University of Virginia School of Medicine, Charlottesville, Virginia.

Current Protocols in Bioinformatics

|March 25, 2016

Summary

This summary is machine-generated.

The FASTA programs offer versatile tools for rapid and optimal sequence similarity searching in protein and DNA data. These tools improve alignment accuracy and sensitivity, aiding in sequence characterization and analysis.

Keywords:

E()-value alignment annotation expectation homology scoring matrices similarity

More Related Videos

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

Analyzing and Building Nucleic Acid Structures with 3DNA

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

Related Experiment Videos

Last Updated: Mar 23, 2026

A Practical Guide to Phylogenetics for Nonexperts

A Practical Guide to Phylogenetics for Nonexperts

Published on: February 5, 2014

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

Analyzing and Building Nucleic Acid Structures with 3DNA

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Sequence similarity searching is crucial for understanding protein and DNA function.
Existing tools like BLAST offer rapid searching, but comprehensive options for optimal and specialized searches are also needed.
Accurate statistical significance estimation and flexible output formats are essential for integrating search results into analysis pipelines.

Purpose of the Study:

To introduce the FASTA programs as a comprehensive suite of tools for sequence similarity searching.
To highlight the advanced features of FASTA, including empirical statistical significance estimation and flexible output options.
To demonstrate the utility of FASTA for characterizing protein and DNA sequences through various comparison types.

Main Methods:

The FASTA programs encompass rapid similarity search tools (e.g., fasta36) and slower, optimal search programs (e.g., ssearch36).
An empirical strategy is employed for estimating statistical significance, accommodating diverse scoring matrices and gap penalties.
The software supports various database formats (including SQL) and offers strategies for integrating domain annotations and highlighting critical residues.

Main Results:

FASTA provides a range of tools for rapid, optimal, local, and global similarity searches across protein and DNA sequences.
The empirical statistical method enhances alignment boundary accuracy and search sensitivity.
Output formats are compatible with existing pipelines, and the programs facilitate searching large datasets via representative subsets.

Conclusions:

The FASTA programs offer a powerful and flexible alternative for sequence similarity searching and characterization.
Their advanced statistical methods and integration capabilities enhance the analysis of biological sequence data.
FASTA supports diverse comparison types (protein:protein, protein:DNA, DNA:DNA) and database formats, making it a versatile bioinformatics resource.