Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...

DNA Microarrays

DNA Microarrays

Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...

Genomics

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

Gene Evolution - Fast or Slow?

Gene Evolution - Fast or Slow?

The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...

Gene Evolution - Fast or Slow?

Gene Evolution - Fast or Slow?

The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Defining reference values for the gut microbiota in a Southern European population.

Frontiers in cellular and infection microbiology·2026

Same author

Benchmarking short- and long-read sequencing technologies for metagenomic profiling of microbiomes.

Scientific reports·2026

Same author

Federated, governed, and interoperable? The emerging architecture of public human genomic data infrastructures: a European perspective.

Frontiers in genetics·2026

Same author

Machine learning-based assessment of the healthy human gut mycobiota landscape using ITS1 DNA metabarcoding data.

BioData mining·2026

Same author

Enhanced Untargeted Metabolomics Based on High-Resolution Mass Spectrometry Reveals Global Rewiring Due to Mitochondrial Dysfunction in Yeast.

International journal of molecular sciences·2026

Same author

Single-cell transcriptomic profiling of human fetal neural stem cells isolated from the subventricular zone.

Frontiers in cell and developmental biology·2026

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 22, 2026

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

Published on: March 5, 2022

EasyCluster: a fast and efficient gene-oriented clustering tool for large-scale transcriptome data.

Ernesto Picardi¹, Flavio Mignone, Graziano Pesole

¹Dipartimento di Biochimica e Biologia Molecolare E, Quagliariello, Università degli Studi di Bari, 70126 Bari, Italy. e.picardi@biologia.uniba.it

BMC Bioinformatics

|June 19, 2009

Summary

This summary is machine-generated.

EasyCluster software accurately clusters expressed sequence tags (ESTs) and full-length cDNAs to gene loci using a genome-based approach. This method improves gene structure inference and alternative splicing analysis, especially in newly sequenced genomes.

More Related Videos

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

Published on: May 22, 2018

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

Related Experiment Videos

Last Updated: Jun 22, 2026

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

Published on: March 5, 2022

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

Published on: May 22, 2018

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

Area of Science:

Genomics
Bioinformatics
Computational Biology

Background:

Expressed Sequence Tags (ESTs) and full-length cDNAs are crucial for gene structure and alternative splicing discovery.
Challenges exist in clustering these sequences for newly sequenced genomes due to limited training sets.
Existing sequence similarity methods can lead to inaccurate clustering.

Purpose of the Study:

To develop a robust genome-based methodology for gene-oriented clustering of ESTs.
To improve the accuracy and efficiency of transcript clustering for downstream annotation.
To provide a reliable tool for analyzing gene structures and alternative splicing.

Main Methods:

Developed EasyCluster software implementing a genome-based clustering approach.
Utilized GMAP for rapid EST-to-genome mapping and splice site detection.
Refined clusters by grouping ESTs that share genomic coordinates and splice sites.

Main Results:

Validated EasyCluster's high accuracy using a manually curated human EST benchmark.
Demonstrated superior clustering performance compared to ASmodeler and BIPASS.
Generated the first gene-oriented clusters for Ricinus communis, enabling alternative splicing evaluation.

Conclusions:

EasyCluster offers a reliable and accurate method for gene-oriented EST clustering.
The software facilitates gene structure inference and alternative splicing analysis.
It is particularly valuable for organisms with newly sequenced genomes.