Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Optimising thermal performance in data centre server racks via a parametric layout configuration study.

Scientific reports·2026

Same author

Engineering cell ring organoids for efficient establishment of patient-derived orthotopic xenotransplantation (r-PDOX) model in sarcoma.

Biomaterials·2026

Same author

Correlative Changes in Endogenous Polyamines and Hormones Associated with Aging in Ancient <i>Cinnamomum camphora</i>.

Plants (Basel, Switzerland)·2026

Same author

Suspected acquired factor XIII deficiency in a man living with HIV: diagnostic and therapeutic challenges in recurrent spontaneous hemorrhage: a case report.

Frontiers in medicine·2026

Same author

EZH2 Inhibition Remodels Cell States and Enhances EGFRi Sensitization in Bladder Cancer.

European urology oncology·2026

Same author

Active Constituents and Mechanisms of Xinshubao Tablets in Coronary Vasorelaxation.

Pharmaceuticals (Basel, Switzerland)·2026

Same journal

circ2DGNN: circRNA-Disease Association Prediction via Transformer-Based Graph Neural Network.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

Hierarchical Hypergraph Learning in Association- Weighted Heterogeneous Network for miRNA- Disease Association Identification.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

Discriminative Domain Adaption Network for Simultaneously Removing Batch Effects and Annotating Cell Types in Single-Cell RNA-Seq.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

MLW-BFECF: A Multi-Weighted Dynamic Cascade Forest Based on Bilinear Feature Extraction for Predicting the Stage of Kidney Renal Clear Cell Carcinoma on Multi-Modal Gene Data.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

Generative Biomedical Event Extraction With Constrained Decoding Strategy.

IEEE/ACM transactions on computational biology and bioinformatics·2024

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 23, 2026

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

Published on: October 19, 2021

Parallel clustering algorithm for large data sets with applications in bioinformatics.

Victor Olman¹, Fenglou Mao, Hongwei Wu

¹Department of Biochemistry and Molecular Biology, Computational System Biology Laboratory, Institute of Bioinformatics, University of Georgia, Athens, Georgia 30602, USA. olman@csbl.bmb.uga.edu

IEEE/ACM Transactions on Computational Biology and Bioinformatics

|May 2, 2009

Summary

This summary is machine-generated.

A new parallel algorithm efficiently identifies dense clusters in large bioinformatical datasets. This approach, using a minimum spanning tree (MST) on graph data, significantly speeds up cluster identification for big data challenges.

Related Experiment Videos

Last Updated: Jun 23, 2026

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

Published on: October 19, 2021

Area of Science:

Bioinformatics
Computer Science
Data Science

Background:

Large bioinformatical datasets present significant computational challenges for cluster identification.
Existing methods are often time-consuming, necessitating efficient parallel algorithms.

Purpose of the Study:

To develop and evaluate a parallel algorithm for identifying dense clusters in large, noisy bioinformatical datasets.
To address the computational bottleneck in cluster identification through parallel processing.

Main Methods:

The algorithm represents data as a graph and identifies clusters as densely intraconnected subgraphs.
A minimum spanning tree (MST) representation is employed for cluster identification.
A parallel algorithm for MST construction is utilized, involving graph partitioning, subgraph MST computation, and merging.

Main Results:

The parallel algorithm, implemented as CLUMP software, achieved nearly 100x speedup on 1,000,000 data points using 150 CPUs compared to single-CPU performance.
Demonstrates efficient handling of very large-scale data clustering problems.

Conclusions:

The developed parallel algorithm offers a highly efficient solution for cluster identification in massive bioinformatical datasets.
The CLUMP software provides a scalable and effective tool for big data clustering.