Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

Extraction: Partition and Distribution Coefficients

Extraction: Partition and Distribution Coefficients

The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an organic...

Stratified Sampling Method

Stratified Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

An Ordered Ni<sub>6</sub> -Ring Superstructure Enables a Highly Stable Sodium Oxide Cathode.

Advanced materials (Deerfield Beach, Fla.)·2019

Same author

Sensitive and Selective Carmine Acid Detection Based on Chemiluminescence Quenching of Layer Doubled Hydroxide-Luminol-H<sub>2</sub>O<sub>2</sub> System.

ACS omega·2019

Same author

Elucidating Energy Pathways through Simultaneous Measurement of Absorption and Transmission in a Coupled Plasmonic-Photonic Cavity.

Nano letters·2019

Same author

Artificial Solid-Electrolyte Interface Facilitating Dendrite-Free Zinc Metal Anodes via Nanowetting Effect.

ACS applied materials & interfaces·2019

Same author

Revealing Insights into Li<sub></sub>FePO<sub>4</sub> Nanocrystals with Magnetic Order at Room Temperature Resulting in Trapping of Li Ions.

The journal of physical chemistry letters·2019

Same author

Spontaneous cellular vibratory motions of osteocytes are regulated by ATP and spectrin network.

Bone·2019

Same journal

A Neural Database for Answering Aggregate Queries on Incomplete Relational Data (Extended Abstract).

Proceedings. International Conference on Data Engineering·2024

Same journal

Wearables for Health (W4H) Toolkit for Acquisition, Storage, Analysis and Visualization of Data from Various Wearable Devices.

Proceedings. International Conference on Data Engineering·2024

Same journal

SPEAR: Dynamic Spatio-Temporal Query Processing over High Velocity Data Streams.

Proceedings. International Conference on Data Engineering·2022

Same journal

Fine-Grained Provenance for Matching & ETL.

Proceedings. International Conference on Data Engineering·2019

Same journal

A Scalable Data Integration and Analysis Architecture for Sensor Data of Pediatric Asthma.

Proceedings. International Conference on Data Engineering·2018

Same journal

Integrated Theory- and Data-driven Feature Selection in Gene Expression Data Analysis.

Proceedings. International Conference on Data Engineering·2018

See all related articles

Search research articles

Related Experiment Video

Updated: May 19, 2026

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

CRD: Fast Co-clustering on Large Datasets Utilizing Sampling-Based Matrix Decomposition.

Feng Pan¹, Xiang Zhang, Wei Wang

¹Dept. of Computer Science, University of North Carolina at Chapel Hill Chapel Hill, NC, US.

Proceedings. International Conference on Data Engineering

|August 24, 2012

Summary

This summary is machine-generated.

This study introduces CRD, a fast co-clustering framework for large datasets. CRD offers competitive accuracy with significantly reduced computational cost, overcoming memory limitations of previous methods.

More Related Videos

A User-friendly and Powerful R Analysis of Large-scale Datasets

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Related Experiment Videos

Last Updated: May 19, 2026

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

A User-friendly and Powerful R Analysis of Large-scale Datasets

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Area of Science:

Data Mining and Machine Learning
Bioinformatics
Recommender Systems

Background:

Co-clustering algorithms simultaneously group rows and columns of data matrices.
Existing co-clustering methods face scalability issues due to high computational complexity (O(m × n)) and memory constraints.
These limitations hinder the analysis of large-scale datasets common in text mining, microarray analysis, and recommendation systems.

Purpose of the Study:

To propose a novel, efficient framework, CRD, for fast co-clustering of large datasets.
To address the computational and memory limitations of traditional co-clustering algorithms.
To enable effective co-clustering for datasets that cannot fit entirely into main memory.

Main Methods:

CRD framework utilizes sampling-based matrix decomposition techniques.
Achieves execution time linear with respect to the number of rows (m) and columns (n).
Designed to handle datasets that exceed available main memory capacity.

Main Results:

CRD demonstrates competitive accuracy compared to existing co-clustering algorithms.
Achieves significantly reduced computational cost, making it suitable for large datasets.
Successfully operates without requiring the entire data matrix to be loaded into memory.

Conclusions:

CRD provides an efficient and scalable solution for co-clustering large datasets.
Overcomes memory and computational bottlenecks of previous co-clustering approaches.
Offers a practical method for uncovering hidden structures in massive data matrices across various domains.