Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

Steps in Outbreak Investigation

Steps in Outbreak Investigation

In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Robustness-oriented training for automated design of photonic tensor cores.

Optics express·2026

Same author

Sub-Poissonian Statistics of Jamming Limits in Ultracold Rydberg Gases.

Physical review letters·2015

Same author

Wireless network control of interacting Rydberg atoms.

Physical review letters·2014

Same journal

Topology only pre-training: towards generalised multi-domain graph models.

Data mining and knowledge discovery·2026

Same journal

Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning.

Data mining and knowledge discovery·2025

Same journal

Missing value replacement in strings and applications.

Data mining and knowledge discovery·2025

Same journal

Robust explainer recommendation for time series classification.

Data mining and knowledge discovery·2024

Same journal

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

Data mining and knowledge discovery·2024

Same journal

Counting frequent patterns in large labeled graphs: a hypergraph-based approach.

Data mining and knowledge discovery·2024

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 11, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Detection and evaluation of clusters within sequential data.

Alexander Van Werde¹, Albert Senen-Cerda^1,2, Gianluca Kosmella^1,3

¹Department of Mathematics & Computer Science, TU/e, Eindhoven, The Netherlands.

Data Mining and Knowledge Discovery

|August 18, 2025

Summary

This summary is machine-generated.

New clustering algorithms for sequential data, based on Block Markov Chains, successfully extract low-dimensional representations from real-world, high-dimensional datasets. These models reveal insights into complex processes like animal movement and DNA sequences.

More Related Videos

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

A Data-Driven Approach to Quantifying Immune States in Sepsis

A Data-Driven Approach to Quantifying Immune States in Sepsis

Published on: February 7, 2025

Related Experiment Videos

Last Updated: Sep 11, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

A Data-Driven Approach to Quantifying Immune States in Sepsis

A Data-Driven Approach to Quantifying Immune States in Sepsis

Published on: February 7, 2025

Area of Science:

Data Science
Computational Biology
Bioinformatics

Background:

Sequential data is prevalent in various fields, presenting challenges due to high dimensionality, sparsity, and noise.
Extracting meaningful insights from complex sequential processes requires robust methods to handle data dependencies.

Purpose of the Study:

To evaluate novel clustering algorithms, derived from Block Markov Chains theory, on real-world sequential data.
To determine if these algorithms can effectively generate useful low-dimensional representations from sparse, high-dimensional sequences.

Main Methods:

Application of new clustering algorithms designed for sequential data.
Empirical study across diverse real-world datasets including animal movement (GPS), DNA sequences, text, and financial data.
Analysis of the extracted low-dimensional representations for their ability to encode sequential structure and reveal underlying process characteristics.

Main Results:

The algorithms successfully extracted low-dimensional representations from diverse real-world sequential data.
These representations effectively captured the inherent sequential structure within the datasets.
The identified representations provided novel insights into the complex processes under study.

Conclusions:

The Block Markov Chain-based clustering algorithms are effective for extracting meaningful low-dimensional representations from complex, real-world sequential data.
This approach offers a promising method for gaining deeper understanding in fields dealing with sequential information.
The study validates the utility of these algorithms beyond synthetic data, demonstrating their applicability in practical scenarios.