Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

12.7K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
12.7K
RNA-seq03:21

RNA-seq

10.4K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
10.4K
Sampling Plans01:23

Sampling Plans

261
Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
261
Steps in Outbreak Investigation01:18

Steps in Outbreak Investigation

199
In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:
199
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Robustness-oriented training for automated design of photonic tensor cores.

Optics express·2026
Same author

Sub-Poissonian Statistics of Jamming Limits in Ultracold Rydberg Gases.

Physical review letters·2015
Same author

Wireless network control of interacting Rydberg atoms.

Physical review letters·2014
Same journal

Topology only pre-training: towards generalised multi-domain graph models.

Data mining and knowledge discovery·2026
Same journal

Universal representation learning for multivariate time series using the instance-level and cluster-level supervised contrastive learning.

Data mining and knowledge discovery·2025
Same journal

Missing value replacement in strings and applications.

Data mining and knowledge discovery·2025
Same journal

Robust explainer recommendation for time series classification.

Data mining and knowledge discovery·2024
Same journal

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

Data mining and knowledge discovery·2024
Same journal

Counting frequent patterns in large labeled graphs: a hypergraph-based approach.

Data mining and knowledge discovery·2024
See all related articles

Related Experiment Video

Updated: Sep 11, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.5K

Detection and evaluation of clusters within sequential data.

Alexander Van Werde1, Albert Senen-Cerda1,2, Gianluca Kosmella1,3

  • 1Department of Mathematics & Computer Science, TU/e, Eindhoven, The Netherlands.

Data Mining and Knowledge Discovery
|August 18, 2025
PubMed
Summary
This summary is machine-generated.

New clustering algorithms for sequential data, based on Block Markov Chains, successfully extract low-dimensional representations from real-world, high-dimensional datasets. These models reveal insights into complex processes like animal movement and DNA sequences.

More Related Videos

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.0K
A Data-Driven Approach to Quantifying Immune States in Sepsis
07:42

A Data-Driven Approach to Quantifying Immune States in Sepsis

Published on: February 7, 2025

313

Related Experiment Videos

Last Updated: Sep 11, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.5K
Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.0K
A Data-Driven Approach to Quantifying Immune States in Sepsis
07:42

A Data-Driven Approach to Quantifying Immune States in Sepsis

Published on: February 7, 2025

313

Area of Science:

  • Data Science
  • Computational Biology
  • Bioinformatics

Background:

  • Sequential data is prevalent in various fields, presenting challenges due to high dimensionality, sparsity, and noise.
  • Extracting meaningful insights from complex sequential processes requires robust methods to handle data dependencies.

Purpose of the Study:

  • To evaluate novel clustering algorithms, derived from Block Markov Chains theory, on real-world sequential data.
  • To determine if these algorithms can effectively generate useful low-dimensional representations from sparse, high-dimensional sequences.

Main Methods:

  • Application of new clustering algorithms designed for sequential data.
  • Empirical study across diverse real-world datasets including animal movement (GPS), DNA sequences, text, and financial data.
  • Analysis of the extracted low-dimensional representations for their ability to encode sequential structure and reveal underlying process characteristics.

Main Results:

  • The algorithms successfully extracted low-dimensional representations from diverse real-world sequential data.
  • These representations effectively captured the inherent sequential structure within the datasets.
  • The identified representations provided novel insights into the complex processes under study.

Conclusions:

  • The Block Markov Chain-based clustering algorithms are effective for extracting meaningful low-dimensional representations from complex, real-world sequential data.
  • This approach offers a promising method for gaining deeper understanding in fields dealing with sequential information.
  • The study validates the utility of these algorithms beyond synthetic data, demonstrating their applicability in practical scenarios.