Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.4K
What Are Outliers?01:12

What Are Outliers?

4.1K
Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...
4.1K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.8K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.8K
Unusual Results01:16

Unusual Results

3.3K
Unusual results are those that have a very low chance of occurring. Unusual results can be identified using probabilities and the range rule of thumb. In problems involving probability, unusual results can be observed in 2 instances – an unusually high number of successes or an unusually low number of successes.
According to the range rule of thumb, any value above or below two standard deviations, 2σ  from the mean, μ  is considered unusual.
Maximum unusual value =...
3.3K
Difference from Background: Limit of Detection01:05

Difference from Background: Limit of Detection

6.9K
The limit of detection (LOD) is the smallest amount of analyte that can be distinguished from the background noise. The LOD value corresponds to the concentration at which the analyte signal is three times larger than the standard deviation of the blank signal. Below this value, the analyte signal cannot be differentiated from the background noise. It is calculated by dividing the calibration slope by 3 times the standard deviation of the blank signals.
The LOD indicates the presence or absence...
6.9K
The Anderson-Darling Test01:16

The Anderson-Darling Test

854
The Anderson-Darling test is a statistical method used to determine whether a data sample is likely drawn from a specific theoretical distribution. Unlike parametric tests, it does not require assumptions about specific parameters of the distribution. Instead, it compares the sample's empirical cumulative distribution function (ECDF) with the cumulative distribution function (CDF) of the hypothesized distribution. Critical values for the test are specific to the chosen distribution rather...
854

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SUPERVISED LEARNING OF OUTCOME-RELEVANT ITEMS FROM A QUESTIONNAIRE VIA MIXED INTEGER OPTIMIZATION.

The annals of applied statistics·2026
Same author

Community-level wastewater surveillance with machine learning methods to assess underreporting of COVID-19 case counts.

mLife·2026
Same author

Collaborative Inference for Accelerated Failure Time Model Using Clinical Center-Level Summary Statistics.

Statistics in medicine·2025
Same author

Determinants of enrolment rate in 397 clinical trials for healing diabetic foot ulcers: a systematic review.

BMJ open·2025
Same author

DrFARM: identification of pleiotropic genetic variants in genome-wide association studies.

Nature communications·2025
Same author

Limitation of site-stratified cox regression analysis in survival data: a cautionary tale of the PANAMO phase III randomized, controlled study in critically ill COVID-19 patients.

Trials·2024
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Aug 30, 2025

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

20.0K

A Pattern Dictionary Method for Anomaly Detection.

Elyas Sabeti1, Sehong Oh2, Peter X K Song3

  • 1Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109, USA.

Entropy (Basel, Switzerland)
|August 26, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces a novel compression-based anomaly detection method using a pattern dictionary for time series and sequence data. This approach effectively identifies unusual patterns by measuring data complexity, enhancing anomaly detection capabilities.

Keywords:
Lempel–Ziv algorithmanomaly detectionatypicalitylossless compressionpattern dictionary

More Related Videos

A Semantic Priming Event-related Potential ERP Task to Study Lexico-semantic and Visuo-semantic Processing in Autism Spectrum Disorder
08:17

A Semantic Priming Event-related Potential ERP Task to Study Lexico-semantic and Visuo-semantic Processing in Autism Spectrum Disorder

Published on: April 12, 2018

10.7K
Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.8K

Related Experiment Videos

Last Updated: Aug 30, 2025

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

20.0K
A Semantic Priming Event-related Potential ERP Task to Study Lexico-semantic and Visuo-semantic Processing in Autism Spectrum Disorder
08:17

A Semantic Priming Event-related Potential ERP Task to Study Lexico-semantic and Visuo-semantic Processing in Autism Spectrum Disorder

Published on: April 12, 2018

10.7K
Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.8K

Area of Science:

  • Data Science
  • Machine Learning
  • Signal Processing

Background:

  • Anomaly detection is crucial for identifying unusual patterns in time series and sequence data.
  • Existing methods may struggle with complex patterns and require robust baselines for accurate detection.

Purpose of the Study:

  • To propose a compression-based anomaly detection method using a pattern dictionary.
  • To develop a robust system for identifying anomalous patterns in sequential data.
  • To establish a framework for creating health baselines for anomaly detection.

Main Methods:

  • Utilizing a pattern dictionary to learn complex patterns in training data.
  • Employing sequence complexity measures (parsed phrases, codelength) as anomaly scores.
  • Combining the pattern dictionary with universal source coders for atypicality detection.
  • Deriving a non-asymptotic upper bound for LZ78 parser using the Lambert W function.

Main Results:

  • The pattern dictionary method effectively detects anomalies by assessing sequence complexity.
  • Combining with universal source coders creates a powerful atypicality detector.
  • A novel non-asymptotic bound for LZ78 was derived, defining the anomaly score range.
  • The framework was illustrated for establishing health baselines against deviations.

Conclusions:

  • The proposed pattern dictionary method offers a powerful and flexible approach to anomaly detection in sequential data.
  • The method provides a quantitative anomaly score and can be enhanced with universal source coders.
  • The derived theoretical bound contributes to understanding the method's performance limits.