Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...
¹H NMR Signal Multiplicity: Splitting Patterns01:13

¹H NMR Signal Multiplicity: Splitting Patterns

When protons A and X are coupled, their nuclear spin energy levels are slightly modified. This is because the energy required to excite proton A to a spin state parallel to proton X is slightly different from the energy required for it to become anti-parallel to spin X. Consequently, there are two possible excitation frequencies for A (A1 and A2), depending on the spin state of X, and vice versa. The mutual nature of coupling implies that the difference between frequencies A1 and A2, indicated...
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
Applications of Molecular Taxonomy01:20

Applications of Molecular Taxonomy

Molecular taxonomy has revolutionized the understanding and classification of bacteria, providing precise insights into their diversity, evolutionary relationships, and ecological roles. By utilizing molecular techniques such as DNA sequencing and fingerprinting, researchers have made significant strides in various fields related to bacterial studies.Resolving Taxonomic AmbiguitiesMolecular taxonomy has been instrumental in distinguishing closely related bacterial species initially thought to...
Wald-Wolfowitz Runs Test II01:17

Wald-Wolfowitz Runs Test II

The Wald-Wolfowitz runs test, commonly referred to as the runs test, is a nonparametric test used to assess the randomness of ordered data. The test evaluates the number of runs, which are consecutive sequences of similar elements within the data. If the number of runs is significantly higher or lower than expected, the data is considered non-random, indicating a detectable pattern or structure.
For binary data, runs are identified using symbols such as + and −, or equivalently, 1s and 0s. In...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

On-Demand Control of Lanthanide Optical Dynamics via Pumping-Flux Modulation.

Nano letters·2025
Same author

Picropodophyllin induces ferroptosis via blockage of AKT/NRF2/SLC7A11 and AKT/NRF2/SLC40A1 axes in hepatocellular carcinoma as a natural IGF1R inhibitor.

Phytomedicine : international journal of phytotherapy and phytopharmacology·2025
Same author

Cerium-Organic Framework and Resveratrol Composite Hydrogel Scaffold with Dual Antioxidant Activity for Enhanced Bone Regeneration.

ACS applied materials & interfaces·2025
Same author

Can levels of HPV vaccine knowledge mitigate HPV vaccine hesitation among guardians of children aged 9-14 years? A moderated mediation model.

Vaccine·2025
Same author

Split-belt treadmill training improves gait symmetry and lower limb function in patients with stroke.

Scientific reports·2025
Same author

[Effect of <i>TBL1XR1</i> Mutation on Cell Biological Characteristics of Diffuse Large B-Cell Lymphoma].

Zhongguo shi yan xue ye xue za zhi·2025
Same journal

DiffGRN: differential gene regulatory network analysis.

International journal of data mining and bioinformatics·2019
Same journal

Integration of multi-omics data for integrative gene regulatory network inference.

International journal of data mining and bioinformatics·2018
Same journal

The development of non-coding RNA ontology.

International journal of data mining and bioinformatics·2016
Same journal

Learning multiple distributed prototypes of semantic categories for named entity recognition.

International journal of data mining and bioinformatics·2015
Same journal

Weighted fusion regularisation and predicting microbial interactions with vector autoregressive model.

International journal of data mining and bioinformatics·2015
Same journal

Application of consensus string matching in the diagnosis of allelic heterogeneity involving transposition mutation.

International journal of data mining and bioinformatics·2015
See all related articles

Related Experiment Video

Updated: May 9, 2026

Mining Spatial Transcriptomics Datasets using DeepSpaceDB
10:16

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

MAIL: mining sequential patterns with wildcards.

Fei Xie1, Xindong Wu, Xuegang Hu

  • 1College of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China. xiefei9815057@sina.com

International Journal of Data Mining and Bioinformatics
|July 20, 2013
PubMed
Summary
This summary is machine-generated.

This study introduces the MAIL algorithm for efficient sequential pattern mining with wildcards and flexible gap constraints. MAIL significantly outperforms existing methods in discovering frequent patterns in biological sequences.

More Related Videos

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Related Experiment Videos

Last Updated: May 9, 2026

Mining Spatial Transcriptomics Datasets using DeepSpaceDB
10:16

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Data Mining

Background:

  • Sequential pattern mining is crucial for analyzing biological data.
  • Existing methods struggle with wildcard characters and flexible gap constraints.

Purpose of the Study:

  • To develop an efficient algorithm for mining frequent sequential patterns with wildcards and flexible gap constraints.
  • To enhance pattern discovery in biological sequences.

Main Methods:

  • Designed the MAIL algorithm incorporating two pattern growth strategies: candidate occurrence pruning and occurrence graph.
  • Developed a random data generator for completeness testing.
  • Applied the algorithm to DNA sequences.

Main Results:

  • MAIL efficiently mines frequent patterns with user-specified gap constraints and wildcards.
  • Experiments show MAIL discovers four times more patterns than a peer algorithm.
  • MAIL demonstrates six times faster average performance compared to another peer algorithm.

Conclusions:

  • MAIL offers a significant advancement in sequential pattern mining for biological data.
  • The algorithm's efficiency and completeness make it suitable for analyzing large biological sequence datasets.
  • MAIL facilitates the discovery of novel patterns in DNA sequences.