Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method.

Mir S Siadaty1, William A Knaus

  • 1Department of Public Health Sciences, University of Virginia School of Medicine, Box 800717, Charlottesville, Virginia 22908, USA. MirSiadaty@virginia.edu

BMC Medical Informatics and Decision Making
|March 9, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prognostic Modeling and Major Dataset Shifts During the COVID-19 Pandemic: What Have We Learned for the Next Pandemic?

JAMA health forum·2022
Same author

All Public Health is Local: Lessons From Eagle County During the First 2 Years of the Coronavirus Disease-2019 Pandemic.

Medical care·2022
Same author

The authors reply.

Critical care medicine·2021
Same author

Severity of Illness and Predictive Models in Society of Critical Care Medicine's First 50 Years: A Tale of Concord and Conflict.

Critical care medicine·2021
Same author

A Case-Control Study to Add Volumetric or Clinical Mammographic Density into the Tyrer-Cuzick Breast Cancer Risk Model.

Journal of breast imaging·2019
Same author

New Phenotypes for Sepsis: The Promise and Problem of Applying Machine Learning and Artificial Intelligence in Clinical Research.

JAMA·2019
Same journal

Risk prediction of sepsis-associated acute kidney injury: development, validation of a machine learning model with multicenter data.

BMC medical informatics and decision making·2026
Same journal

Trajectory analysis of sleep disorders and anxiety-depression in female breast cancer patients undergoing chemotherapy: based on group-based Multi-Trajectory Model and machine learning.

BMC medical informatics and decision making·2026
Same journal

Multitask learning of longitudinal circulating biomarkers and clinical outcomes: identification of optimal machine-learning and deep-learning models.

BMC medical informatics and decision making·2026
Same journal

Comparative machine learning approaches to prognosticate clinical outcomes in oral and maxillofacial space infections: a retrospective analysis.

BMC medical informatics and decision making·2026
Same journal

Development and validation of machine learning models for early diagnosis of hemophagocytic lymphohistiocytosis in pediatric Epstein-Barr virus infection.

BMC medical informatics and decision making·2026
Same journal

Clinical subphenotypes in septic patients with new-onset atrial fibrillation: validation and parsimonious classifier model development.

BMC medical informatics and decision making·2026
See all related articles

This study introduces a dual-mining method to automatically identify novel and interesting patterns from large datasets. The approach effectively prunes uninteresting associations, significantly reducing manual analysis and expert dependence.

Area of Science:

  • Data Mining and Knowledge Discovery
  • Bioinformatics
  • Computational Biology

Background:

  • Data mining generates numerous patterns, many lacking practical utility.
  • Current methods for identifying interesting patterns are inefficient, creating bottlenecks in knowledge acquisition.
  • Automating the discovery of novel and significant patterns is crucial for effective data analysis.

Purpose of the Study:

  • To develop and evaluate a novel method for automatically acquiring knowledge from data mining results.
  • To enhance the efficiency of identifying interesting patterns by reducing reliance on human experts.
  • To address the bottleneck in knowledge acquisition caused by uninteresting patterns.

Main Methods:

  • The dual-mining method compares pattern strengths from a database and a knowledgebase.

Related Experiment Videos

  • A "surprise score" is calculated when pattern strengths differ, indicating novelty.
  • Statistical significance is assessed using p-values to filter out noise.
  • Main Results:

    • The dual-mining method was implemented using Perl and R scripts.
    • Applied to a patient database and biomedical literature, it analyzed 50,000 patterns.
    • The method successfully identified novel patterns by comparing association scores and computing surprise scores.

    Conclusions:

    • The dual-mining method effectively eliminates over 90% of uninteresting patterns.
    • Surprise score-based pruning aligns with biomedical evidence, validating its accuracy.
    • This automated approach significantly reduces the need for human expert knowledge in pattern discovery.