Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Investigation of Disease Outbreaks

Investigation of Disease Outbreaks

Multistate foodborne outbreaks pose significant public health risks and require meticulous investigation to identify sources and implement control measures. The Centers for Disease Control and Prevention (CDC) utilizes a dynamic seven-step process for these investigations, integrating data from laboratories, interviews, and environmental assessments to protect public health.Outbreak Detection: The detection of multistate outbreaks typically begins with PulseNet, the CDC's national laboratory...

Classification of Illness

Classification of Illness

The meaning of illness is individualized to each person who experiences an alteration in health. In contrast, disease is a medical term indicating a pathological change in the structure and function of the body or mind. It is a condition that has specific symptoms and boundaries.
An illness is a response to a disease in which the person's level of functioning is changed compared with a previous level. The general classification of illness includes acute and chronic.
Acute illness is severe and...

Steps in Outbreak Investigation

Steps in Outbreak Investigation

In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

Classification of Leukocytes

Classification of Leukocytes

Leukocytes are classified into two groups based on the presence or absence of cytoplasmic granules. Granular leukocytes, which contain granules, belong to the myeloid lineage and are divided into three subtypes: neutrophils, eosinophils, and basophils. These cells are roughly spherical and characterized by the granules in their cytoplasm.
Neutrophils are the most abundant type of granular leukocytes, comprising 50-70% of all leukocytes. They feature small, evenly distributed granules and a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A theory-informed deep learning approach to extracting and characterizing substance use-related stigma in social media.

BMC digital health·2026

Same author

Identifying Stigma Phenotypes in Social Media Narratives of Substance Use: Observational Study.

Journal of medical Internet research·2025

Same author

Evaluating a global classroom initiative to teach machine learning applications in healthcare.

BMC medical education·2025

Same author

Psychology student and mental health practitioner experiences of and perspectives on Client101, a virtual client chatbot training tool.

BMC medical education·2025

Same author

Comparing the Use Experiences, Contextual Factors, and Recovery Strategies Associated with Different Substances: An Analysis of Social Media Narratives.

Substance use & misuse·2025

Same author

Leveraging Large Language Models for Simulated Psychotherapy Client Interactions: Development and Usability Study of Client101.

JMIR medical education·2025

Same journal

BlockFedMed: A blockchain-federated learning framework for privacy-preserving mortality prediction across heterogeneous intensive care units.

International journal of medical informatics·2026

Same journal

Integrating clinical decision support systems in pediatric oncology: A scoping review of applications, implementation gaps, and management Implications.

International journal of medical informatics·2026

Same journal

Understanding digital health capability of allied health professionals - a mixed-methods study with content validity analysis.

International journal of medical informatics·2026

Same journal

On-premises open-source large language models for privacy-preserving multimodal depression screening.

International journal of medical informatics·2026

Same journal

Data mining methods, tasks, and algorithms for adverse drug reaction analysis in pharmacovigilance: A scoping review.

International journal of medical informatics·2026

Same journal

Development and validation of an interpretable machine learning model for predicting systemic inflammatory response syndrome after percutaneous nephrolithotomy: A multicenter study.

International journal of medical informatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 23, 2026

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Classifying disease outbreak reports using n-grams and semantic features.

Mike Conway¹, Son Doan, Ai Kawazoe

¹National Institute of Informatics, Tokyo, Japan. mike@nii.ac.jp

International Journal of Medical Informatics

|May 19, 2009

Summary

This summary is machine-generated.

Feature selection combined with n-grams and semantic features significantly improves disease outbreak report classification accuracy. This approach enhances the BioCaster text mining system

Related Experiment Videos

Last Updated: Jun 23, 2026

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Area of Science:

Natural Language Processing
Computational Linguistics
Epidemiology

Background:

The BioCaster system aims to mine disease outbreak reports for epidemiological surveillance.
Classifying these reports accurately is crucial for timely public health response.
Existing methods may not fully leverage diverse textual features.

Purpose of the Study:

To evaluate the effectiveness of n-grams and semantic features for classifying disease outbreak reports.
To investigate the contribution of a general-purpose semantic tagger (USAS) in this classification task.
To compare different machine learning algorithms and feature selection techniques.

Main Methods:

Utilized the BioCaster corpus (1000 documents) for classification experiments.
Employed feature sets including Named Entity recognition, n-grams (unigrams, bigrams, trigrams), and USAS semantic tags.
Applied Naïve Bayes, Support Vector Machine, and C4.5 decision tree algorithms.
Performed feature selection using the chi(2) algorithm.

Main Results:

A combination of unigrams, bigrams, trigrams, and semantic features with the Naïve Bayes algorithm and feature selection achieved the highest classification accuracy and F-score.
This performance improvement was statistically significant compared to baseline and prior work.
Feature selection was identified as the primary driver of improved performance, more so than semantic tagging.

Conclusions:

The study demonstrates that integrating bag-of-words, n-grams, and semantic features, coupled with feature selection, significantly enhances disease outbreak report classification.
This optimized approach offers a statistically validated improvement over previous methods in the domain.
The findings provide valuable insights for developing more effective text mining systems for public health surveillance.