Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Random Sampling Method

Random Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

Stratified Sampling Method

Stratified Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...

Systematic Sampling Method

Systematic Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
Systematic sampling is one of the simplest methods...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Medication-Wide Association Study of Alzheimer's Disease and Related Dementias: Identifying Drug Candidates from Electronic Health Records through Explainable AI.

medRxiv : the preprint server for health sciences·2026

Same author

Cognitive Trajectories After Major Surgery in Older Adults and Factors Associated With Severe Decline.

Journal of the American Geriatrics Society·2026

Same author

Characteristics and Outcomes of Over 1 Million Veterans With Heart Failure Phenotyped Using Artificial Intelligence Approaches: the National DCVA-HF Registry.

Journal of cardiac failure·2026

Same author

Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings.

Fortune journal of health sciences·2026

Same author

Exercise cardiac magnetic resonance biventricular volumetric reserve in heart failure with preserved ejection fraction.

European journal of heart failure·2026

Same author

Target-Dose Versus Below-Target-Dose ACE Inhibitors and Lower Risk of Kidney Failure in U.S. Veterans with HFrEF.

European journal of heart failure·2026

Same journal

Digital divide in clinical and operational artificial intelligence adoption and implementation stages: US hospital diffusion patterns and AI deserts.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Extending the fundamental theorem of biomedical informatics: a proposal and illustrative examples.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Human factors methods for designing safe health information technology: what do the experts think?

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Equity-by-design for socially assistive robots as digital health tools.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Orchestrator multi-agent clinical decision support system for secondary headache diagnosis in primary care.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

CUI-Curate: a GraphRAG-based framework for automated clinical concept curation for NLP applications.

Journal of the American Medical Informatics Association : JAMIA·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 21, 2026

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Active learning for clinical text classification: is it better than random sampling?

Rosa L Figueroa¹, Qing Zeng-Treitler, Long H Ngo

¹Departamento de Ingeniería Eléctrica, Facultad de Ingeniería, Universidad de Concepción, Concepción, Chile.

Journal of the American Medical Informatics Association : JAMIA

|June 19, 2012

Summary

This summary is machine-generated.

Active learning algorithms can significantly reduce the need for large training datasets in medical text classification. Distance-based and combined algorithms show improved performance compared to passive learning, especially with diverse or uncertain datasets.

Related Experiment Videos

Last Updated: May 21, 2026

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Area of Science:

Medical informatics
Machine learning
Natural Language Processing

Background:

Large labeled datasets are crucial for training effective medical text classification models.
Active learning strategies aim to reduce the annotation burden by intelligently selecting informative data points for labeling.

Purpose of the Study:

To evaluate the efficacy of active learning algorithms in reducing training set requirements for medical text classification.
To compare the performance of distance-based (DIST), diversity-based (DIV), and combined (CMB) active learning algorithms against passive learning.
To investigate the influence of dataset characteristics (diversity, uncertainty) on active learning algorithm performance.

Main Methods:

Three active learning algorithms (DIST, DIV, CMB) were applied to five medical text datasets.
Performance was assessed using classification accuracy and Area Under the ROC Curve (AUC) at varying sample sizes.
Dataset diversity and uncertainty were quantified using relative entropy and correlated with algorithm performance.

Main Results:

The DIST and CMB active learning algorithms outperformed passive learning across multiple datasets.
DIST demonstrated superior performance over passive learning in all five datasets.
Significant correlations were observed between dataset diversity and DIV performance, and dataset uncertainty and DIST performance.

Conclusions:

Active learning algorithms can achieve performance comparable to passive learning with substantially smaller training sets in medical text classification.
The DIV algorithm is more effective on diverse datasets, while the DIST algorithm performs better on datasets with lower uncertainty.