Search research articles

Related Concept Videos

Sample Size Calculation

Sample Size Calculation

Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...

Contaminants and Errors

Contaminants and Errors

Effective sample preparation is crucial for accurate and reliable laboratory analysis. During this process, two significant sources of error can arise: concentration bias from improper sample splitting and contamination caused by methods used to reduce particle size, such as grinding or homogenization. Identifying and minimizing these potential errors is crucial to ensuring the validity of the analysis.
Another key consideration is determining the appropriate number of samples required to...

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Orchestrator multi-agent clinical decision support system for secondary headache diagnosis in primary care.

Journal of the American Medical Informatics Association : JAMIA·2026

Same author

Multimodal passive smartphone sensing in older adults: a guide for clinical scientists based upon an ongoing cohort study.

Innovation in aging·2026

Same author

CPGPrompt: translating clinical guidelines into large language model-executable decision support.

Journal of the American Medical Informatics Association : JAMIA·2026

Same author

Noninferiority and Efficiency/Revenue Facilitation (NERF) Endpoints : Shifting Grounds of Argument in Health AI Interventional Studies.

Journal of bioethical inquiry·2025

Same author

A foundation model for human-AI collaboration in medical literature mining.

Nature communications·2025

Same author

Accuracy of Large Language Models to Identify Stroke Subtypes Within Unstructured Electronic Health Record Data.

Stroke·2025

Same journal

Supporting Radiology Resident Education and Clinical Decision-Making With Large Language Models: Comparative Study of Reasoning Models DeepSeek-R1 and ChatGPT-o1.

JMIR AI·2026

Same journal

Patient Perceptions on the Use of Artificial Intelligence in Creating Clinical Research Documents: Survey Study.

JMIR AI·2026

Same journal

Application of Language Models for the Analysis of Adverse Drug Events in Pharmaceutical Research and Development: Scoping Review.

JMIR AI·2026

Same journal

Correction: Deep Learning for Age Estimation and Sex Prediction Using Mandibular-Cropped Cephalometric Images: Comparative Model Development and Validation Study.

JMIR AI·2026

Same journal

AI-Assisted Systematic Literature Review of the Economic Burden of Pneumococcal Disease: Development and Validation Study.

JMIR AI·2026

Same journal

Knowledge-Augmented Large Language Model for Multimodal Electronic Health Record-Based Risk Prediction: Development and Validation Study.

JMIR AI·2026

See all related articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Jun 23, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological

Zoltan P Majdik¹, S Scott Graham², Jade C Shiva Edward²

¹Department of Communication, North Dakota State University, Fargo, ND, United States.

|June 14, 2024

Summary

This summary is machine-generated.

Modest sample sizes effectively fine-tune large language models (LLMs) for biomedical named entity recognition (NER). Training data density is key, and quality may outweigh volume for optimal performance.

Keywords:

annotation conflict of interest disclosure disclosures expert annotation fine-tuning language model large language models machine learning named-entity recognition natural language processing sample sample size statement statements transfer learning

More Related Videos

Comparing the Frequency Effect Between the Lexical Decision and Naming Tasks in Chinese

Comparing the Frequency Effect Between the Lexical Decision and Naming Tasks in Chinese

Published on: April 1, 2016

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

Related Experiment Videos

Last Updated: Jun 23, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Comparing the Frequency Effect Between the Lexical Decision and Naming Tasks in Chinese

Comparing the Frequency Effect Between the Lexical Decision and Naming Tasks in Chinese

Published on: April 1, 2016

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

Area of Science:

Health Informatics
Natural Language Processing
Biomedical Data Science

Background:

Large language models (LLMs) offer significant potential for health informatics applications.
However, there is a lack of practical data regarding sample size requirements for fine-tuning LLMs in biomedical and health policy contexts.

Purpose of the Study:

To evaluate sample size and selection techniques for fine-tuning LLMs.
To improve named entity recognition (NER) for conflict of interest disclosure statements.

Main Methods:

Annotated 490 conflict of interest disclosure statements to identify "PERSON" and "ORG" entities.
Drew 2500 stratified random samples of varying sizes for fine-tuning.
Trained Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) models using these samples.
Assessed the impact of sample size (sentences) and entity density (entities per sentence [EPS]) on NER performance (F1-score) using multiple regression.

Main Results:

Fine-tuned models achieved high NER performance (F1-score: 0.79–0.96).
Both sample size and EPS were significant predictors of model performance (P<.001).
Identified diminishing marginal returns for both sample size (439–527 sentences) and EPS (1.36–1.38).

Conclusions:

Effective fine-tuning of LLMs for biomedical NER is achievable with modest sample sizes.
Training data entity density should align with production data.
Training data quality and model architecture's intended use are critical factors, potentially more so than data volume or model size.