Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Types of Biopharmaceutical Studies: Controlled and Non-Controlled Approaches

Types of Biopharmaceutical Studies: Controlled and Non-Controlled Approaches

Biopharmaceutical studies constitute a vital field aiming to enhance drug delivery methods and refine therapeutic approaches, drawing upon diverse interdisciplinary knowledge. In research methodologies, the choice between controlled and non-controlled studies significantly influences the study's reliability and accuracy.
Non-controlled studies, commonly employed for initial exploration, lack a control group, rendering them susceptible to biases and external influences. In contrast,...

Statistical Software for Data Analysis and Clinical Trials

Statistical Software for Data Analysis and Clinical Trials

Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A Systematic Review and Meta-analysis of Ethnic Inequalities in Stroke Incidence Patterns and Trends in High-Income Countries (2015-2025).

Neuroepidemiology·2026

Same author

Variation in stroke survivors' long-term home care use: a South London population-based study.

European stroke journal·2026

Same author

Temporal Trends and Drivers of Socioeconomic Inequalities in Stroke Care, Survival, and Long-Term Outcomes.

Journal of the American Heart Association·2026

Same author

Reducing alcohol consumption in UK armed forces veterans: Feasibility of using personalized push notifications with AI.

PLOS digital health·2026

Same author

Ethnic Inequalities in Return to Work Post First Stroke: Findings From a Population-Based Cohort Study in South London.

Journal of the American Heart Association·2026

Same author

Effectiveness of AI-based interventions in workplace mental health: a systematic review and narrative synthesis.

British medical bulletin·2026

Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026

Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026

Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026

Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026

Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026

Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 17, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Automatically Extracting Numerical Results from Randomized Controlled Trials with Large Language Models.

Hye Sun Yun¹, David Pogrebitskiy¹, Iain J Marshall²

¹Northeastern University, Boston, MA, USA.

Proceedings of Machine Learning Research

|September 22, 2025

Summary

This summary is machine-generated.

Large language models (LLMs) show promise for automating meta-analyses by extracting data from randomized controlled trials (RCTs). While effective for simple outcomes, LLMs struggle with complex data requiring inference.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Jan 17, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Area of Science:

Medical Informatics
Natural Language Processing
Clinical Trials

Background:

Meta-analyses are crucial for robust treatment effectiveness estimates, synthesizing findings from multiple randomized controlled trials (RCTs).
Current meta-analysis requires laborious manual data extraction from individual trial reports, limiting efficiency and scalability.
Automating this data extraction using language technologies could enable on-demand meta-analyses.

Purpose of the Study:

To evaluate the capability of modern large language models (LLMs) in reliably extracting numerical findings from clinical trial reports for meta-analysis.
To assess LLM performance in zero-shot conditional extraction of numerical results linked to interventions, comparators, and outcomes.

Main Methods:

Development and release of a granular evaluation dataset of clinical trial reports with annotated numerical findings.
Evaluation of seven large language models (LLMs) using a zero-shot approach on the annotated dataset.
Focus on extracting numerical results for interventions, comparators, and outcomes from trial reports.

Main Results:

Massive LLMs demonstrate near-capability for fully automatic meta-analysis, particularly for dichotomous outcomes like mortality.
LLMs exhibit poor performance when outcome measures are complex and require inferential reasoning to tally results.
Performance limitations persist even for LLMs trained on biomedical texts.

Conclusions:

Large language models (LLMs) are approaching the goal of fully automatic meta-analysis of randomized controlled trials (RCTs).
Current LLMs face significant limitations in extracting and synthesizing complex numerical data from trial reports.
Further advancements in LLMs are needed to overcome challenges in inferential data processing for comprehensive meta-analysis.