Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Systematic Error: Methodological and Sampling Errors

Systematic Error: Methodological and Sampling Errors

In the case of systematic errors, the sources can be identified, and the errors can be subsequently minimized by addressing these sources. According to the source, systematic errors can be divided into sampling, instrumental, methodological, and personal errors.
Sampling errors originate from improper sampling methods or the wrong sample population. These errors can be minimized by refining the sampling strategy. Defective instruments or faulty calibrations are the sources of instrumental...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Toward Evidence Synthesis of Adverse Events in Imbalanced Time-to-Event Data.

Journal of evidence-based medicine·2026

Same author

SPIRIT-CONSORT-ELM: Element-Level Assessment of Randomized Controlled Trial Reporting Using Large Language Models.

medRxiv : the preprint server for health sciences·2026

Same author

TACE-induced liver injury in hepatocellular carcinoma: mechanisms, prediction, and prevention in the era of gut microbiota and inflammasome research.

Clinics and research in hepatology and gastroenterology·2026

Same author

Metal-center electron affinity modulates multicolor electrochromism in 2D conjugated metal-organic frameworks.

Nature communications·2026

Same author

Bimetallic-Node-Occupied MOF With Glycoside Hydrolase Activity for Efficient Bacterial Biofilm Hydrolysis.

Angewandte Chemie (International ed. in English)·2026

Same author

Enterovirus-related central nervous system infections in Queensland, Australia: A 23-year spatiotemporal analysis (2000-2022).

Infection, disease & health·2026

Same journal

Quality of instruments to assess the process of shared decision-making: a comprehensive systematic review and COSMIN quality appraisal.

BMJ evidence-based medicine·2026

Same journal

Safety of proton pump inhibitors: an overview of systematic reviews and meta-analyses.

BMJ evidence-based medicine·2026

Same journal

Effectiveness and tolerability of pharmacological prophylaxis for migraine headaches: a systematic review and network meta-analysis of randomised controlled trials.

BMJ evidence-based medicine·2026

Same journal

Prediction models in clinical guidelines: a scoping review of clinical guideline guidance development documents.

BMJ evidence-based medicine·2026

Same journal

Advancing traditional and integrated medicine: reflections from an international roundtable.

BMJ evidence-based medicine·2026

Same journal

Impact of prompt engineering on large language models for risk of bias assessment: a comparative study.

BMJ evidence-based medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 1, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evaluating data extraction error by a large language model from randomised controlled trials: a large-scale empirical

Shiqi Fan¹, Ming Chen², Suhail A Doi³

¹Proof of Concept Center, Shanghai Eastern Hepatobiliary Surgery Hospital, Shanghai, China.

BMJ Evidence-Based Medicine

|April 29, 2026

Summary

This summary is machine-generated.

Large language models (LLMs) like Claude 3.5 Sonnet show low data extraction error rates from randomized controlled trials (RCTs). However, careful verification of LLM outputs is crucial for evidence synthesis applications.

Keywords:

Child Health Drug-Related Side Effects and Adverse Reactions Evidence-Based Practice

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Published on: June 30, 2020

Related Experiment Videos

Last Updated: May 1, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Published on: June 30, 2020

Area of Science:

Medical informatics
Artificial intelligence in healthcare
Clinical trial methodology

Background:

Large language models (LLMs) offer potential for automating data extraction from clinical research.
Evaluating the accuracy of LLMs in extracting data from randomized controlled trials (RCTs) is essential.

Purpose of the Study:

To assess the data extraction accuracy of Claude 3.5 Sonnet on RCTs.
To identify common error types and influencing factors in LLM-based data extraction.

Main Methods:

An empirical study compared Claude 3.5 Sonnet's extractions against a human-verified dataset of 664 RCTs.
Data extraction focused on basic trial information and adverse outcomes.
Error rates were calculated and analyzed by error type and trial reporting quality (CONSORT adherence).

Main Results:

The overall data extraction error rate for Claude 3.5 Sonnet was 6.6%.
Misallocation (57.1%) and omitted data (23.2%) were the most frequent error types.
Higher adherence to Consolidated Standards of Reporting Trials (CONSORT) guidelines correlated with lower LLM extraction errors.

Conclusions:

Claude 3.5 Sonnet demonstrates a relatively low error rate for RCT data extraction.
LLM applications in evidence synthesis require rigorous human oversight and detailed checking of outputs.