Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Five-year survival with tebentafusp in metastatic uveal melanoma.

Annals of oncology : official journal of the European Society for Medical Oncology·2026
Same author

IO102-IO103 immune-modulatory cancer vaccine and pembrolizumab in melanoma.

Annals of oncology : official journal of the European Society for Medical Oncology·2026
Same author

ESMO adaptation of Lines of Systemic Therapy (EnLiST): a consensus framework for standardising the designation of lines of therapy in solid tumours.

Annals of oncology : official journal of the European Society for Medical Oncology·2026
Same author

Prediction of Mutations and Outcome in Gastrointestinal Stromal Tumors with Deep Learning: A Multicenter, Multinational Study.

medRxiv : the preprint server for health sciences·2026
Same author

Incidence and risk factors of brain metastases in radically resected melanoma patients: a large international cohort study.

ESMO open·2026
Same author

Exploring the impact of NGS on diagnostics and treatment of sarcoma: insights from real-world data across multiple institutions in Europe.

ESMO open·2025
Same journal

First-line osimertinib in advanced <i>EGFR</i>-mutated NSCLC: real-world outcomes, clinicogenomic correlates, and oligoprogression management in a multicenter Spanish cohort.

ESMO real world data and digital oncology·2026
Same journal

Large language models in oncology: promise, pitfalls, and the path to real-world adoption.

ESMO real world data and digital oncology·2026
Same journal

Systematic identification of genomic nonresponse biomarkers to cancer therapies.

ESMO real world data and digital oncology·2026
Same journal

Retrospective analysis of real-world clinical use of comprehensive genomic profiling in solid tumors in Finland 2017-2020.

ESMO real world data and digital oncology·2026
Same journal

Development and evaluation of a large language model-based, retrieval-augmented generation application for query response in early oncology clinical trials.

ESMO real world data and digital oncology·2026
Same journal

The evolving physician-AI relationship: a five-tier framework for integrating intelligent systems into clinical practice and medical education.

ESMO real world data and digital oncology·2026
See all related articles

Related Experiment Video

Updated: Apr 24, 2026

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

5.8K

Accelerating real-world data collection using large language models in rare neoplasms: a bone sarcoma example.

P Teterycz1,2, S Rynkun1, B Szostakowski2,3

  • 1Digital Medicine Center, Maria Sklodowska-Curie National Research Institute of Oncology, Warsaw, Poland.

ESMO Real World Data and Digital Oncology
|April 23, 2026
PubMed
Summary
This summary is machine-generated.

Extracting oncology data from Polish medical notes using small large language models (LLMs) showed modest single-model accuracy. However, an ensemble voting approach significantly improved performance, demonstrating potential for automated clinical research data extraction.

Keywords:
LLMsartificial intelligencebone sarcomadata extraction

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

3.0K

Related Experiment Videos

Last Updated: Apr 24, 2026

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

5.8K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

3.0K

Area of Science:

  • Medical Informatics
  • Natural Language Processing
  • Oncology Data Science

Background:

  • Real-world data collection in oncology is challenging due to unstructured medical notes.
  • Large language models (LLMs) show promise for extracting information from free-text data.
  • This study assesses small LLMs for information extraction from Polish medical notes.

Purpose of the Study:

  • To evaluate the performance of multiple small LLMs as information extractors on Polish medical notes.
  • To determine the effectiveness of different prompting techniques and ensemble methods for data extraction.
  • To assess the feasibility of automating data extraction from electronic health records (EHRs) in a non-English setting.

Main Methods:

  • Utilized EHRs from 302 bone sarcoma patients (2016-2022).
  • Annotated five key variables: pathology type, tumor size, localization, grade, and primary resection.
  • Employed four small LLMs with multiple prompting techniques and an ensemble voting strategy.

Main Results:

  • Single-model accuracy ranged from 17.5% to 30.3%, highly dependent on prompts.
  • Tumor localization was the easiest variable to extract (up to 36.2% accuracy).
  • The ensemble voting approach significantly boosted overall accuracy to 83.6%, reaching 90.0% for resection type.

Conclusions:

  • Lightweight LLMs show potential for automating data extraction from medical notes, accelerating clinical research.
  • Individual small LLMs are insufficient for real-world, non-English applications.
  • Prompt engineering and ensemble methods are crucial for improving LLM performance in medical data extraction.