Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Relative Risk

Relative Risk

Relative risk (RR) is a statistical measure commonly used in epidemiology to compare the likelihood of a particular event occurring between two groups. This metric is important for evaluating the relationship between exposure to a specific risk factor and the probability of a particular outcome. It plays a crucial role in medical research, public health studies, and risk assessment. Relative risk quantifies how much more (or less) likely an event is to occur in an exposed group compared to an...

End Point Prediction: Gran Plot

End Point Prediction: Gran Plot

A Gran plot is used to predict the equivalence volume or endpoint of a potentiometric or acid-base titration without reaching the endpoint. Typically, titration data is collected as a function of the titrant's volume up to a point less than the equivalence volume and then transformed into a linear format. The straight line is extended to the x-axis, indicating the necessary titrant volume to achieve the equivalence point.
For potentiometric titration, the Gran plot is created by plotting...

Drug Concentration Versus Time Correlation

Drug Concentration Versus Time Correlation

The plasma drug concentration-time curve is a crucial tool in pharmacokinetics, representing the drug's concentration in plasma at different time intervals post-administration. This curve illustrates the drug's journey from absorption into the systemic circulation, distribution to body tissues, and eventual elimination through excretion or biotransformation.
Two pivotal parameters are the minimum effective concentration (MEC) and the minimum toxic concentration (MTC). The MEC is the...

Kaplan-Meier Approach

Kaplan-Meier Approach

The Kaplan-Meier estimator is a non-parametric method used to estimate the survival function from time-to-event data. In medical research, it is frequently employed to measure the proportion of patients surviving for a certain period after treatment. This estimator is fundamental in analyzing time-to-event data, making it indispensable in clinical trials, epidemiological studies, and reliability engineering. By estimating survival probabilities, researchers can evaluate treatment effectiveness,...

Hazard Ratio

Hazard Ratio

The hazard ratio (HR) is a widely used measure in clinical trials to compare the risk of events, such as death or disease recurrence, between two groups over time. It reflects the ratio of hazard rates—the instantaneous risk of the event occurring—between a treatment group and a control group. This measure provides valuable insights into the relative effectiveness of a treatment by assessing how the risk of an event differs between the two groups.
For example, in a clinical trial...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling.

medRxiv : the preprint server for health sciences·2026

Same author

Forecasting Clinical Risk from Textual Time Series: Structuring Narratives for Temporal AI in Healthcare.

medRxiv : the preprint server for health sciences·2025

Same author

The Impact of Medication Non-adherence on Adverse Outcomes: Evidence from Schizophrenia Patients via Survival Analysis.

Proceedings of machine learning research·2025

Same author

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis.

ArXiv·2025

Same author

Active Learning for Forecasting Severity among Patients with Post Acute Sequelae of SARS-CoV-2.

ArXiv·2025

Same author

A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports.

ArXiv·2025

Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026

Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026

Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026

Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026

Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026

Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 13, 2025

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

MIMIC-\RNum{4}-Ext-22MCTS: A 22 Millions-Event Temporal Clinical Time-Series Dataset with Relative Timestamp for Risk

Jing Wang, Xing Niu, Juyong Kim

|July 30, 2025

Summary

This summary is machine-generated.

This study introduces a new dataset of over 22 million clinical time series events extracted from discharge summaries. This dataset enhances machine learning models for improved healthcare applications, including medical question answering and clinical trial matching.

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Related Experiment Videos

Last Updated: Sep 13, 2025

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Area of Science:

Medical Informatics
Natural Language Processing
Machine Learning

Background:

High-quality clinical time series data are essential for accurate machine learning-based risk prediction in healthcare.
Existing datasets like MIMIC-IV-Note contain unstructured discharge summaries, posing challenges due to length and missing explicit timestamps for clinical events.

Purpose of the Study:

To create a novel, large-scale dataset of clinical time series events (MIMIC-4-Ext-22MCTS) from unstructured discharge summaries.
To develop a robust framework for extracting clinical events and their temporal information from lengthy medical texts.
To improve the performance of machine learning models in healthcare applications through the use of this new dataset.

Main Methods:

Developed a framework to process lengthy discharge summaries by segmenting them into smaller chunks.
Utilized contextual BM25 and semantic search to identify relevant text chunks containing clinical events.
Employed prompt engineering with the Llama-3.1-8B model to identify and infer timestamps for clinical events.

Main Results:

Created the MIMIC-4-Ext-22MCTS dataset, comprising 22,588,586 clinical time series events.
Fine-tuning standard models on this dataset led to significant performance improvements in healthcare tasks.
BERT models achieved a 10% accuracy increase in medical question answering and a 3% increase in clinical trial matching.

Conclusions:

The MIMIC-4-Ext-22MCTS dataset provides informative and transparent clinical time series data.
This dataset effectively enhances the performance of machine learning models for critical healthcare applications.
The proposed framework offers a scalable solution for extracting temporal clinical event data from unstructured medical records.