Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Topiramate's ocular trap: acute angle closure rescued by early systemic steroid therapy.

BMJ case reports·2026
Same author

Language-based personality assessment from life narratives: a focus on model interpretability and efficiency.

Frontiers in artificial intelligence·2026
Same author

Targeted use of large language models for EHR-based computable phenotyping.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

Topical Steroid and Fairness Cream Abuse in Facial Dermatoses: A Cross-Sectional Study at a Tertiary Care Center in Western Uttar Pradesh.

Cureus·2026
Same author

Periodontal severity and metabolic syndrome: A cross-sectional study from a tertiary care hospital in India.

Bioinformation·2026
Same author

Bias Patterns in the Application of LLMs for Clinical Decision Support: A Comprehensive Study.

Delaware journal of public health·2026
Same journal

What do LLMs value? An evaluation framework for revealing subjective trade-offs in assessment of glycemic control.

Proceedings of machine learning research·2026
Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Aug 13, 2025

PIPEMAT-RS: Development and Validation of a Standardized MATLAB Pipeline for Resting-State EEG Preprocessing
06:51

PIPEMAT-RS: Development and Validation of a Standardized MATLAB Pipeline for Resting-State EEG Preprocessing

Published on: June 6, 2025

287

An Extensive Data Processing Pipeline for MIMIC-IV.

Mehak Gupta1, Brennan Gallamoza1, Nicolas Cutrona1

  • 1University of Delaware, Newark, Delaware, USA.

Proceedings of Machine Learning Research
|January 23, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a customizable pipeline for preprocessing the MIMIC-IV electronic health record (EHR) dataset. The tool standardizes data extraction and cleaning for machine learning, improving reproducibility and enabling clinical prediction tasks.

Keywords:
Data preprocessingElectronic Health RecordsMIMIC

More Related Videos

A Standardized Pipeline for Examining Human Cerebellar Grey Matter Morphometry using Structural Magnetic Resonance Imaging
11:50

A Standardized Pipeline for Examining Human Cerebellar Grey Matter Morphometry using Structural Magnetic Resonance Imaging

Published on: February 4, 2022

4.1K
Second Harmonic Generation Signals in Rabbit Sclera As a Tool for Evaluation of Therapeutic Tissue Cross-linking TXL for Myopia
12:25

Second Harmonic Generation Signals in Rabbit Sclera As a Tool for Evaluation of Therapeutic Tissue Cross-linking TXL for Myopia

Published on: January 6, 2018

7.8K

Related Experiment Videos

Last Updated: Aug 13, 2025

PIPEMAT-RS: Development and Validation of a Standardized MATLAB Pipeline for Resting-State EEG Preprocessing
06:51

PIPEMAT-RS: Development and Validation of a Standardized MATLAB Pipeline for Resting-State EEG Preprocessing

Published on: June 6, 2025

287
A Standardized Pipeline for Examining Human Cerebellar Grey Matter Morphometry using Structural Magnetic Resonance Imaging
11:50

A Standardized Pipeline for Examining Human Cerebellar Grey Matter Morphometry using Structural Magnetic Resonance Imaging

Published on: February 4, 2022

4.1K
Second Harmonic Generation Signals in Rabbit Sclera As a Tool for Evaluation of Therapeutic Tissue Cross-linking TXL for Myopia
12:25

Second Harmonic Generation Signals in Rabbit Sclera As a Tool for Evaluation of Therapeutic Tissue Cross-linking TXL for Myopia

Published on: January 6, 2018

7.8K

Area of Science:

  • Clinical Informatics
  • Machine Learning in Healthcare
  • Data Science

Background:

  • Machine learning on electronic health records (EHRs) is growing, but data accessibility and standardization challenges hinder research.
  • The MIMIC (Medical Information Mart for Intensive Care) dataset is a valuable public resource, yet its raw format requires significant preprocessing.
  • Lack of standardized preprocessing limits reproducibility and comparability across studies using MIMIC data.

Purpose of the Study:

  • To develop a flexible and customizable data processing pipeline for the MIMIC-IV dataset.
  • To facilitate the application of machine learning methods to EHR data for clinical prediction.
  • To enhance the reproducibility and comparability of research utilizing the MIMIC-IV resource.

Main Methods:

  • Developed a customizable pipeline for extracting, cleaning, and preprocessing MIMIC-IV data.
  • Integrated an end-to-end package for predictive model creation and evaluation.
  • The pipeline supports diverse clinical prediction tasks including readmission, length of stay, mortality, and phenotype prediction.

Main Results:

  • A publicly available, customizable pipeline for MIMIC-IV data preprocessing has been created.
  • The tool standardizes data handling, addressing a key barrier to MIMIC dataset utilization.
  • The pipeline supports multiple clinical prediction tasks, enhancing research capabilities.

Conclusions:

  • The developed pipeline significantly improves accessibility and usability of the MIMIC-IV dataset for machine learning applications.
  • Standardized preprocessing enhances the reproducibility and comparability of clinical prediction models derived from EHR data.
  • This tool empowers researchers to more effectively leverage MIMIC-IV for advancing clinical research and patient care.