Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...

Statistical Methods for Analyzing Epidemiological Data

Statistical Methods for Analyzing Epidemiological Data

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

Interpreting R Charts

Interpreting R Charts

R chart, or range chart, is a fundamental tool in statistical process control used to monitor the variability within a process. It complements the X-bar (x̄) chart by focusing on the range of the data, rather than individual values, providing a clear picture of the process dispersion over time.
An R chart plots the range of subsets of measurements collected from a process. Each point on the chart represents the range—defined as the difference between the maximum and minimum...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Causal effect heterogeneity estimation using summary statistics.

Research square·2026

Same author

Challenges to case-only analysis for interaction detection using polygenic risk scores: model assumptions and biases in large biobanks.

Genetics·2026

Same author

Insights into the Datasets, Tools, and Training Needs of the AnVIL Community: 2024.

bioRxiv : the preprint server for biology·2025

Same author

Polygenic prediction of treatment efficacy with causal transfer learning.

medRxiv : the preprint server for health sciences·2025

Same author

What's the Weight? Estimating Controlled Outcome Differences in Complex Surveys for Health Disparities Research.

Statistics in medicine·2025

Same author

PIGEON: a statistical framework for estimating gene-environment interaction for polygenic traits.

Nature human behaviour·2025

Same journal

PepMCP: A Graph-Based Membrane Contact Probability Predictor for Membrane-Lytic Antimicrobial Peptides.

Bioinformatics (Oxford, England)·2026

Same journal

ARGscape: A modular, interactive tool for manipulation of spatiotemporal ancestral recombination graphs.

Bioinformatics (Oxford, England)·2026

Same journal

A-liner: linear alignment visualizer for genome comparisons.

Bioinformatics (Oxford, England)·2026

Same journal

Interacting Species Database (ISDB): Comprehensive Resource for Interspecies Interactions at the Molecular Level.

Bioinformatics (Oxford, England)·2026

Same journal

ReadChop: a high-performance demultiplexer for long-read sequencing data.

Bioinformatics (Oxford, England)·2026

Same journal

SegJointGene: joint cell segmentation and spatial gene prioritization by information entropy guided convolutional neural networks.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Home
Ipd: An R Package For Conducting Inference On Predicted Data.

Home
Ipd: An R Package For Conducting Inference On Predicted Data.

Related Experiment Video

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

ipd: an R package for conducting inference on predicted data.

Stephen Salerno¹, Jiacheng Miao², Awan Afiaz^1,3

¹Public Health Sciences, Biostatistics, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States.

Bioinformatics (Oxford, England)

|February 3, 2025

View abstract on PubMed

Summary

This summary is machine-generated.

Introducing ipd, an R package for downstream modeling with imputed data. It simplifies inference on predicted data using AI/ML, offering user-friendly functions for model inspection and analysis.

More Related Videos

Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study

Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study

Published on: April 18, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Related Experiment Videos

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study

Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study

Published on: April 18, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Area of Science:

Statistical Software
Machine Learning Applications
Data Science

Background:

The ipd package is an open-source R software for downstream modeling.
It addresses challenges in handling outcome data imputed by AI/ML algorithms.
The package is available on CRAN and GitHub with comprehensive documentation.

Purpose of the Study:

To introduce the ipd R package for statistical modeling.
To provide a user-friendly tool for inference on data with AI/ML-imputed outcomes.
To demonstrate the basic usage and features of the ipd package.

Main Methods:

The ipd package implements recent methods for inference on predicted data.
It offers a single, user-friendly wrapper function named 'ipd'.

Custom methods (print, summary, tidy, glance, augment) are included for model inspection.

Main Results:

The ipd package facilitates downstream modeling with imputed outcome data.
It enables straightforward inference on AI/ML-predicted data.
The package simplifies model inspection through custom S3 methods.

Conclusions:

ipd is a valuable open-source R package for researchers and data scientists.
It enhances the ability to perform reliable statistical modeling with imputed data.
The package promotes reproducible research and efficient data analysis.