Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Steps in Outbreak Investigation01:18

Steps in Outbreak Investigation

257
In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:
257
Prediction Intervals01:03

Prediction Intervals

2.5K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.5K
Outliers and Influential Points01:08

Outliers and Influential Points

4.8K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
4.8K
Censoring Survival Data01:09

Censoring Survival Data

284
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
284
Survival Tree01:19

Survival Tree

183
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
183
Residuals and Least-Squares Property01:11

Residuals and Least-Squares Property

8.1K
The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...
8.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Association of RNA m<sup>7</sup>G Modification Gene Polymorphisms with Pediatric Glioma Risk.

BioMed research international·2023
Same author

Treatment of sewage sludge hydrothermal carbonization aqueous phase by Fe(II)/CaO<sub>2</sub> system: Oxidation behaviors and mechanism of organic compounds.

Waste management (New York, N.Y.)·2023
Same author

TeCD: The eccDNA Collection Database for extrachromosomal circular DNA.

BMC genomics·2023
Same author

Genetic variants in m5C modification core genes are associated with the risk of Chinese pediatric acute lymphoblastic leukemia: A five-center case-control study.

Frontiers in oncology·2023
Same author

New association between splicing factor-coding gene polymorphisms and the risk of acute lymphoblastic leukemia in southern Chinese children: A five-center case-control study.

The journal of gene medicine·2023
Same author

The Mef2c/AdipoR1 axis is responsible for myogenic differentiation and is regulated by resistin in skeletal muscles.

Gene·2023
Same journal

Combination Chemotherapy Optimization with Discrete Dosing.

INFORMS journal on computing·2024
Same journal

A High-Fidelity Model to Predict Length-of-Stay in the Neonatal Intensive Care Unit (NICU).

INFORMS journal on computing·2022
Same journal

Supervised t-distributed stochastic neighbor embedding for data visualization and classification.

INFORMS journal on computing·2021
Same journal

Palindromes in SARS and Other Coronaviruses.

INFORMS journal on computing·2014
Same journal

Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction.

INFORMS journal on computing·2010
See all related articles

Related Experiment Video

Updated: Oct 19, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.7K

Predictive Analytics with Strategically Missing Data.

Juheng Zhang1, Xiaoping Liu2, Xiao-Bai Li1

  • 1Department of Operations and Information Systems, University of Massachusetts, Lowell, Massachusetts 01854.

INFORMS Journal on Computing
|September 27, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a new method to handle missing data in predictive analytics. Our approach uses Support Vector Regression to accurately impute missing values, encouraging honest data disclosure.

Keywords:
business analyticsdata manipulationinformation disclosurestrategic learningsupport vector regression

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

755

Related Experiment Videos

Last Updated: Oct 19, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.7K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

755

Area of Science:

  • Data Science
  • Machine Learning
  • Predictive Analytics

Background:

  • Real-world data often has strategically missing values due to intentional concealment by data providers.
  • This strategic data omission occurs in various domains like finance, admissions, and marketing, impacting decision-making.
  • Existing methods struggle to address the incentive problem behind strategically missing data.

Purpose of the Study:

  • To develop a novel approach for handling strategically missing data in regression prediction.
  • To create a mechanism that incentivizes data providers to disclose truthful information.
  • To minimize imputation errors for missing values in predictive models.

Main Methods:

  • Utilizing Support Vector Regression (SVR) models to derive imputation values for missing data.
  • Developing a framework that aligns data provider incentives with accurate data disclosure.
  • Applying the proposed method to real-world datasets for validation.

Main Results:

  • The proposed method effectively imputes strategically missing data.
  • Support Vector Regression models are leveraged for accurate imputation.
  • Imputation errors are minimized under specific conditions, as demonstrated by experiments.
  • The approach incentivizes data providers to reveal true information, improving data quality.

Conclusions:

  • The novel approach effectively addresses strategically missing data problems in predictive analytics.
  • Support Vector Regression provides a robust foundation for imputing missing values.
  • The method offers a practical solution for decision-makers facing data concealment.
  • Experimental validation confirms the approach's effectiveness on real-world data.