Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Regression01:25

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
Truncation in Survival Analysis01:09

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are observed.
Regression Toward the Mean01:52

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when researchers try to extrapolate results...
Prediction Intervals01:03

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
The...
Regression Analysis01:11

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
Assumptions of Survival Analysis01:15

Assumptions of Survival Analysis

Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

[Role of red nucleus in inhibiting nociceptive responses of rat spindle afferent].

Sheng li xue bao : [Acta physiologica Sinica]·2001
Same author

Colonoscopic manifestations of primary colorectal lymphoma.

Endoscopy·2001
Same author

Steroid allergy: report of two cases.

Journal of microbiology, immunology, and infection = Wei mian yu gan ran za zhi·2001
Same author

A second deciduous molar impacted in right maxillary sinus: a long-term follow-up.

Chang Gung medical journal·2001
Same author

Augmented inverse probability weighted estimator for Cox missing covariate regression.

Biometrics·2001
Same author

Metabolism of phenolic compounds during loquat fruit development.

Journal of agricultural and food chemistry·2001
Same journal

A Bayesian functional concurrent zero-inflated Dirichlet-multinomial regression model with application to infant microbiome.

Biostatistics (Oxford, England)·2026
Same journal

Towards optimal environmental policies: policy learning under arbitrary bipartite network interference.

Biostatistics (Oxford, England)·2026
Same journal

Multilevel functional quantile principal component analysis.

Biostatistics (Oxford, England)·2026
Same journal

Adaptive transfer learning for time-to-event modeling with applications in disease risk assessment.

Biostatistics (Oxford, England)·2026
Same journal

High-dimensional test for one-sided hypotheses.

Biostatistics (Oxford, England)·2026
Same journal

NBSR: a Negative Binomial Softmax Regression model for microRNA-seq data analysis.

Biostatistics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Jun 18, 2026

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

Boosting with missing predictors.

C Y Wang1, Ziding Feng

  • 1Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109-1024, USA. cywang@fhcrc.org

Biostatistics (Oxford, England)
|December 2, 2009
PubMed
Summary
This summary is machine-generated.

This study introduces two new conditional mean imputation methods for high-dimensional classification using microarray data. These methods effectively handle missing data, outperforming naive approaches in pancreatic cancer classification.

Related Experiment Videos

Last Updated: Jun 18, 2026

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

Area of Science:

  • Bioinformatics
  • Machine Learning
  • Statistical Modeling

Background:

  • Boosting is a powerful classification technique, particularly effective for high-dimensional data like antibody microarrays.
  • Microarray data frequently presents missing values, complicating analysis and imputation.
  • Existing imputation methods can be computationally intensive for high-dimensional datasets.

Purpose of the Study:

  • To propose novel conditional mean imputation methods for high-dimensional microarray data.
  • To address the challenge of missing data in classification tasks, even without a complete-case subset.
  • To evaluate the performance of these new imputation methods against existing techniques.

Main Methods:

  • Development of two conditional mean imputation techniques tailored for high-dimensional predictors.
  • Application of imputation methods to antibody microarray data, including serum protein data for pancreatic cancer studies.
  • Comparative analysis through simulations to assess method superiority.

Main Results:

  • The proposed conditional mean imputation methods demonstrate superior performance compared to naive imputation strategies.
  • The methods are effective even when no complete-case subset is available.
  • Successful application to a real-world pancreatic cancer classification study using serum protein microarrays.

Conclusions:

  • The novel conditional mean imputation methods offer an efficient and effective solution for handling missing data in high-dimensional classification.
  • These methods enhance the applicability of boosting for analyzing complex biological datasets like microarrays.
  • The approach shows promise for improving diagnostic accuracy in diseases such as pancreatic cancer.