Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Estimating Population Mean with Unknown Standard Deviation

Estimating Population Mean with Unknown Standard Deviation

In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...

What are Estimates?

What are Estimates?

It isn't easy to measure a parameter such as the mean height or the mean weight of a population. So, we draw samples from the population and calculate the mean height or mean weight of the individuals in the sample. This sample data acts as a representative measure of the population parameter. These sample statistics are known as estimates.
The estimate for the mean of a sample is denoted by ͞x, whereas the mean of the population is designated as μ. Further, parameters such...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Choosing Between z and t Distribution

Choosing Between z and t Distribution

The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Super-taxon in human microbiome are identified to be associated with colorectal cancer.

BMC bioinformatics·2022

Same author

Pre-IVF treatment with a GnRH antagonist in women with endometriosis (PREGNANT): study protocol for a prospective, double-blind, placebo-controlled trial.

BMJ open·2022

Same author

Comparative genomic analysis revealed genetic divergence between Bifidobacterium catenulatum subspecies present in infant versus adult guts.

BMC microbiology·2022

Same author

Probiotics synergized with conventional regimen in managing Parkinson's disease.

NPJ Parkinson's disease·2022

Same author

Protocol of a randomized, double-blind, placebo-controlled study of the effect of probiotics on the gut microbiome of patients with gastro-oesophageal reflux disease treated with rabeprazole.

BMC gastroenterology·2022

Same author

<i>Lentilactobacillus rapi</i> subsp. <i>dabitei</i> subsp. nov., a lactic acid bacterium isolated from naturally fermented dairy product.

International journal of systematic and evolutionary microbiology·2022

Same journal

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment.

Journal of the American Statistical Association·2026

Same journal

Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates.

Journal of the American Statistical Association·2026

Same journal

Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference.

Journal of the American Statistical Association·2026

Same journal

Facilitating Heterogeneous Effect Estimation via Statistically Efficient Categorical Modifiers.

Journal of the American Statistical Association·2026

Same journal

Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data.

Journal of the American Statistical Association·2026

Same journal

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 22, 2025

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench

Published on: August 23, 2017

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning.

Siyi Deng¹, Yang Ning¹, Jiwei Zhao²

¹Department of Statistics and Data Science, Cornell University, Ithaca, NY 14850, USA.

Journal of the American Statistical Association

|March 13, 2025

Summary

This summary is machine-generated.

This study explores using unlabeled data to enhance high-dimensional semi-supervised learning parameter estimation. An optimal estimator leverages unlabeled data to outperform traditional supervised methods, even with misspecified models.

Keywords:

High dimensionality model aggregation model misspecification optimal estimation safe estimation semi-supervised learning

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM

A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM

Published on: August 18, 2023

Related Experiment Videos

Last Updated: May 22, 2025

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench

Published on: August 23, 2017

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM

A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM

Published on: August 18, 2023

Area of Science:

Statistics
Machine Learning

Background:

High-dimensional data analysis presents challenges for accurate parameter estimation.
Linear models are frequently used but may be misspecified in real-world applications.
Semi-supervised learning offers a framework to leverage both labeled and unlabeled data.

Purpose of the Study:

To investigate the conditions and methods for effectively utilizing unlabeled data in high-dimensional semi-supervised learning.
To develop estimators that improve upon supervised methods when linear models are potentially misspecified.
To establish theoretical performance bounds for parameter estimation in this setting.

Main Methods:

Establishing the minimax lower bound for parameter estimation in the semi-supervised context.
Proposing an optimal semi-supervised estimator designed to achieve this lower bound.
Developing a 'safe' semi-supervised estimator that guarantees performance at least as good as supervised estimators.
Extending the methodology to aggregate multiple estimators for robustness against different model misspecifications.

Main Results:

Demonstrating that supervised estimators alone cannot achieve the established minimax lower bound.
Showing that the proposed optimal semi-supervised estimator can attain the lower bound under specific conditions.
Confirming that the safe semi-supervised estimator provides a performance guarantee.
Illustrating the effectiveness of the proposed methods through extensive simulations and real data analysis.

Conclusions:

Unlabeled data can significantly improve parameter estimation in high-dimensional semi-supervised learning, especially when models are misspecified.
The proposed optimal and safe semi-supervised estimators offer theoretical and practical advantages over purely supervised approaches.
The developed aggregation strategy enhances robustness in the presence of model uncertainty.