Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

304
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
304
Prediction Intervals01:03

Prediction Intervals

2.2K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.2K
Estimating Population Mean with Unknown Standard Deviation01:22

Estimating Population Mean with Unknown Standard Deviation

7.6K
In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...
7.6K
What are Estimates?01:06

What are Estimates?

4.9K
It isn't easy to measure a parameter such as the mean height or the mean weight of a population. So, we draw samples from the population and calculate the mean height or mean weight of the individuals in the sample. This sample data acts as a representative measure of the population parameter. These sample statistics are known as estimates. 
The estimate for the mean of a sample is denoted by ͞x, whereas the mean of the population is designated as μ. Further, parameters such...
4.9K
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.0K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.0K
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

2.7K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
2.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Super-taxon in human microbiome are identified to be associated with colorectal cancer.

BMC bioinformatics·2022
Same author

Pre-IVF treatment with a GnRH antagonist in women with endometriosis (PREGNANT): study protocol for a prospective, double-blind, placebo-controlled trial.

BMJ open·2022
Same author

Comparative genomic analysis revealed genetic divergence between Bifidobacterium catenulatum subspecies present in infant versus adult guts.

BMC microbiology·2022
Same author

Probiotics synergized with conventional regimen in managing Parkinson's disease.

NPJ Parkinson's disease·2022
Same author

Protocol of a randomized, double-blind, placebo-controlled study of the effect of probiotics on the gut microbiome of patients with gastro-oesophageal reflux disease treated with rabeprazole.

BMC gastroenterology·2022
Same author

<i>Lentilactobacillus rapi</i> subsp. <i>dabitei</i> subsp. nov., a lactic acid bacterium isolated from naturally fermented dairy product.

International journal of systematic and evolutionary microbiology·2022
Same journal

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment.

Journal of the American Statistical Association·2026
Same journal

Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates.

Journal of the American Statistical Association·2026
Same journal

Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference.

Journal of the American Statistical Association·2026
Same journal

Facilitating Heterogeneous Effect Estimation via Statistically Efficient Categorical Modifiers.

Journal of the American Statistical Association·2026
Same journal

Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data.

Journal of the American Statistical Association·2026
Same journal

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026
See all related articles

Related Experiment Video

Updated: May 22, 2025

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench
11:38

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench

Published on: August 23, 2017

9.8K

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning.

Siyi Deng1, Yang Ning1, Jiwei Zhao2

  • 1Department of Statistics and Data Science, Cornell University, Ithaca, NY 14850, USA.

Journal of the American Statistical Association
|March 13, 2025
PubMed
Summary
This summary is machine-generated.

This study explores using unlabeled data to enhance high-dimensional semi-supervised learning parameter estimation. An optimal estimator leverages unlabeled data to outperform traditional supervised methods, even with misspecified models.

Keywords:
High dimensionalitymodel aggregationmodel misspecificationoptimal estimationsafe estimationsemi-supervised learning

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.4K
A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM
13:54

A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM

Published on: August 18, 2023

4.3K

Related Experiment Videos

Last Updated: May 22, 2025

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench
11:38

Volume Segmentation and Analysis of Biological Materials Using SuRVoS Super-region Volume Segmentation Workbench

Published on: August 23, 2017

9.8K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.4K
A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM
13:54

A Workflow for Lipid Nanoparticle LNP Formulation Optimization using Designed Mixture-Process Experiments and Self-Validated Ensemble Models SVEM

Published on: August 18, 2023

4.3K

Area of Science:

  • Statistics
  • Machine Learning

Background:

  • High-dimensional data analysis presents challenges for accurate parameter estimation.
  • Linear models are frequently used but may be misspecified in real-world applications.
  • Semi-supervised learning offers a framework to leverage both labeled and unlabeled data.

Purpose of the Study:

  • To investigate the conditions and methods for effectively utilizing unlabeled data in high-dimensional semi-supervised learning.
  • To develop estimators that improve upon supervised methods when linear models are potentially misspecified.
  • To establish theoretical performance bounds for parameter estimation in this setting.

Main Methods:

  • Establishing the minimax lower bound for parameter estimation in the semi-supervised context.
  • Proposing an optimal semi-supervised estimator designed to achieve this lower bound.
  • Developing a 'safe' semi-supervised estimator that guarantees performance at least as good as supervised estimators.
  • Extending the methodology to aggregate multiple estimators for robustness against different model misspecifications.

Main Results:

  • Demonstrating that supervised estimators alone cannot achieve the established minimax lower bound.
  • Showing that the proposed optimal semi-supervised estimator can attain the lower bound under specific conditions.
  • Confirming that the safe semi-supervised estimator provides a performance guarantee.
  • Illustrating the effectiveness of the proposed methods through extensive simulations and real data analysis.

Conclusions:

  • Unlabeled data can significantly improve parameter estimation in high-dimensional semi-supervised learning, especially when models are misspecified.
  • The proposed optimal and safe semi-supervised estimators offer theoretical and practical advantages over purely supervised approaches.
  • The developed aggregation strategy enhances robustness in the presence of model uncertainty.