Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Probability in Statistics01:14

Probability in Statistics

20.7K
Probability is the likelihood of an event occurring. The term event is defined as a collection of results of a procedure. An event is a simple event when an outcome cannot be divided into simpler parts.
An example of a simple event is a coin toss. The result of a coin toss is either a head or a tail. Here, head and tail are two simple events. These two simple events make up the sample space. Further, the probability of an event occurring falls within the range of 0 to 1. The probability of an...
20.7K
Probability Histograms01:17

Probability Histograms

12.9K
A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.
12.9K
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

335
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
335
Probability Distributions01:32

Probability Distributions

11.2K
 The probability of a random variable x  is the likelihood of its occurrence. A probability distribution represents the probabilities of a random variable using a formula, graph, or table. There are two types of probability distribution– discrete probability distribution and continuous probability distribution.
A discrete probability distribution is a probability distribution of discrete random variables. It can be categorized into binomial probability distribution and Poisson...
11.2K
Cluster Sampling Method01:20

Cluster Sampling Method

13.8K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
13.8K
Stratified Sampling Method01:16

Stratified Sampling Method

14.2K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...
14.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prime Editing of Phytoene Synthase 1 in Rice for Seed Carotenoid Biofortification.

Plant biotechnology journal·2026
Same author

Real-world evaluation and management of osteoporosis in postmenopausal women following distal radius fracture.

Journal of clinical densitometry : the official journal of the International Society for Clinical Densitometry·2026
Same author

Generalized entropy calibration for analyzing voluntary survey data.

Biometrics·2026
Same author

Comprehensive metabolomics and phytochemical analyses identified important metabolites involved in the antioxidant activity of four Swiss chard cultivars (<i>Beta vulgaris</i> L. var. cicla) with different leaf colours.

Food chemistry: X·2026
Same author

Mutation of STAY-GREEN 1 in tomato increases volatile organic compounds during fruit ripening.

Plant & cell physiology·2026
Same author

Prognostic value of clinical and electrodiagnostic factors after corticosteroid injection in carpal tunnel syndrome.

Plastic and reconstructive surgery·2026
Same journal

Simplifying debiased inference via automatic differentiation and probabilistic programming.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Principal stratification with U-statistics under principal ignorability.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Causal K-Means Clustering.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Correction to: Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Harmonized Estimation of Subgroup-Specific Treatment Effects in Randomized Trials: The Use of External Control Data.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
See all related articles

Related Experiment Video

Updated: Dec 1, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.8K

Doubly robust inference when combining probability and non-probability samples with high dimensional data.

Shu Yang1, Jae Kwang Kim2, Rui Song1

  • 1North Carolina State University, Raleigh, USA.

Journal of the Royal Statistical Society. Series B, Statistical Methodology
|November 9, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces a two-step method for combining non-probability and probability samples, improving variable selection and finite population inference for representative covariate data.

Keywords:
Data integrationDouble robustnessGeneralizabilityPenalized estimating equationVariable selection

More Related Videos

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

17.2K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K

Related Experiment Videos

Last Updated: Dec 1, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.8K
Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

17.2K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K

Area of Science:

  • Statistics
  • Survey Methodology
  • Data Science

Background:

  • Integrating non-probability and probability samples presents challenges in statistical inference.
  • High-dimensional covariate information from probability samples is valuable for target population analysis.
  • Existing methods may struggle with variable selection and ensuring robustness.

Purpose of the Study:

  • To develop a robust two-step approach for variable selection and finite population inference.
  • To effectively combine non-probability samples with probability samples containing rich covariate data.
  • To enhance the accuracy and reliability of statistical estimates from complex survey data.

Main Methods:

  • A two-step procedure involving penalized estimating equations with folded concave penalties for variable selection.
  • Utilizing a doubly robust estimator for finite population mean estimation.
  • Re-estimating nuisance model parameters by minimizing the asymptotic squared bias of the doubly robust estimator.

Main Results:

  • Demonstrated selection consistency for important variables across general samples in the first step.
  • Developed a doubly robust estimator that is root-n consistent under weaker model assumptions.
  • The proposed strategy mitigates potential first-step selection errors, enhancing overall estimator performance.

Conclusions:

  • The proposed two-step method offers a robust framework for integrating diverse data sources.
  • Variable selection and finite population inference are improved by combining penalized estimating equations and doubly robust estimation.
  • This approach provides reliable estimates even when either the sampling probability or outcome model is misspecified.