Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Testing a Claim about Population Proportion

Testing a Claim about Population Proportion

A complete procedure for testing a claim about a population proportion is provided here.
There are two methods of testing a claim about a population proportion: (1) Using the sample proportion from the data where a binomial distribution is approximated to the normal distribution and (2) Using the binomial probabilities calculated from the data.
The first method uses normal distribution as an approximation to the binomial distribution. The requirements are as follows: sample size is large...

Receiver Operating Characteristic Plot

Receiver Operating Characteristic Plot

A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Sensitivity, Specificity, and Predicted Value

Sensitivity, Specificity, and Predicted Value

In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Outcome-Assisted Multiple Imputation of Missing Treatments.

Observational studies·2026

Same author

Optimal <i>F</i>-score Matching for Bipartite Record Linkage.

Statistics and computing·2026

Same author

Fully Synthetic Data for Complex Surveys.

Survey methodology·2025

Same author

Studying Chinese immigrants' spatial distribution in the Raleigh-Durham area by linking survey and commercial data using romanized names.

Journal of the Royal Statistical Society. Series A, (Statistics in Society)·2025

Same author

The association between long-term PM2.5 exposure and risk for pancreatic cancer: an application of social informatics.

American journal of epidemiology·2024

Same author

Regression-Assisted Bayesian Record Linkage for Causal Inference in Observational Studies with Covariates Spread Over Two Files.

Journal of statistical planning and inference·2024

Same journal

Can the All of Us sample be reweighted to mirror a nationally representative sample? A comparison of mortality predictors.

Epidemiology (Cambridge, Mass.)·2026

Same journal

Gut health, systemic inflammation, and linear growth among Indonesian infants: findings from the Action Against Stunting Hub observation cohort: Erratum.

Epidemiology (Cambridge, Mass.)·2026

Same journal

Evaluating Estimators in Partially Identified Models.

Epidemiology (Cambridge, Mass.)·2026

Same journal

Stratification and accumulation? Explaining changing mortality inequities between business owners and non-owners in the U.S. (1984-2022).

Epidemiology (Cambridge, Mass.)·2026

Same journal

Be wary of age-stratum aging in early-onset cancer trends.

Epidemiology (Cambridge, Mass.)·2026

Same journal

The Authors Respond.

Epidemiology (Cambridge, Mass.)·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jun 17, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

评估基于调查数据估计的二进制结果分类器

Adway S Wadekar¹, Jerome P Reiter

¹From the Department of Statistical Science, Duke University, Durham, NC.

Epidemiology (Cambridge, Mass.)

|August 14, 2024

概括

此摘要是机器生成的。

使用调查权重可以改善对复杂调查数据的预测模型评估. 权重指标准确地反映了人口表现,与未加权的指标不同,特别是在减轻阶级不平衡的情况下.

更多相关视频

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

相关实验视频

Last Updated: Jun 17, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

科学领域:

流行病学流行病学
健康科学卫生科学卫生科学
社会和行为科学社会和行为科学

背景情况:

调查是重要的研究工具,但通常使用复杂的抽样设计,而不是简单的随机抽样.
调查受访者通常被分配权重,以考虑到不平等的选择概率.
在调查数据上评估预测模型需要仔细考虑这些复杂的设计.

研究的目的:

证明使用调查权重来评估预测模型质量的好处.
在复杂的调查数据上比较加权与未加权的绩效指标.
评估权重对训练有素模型的影响,以缓解类失衡.

主要方法:

描述模型评估统计数据 (例如,灵敏度,特异性) 作为有限的种群数量.
使用原始调查数据的随机子集进行测试的计算调查加权估计.
通过使用国家药物使用和健康调查和国家并发症调查数据进行模拟.

主要成果:

使用样本测试数据的未加权指标可能不准确地代表了人口的表现.
权重指标适当调整复杂的抽样设计,提供准确的人口估计.
权重指标的好处仍然存在,即使模型是通过对阶级不平衡的上抽样进行训练.

结论:

调查权重对于对复杂的调查数据进行准确的预测模型性能评估至关重要.
权重指标提供了一个更可靠的评估模型对目标人群的概括性.
研究人员在评估在复杂调查数据集上训练或测试的模型时,应采用加权指标.