Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Statistical Hypothesis Testing

Statistical Hypothesis Testing

Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Binet's Contribution to Measures of Intelligence

Binet's Contribution to Measures of Intelligence

Alfred Binet, along with his student Théophile Simon, was tasked by the French Ministry of Education in 1904 to create a method for identifying students who struggled to learn through conventional classroom instruction. This initiative aimed to address overcrowding by placing such students in specialized schools. Binet and Simon developed an intelligence test comprising 30 tasks, ranging from simple commands, like touching one's nose or ear, to more complex tasks, such as drawing...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

On the Use of Elbow Plot Method for Class Enumeration in Factor Mixture Models.

Applied psychological measurement·2025

Same author

Enhancing Effort-Moderated Item Response Theory Models by Evaluating a Two-Step Estimation Method and Multidimensional Variations on the Model.

Educational and psychological measurement·2024

Same author

Exploring examinees' responses to constructed response items with a supervised topic model.

The British journal of mathematical and statistical psychology·2023

Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026

Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026

Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026

Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026

Same journal

Automatic Item Generation Measurement Models Respecting the Stochastic Sampling Space for Cross-Classified and Two-Level Sampling of Subjects and Incidentals.

Applied psychological measurement·2026

Same journal

Multistage Testing for Cognitive Diagnosis Based on Skill-Space Partitioning.

Applied psychological measurement·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jul 14, 2025

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

对混合格式项目测试应用的顺序贝叶斯能力估计.

Jiawei Xiong¹, Allan S Cohen², Xinhui Maggie Xiong³

¹Pearson, Athens, GA, USA.

Applied psychological measurement

|October 9, 2023

概括

此摘要是机器生成的。

与传统的并发贝叶斯式 (CB) 方法相比,新的顺序贝叶斯式 (SB) 方法为混合格式测试提供了更准确的能力估计,特别是在较小的样本大小的情况下. 这种方法提高了整体评估可靠性.

关键词:

贝叶斯语贝叶斯语贝叶斯语贝叶斯语在EAP EAP中使用.能力估计能力估计混合格式数据的数据格式.在此之前,在此之前,在此之前,在此之前

更多相关视频

A Two-interval Forced-choice Task for Multisensory Comparisons

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

相关实验视频

Last Updated: Jul 14, 2025

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

A Two-interval Forced-choice Task for Multisensory Comparisons

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

科学领域:

教育测量教育的测量
心理测量心理测量心理测量
统计建模统计建模

背景情况:

大规模评估经常使用混合格式的项目,结合多选项 (MC) 和构建响应 (CR) 格式.
同时分析混合格式的项目可能不会产生最佳能力估计.
现有的方法,如并发贝叶斯校准 (CB),在准确估计考生能力方面可能存在局限性.

研究的目的:

探索混合项目响应模型的两步顺序贝叶斯式 (SB) 分析方法.
将SB方法的准确性和可靠性与传统的并发贝叶斯式 (CB) 方法 (EAPsum) 进行比较.
评估SB方法在各种因素的性能,包括样本大小和测试长度.

主要方法:

开发并应用了一种两步顺序的贝叶斯式 (SB) 方法,将MC和CR项目中的能力估计整合起来.
利用从MC项目估计的个体水平样本依赖的先前分布来估计后来的能力.
进行模拟研究以评估参数恢复,并在不同的条件下将SB与CB (EAPsum) 进行比较.

主要成果:

SB方法比CB方法更准确,更可靠地估计了能力,特别是在小样本大小 (N=150,500) 的情况下.
这两种方法都显示了多选项参数的可比回收.
在恢复构造响应项目参数方面,CB方法略高于SB方法,尽管SB方法的后置能力估计在实证示例中显示出更高的可靠性.

结论:

序列贝叶斯式 (SB) 方法提供了一个更准确和可靠的方法,用于能力估计在混合格式的评估与并发贝叶斯式 (CB) 方法相比.
在有限的样本规模的情况下,SB方法的优势尤其明显.
拟议的SB方法为混合项目评估提供了更好的后置能力估计可靠性.