Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Statistical Hypothesis Testing

Statistical Hypothesis Testing

Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Binet's Contribution to Measures of Intelligence

Binet's Contribution to Measures of Intelligence

Alfred Binet, along with his student Théophile Simon, was tasked by the French Ministry of Education in 1904 to create a method for identifying students who struggled to learn through conventional classroom instruction. This initiative aimed to address overcrowding by placing such students in specialized schools. Binet and Simon developed an intelligence test comprising 30 tasks, ranging from simple commands, like touching one's nose or ear, to more complex tasks, such as drawing...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

On the Use of Elbow Plot Method for Class Enumeration in Factor Mixture Models.

Applied psychological measurement·2025

Same author

Enhancing Effort-Moderated Item Response Theory Models by Evaluating a Two-Step Estimation Method and Multidimensional Variations on the Model.

Educational and psychological measurement·2024

Same author

Exploring examinees' responses to constructed response items with a supervised topic model.

The British journal of mathematical and statistical psychology·2023

Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026

Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026

Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026

Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026

Same journal

Automatic Item Generation Measurement Models Respecting the Stochastic Sampling Space for Cross-Classified and Two-Level Sampling of Subjects and Incidentals.

Applied psychological measurement·2026

Same journal

Multistage Testing for Cognitive Diagnosis Based on Skill-Space Partitioning.

Applied psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 14, 2025

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

Sequential Bayesian Ability Estimation Applied to Mixed-Format Item Tests.

Jiawei Xiong¹, Allan S Cohen², Xinhui Maggie Xiong³

¹Pearson, Athens, GA, USA.

Applied Psychological Measurement

|October 9, 2023

Summary

This summary is machine-generated.

A new sequential Bayesian (SB) method offers more accurate ability estimation for mixed-format tests compared to traditional concurrent Bayesian (CB) methods, particularly with smaller sample sizes. This approach improves overall assessment reliability.

Keywords:

Bayesian EAP ability estimation mixed-format data prior

More Related Videos

A Two-interval Forced-choice Task for Multisensory Comparisons

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Related Experiment Videos

Last Updated: Jul 14, 2025

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

A Two-interval Forced-choice Task for Multisensory Comparisons

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Area of Science:

Educational Measurement
Psychometrics
Statistical Modeling

Background:

Large-scale assessments frequently employ mixed-format items, combining multiple-choice (MC) and constructed-response (CR) formats.
Simultaneous analysis of mixed-format items may not yield optimal ability estimates.
Existing methods like concurrent Bayesian (CB) calibration may have limitations in accurately estimating examinee abilities.

Purpose of the Study:

To explore a two-step sequential Bayesian (SB) analytic method for mixed item response models.
To compare the accuracy and reliability of the SB method against a traditional concurrent Bayesian (CB) method (EAPsum).
To evaluate the performance of the SB method across various factors including sample size and test length.

Main Methods:

Developed and applied a two-step sequential Bayesian (SB) method integrating ability estimates from MC and CR items.
Utilized individual-level sample-dependent prior distributions estimated from MC items for posterior ability estimation.
Conducted simulation studies to assess parameter recovery and compared SB with CB (EAPsum) under different conditions.

Main Results:

The SB method demonstrated more accurate and reliable ability estimation than the CB method, especially with small sample sizes (N=150, 500).
Both methods showed comparable recovery for multiple-choice item parameters.
The CB method slightly outperformed the SB method in recovering constructed-response item parameters, though the SB method's posterior ability estimates showed higher reliability in an empirical example.

Conclusions:

The sequential Bayesian (SB) method provides a more accurate and reliable approach for ability estimation in mixed-format assessments compared to concurrent Bayesian (CB) methods.
The SB method's advantages are particularly pronounced in scenarios with limited sample sizes.
The proposed SB method offers improved posterior ability estimation reliability for mixed-item assessments.