Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

2.5K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
2.5K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

3.4K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
3.4K
Statistical Hypothesis Testing01:16

Statistical Hypothesis Testing

2.0K
Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...
2.0K
Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test01:09

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

1.6K
In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...
1.6K
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

544
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
544
Binet's Contribution to Measures of Intelligence01:23

Binet's Contribution to Measures of Intelligence

1.3K
Alfred Binet, along with his student Théophile Simon, was tasked by the French Ministry of Education in 1904 to create a method for identifying students who struggled to learn through conventional classroom instruction. This initiative aimed to address overcrowding by placing such students in specialized schools. Binet and Simon developed an intelligence test comprising 30 tasks, ranging from simple commands, like touching one's nose or ear, to more complex tasks, such as drawing...
1.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

On the Use of Elbow Plot Method for Class Enumeration in Factor Mixture Models.

Applied psychological measurement·2025
Same author

Enhancing Effort-Moderated Item Response Theory Models by Evaluating a Two-Step Estimation Method and Multidimensional Variations on the Model.

Educational and psychological measurement·2024
Same author

Exploring examinees' responses to constructed response items with a supervised topic model.

The British journal of mathematical and statistical psychology·2023
Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026
Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026
Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026
Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026
Same journal

Automatic Item Generation Measurement Models Respecting the Stochastic Sampling Space for Cross-Classified and Two-Level Sampling of Subjects and Incidentals.

Applied psychological measurement·2026
Same journal

Multistage Testing for Cognitive Diagnosis Based on Skill-Space Partitioning.

Applied psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Jul 14, 2025

A Tactile Automated Passive-Finger Stimulator TAPS
19:44

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

13.7K

Sequential Bayesian Ability Estimation Applied to Mixed-Format Item Tests.

Jiawei Xiong1, Allan S Cohen2, Xinhui Maggie Xiong3

  • 1Pearson, Athens, GA, USA.

Applied Psychological Measurement
|October 9, 2023
PubMed
Summary
This summary is machine-generated.

A new sequential Bayesian (SB) method offers more accurate ability estimation for mixed-format tests compared to traditional concurrent Bayesian (CB) methods, particularly with smaller sample sizes. This approach improves overall assessment reliability.

Keywords:
BayesianEAPability estimationmixed-format dataprior

More Related Videos

A Two-interval Forced-choice Task for Multisensory Comparisons
07:13

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

11.0K
Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

5.8K

Related Experiment Videos

Last Updated: Jul 14, 2025

A Tactile Automated Passive-Finger Stimulator TAPS
19:44

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

13.7K
A Two-interval Forced-choice Task for Multisensory Comparisons
07:13

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

11.0K
Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

5.8K

Area of Science:

  • Educational Measurement
  • Psychometrics
  • Statistical Modeling

Background:

  • Large-scale assessments frequently employ mixed-format items, combining multiple-choice (MC) and constructed-response (CR) formats.
  • Simultaneous analysis of mixed-format items may not yield optimal ability estimates.
  • Existing methods like concurrent Bayesian (CB) calibration may have limitations in accurately estimating examinee abilities.

Purpose of the Study:

  • To explore a two-step sequential Bayesian (SB) analytic method for mixed item response models.
  • To compare the accuracy and reliability of the SB method against a traditional concurrent Bayesian (CB) method (EAPsum).
  • To evaluate the performance of the SB method across various factors including sample size and test length.

Main Methods:

  • Developed and applied a two-step sequential Bayesian (SB) method integrating ability estimates from MC and CR items.
  • Utilized individual-level sample-dependent prior distributions estimated from MC items for posterior ability estimation.
  • Conducted simulation studies to assess parameter recovery and compared SB with CB (EAPsum) under different conditions.

Main Results:

  • The SB method demonstrated more accurate and reliable ability estimation than the CB method, especially with small sample sizes (N=150, 500).
  • Both methods showed comparable recovery for multiple-choice item parameters.
  • The CB method slightly outperformed the SB method in recovering constructed-response item parameters, though the SB method's posterior ability estimates showed higher reliability in an empirical example.

Conclusions:

  • The sequential Bayesian (SB) method provides a more accurate and reliable approach for ability estimation in mixed-format assessments compared to concurrent Bayesian (CB) methods.
  • The SB method's advantages are particularly pronounced in scenarios with limited sample sizes.
  • The proposed SB method offers improved posterior ability estimation reliability for mixed-item assessments.