Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Confidence Coefficient01:24

Confidence Coefficient

10.7K
The confidence coefficient is also known as the confidence level or degree of confidence. It is the percent expression for the probability, 1-α, that the confidence interval contains the true population parameter assuming that the confidence interval is obtained after sufficient unbiased sampling; for example, if the CL = 90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter. Here α is the area under the curve, distributed equally under...
10.7K
Confidence Intervals01:21

Confidence Intervals

10.8K
An unbiased point estimate is often insufficient to predict a population estimate, such as population mean or population proportion. In this scenario, a confidence interval is used. A confidence interval is an estimate similar to a  sample proportion. However, unlike the point estimate which is a single value, the confidence interval  contains a range of values. These values have lower and upper limits, known as confidence limits, and can be designated as L1 and L2, respectively.
A...
10.8K
Interpretation of Confidence Intervals01:19

Interpretation of Confidence Intervals

10.1K
A confidence interval is a better estimate of the population than a point estimate, as it uses a range of values from a sample instead of a single value.
Confidence intervals have confidence coefficients that are crucial for their interpretation. The most common confidence coefficients are 0.90, 0.95, and 0.99, which can be written as percentages–90%, 95%, and 99%, respectively.
Suppose a person calculates a confidence interval with a confidence coefficient of 0.95. In that case, they can...
10.1K
Uncertainty: Confidence Intervals00:54

Uncertainty: Confidence Intervals

11.7K
The confidence interval is the range of values around the mean that contains the true mean. It is expressed as a probability percentage. The interpretation of a 95% confidence interval, for instance, is that the statistician is 95% confident that the true mean falls within the interval. The upper and lower limits of this range are known as confidence limits. The confidence limits for the true mean are estimated from the sample's mean, the standard deviation, and the statistical factor...
11.7K
Confidence Interval for Estimating Population Mean01:25

Confidence Interval for Estimating Population Mean

8.9K
A point estimate of the population mean is obtained from a single sample. Such a point estimate does not represent a population well because it needs to account for variability in the population. Single point estimate can also be biased despite the sample being selected randomly. Thus, a point estimate is often unreliable. A confidence interval is needed to reduce this unreliability.
A confidence interval for the mean is a range of values that provides an estimate of the population mean. As the...
8.9K
Reliability and Validity01:29

Reliability and Validity

14.1K
Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
14.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Portable kits based on L-arginine modified Cu-CuFe<sub>2</sub>O<sub>4</sub> with superior peroxidase-like activity for colorimetric detection of cholesterol and glucose in human serum.

Mikrochimica acta·2026
Same author

The Riemann Hypothesis manifested in dynamical quantum phase transitions.

Nature communications·2026
Same author

Cardiometabolic Index: a novel prognostic biomarker for recurrent stroke risk in acute ischemic stroke patients.

Frontiers in neurology·2026
Same author

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026
Same author

Multiparticle entanglement of nuclear spins in silicon.

Nature communications·2026
Same author

Bayesian fine-mapping pinpoints candidate genes and pleiotropic loci of production traits from a chicken backcrossing scheme.

BMC genomics·2026
Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026
Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026
Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026
Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026
Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Feb 10, 2026

Behavioral Assessment of Hearing in 2 to 4 Year-old Children: A Two-interval, Observer-based Procedure Using Conditioned Play-based Responses
14:05

Behavioral Assessment of Hearing in 2 to 4 Year-old Children: A Two-interval, Observer-based Procedure Using Conditioned Play-based Responses

Published on: January 23, 2017

29.7K

Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients.

Björn Andersson1, Tao Xin1

  • 1Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing, China.

Educational and Psychological Measurement
|May 26, 2018
PubMed
Summary
This summary is machine-generated.

This study derives standard errors for item response theory (IRT) reliability estimators, enabling confidence intervals. Simulations show test reliability intervals perform well, while marginal reliability intervals exhibit bias in small samples.

Keywords:
asymptotic varianceconfidence intervalsitem response theoryreliability

More Related Videos

A Two-interval Forced-choice Task for Multisensory Comparisons
07:13

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

11.5K
Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking
07:34

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Published on: November 17, 2023

1.2K

Related Experiment Videos

Last Updated: Feb 10, 2026

Behavioral Assessment of Hearing in 2 to 4 Year-old Children: A Two-interval, Observer-based Procedure Using Conditioned Play-based Responses
14:05

Behavioral Assessment of Hearing in 2 to 4 Year-old Children: A Two-interval, Observer-based Procedure Using Conditioned Play-based Responses

Published on: January 23, 2017

29.7K
A Two-interval Forced-choice Task for Multisensory Comparisons
07:13

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

11.5K
Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking
07:34

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Published on: November 17, 2023

1.2K

Area of Science:

  • Psychometrics
  • Statistical Modeling

Background:

  • Item response theory (IRT) applications often report reliability estimates for ability or sum scores.
  • However, the literature lacks analytical expressions for the standard errors of reliability coefficient estimators, hindering reporting of reliability variability.

Purpose of the Study:

  • To derive asymptotic variances for IRT marginal and test reliability coefficient estimators.
  • To construct confidence intervals for these reliability coefficients.

Main Methods:

  • Derivation of asymptotic variances for dichotomous and polytomous IRT models.
  • Assumption of asymptotically normally distributed item parameter estimators.
  • Construction of confidence intervals using derived variances.

Main Results:

  • Confidence intervals for test reliability coefficients demonstrate good coverage in finite samples across various settings (e.g., generalized partial credit model, three-parameter logistic model).
  • The marginal reliability coefficient estimator shows finite sample bias, leading to confidence intervals not achieving nominal levels with small sample sizes.
  • This bias diminishes as sample size increases.

Conclusions:

  • The derived methods provide a means to quantify uncertainty in IRT reliability estimates.
  • Test reliability confidence intervals are reliable in practice, even with moderate sample sizes.
  • Marginal reliability confidence intervals require larger sample sizes to achieve nominal coverage due to initial bias.