Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Spearman's Rank Correlation Test01:20

Spearman's Rank Correlation Test

1.0K
Spearman's rank correlation test, also known as Spearman's rho, is a nonparametric method for assessing the strength and direction of association between two variables. This test is particularly valuable when the data distribution is unknown or when the assumption of normality does not hold. Named after the English psychologist and statistician Dr. Charles Edward Spearman, it serves as the nonparametric counterpart to Pearson's correlation coefficient.
Spearman's test calculates...
1.0K
Weighted Mean00:57

Weighted Mean

5.3K
While taking the arithmetic, geometric, or harmonic mean of a sample data set, equal importance is assigned to all the data points. However, all the values may not always be equally important in some data sets. An intrinsic bias might make it more important to give more weightage to specific values over others.
For example, consider the number of goals scored in the matches of a tournament. While computing the average number of goals scored in the tournament, it may be more important to...
5.3K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

4.1K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
4.1K
Measures of Intelligence01:29

Measures of Intelligence

7.8K
Psychologists measure intelligence by using standardized tests that produce a score known as the intelligence quotient or IQ. To understand IQ tests, it's important to recognize the key principles behind their construction: validity, reliability, and standardization.
Validity refers to how well a test measures what it claims to measure. An intelligence test should accurately assess intelligence rather than another characteristic, like anxiety. Criterion validity is one way to evaluate this;...
7.8K
Multiple Comparison Tests01:13

Multiple Comparison Tests

4.0K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.0K
Test for Homogeneity01:23

Test for Homogeneity

2.1K
The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...
2.1K

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Developing and validating a frailty score based on patient-reported outcome 3 months after stroke: A Riksstroke-based study.

PloS one·2026
Same author

The bit scale: A metric score scale for unidimensional item response theory models.

Psychometrika·2025
Same author

Calculating Bias in Test Score Equating in a NEAT Design.

Applied psychological measurement·2025
Same author

An Information Manifold Perspective for Analyzing Test Data.

Applied psychological measurement·2024
Same author

Efficiency Analysis of Item Response Theory Kernel Equating for Mixed-Format Tests.

Applied psychological measurement·2023
Same author

Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.

Applied psychological measurement·2023
Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026
Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026
Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026
Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026
Same journal

Automatic Item Generation Measurement Models Respecting the Stochastic Sampling Space for Cross-Classified and Two-Level Sampling of Subjects and Incidentals.

Applied psychological measurement·2026
Same journal

Multistage Testing for Cognitive Diagnosis Based on Skill-Space Partitioning.

Applied psychological measurement·2026
查看所有相关文章

相关实验视频

Updated: Sep 12, 2025

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities
10:26

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

4.1K

结合倾向分数和测试分数等同的常见项目

Inga Laukaityte1, Gabriel Wallin2, Marie Wiberg3

  • 1Department of Applied Educational Science, Umeå University, Sweden.

Applied psychological measurement
|August 4, 2025
PubMed
概括
此摘要是机器生成的。

本研究引入了一种用于公平测试成绩比较的新统计方法. 将倾向分数与常见项目数据相结合,可以提高准确性,减少教育测试中的偏见.

关键词:
学术录取学术录取是什么教育测试教育测试教育测试在等同化方面,它是相当的.公平的公平的公平.没有等效的组与测试设计.

更多相关视频

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing
15:00

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing

Published on: February 7, 2025

763
Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.6K

相关实验视频

Last Updated: Sep 12, 2025

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities
10:26

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

4.1K
A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing
15:00

A Tablet-Based Curriculum-Based Measurement Protocol for Kindergarten Writing

Published on: February 7, 2025

763
Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.6K

科学领域:

  • 统计 统计 统计 统计
  • 教育测量教育的测量
  • 心理测量 心理测量 心理测量

背景情况:

  • 确保测试形式和组之间的得分可比性是教育测试中的一个关键挑战.
  • 目前测试分数等同的方法通常依赖于共同的项目或组相似性的假设.
  • 需要新的方法来提高测试成绩比较的公平性和准确性.

研究的目的:

  • 开发和评估一种用于测试分数等同的新统计方法.
  • 将基于背景共变量的倾向得分与常见项目信息相结合,以提高得分的可比性.
  • 通过实证和模拟研究来评估这种综合方法的性能.

主要方法:

  • 利用了从考生背景共变量中得出的倾向分数.
  • 综合性倾向分数与使用内核平滑技术的共同项目信息.
  • 进行了对高风险的大学入学考试和模拟研究的经验分析.

主要成果:

  • 拟议的方法整合了倾向分数和常见项目数据,与单独使用任何来源相比,显示了较少的标准错误和偏差.
  • 测试对象共变量的均衡组被证明可以提高分数比较的公平性和准确性.
  • 该研究强调了利用所有可用的数据来提高得分可比性的好处.

结论:

  • 这种新的方法通过整合倾向分数和常见项目信息,有效地提高了测试成绩的可比性.
  • 这种方法提供了一种更强大,更准确的方式,以确保在不同群体的教育测试中公平.
  • 考虑所有收集的数据,包括背景共变量,对于提高测试分数等级的精度至关重要.