Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Self-Report Tests of Personality01:22

Self-Report Tests of Personality

Self-report inventories are objective personality assessments that use multiple-choice items or numbered scales, typically ranging from 1 (strongly disagree) to 5 (strongly agree). They are often called Likert scales after Rensis Likert. These inventories are widely used due to their ease of administration and cost-effectiveness. One of the most prominent examples is the Minnesota Multiphasic Personality Inventory (MMPI), initially developed in the 1940s to assess abnormal personality traits.
McNemar's Test01:23

McNemar's Test

McNemar's Test is a nonparametric statistical test used to determine if there is a significant difference in proportions between two related groups when the outcome is binary (e.g., yes/no, success/failure). It is beneficial when we have paired data, such as pre-test/post-test designs, where the same subjects are measured under two different conditions. The test is named after the statistician Quinn McNemar, who introduced it in 1947. It is commonly used in situations where subjects are...
Multiple Comparison Tests01:13

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
Measures of Central Tendency02:16

Measures of Central Tendency

The "center" of a data set is also a way of describing location. The two most widely used measures of the "center" of the data are the mean (average) and the median. The words "mean" and "average" are often used interchangeably. The substitution of one word for the other is common practice. The technical term is "arithmetic mean" and "average" is technically a center location. However, in practice among non-statisticians, "average" is commonly accepted for "arithmetic mean."
Strategies of Self-Presentation III: Self-Monitoring01:24

Strategies of Self-Presentation III: Self-Monitoring

Self-monitoring is a central construct in understanding individual differences in self-presentation strategies across social contexts. It refers to how individuals observe, regulate, and control their expressive behavior and self-presentation following situational cues. Self-monitoring reflects a person's sensitivity to social appropriateness and willingness to adapt behavior to fit varying interpersonal demands.High vs. Low Self-Monitoring IndividualsIndividuals high in self-monitoring are...
Halo Effect01:27

Halo Effect

The halo effect is a cognitive bias in which an individual's overall impression influences judgments about their specific traits. This psychological phenomenon leads people to associate positive characteristics with those they perceive as generally good and negative characteristics with those they view as bad. This effect is particularly influential in social perception, professional evaluations, and decision-making processes.The Psychological Basis of the Halo EffectThe halo effect is rooted...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Maggie Beer's Big Mission: implementation and service-system outcomes of a multi-component and multi-disciplinary mealtime model in residential aged care.

BMC geriatrics·2026
Same author

Quantitative dissection of the metastatic cascade at single colony resolution.

bioRxiv : the preprint server for biology·2026
Same author

CXCL-CXCR2 signaling drives cancer-endothelium interactions in SCLC metastatic seeding.

bioRxiv : the preprint server for biology·2026
Same author

Long-term Grip Strength and Complications After Total Wrist Fusion With and Without Inclusion of the Third Carpometacarpal Joint: A Systematic Review and Meta-analysis.

Journal of hand surgery global online·2026
Same author

Simulated patient judgements of medical student performance in OSCEs: A realist evaluation.

Medical teacher·2026
Same author

Determining the influence of video-based benchmarking (VBB) on examiner variability in objective structured clinical exams (OSCE): The Align study.

Medical teacher·2026
Same journal

Channelling Socrates to re-imagine asynchronous online learning.

Medical education·2026
Same journal

Moving beyond tokenism: A structured and intentional approach to embedding health advocacy in medical education.

Medical education·2026
Same journal

When I say … 'in situ simulation'.

Medical education·2026
Same journal

Examiner training and calibration for simulated clinical examinations: A scoping review.

Medical education·2026
Same journal

When systems set the limits of supervision.

Medical education·2026
Same journal

From psychometrics to partnerships: Broadening what counts as validity evidence.

Medical education·2026
See all related articles

Related Experiment Video

Updated: Jun 10, 2026

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
07:14

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

Published on: December 23, 2025

Should candidate scores be adjusted for interviewer stringency or leniency in the multiple mini-interview?

Chris Roberts1, Imogene Rothnie, Nathan Zoanetti

  • 1Sydney Medical School-Northern, University of Sydney, Sydney, New South Wales, Australia. christopher.roberts@sydney.edu.au

Medical Education
|July 20, 2010
PubMed
Summary
This summary is machine-generated.

Multi-facet Rasch modelling (MFRM) can identify and adjust for interviewer bias in multiple mini-interview (MMI) scores. This approach offers a fairer assessment by accounting for interviewer stringency and question difficulty, potentially altering candidate rankings.

Related Experiment Videos

Last Updated: Jun 10, 2026

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
07:14

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

Published on: December 23, 2025

Area of Science:

  • Medical Education
  • Psychometrics
  • Assessment Science

Background:

  • Candidate scores in multiple mini-interviews (MMI) exhibit significant variation due to interviewer-related factors.
  • Multi-facet Rasch modelling (MFRM) offers a statistical approach to identify and mitigate these sources of measurement error.
  • MFRM can potentially create a fairer assessment model for candidates by adjusting for variability.

Purpose of the Study:

  • To apply Multi-facet Rasch modelling (MFRM) to analyze variance in multiple mini-interview (MMI) scores.
  • To quantify the contribution of candidate ability, interviewer stringency, and question difficulty to score variance.
  • To model adjusted candidate rankings based on MFRM analysis.

Main Methods:

  • A variance components analysis was performed using facets software, aligning with generalisability theory principles.
  • Fair average scores were calculated to adjust for interviewer stringency/leniency and question difficulty.
  • Candidate rankings were modelled based on these adjusted scores.

Main Results:

  • The MFRM model demonstrated acceptable fit with data from 207 interviewers.
  • Interviewer stringency/leniency accounted for 8.9% of score variance, and question difficulty for 2.6%.
  • Adjusting scores for these factors resulted in significant ranking changes for 11.5% of candidates, with leniency linked to interview volume.

Conclusions:

  • Interviewer stringency/leniency is a stable and significant factor influencing MMI scores.
  • MFRM provides a robust method for generating candidate scores that are adjusted for interviewer and question-specific effects.
  • This adjusted scoring approach promotes fairer candidate evaluation in selection processes.