Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
What Are Outliers?01:12

What Are Outliers?

Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...
Effects of EDTA on End-Point Detection Methods01:18

Effects of EDTA on End-Point Detection Methods

Different methods, such as visual observance of metal-ion indicators, spectroscopic techniques, and potentiometric methods, can determine the endpoint of an EDTA titration.
In the visual method, metal-ion indicators (metallochromic dyes), which have distinct colors in their free and complex forms, are added to the mixture to signal the titration's end point. They form stable complexes with metal ions, but these complexes are weaker than the corresponding metal–EDTA complexes. As a result, EDTA...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Association Between Physician Licensing and Certification Examination Performance and Quality of Patient Care: A Systematic Review.

Academic medicine : journal of the Association of American Medical Colleges·2026
Same author

Exploring emerging physician competencies: Analyzing insights from medical care influencers on X.

Medical teacher·2025
Same author

Multi-source feedback in undergraduate medical education: a pilot study.

Canadian medical education journal·2025
Same author

Correction to: Can all roads lead to competency? School levels effects in licensing examinations scores.

Advances in health sciences education : theory and practice·2025
Same author

Can all roads lead to competency? School levels effects in Licensing examinations scores.

Advances in health sciences education : theory and practice·2024
Same author

Conducting an objective structured clinical examination under COVID-restricted conditions.

BMC medical education·2024
Same journal

Is It My Responsibility? Professional Organizations' Perspectives on Disability Inclusion in Health Professions Education and Practice.

The clinical teacher·2026
Same journal

Integrated Team-Based Learning in a UK Undergraduate Medical Programme.

The clinical teacher·2026
Same journal

Women's Conferences in Medicine: Advancing Gender Equity in Medical Education.

The clinical teacher·2026
Same journal

Entrusting Attention: An Additional lens on Entrustable Professional Activity Assessment.

The clinical teacher·2026
Same journal

Students as Teachers (SAT) and Educators: An Online Elective in Medical Education.

The clinical teacher·2026
Same journal

Beyond Student Proactivity in Surgical Placements.

The clinical teacher·2026
See all related articles

Related Experiment Video

Updated: May 15, 2026

Comparing Objective Conjunctival Hyperemia Grading and the Ocular Surface Disease Index Score in Dry Eye Syndrome During COVID-19
06:29

Comparing Objective Conjunctival Hyperemia Grading and the Ocular Surface Disease Index Score in Dry Eye Syndrome During COVID-19

Published on: May 25, 2022

A method for identifying extreme OSCE examiners.

Ilona Bartman1, Sydney Smee, Marguerite Roy

  • 1Evaluation Bureau, Medical Council of Canada, Ottawa, Ontario K1G 5A2, Canada. ilona@mcc.ca

The Clinical Teacher
|January 9, 2013
PubMed
Summary
This summary is machine-generated.

A simple method identifies extreme raters in performance assessments, minimizing rater bias in high-stakes decisions. This quality assurance process ensures fair evaluation by managing leniency and harshness in scoring.

More Related Videos

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences
08:33

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

Published on: September 4, 2019

Observational Study Protocol for Repeated Clinical Examination and Critical Care Ultrasonography Within the Simple Intensive Care Studies
10:38

Observational Study Protocol for Repeated Clinical Examination and Critical Care Ultrasonography Within the Simple Intensive Care Studies

Published on: January 16, 2019

Related Experiment Videos

Last Updated: May 15, 2026

Comparing Objective Conjunctival Hyperemia Grading and the Ocular Surface Disease Index Score in Dry Eye Syndrome During COVID-19
06:29

Comparing Objective Conjunctival Hyperemia Grading and the Ocular Surface Disease Index Score in Dry Eye Syndrome During COVID-19

Published on: May 25, 2022

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences
08:33

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

Published on: September 4, 2019

Observational Study Protocol for Repeated Clinical Examination and Critical Care Ultrasonography Within the Simple Intensive Care Studies
10:38

Observational Study Protocol for Repeated Clinical Examination and Critical Care Ultrasonography Within the Simple Intensive Care Studies

Published on: January 16, 2019

Area of Science:

  • Medical Education
  • Assessment and Evaluation
  • Psychometrics

Background:

  • Performance assessments are susceptible to rater effects, such as leniency or harshness, potentially compromising the validity of high-stakes decisions.
  • Managing rater effects is crucial for accurate inferences from performance ratings.
  • A straightforward method for identifying extreme raters has been developed for quality assurance in Objective Structured Clinical Examinations (OSCEs).

Purpose of the Study:

  • To introduce and validate a simple, statistically accessible method for detecting extreme raters in performance assessments.
  • To ensure the reliability and fairness of evaluations by mitigating rater bias.
  • To provide a quality assurance tool applicable to various assessment formats relying on human judgment.

Main Methods:

  • Identified extreme raters by comparing individual rater means to the overall mean, using a threshold of three standard deviations.
  • Mitigated station effects by comparing extreme raters' score distributions to the overall station distribution.
  • Ruled out cohort effects by examining the cohort of candidates assessed by each extreme rater.

Main Results:

  • Fewer than 0.3% of over 3000 raters were identified as extreme using the defined criteria.
  • Rater performance is continuously monitored, and the impact of extreme raters on candidate results is assessed.
  • Interventions with extreme raters include performance review and, if necessary, removal from future assessment participation.

Conclusions:

  • The developed method effectively identifies extreme raters with minimal statistical complexity.
  • Regular monitoring and intervention strategies help manage rater effects and improve assessment quality.
  • Ongoing data collection will inform future improvements in addressing and mitigating extreme rater performance.