Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Uncertainty in Measurement: Reading Instruments02:46

Uncertainty in Measurement: Reading Instruments

Counting is the type of measurement that is free from uncertainty, provided the number of objects being counted does not change during the process. Such measurements result in exact numbers. By counting the eggs in a carton, for instance, one can determine exactly how many eggs are there in the carton. Similarly, the numbers of defined quantities are also exact. For example, 1 foot is exactly 12 inches, 1 inch is exactly 2.54 centimeters, and 1 gram is exactly 0.001 kilograms. Quantities...
Uncertainty in Measurement: Accuracy and Precision03:37

Uncertainty in Measurement: Accuracy and Precision

Scientists typically make repeated measurements of a quantity to ensure the quality of their findings and to evaluate both the precision and the accuracy of their results. Measurements are said to be precise if they yield very similar results when repeated in the same manner. A measurement is considered accurate if it yields a result that is very close to the true or the accepted value. Precise values agree with each other; accurate values agree with a true value.
Longitudinal Research02:20

Longitudinal Research

Sometimes we want to see how people change over time, as in studies of human development and lifespan. When we test the same group of individuals repeatedly over an extended period of time, we are conducting longitudinal research. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time. For example, we may survey a group of individuals about their dietary habits at age 20, retest them a decade later at age 30, and then again...
Reliability and Validity01:29

Reliability and Validity

Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
Testing a Claim about Standard Deviation01:19

Testing a Claim about Standard Deviation

A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...
Longitudinal Studies01:26

Longitudinal Studies

Longitudinal studies are also widely used in other medical and social science fields. For instance, in cardiovascular research, they can monitor patients' health over decades to identify risk factors for heart disease, such as high cholesterol or smoking, and evaluate the long-term effectiveness of preventive measures. Similarly, in mental health studies, researchers might follow individuals from adolescence into adulthood to understand the development and progression of conditions like...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Microlearning to teach geriatric principles in hospitals: a systematic review and meta-analysis.

Age and ageing·2026
Same author

Polyclonal selection of immune checkpoint mutations in thyroid autoimmunity.

Nature·2026
Same author

Facilitators and barriers along the pathways from secondary school to medical programmes- results from a national longitudinal study in Aotearoa New Zealand.

Advances in health sciences education : theory and practice·2026
Same author

Somatic genomics as a discovery engine for biomedicine.

Cell·2026
Same author

Educational needs of junior doctors caring for hospitalised older adults.

Age and ageing·2026
Same author

Chromothripsis and ecDNA initiated by N4BP2 nuclease fragmentation of cytoplasm-exposed chromosomes.

Science (New York, N.Y.)·2025
Same journal

Channelling Socrates to re-imagine asynchronous online learning.

Medical education·2026
Same journal

Moving beyond tokenism: A structured and intentional approach to embedding health advocacy in medical education.

Medical education·2026
Same journal

When I say … 'in situ simulation'.

Medical education·2026
Same journal

Examiner training and calibration for simulated clinical examinations: A scoping review.

Medical education·2026
Same journal

When systems set the limits of supervision.

Medical education·2026
Same journal

From psychometrics to partnerships: Broadening what counts as validity evidence.

Medical education·2026
See all related articles

Related Experiment Video

Updated: Jul 2, 2026

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting
09:18

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting

Published on: December 15, 2023

Reliability of the long case.

Tim J Wilkinson1, Peter J Campbell, Stephen J Judd

  • 1Department of Medicine, Christchurch School of Medicine and Health Sciences, University of Otago, Dunedin, New Zealand. tim.wilkinson@otago.ac.nz

Medical Education
|August 22, 2008
PubMed
Summary
This summary is machine-generated.

Long cases in medical assessments are unreliable. Supplementing with short cases or increasing examiner training shows minimal impact on reliability, suggesting longer examination times or workplace assessments are needed for dependable clinical competence evaluation.

More Related Videos

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population
09:02

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population

Published on: January 31, 2025

Related Experiment Videos

Last Updated: Jul 2, 2026

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting
09:18

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting

Published on: December 15, 2023

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population
09:02

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population

Published on: January 31, 2025

Area of Science:

  • Medical Education
  • Clinical Competence Assessment
  • Psychometrics in Education

Background:

  • Summative assessment of clinical competence using long cases faces challenges due to reliability concerns.
  • The effectiveness of long cases as a sole assessment tool is questioned in high-stakes examinations.

Purpose of the Study:

  • To investigate the reliability of long cases in summative clinical competence assessments.
  • To determine how supplementing long cases with short cases impacts overall assessment reliability.

Main Methods:

  • Statistical analysis of Royal Australasian College of Physicians examinations from 2005 and 2006.
  • Examination of variance sources including candidate ability, case difficulty, and inter-examiner differences.
  • Comparison of reliability between long cases and combinations of short and long cases.

Main Results:

  • Candidate ability explained 38% of the variation in long case data in 2006, with candidate x case and candidate x examiner interactions also significant.
  • While a single short case is less reliable than a long case, three short cases offer comparable reliability to one long case when examiner time is considered.
  • Achieving a dependability of > 0.7 requires 4-5 hours of testing time, regardless of the combination of short and long cases.

Conclusions:

  • Optimizing long cases for reliability is possible, but time constraints limit their use as the sole summative assessment method.
  • Improvements in examiner training, case selection, or increased use of short cases have minimal impact on reliability.
  • Enhancing reliability necessitates increasing examination duration or incorporating additional assessment methods like workplace-based assessments.