Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Uncertainty in Measurement: Reading Instruments

Uncertainty in Measurement: Reading Instruments

Counting is the type of measurement that is free from uncertainty, provided the number of objects being counted does not change during the process. Such measurements result in exact numbers. By counting the eggs in a carton, for instance, one can determine exactly how many eggs are there in the carton. Similarly, the numbers of defined quantities are also exact. For example, 1 foot is exactly 12 inches, 1 inch is exactly 2.54 centimeters, and 1 gram is exactly 0.001 kilograms. Quantities...

Uncertainty in Measurement: Accuracy and Precision

Uncertainty in Measurement: Accuracy and Precision

Scientists typically make repeated measurements of a quantity to ensure the quality of their findings and to evaluate both the precision and the accuracy of their results. Measurements are said to be precise if they yield very similar results when repeated in the same manner. A measurement is considered accurate if it yields a result that is very close to the true or the accepted value. Precise values agree with each other; accurate values agree with a true value.

Longitudinal Research

Longitudinal Research

Sometimes we want to see how people change over time, as in studies of human development and lifespan. When we test the same group of individuals repeatedly over an extended period of time, we are conducting longitudinal research. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time. For example, we may survey a group of individuals about their dietary habits at age 20, retest them a decade later at age 30, and then again...

Reliability and Validity

Reliability and Validity

Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.

Testing a Claim about Standard Deviation

Testing a Claim about Standard Deviation

A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...

Longitudinal Studies

Longitudinal Studies

Longitudinal studies are also widely used in other medical and social science fields. For instance, in cardiovascular research, they can monitor patients' health over decades to identify risk factors for heart disease, such as high cholesterol or smoking, and evaluate the long-term effectiveness of preventive measures. Similarly, in mental health studies, researchers might follow individuals from adolescence into adulthood to understand the development and progression of conditions like...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Microlearning to teach geriatric principles in hospitals: a systematic review and meta-analysis.

Age and ageing·2026

Same author

Polyclonal selection of immune checkpoint mutations in thyroid autoimmunity.

Nature·2026

Same author

Facilitators and barriers along the pathways from secondary school to medical programmes- results from a national longitudinal study in Aotearoa New Zealand.

Advances in health sciences education : theory and practice·2026

Same author

Somatic genomics as a discovery engine for biomedicine.

Cell·2026

Same author

Educational needs of junior doctors caring for hospitalised older adults.

Age and ageing·2026

Same author

Chromothripsis and ecDNA initiated by N4BP2 nuclease fragmentation of cytoplasm-exposed chromosomes.

Science (New York, N.Y.)·2025

Same journal

Channelling Socrates to re-imagine asynchronous online learning.

Medical education·2026

Same journal

Moving beyond tokenism: A structured and intentional approach to embedding health advocacy in medical education.

Medical education·2026

Same journal

When I say … 'in situ simulation'.

Medical education·2026

Same journal

Examiner training and calibration for simulated clinical examinations: A scoping review.

Medical education·2026

Same journal

When systems set the limits of supervision.

Medical education·2026

Same journal

From psychometrics to partnerships: Broadening what counts as validity evidence.

Medical education·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 2, 2026

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting

Published on: December 15, 2023

Reliability of the long case.

Tim J Wilkinson¹, Peter J Campbell, Stephen J Judd

¹Department of Medicine, Christchurch School of Medicine and Health Sciences, University of Otago, Dunedin, New Zealand. tim.wilkinson@otago.ac.nz

Medical Education

|August 22, 2008

Summary

This summary is machine-generated.

Long cases in medical assessments are unreliable. Supplementing with short cases or increasing examiner training shows minimal impact on reliability, suggesting longer examination times or workplace assessments are needed for dependable clinical competence evaluation.

More Related Videos

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population

Published on: January 31, 2025

Related Experiment Videos

Last Updated: Jul 2, 2026

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting

Doppler Ultrasound-Based Leg Blood Flow Assessment During Single-Leg Knee-Extensor Exercise in an Uncontrolled Setting

Published on: December 15, 2023

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population

Cortical Bone Assessment Using Ultrasonic Guided Waves: A Reproducibility Study in a Healthy Population

Published on: January 31, 2025

Area of Science:

Medical Education
Clinical Competence Assessment
Psychometrics in Education

Background:

Summative assessment of clinical competence using long cases faces challenges due to reliability concerns.
The effectiveness of long cases as a sole assessment tool is questioned in high-stakes examinations.

Purpose of the Study:

To investigate the reliability of long cases in summative clinical competence assessments.
To determine how supplementing long cases with short cases impacts overall assessment reliability.

Main Methods:

Statistical analysis of Royal Australasian College of Physicians examinations from 2005 and 2006.
Examination of variance sources including candidate ability, case difficulty, and inter-examiner differences.
Comparison of reliability between long cases and combinations of short and long cases.

Main Results:

Candidate ability explained 38% of the variation in long case data in 2006, with candidate x case and candidate x examiner interactions also significant.
While a single short case is less reliable than a long case, three short cases offer comparable reliability to one long case when examiner time is considered.
Achieving a dependability of > 0.7 requires 4-5 hours of testing time, regardless of the combination of short and long cases.

Conclusions:

Optimizing long cases for reliability is possible, but time constraints limit their use as the sole summative assessment method.
Improvements in examiner training, case selection, or increased use of short cases have minimal impact on reliability.
Enhancing reliability necessitates increasing examination duration or incorporating additional assessment methods like workplace-based assessments.