Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Kendall's Coefficient of Concordance

Kendall's Coefficient of Concordance

Kendall's Coefficient of Concordance (W), also known as Kendall's W, is a non-parametric statistical measure used to assess the agreement or concordance between multiple raters or judges when they rank a set of items. It is often used when you have ordinal data (ranks) and you want to see if there is consistency or consensus among the raters. It is widely applied in research areas such as psychology, medicine, and social sciences, where multiple judges are asked to rank or rate subjects or...

Blind Procedures

Blind Procedures

Ideally, the people who observe and record the children’s behavior are unaware of who was assigned to the experimental or control group, in order to control for experimenter bias. Experimenter bias refers to the possibility that a researcher’s expectations might skew the results of the study. Remember, conducting an experiment requires a lot of planning, and the people involved in the research project have a vested interest in supporting their hypotheses. If the observers knew which child was...

Sign Test for Matched Pairs

Sign Test for Matched Pairs

The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in value between...

McNemar's Test

McNemar's Test

McNemar's Test is a nonparametric statistical test used to determine if there is a significant difference in proportions between two related groups when the outcome is binary (e.g., yes/no, success/failure). It is beneficial when we have paired data, such as pre-test/post-test designs, where the same subjects are measured under two different conditions. The test is named after the statistician Quinn McNemar, who introduced it in 1947. It is commonly used in situations where subjects are...

Group Design

Group Design

The most basic experimental design involves two groups: the experimental group and the control group. The two groups are designed to be the same except for one difference— experimental manipulation. The experimental group gets the experimental manipulation—that is, the treatment or variable being tested—and the control group does not. Since experimental manipulation is the only difference between the experimental and control groups, we can be sure that any differences between the two are due to...

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Clinical and HLA Associations of Fluoroquinolone-Induced Liver Injury: Results From the Drug-Induced Liver Injury Network.

The American journal of gastroenterology·2025

Same author

Liver Injury due to Intravenous Methylprednisolone in the Drug-Induced Liver Injury Network.

Liver international : official journal of the International Association for the Study of the Liver·2025

Same author

Comparison of Measures of Pain Intensity During Sickle Cell Disease Vaso-Occlusive Episodes.

The journal of pain·2024

Same author

A comparison of the effect of patient-specific versus weight-based protocols to treat vaso-occlusive episodes in the emergency department.

Academic emergency medicine : official journal of the Society for Academic Emergency Medicine·2023

Same author

Identification of Reduced ERAP2 Expression and a Novel HLA Allele as Components of a Risk Score for Susceptibility to Liver Injury Due to Amoxicillin-Clavulanate.

Gastroenterology·2022

Same author

Multiparametric Quantitative Imaging Biomarker as a Multivariate Descriptor of Health: A Roadmap.

Academic radiology·2022

Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026

Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026

Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026

Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026

Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026

Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 3, 2026

$Comparison of Agreement and Accuracy using Binocular Wavefront Optometer with Autorefractor and Phoropter$

Comparison of Agreement and Accuracy using Binocular Wavefront Optometer with Autorefractor and Phoropter

Published on: September 16, 2025

A new permutation-based method for assessing agreement between two observers making replicated binary readings.

Yi Pan¹, Michael Haber, Huiman X Barnhart

¹Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA.

Statistics in Medicine

|March 25, 2011

Summary

This summary is machine-generated.

A new permutation-based coefficient measures agreement between two observers for binary data. It compares observed disagreement to expected disagreement under individual equivalence, offering a novel approach to assess inter-observer reliability.

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Related Experiment Videos

Last Updated: Jun 3, 2026

$Comparison of Agreement and Accuracy using Binocular Wavefront Optometer with Autorefractor and Phoropter$

Comparison of Agreement and Accuracy using Binocular Wavefront Optometer with Autorefractor and Phoropter

Published on: September 16, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Area of Science:

Statistics
Biostatistics
Medical Imaging Analysis

Background:

Assessing agreement between observers is crucial in medical research.
Existing methods like kappa may have limitations in certain scenarios.
Binary observations require specific statistical approaches for agreement assessment.

Purpose of the Study:

Introduce a novel permutation-based coefficient for observer agreement.
Evaluate the performance of the new coefficient for binary data.
Compare the new coefficient with existing measures like kappa.

Main Methods:

Developed a new coefficient based on comparing observed and expected disagreement.
Utilized a permutation-based approach under the hypothesis of individual equivalence.
Derived methods for estimating the coefficient and its standard error.
Conducted simulation studies to validate the coefficient and its standard error.

Main Results:

The new coefficient effectively assesses agreement for binary observations.
Simulation studies confirmed the validity of the coefficient and its standard error.
The new coefficient provides an alternative to kappa and the coefficient of individual agreement.

Conclusions:

The proposed permutation-based coefficient offers a robust method for evaluating observer agreement.
This method is particularly useful for binary data where individual equivalence is a key assumption.
The approach was successfully illustrated using mammogram evaluation data.