Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Comparing the Survival Analysis of Two or More Groups01:20

Comparing the Survival Analysis of Two or More Groups

369
Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...
369
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.9K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.9K
Wilcoxon Signed-Ranks Test for Matched Pairs01:09

Wilcoxon Signed-Ranks Test for Matched Pairs

272
The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...
272
Ratio Level of Measurement00:54

Ratio Level of Measurement

19.7K
The way a set of data is measured is called its level of measurement. Correct statistical procedures depend on a researcher being familiar with levels of measurement. For analysis, data are classified into four levels of measurement—nominal, ordinal, interval, and ratio.
A set of data measured using the ratio scale takes care of the ratio problem and provides complete information. Ratio scale data are like interval scale data, except they have a zero point and ratios can be calculated....
19.7K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.6K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.6K
Kendall's Coefficient of Concordance01:20

Kendall's Coefficient of Concordance

654
Kendall's Coefficient of Concordance (W), also known as Kendall's W, is a non-parametric statistical measure used to assess the agreement or concordance between multiple raters or judges when they rank a set of items. It is often used when you have ordinal data (ranks) and you want to see if there is consistency or consensus among the raters. It is widely applied in research areas such as psychology, medicine, and social sciences, where multiple judges are asked to rank or rate subjects...
654

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A global chronologically standardised database of high-resolution proxy sea-level reconstructions since 1800 CE.

Scientific data·2026
Same author

A Privacy Attack on Multiple Dynamic Match-key based Privacy-Preserving Record Linkage.

International journal of population data science·2021
Same author

Hairy roots of Datura candida×D. aurea: effect of culture medium composition on growth and alkaloid biosynthesis.

Plant cell reports·2019
Same author

Production of diosgenin by hairy root cultures ofTrigonella foenum-graecum L.

Plant cell reports·2019
Same author

A transcriptomal analysis of bovine oviductal epithelial cells collected during the follicular phase versus the luteal phase of the estrous cycle.

Reproductive biology and endocrinology : RB&E·2015
Same author

Signet-ring cell carcinoma in a goat.

The veterinary quarterly·2015
Same journal

Sussex Integrated Dataset (SID) Data Resource Profile.

International journal of population data science·2026
Same journal

Data resource profile: PharmaNet, the database for prescription drug dispensing in British Columbia, Canada.

International journal of population data science·2026
Same journal

The Impact of Indirect Transport to a Trauma Centre on Survival for Major Trauma Patients: A National Propensity-Adjusted Observational Study.

International journal of population data science·2026
Same journal

Assessing and Improving Access to Health and Social Care Services for Children Rendered Vulnerable by Abuse: Protocol for a Cross-Sectoral Longitudinal, Mixed Methods, Multi-Country Study Using Nationwide Data in Europe.

International journal of population data science·2026
Same journal

Preventive Care Uptake and Long-Term Healthcare Use Among Children with Fetal Opioid Exposure in Ontario, Canada: A Population-Based Retrospective Cohort Study.

International journal of population data science·2026
Same journal

The Road to Hell Winds On: The High Administrative Burden of Maintaining Linked National Health Data.

International journal of population data science·2026
See all related articles

Related Experiment Video

Updated: Nov 2, 2025

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.9K

Evaluation measure for group-based record linkage.

C Nanayakkara1, P Christen1, T Ranbaduge1

  • 1Research School of Computer Science, The Australian National University, Canberra, ACT 2601, Australia.

International Journal of Population Data Science
|June 7, 2021
PubMed
Summary
This summary is machine-generated.

Traditional record linkage evaluation measures like precision and recall are unsuitable for assessing grouped records. This study introduces a novel method for evaluating clustering quality in group-based record linkage, offering unambiguous and detailed insights.

More Related Videos

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.9K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.7K

Related Experiment Videos

Last Updated: Nov 2, 2025

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.9K
A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.9K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.7K

Area of Science:

  • Data Science
  • Information Science
  • Computer Science

Background:

  • Robust evaluation of record linkage techniques is crucial for accurate data integration.
  • Existing measures (e.g., precision, recall) are inadequate for evaluating the quality of linked record groups.
  • Group-based record linkage requires specialized methods for assessing clustering performance.

Purpose of the Study:

  • To highlight the limitations of traditional evaluation metrics in group-based record linkage.
  • To propose and validate a novel method for evaluating the quality of record clusters.
  • To provide a more accurate and unambiguous assessment of linkage technique performance.

Main Methods:

  • Developed a novel evaluation method to assess record allocation within predicted clusters against ground-truth data.
  • Mapped predicted clusters to ground-truth clusters to categorize individual record assignments.
  • Utilized seven distinct categories to reflect the accuracy of record grouping by linkage techniques.

Main Results:

  • Empirically validated the proposed method using real-world data, demonstrating superior reflection of cluster quality compared to traditional measures.
  • Showcased that traditional measures like precision and recall yield ambiguous results for group linkage.
  • The proposed method provides unambiguous and detailed insights into linkage performance.

Conclusions:

  • The novel evaluation method offers unambiguous results for group-based record linkage.
  • The seven-category system provides detailed information on record prediction accuracy within clusters.
  • This facilitates informed decisions on selecting appropriate record linkage techniques for specific applications.