Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Variability: Analysis

Variability: Analysis

Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...

Contingency Table

Contingency Table

A contingency table provides a way of portraying data that can facilitate calculating probabilities. It is a method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other; The table helps determine conditional probabilities quite quickly and can help systematically organize, analyze and quantify data. The table displays sample values concerning two variables that may be dependent or contingent on one...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Creating Synthetic Data for Complex Surveys Using the Research and Development Survey: A Comparison Study.

Vital and health statistics. Series 2, Data evaluation and methods research·2025

Same author

Discussion.

NCHS data brief·2025

Same author

Interior-point methods for monotone linear complementarity problems based on the new kernel function with applications to control tabular adjustment problem.

Statistics, optimization & information computing·2025

Same authorSame journal

On Different Formulations of a Continuous CTA Model.

Privacy in statistical databases. PSD (Conference : 2004- )·2021

Same authorSame journal

Grouping of variables to facilitate statistical disclosure limitation methods in multivariate data sets.

Privacy in statistical databases. PSD (Conference : 2004- )·2020

Same authorSame journal

Propensity score based conditional group swapping for disclosure limitation of strata-defining variables.

Privacy in statistical databases. PSD (Conference : 2004- )·2020

Same journal

A Second Order Cone Formulation of Continuous CTA Model.

Privacy in statistical databases. PSD (Conference : 2004- )·2019

Same journal

When Excessive Perturbation Goes Wrong and Why IPUMS-International Relies Instead on Sampling, Suppression, Swapping, and Other Minimally Harmful Methods to Protect Privacy of Census Microdata.

Privacy in statistical databases. PSD (Conference : 2004- )·2017

Same journal

IPUMS-International Statistical Disclosure Controls: 159 Census Microdata Samples in Dissemination, 100+ in Preparation.

Privacy in statistical databases. PSD (Conference : 2004- )·2017

See all related articles

Search research articles

Related Experiment Video

Updated: Nov 8, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Multivariate Top-Coding for Statistical Disclosure Limitation.

Anna Oganian¹, Ionut Iacob², Goran Lesaja^2,3

¹National Center for Health Statistics, 3311 Toledo Rd, Hyattsville, MD, 20782, U.S.A.

Privacy in Statistical Databases. PSD (Conference : 2004- )

|April 23, 2021

Summary

This summary is machine-generated.

This study introduces a new multivariate top-coding method for statistical disclosure limitation. It enhances data privacy by considering variable relationships, improving protection for subpopulations.

Keywords:

Statistical disclosure limitation (SDL)association rule mining dimensionality reduction genetic algorithm hierarchical clustering top-coding

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Related Experiment Videos

Last Updated: Nov 8, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Area of Science:

Statistics
Data Privacy
Computer Science

Background:

National statistical agencies face challenges in releasing microdata with many attributes while controlling disclosure risk.
Altering microdata for disclosure limitation requires considering variable relationships to maintain data quality.
Univariate Statistical Disclosure Limitation (SDL) methods may inadequately protect certain subpopulations.

Purpose of the Study:

To propose a multivariate top-coding method for enhanced statistical disclosure limitation.
To address the limitations of univariate top-coding in protecting subpopulations.
To improve the quality and privacy of public microdata sets.

Main Methods:

Developed a multivariate top-coding approach by clustering variables based on a closeness metric.
Utilized Association Rule Mining techniques within variable clusters to formulate top-coding rules.
Extended the methodology for a similar multivariate bottom-coding procedure.
Illustrated the method using a realistic, large-scale multivariate data set.

Main Results:

The proposed multivariate top-coding method offers improved disclosure control compared to univariate methods.
Clustering variables and applying association rule mining within clusters effectively identifies and manages extreme values.
The approach is demonstrated to be applicable to genuine, complex datasets.

Conclusions:

Multivariate top-coding, by considering inter-variable relationships, provides superior privacy protection for microdata.
This method enhances the utility of public-use microdata by better balancing data utility and privacy.
The proposed approach offers a robust framework for advanced Statistical Disclosure Limitation.