Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Censoring Survival Data01:09

Censoring Survival Data

334
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
334
Statistical Analysis: Overview01:11

Statistical Analysis: Overview

11.3K
When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...
11.3K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

356
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
356
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.0K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.0K
Variability: Analysis01:11

Variability: Analysis

254
Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...
254
Contingency Table01:29

Contingency Table

3.1K
A contingency table provides a way of portraying data that can facilitate calculating probabilities. It is a method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other; The table helps determine conditional probabilities quite quickly and can help systematically organize, analyze and quantify data. The table displays sample values concerning two variables that may be dependent or contingent on one...
3.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Creating Synthetic Data for Complex Surveys Using the Research and Development Survey: A Comparison Study.

Vital and health statistics. Series 2, Data evaluation and methods research·2025
Same author

Discussion.

NCHS data brief·2025
Same author

Interior-point methods for monotone linear complementarity problems based on the new kernel function with applications to control tabular adjustment problem.

Statistics, optimization & information computing·2025
Same authorSame journal

On Different Formulations of a Continuous CTA Model.

Privacy in statistical databases. PSD (Conference : 2004- )·2021
Same authorSame journal

Grouping of variables to facilitate statistical disclosure limitation methods in multivariate data sets.

Privacy in statistical databases. PSD (Conference : 2004- )·2020
Same authorSame journal

Propensity score based conditional group swapping for disclosure limitation of strata-defining variables.

Privacy in statistical databases. PSD (Conference : 2004- )·2020
Same journal

A Second Order Cone Formulation of Continuous CTA Model.

Privacy in statistical databases. PSD (Conference : 2004- )·2019
Same journal

When Excessive Perturbation Goes Wrong and Why IPUMS-International Relies Instead on Sampling, Suppression, Swapping, and Other Minimally Harmful Methods to Protect Privacy of Census Microdata.

Privacy in statistical databases. PSD (Conference : 2004- )·2017
Same journal

IPUMS-International Statistical Disclosure Controls: 159 Census Microdata Samples in Dissemination, 100+ in Preparation.

Privacy in statistical databases. PSD (Conference : 2004- )·2017
See all related articles

Related Experiment Video

Updated: Nov 8, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.8K

Multivariate Top-Coding for Statistical Disclosure Limitation.

Anna Oganian1, Ionut Iacob2, Goran Lesaja2,3

  • 1National Center for Health Statistics, 3311 Toledo Rd, Hyattsville, MD, 20782, U.S.A.

Privacy in Statistical Databases. PSD (Conference : 2004- )
|April 23, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a new multivariate top-coding method for statistical disclosure limitation. It enhances data privacy by considering variable relationships, improving protection for subpopulations.

Keywords:
Statistical disclosure limitation (SDL)association rule miningdimensionality reductiongenetic algorithmhierarchical clusteringtop-coding

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

17.1K

Related Experiment Videos

Last Updated: Nov 8, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.8K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

17.1K

Area of Science:

  • Statistics
  • Data Privacy
  • Computer Science

Background:

  • National statistical agencies face challenges in releasing microdata with many attributes while controlling disclosure risk.
  • Altering microdata for disclosure limitation requires considering variable relationships to maintain data quality.
  • Univariate Statistical Disclosure Limitation (SDL) methods may inadequately protect certain subpopulations.

Purpose of the Study:

  • To propose a multivariate top-coding method for enhanced statistical disclosure limitation.
  • To address the limitations of univariate top-coding in protecting subpopulations.
  • To improve the quality and privacy of public microdata sets.

Main Methods:

  • Developed a multivariate top-coding approach by clustering variables based on a closeness metric.
  • Utilized Association Rule Mining techniques within variable clusters to formulate top-coding rules.
  • Extended the methodology for a similar multivariate bottom-coding procedure.
  • Illustrated the method using a realistic, large-scale multivariate data set.

Main Results:

  • The proposed multivariate top-coding method offers improved disclosure control compared to univariate methods.
  • Clustering variables and applying association rule mining within clusters effectively identifies and manages extreme values.
  • The approach is demonstrated to be applicable to genuine, complex datasets.

Conclusions:

  • Multivariate top-coding, by considering inter-variable relationships, provides superior privacy protection for microdata.
  • This method enhances the utility of public-use microdata by better balancing data utility and privacy.
  • The proposed approach offers a robust framework for advanced Statistical Disclosure Limitation.