Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Weighted Mean

Weighted Mean

While taking the arithmetic, geometric, or harmonic mean of a sample data set, equal importance is assigned to all the data points. However, all the values may not always be equally important in some data sets. An intrinsic bias might make it more important to give more weightage to specific values over others.
For example, consider the number of goals scored in the matches of a tournament. While computing the average number of goals scored in the tournament, it may be more important to...

Outliers and Influential Points

Outliers and Influential Points

An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...

Polymers: Molecular Weight Distribution

Polymers: Molecular Weight Distribution

For any given polymer, the weight average molecular weight (Mw) is higher than, if not equal to, the number average molecular weight (Mn). The only situation in which the weight average molecular weight and the number average molecular weight are equal is when a polymer consists only of chains with equal molecular weight. However, this never happens in a synthetic polymer, since it is difficult to control the polymerization process up to a molecular level with accuracy to a hundred percent.

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Confidence Coefficient

Confidence Coefficient

The confidence coefficient is also known as the confidence level or degree of confidence. It is the percent expression for the probability, 1-α, that the confidence interval contains the true population parameter assuming that the confidence interval is obtained after sufficient unbiased sampling; for example, if the CL = 90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter. Here α is the area under the curve, distributed equally under...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Organic matter components rather than microbial enzymes and genes predominate CO<sub>2</sub>/CH<sub>4</sub> emissions during composting amended with biochar at different stages.

Environmental pollution (Barking, Essex : 1987)·2025

Same author

Insight into mitigation mechanisms of N<sub>2</sub>O emission by biochar during agricultural waste composting.

Bioresource technology·2024

Same author

Selecting Optimal Subset to release under Differentially Private M-estimators from Hybrid Datasets.

IEEE transactions on knowledge and data engineering·2018

Same author

Finding relevant biomedical datasets: the UC San Diego solution for the bioCADDIE Retrieval Challenge.

Database : the journal of biological databases and curation·2018

Same author

Mechanisms to protect the privacy of families when using the transmission disequilibrium test in genome-wide association studies.

Bioinformatics (Oxford, England)·2017

Same author

Partitioning-based mechanisms under personalized differential privacy.

Advances in knowledge discovery and data mining : ... Pacific-Asia Conference, PAKDD ..., proceedings. Pacific-Asia Conference on Knowledge Discovery and Data Mining·2017

Same journal

Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification.

Machine learning·2026

Same journal

Linear Causal Discovery with Interventional Constraints.

Machine learning·2026

Same journal

Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models.

Machine learning·2025

Same journal

Mining exceptional social behavior on attributed interaction networks.

Machine learning·2025

Same journal

Persistent Laplacian-enhanced algorithm for scarcely labeled data classification.

Machine learning·2025

Same journal

Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data.

Machine learning·2025

See all related articles

Search research articles

Related Experiment Video

Updated: May 3, 2026

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Differential privacy based on importance weighting.

Zhanglong Ji¹, Charles Elkan¹

¹Department of Computer Science and Engineering 0404, University of California, San Diego, USA.

Machine Learning

|February 1, 2014

Summary

This summary is machine-generated.

This study introduces a novel privacy-preserving data publishing method using weighted datasets. It ensures differential privacy for statistical queries, even with limited privacy budgets or differing data populations.

Keywords:

Differential privacy Importance weighting Privacy

Related Experiment Videos

Last Updated: May 3, 2026

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Area of Science:

Computer Science
Data Privacy
Statistical Analysis

Background:

Data publishing is essential for research and analytics.
Protecting sensitive information in datasets is a critical challenge.
Existing methods may not offer robust privacy guarantees or may limit data utility.

Purpose of the Study:

To develop a novel method for publishing data while ensuring strong privacy.
To enable accurate statistical queries on sensitive datasets.
To provide provable differential privacy guarantees.

Main Methods:

Computing importance sampling weights to create an analogous dataset.
Regularizing weights and adding noise for privacy protection.
Deriving an expression for the asymptotic variance of approximate answers.

Main Results:

The proposed mechanism achieves provable differential privacy for statistical queries.
Experimental results demonstrate good performance even with small privacy budgets.
The method is effective when public and private datasets originate from different populations.

Conclusions:

The novel weighting method offers a viable solution for privacy-preserving data publishing.
This approach balances data utility with robust differential privacy.
The technique is applicable in diverse data scenarios, including cross-population analysis.