Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Weighted Mean00:57

Weighted Mean

5.5K
While taking the arithmetic, geometric, or harmonic mean of a sample data set, equal importance is assigned to all the data points. However, all the values may not always be equally important in some data sets. An intrinsic bias might make it more important to give more weightage to specific values over others.
For example, consider the number of goals scored in the matches of a tournament. While computing the average number of goals scored in the tournament, it may be more important to...
5.5K
Outliers and Influential Points01:08

Outliers and Influential Points

5.1K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
5.1K
Polymers: Molecular Weight Distribution01:10

Polymers: Molecular Weight Distribution

4.0K
For any given polymer, the weight average molecular weight (Mw) is higher than, if not equal to, the number average molecular weight (Mn). The only situation in which the weight average molecular weight and the number average molecular weight are equal is when a polymer consists only of chains with equal molecular weight. However, this never happens in a synthetic polymer, since it is difficult to control the polymerization process up to a molecular level with accuracy to a hundred percent.
4.0K
Regression Toward the Mean01:52

Regression Toward the Mean

6.3K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.3K
Confidence Coefficient01:24

Confidence Coefficient

9.2K
The confidence coefficient is also known as the confidence level or degree of confidence. It is the percent expression for the probability, 1-α, that the confidence interval contains the true population parameter assuming that the confidence interval is obtained after sufficient unbiased sampling; for example, if the CL = 90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter. Here α is the area under the curve, distributed equally under...
9.2K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

4.0K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
4.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Organic matter components rather than microbial enzymes and genes predominate CO<sub>2</sub>/CH<sub>4</sub> emissions during composting amended with biochar at different stages.

Environmental pollution (Barking, Essex : 1987)·2025
Same author

Insight into mitigation mechanisms of N<sub>2</sub>O emission by biochar during agricultural waste composting.

Bioresource technology·2024
Same author

Selecting Optimal Subset to release under Differentially Private M-estimators from Hybrid Datasets.

IEEE transactions on knowledge and data engineering·2018
Same author

Finding relevant biomedical datasets: the UC San Diego solution for the bioCADDIE Retrieval Challenge.

Database : the journal of biological databases and curation·2018
Same author

Mechanisms to protect the privacy of families when using the transmission disequilibrium test in genome-wide association studies.

Bioinformatics (Oxford, England)·2017
Same author

Partitioning-based mechanisms under personalized differential privacy.

Advances in knowledge discovery and data mining : ... Pacific-Asia Conference, PAKDD ..., proceedings. Pacific-Asia Conference on Knowledge Discovery and Data Mining·2017
Same journal

Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification.

Machine learning·2026
Same journal

Linear Causal Discovery with Interventional Constraints.

Machine learning·2026
Same journal

Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models.

Machine learning·2025
Same journal

Mining exceptional social behavior on attributed interaction networks.

Machine learning·2025
Same journal

Persistent Laplacian-enhanced algorithm for scarcely labeled data classification.

Machine learning·2025
Same journal

Ensuring medical AI safety: interpretability-driven detection and mitigation of spurious model behavior and associated data.

Machine learning·2025
See all related articles

Related Experiment Video

Updated: May 3, 2026

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.3K

Differential privacy based on importance weighting.

Zhanglong Ji1, Charles Elkan1

  • 1Department of Computer Science and Engineering 0404, University of California, San Diego, USA.

Machine Learning
|February 1, 2014
PubMed
Summary
This summary is machine-generated.

This study introduces a novel privacy-preserving data publishing method using weighted datasets. It ensures differential privacy for statistical queries, even with limited privacy budgets or differing data populations.

Keywords:
Differential privacyImportance weightingPrivacy

Related Experiment Videos

Last Updated: May 3, 2026

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.3K

Area of Science:

  • Computer Science
  • Data Privacy
  • Statistical Analysis

Background:

  • Data publishing is essential for research and analytics.
  • Protecting sensitive information in datasets is a critical challenge.
  • Existing methods may not offer robust privacy guarantees or may limit data utility.

Purpose of the Study:

  • To develop a novel method for publishing data while ensuring strong privacy.
  • To enable accurate statistical queries on sensitive datasets.
  • To provide provable differential privacy guarantees.

Main Methods:

  • Computing importance sampling weights to create an analogous dataset.
  • Regularizing weights and adding noise for privacy protection.
  • Deriving an expression for the asymptotic variance of approximate answers.

Main Results:

  • The proposed mechanism achieves provable differential privacy for statistical queries.
  • Experimental results demonstrate good performance even with small privacy budgets.
  • The method is effective when public and private datasets originate from different populations.

Conclusions:

  • The novel weighting method offers a viable solution for privacy-preserving data publishing.
  • This approach balances data utility with robust differential privacy.
  • The technique is applicable in diverse data scenarios, including cross-population analysis.