Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.0K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.0K
Censoring Survival Data01:09

Censoring Survival Data

57
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
57
Data: Types and Distribution01:19

Data: Types and Distribution

679
In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...
679
Sampling Distribution01:12

Sampling Distribution

12.3K
Given simple random samples of size n from a given population with a measured characteristic such as mean, proportion, or standard deviation for each sample, the probability distribution of all the measured characteristics is called a sampling distribution. How much the statistic varies from one sample to another is known as the sampling variability of a statistic. You typically measure the sampling variability of a statistic by its standard error. The standard error of the mean is an example...
12.3K
Estimating Population Mean with Unknown Standard Deviation01:22

Estimating Population Mean with Unknown Standard Deviation

7.6K
In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...
7.6K
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

2.7K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
2.7K

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Privacy-preserving verification of preprocessing in federated learning for genomic data.

JAMIA open·2026
Same author

Sustainable Personalized Home Care for Pandemic Management: A Service-Oriented Approach.

Digital government (New York, N.Y.)·2026
Same author

Semantically Correct Policy Mining and Enforcement for Attribute based Access Control.

ACM transactions on Internet technology·2026
Same author

Performance Analysis of Dynamic ABAC Systems using a Queuing Theoretic Framework.

Computers & security·2026
Same author

Privacy-Preserving Verification of ML Preprocessing via Model Behavior Indicators.

IEEE transactions on privacy·2026
Same author

MALITE: Lightweight Malware Detection and Classification for Constrained Devices.

IEEE transactions on emerging topics in computing·2025
Same journal

MedAssist: LLM-Empowered Medical Assistant for Assisting the Scrutinization and Comprehension of Electronic Health Records.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference·2026
Same journal

Bridging the Scientific Knowledge Gap and Reproducibility: A Survey of Provenance, Assertion and Evidence Ontologies.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference·2025
Same journal

Uncertainty-Aware Pre-Trained Foundation Models for Patient Risk Prediction via Gaussian Process.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference·2025
Same journal

DPAR: Decoupled Graph Neural Networks with Node-Level Differential Privacy.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference·2024
Same journal

Federated Node Classification over Graphs with Latent Link-type Heterogeneity.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference·2024
Same journal

Application of an ontology for model cards to generate computable artifacts for linking machine learning information from biomedical research.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference·2024
查看所有相关文章

相关实验视频

Updated: May 30, 2025

The Replica Set Method: A High-throughput Approach to Quantitatively Measure Caenorhabditis elegans Lifespan
11:58

The Replica Set Method: A High-throughput Approach to Quantitatively Measure Caenorhabditis elegans Lifespan

Published on: June 29, 2018

9.0K

在合成数据中保存缺失的数据分布.

Xinyue Wang1, Hafiz Asif1, Jaideep Vaidya1

  • 1Rutgers University, Newark, USA.

Proceedings of the ... International World-Wide Web Conference. International WWW Conference
|January 28, 2025
PubMed
概括
此摘要是机器生成的。

本研究引入了用于生成合成数据的新方法,这些数据保留了缺失数据点的信息价值. 这种方法通过保留关键的缺失数据分布来增强保护隐私的数据分析.

关键词:
没有了,没有了,没有了.缺失的数据 缺失的数据隐私 隐私 隐私 隐私 隐私 隐私合成数据生成 合成数据生成

更多相关视频

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.4K
Quantification of Information Encoded by Gene Expression Levels During Lifespan Modulation Under Broad-range Dietary Restriction in C. elegans
09:23

Quantification of Information Encoded by Gene Expression Levels During Lifespan Modulation Under Broad-range Dietary Restriction in C. elegans

Published on: August 16, 2017

8.0K

相关实验视频

Last Updated: May 30, 2025

The Replica Set Method: A High-throughput Approach to Quantitatively Measure Caenorhabditis elegans Lifespan
11:58

The Replica Set Method: A High-throughput Approach to Quantitatively Measure Caenorhabditis elegans Lifespan

Published on: June 29, 2018

9.0K
Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.4K
Quantification of Information Encoded by Gene Expression Levels During Lifespan Modulation Under Broad-range Dietary Restriction in C. elegans
09:23

Quantification of Information Encoded by Gene Expression Levels During Lifespan Modulation Under Broad-range Dietary Restriction in C. elegans

Published on: August 16, 2017

8.0K

科学领域:

  • 计算机科学 计算机科学
  • 数据科学数据科学数据科学
  • 统计 统计 统计 统计

背景情况:

  • 网络数据通常是敏感的,需要保护隐私的分析方法.
  • 合成数据生成是保护敏感信息的关键技术.
  • 网页文物中缺失的数据包含有价值的信息,通常在传统的数据预处理过程中丢失.

研究的目的:

  • 开发和评估生成可观测和缺失数据分布的合成数据的方法.
  • 在合成数据生成之前,解决与归算或删除缺失数据相关的信息丢失问题.

主要方法:

  • 提出了用于合成数据生成的新方法.
  • 专注于保持观察和缺失数据的分布.
  • 对虚构的和真实的数据集进行了广泛的经验评估.

主要成果:

  • 证明了拟议方法在保存缺失数据分布方面的有效性.
  • 展示了合成数据保留信息内容从缺失的能力.
  • 经验评估证实了这种方法在各种数据集中的实用性.

结论:

  • 拟议的方法在保护隐私的合成数据生成方面取得了重大进展.
  • 保存缺失的数据分布对于保持敏感网络数据分析中的数据实用性至关重要.
  • 这种方法使得来自网络文物的数据分析更强大,更具信息性.