Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Estimating Population Standard Deviation

Estimating Population Standard Deviation

When the population standard deviation is unknown and the sample size is large, the sample standard deviation s is commonly used as a point estimate of σ. However, it can sometimes under or overestimate the population standard deviation. To overcome this drawback, confidence intervals are determined to estimate population parameters and eliminate any calculation bias accurately. However, this only applies to random samples from normally distributed populations. Knowing the sample mean and...

Estimating Population Mean with Unknown Standard Deviation

Estimating Population Mean with Unknown Standard Deviation

In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...

What are Estimates?

What are Estimates?

It isn't easy to measure a parameter such as the mean height or the mean weight of a population. So, we draw samples from the population and calculate the mean height or mean weight of the individuals in the sample. This sample data acts as a representative measure of the population parameter. These sample statistics are known as estimates.
The estimate for the mean of a sample is denoted by ͞x, whereas the mean of the population is designated as μ. Further, parameters such...

Estimating Population Mean with Known Standard Deviation

Estimating Population Mean with Known Standard Deviation

To construct a confidence interval for a single unknown population mean μ, where the population standard deviation is known, we need sample mean as an estimate for μ and we need the margin of error. Here, the margin of error (EBM) is called the error bound for a population mean (abbreviated EBM). The sample mean is the point estimate of the unknown population mean μ.
The confidence interval estimate will have the form as follows:
(point estimate - error bound, point estimate +...

Statistical Significance

Statistical Significance

Once data is collected from both the experimental and the control groups, a statistical analysis is conducted to find out if there are meaningful differences between the two groups. A statistical analysis determines how likely any difference found is due to chance (and thus not meaningful). In psychology, group differences are considered meaningful, or significant, if the odds that these differences occurred by chance alone are 5 percent or less. Stated another way, if we repeated this...

Empirical Method to Interpret Standard Deviation

Empirical Method to Interpret Standard Deviation

The empirical rule, also known as the three-sigma rule, allows a statistician to interpret the standard deviation in a normally distributed dataset. The rule states that 68% of the data lies within one standard deviation from the mean, 95% lies within two standard deviations from the mean, and 99.7% lies within three standard deviations from the mean. Additionally, this rule is also called the 68-95-99.7 rule.
This rule is used widely in statistics to calculate the proportion of data values...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Optimization of Fe(III)-based negative electrodes for lithium-ion batteries: probing electrochemical performance and stability characteristics.

Dalton transactions (Cambridge, England : 2003)·2026

Same author

The IMPACT epilepsy Consortium: Exploring social drivers of health in epilepsy care to advance solution based initiatives.

Epilepsy & behavior : E&B·2026

Same author

Naturalistic Driving Outcomes and Sensorimotor Function in Cognitively Normal Older Adults.

Journal of the American Geriatrics Society·2026

Same author

Multivariate and Online Transfer Learning With Uncertainty Quantification.

Statistics in medicine·2026

Same author

Redox-Active Bis-Catecholaldimine Cu(II)-Salen Complex with Hydroxyl Functionality as Cathode Material in Li-Ion Battery.

ChemPlusChem·2026

Same author

A Minimalist Iron Porphyrin Which Can Catalyze Both Peroxidation and Oxygen Reduction Reaction.

JACS Au·2025

Same journal

Regression Trees and Ensemble for Multivariate Outcomes.

Sankhya. Series B. [Methodological.]·2025

Same journal

Cluster Based Association Measures with Applications.

Sankhya. Series B. [Methodological.]·2025

Same journal

Mediation Analysis using Semi-parametric Shape-Restricted Regression with Applications.

Sankhya. Series B. [Methodological.]·2024

Same journal

A Blockwise Consistency Method for Parameter Estimation of Complex Models.

Sankhya. Series B. [Methodological.]·2021

Same journal

Local linear estimation for spatial random processes with stochastic trend and stationary noise.

Sankhya. Series B. [Methodological.]·2019

Same journal

NONPARAMETRIC BENCHMARK ANALYSIS IN RISK ASSESSMENT: A COMPARATIVE STUDY BY SIMULATION AND DATA ANALYSIS.

Sankhya. Series B. [Methodological.]·2013

查看所有相关文章

Search research articles

相关实验视频

Updated: Jan 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

词嵌入作为统计估计器

Neil Dey¹, Matthew Singer¹, Jonathan P Williams²

¹Department of Statistics, North Carolina State University.

Sankhya. Series B. [Methodological.]

|December 19, 2025

概括

此摘要是机器生成的。

本研究介绍了词嵌入的统计框架,通过点向互联信息 (PMI) 解释Word2Vec. 一个新的缺失值估计器提供了一个统计学上合理的替代方案,其性能与Word2Vec.ec.相当.

更多相关视频

Decoding Natural Behavior from Neuroethological Embedding

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

相关实验视频

Last Updated: Jan 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Decoding Natural Behavior from Neuroethological Embedding

Decoding Natural Behavior from Neuroethological Embedding

Published on: October 3, 2025

科学领域:

自然语言处理自然语言处理.
统计学理论统计学理论
机器学习机器学习

背景情况:

词嵌入在NLP中至关重要,但缺乏理论理解.
目前的评估依赖于经验性表现,而不是严格的属性.
正式推断和不确定性量化需要一个理论基础.

研究的目的:

提供关于词嵌入的统计理论视角.
在正式的统计模型中解释Word2Vec等经典方法.
开发一种新的,统计学上可处理的替代现有的词嵌入技术.

主要方法:

为文本数据提出了一个基于copula的统计模型.
解释了Word2Vec作为理论点向相互信息 (PMI) 的估计器.
在之前的工作基础上开发了基于缺失的价值的估计器.

主要成果:

证明了Word2Vec与估计理论PMI的联系.
建议的缺失值估计器显示了与Word2Vec.ec相似的估计错误.
新的估计器的性能优于基于切割的方法.
在IMDb情绪分析任务中实现了与Word2Vec可比的性能.

结论:

基于的模型为词嵌入提供了理论基础.
缺失值估计器提供了一个统计学上可解释和有效的替代方案.
这项工作弥合了经验上的成功与词嵌入中的理论理解之间的差距.