Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

2.7K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
2.7K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.4K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.2K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.2K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

4.1K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
4.1K
Introduction to z Scores01:05

Introduction to z Scores

687
A z score (or standardized value) is measured in units of the standard deviation. It indicates how many standard deviations the value x is above (to the right of) or below (to the left of) the mean, μ. Values of x that are larger than the mean have positive z scores, and values of x that are smaller than the mean have negative z scores. If x equals the mean, then x has a zero z score. It is important to note that the mean of the z scores is zero, and the standard deviation is one.
z scores...
687
Compacting Factor test01:22

Compacting Factor test

258
The compacting factor test is a method used to assess the workability of concrete. It is  especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...
258

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Platelet proteomics on less than a drop of previously frozen, non-citrate plasma.

Molecular omics·2026
Same author

Artificial Intelligence as an Add-On Instrument in Fetal Ultrasound; Sonographers' and Obstetricians' Expectations.

Prenatal diagnosis·2026
Same author

Contrasting effects of SARS-CoV-2 vaccination vs. infection on antibody and TCR repertoires.

PloS one·2026
Same author

Why are we doing this alone? A collaborative framework for LDT development and validation.

Journal of clinical microbiology·2026
Same author

What's not to learn? AI meets parasitology.

Journal of clinical microbiology·2025
Same author

From Bytes to Beats: Overcoming Conceptual and Implementation Challenges for AI in Cardiovascular Care.

Circulation·2025
Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026
Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026
Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026
Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026
Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026
Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026
查看所有相关文章

相关实验视频

Updated: Sep 19, 2025

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.5K

X-Factor:质量是一个数据集内在的属性.

Josiah Couch1, Miao Li1, Rima Arnaout2,3,4

  • 1Department of Pathology, BIDMC.

ArXiv
|June 10, 2025
PubMed
概括
此摘要是机器生成的。

数据集的质量,无论大小和架构如何,都会对机器学习分类器的性能产生重大影响. 这种源自类质量的内在性质,为更好的模型性能提供了一个新的优化目标.

更多相关视频

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

921
Quantifying X-Ray Fluorescence Data Using MAPS
14:58

Quantifying X-Ray Fluorescence Data Using MAPS

Published on: February 17, 2018

10.9K

相关实验视频

Last Updated: Sep 19, 2025

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.5K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

921
Quantifying X-Ray Fluorescence Data Using MAPS
14:58

Quantifying X-Ray Fluorescence Data Using MAPS

Published on: February 17, 2018

10.9K

科学领域:

  • 机器学习 机器学习
  • 计算机科学 计算机科学
  • 数据科学数据科学数据科学

背景情况:

  • 模型架构,数据集大小和类平衡是影响机器学习分类器性能的已知因素.
  • 一个额外的因素,数据集质量,以前被建议,但其内在性质尚不清楚.

研究的目的:

  • 确定数据集质量是否是独立于其他因素的内在属性.
  • 调查数据集质量与不同模型架构的分类器性能之间的关系.

主要方法:

  • 数以千计的数据集被创建,控制大小和类平衡.
  • 在这些数据集上训练了各种架构 (随机森林,SVM,深度网络) 的分类器.
  • 对不同数据集和架构的分类器性能进行了分析.

主要成果:

  • 分类器的性能在不同架构之间显示出强烈的相关性 (R2 = 0.79).
  • 这表明数据集质量是一个内在的属性,独立于数据集大小,类平衡和模型架构.
  • 发现数据集质量是构成类的质量的一个新兴属性.

结论:

  • 数据集质量是机器学习分类器性能的独立相关因子.
  • 质量将数据集大小,类平衡和模型架构作为一个关键的优化目标.
  • 专注于内在数据集和课堂质量可以改善机器学习模型优化.