Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.8K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.8K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.5K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.5K
Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test01:09

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

5.5K
In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...
5.5K
Data Validation01:15

Data Validation

553
Method validation is a crucial process in analytical chemistry designed to confirm that a given method consistently produces reliable and high-quality results. This process is essential when a method is applied to different sample matrices or when procedural modifications are made, ensuring that the results meet acceptable standards across various applications.
Key parameters for method validation include:
553
Data Validation01:03

Data Validation

6.3K
Data validation is an essential part of a comprehensive assessment. Validation is confirming or verifying and opening the door to gathering more assessment data as it clarifies vague or unclear data. The process of checking and verifying the collected information is called data validation. The primary purpose of data validation is to ensure data is as free from error, bias, and misinterpretation as possible.
Nursing assessment guides are generally based on holistic models rather than medical...
6.3K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

8.1K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
8.1K

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Explore Thermal and Mechanical Properties of Biobased Polyurethane Elastomers Through Machine Learning Models.

Macromolecular rapid communications·2026
Same author

Tunable and Photomodifiable Nonisocyanate Polyurethanes from Lignin-Based Cyclic Carbonates Bearing α,β-Unsaturated Ketone.

ACS macro letters·2025
Same author

Castor Oil-Derived Ionic Liquids for Flexible, Antibacterial Biobased Thermosetting Polymers via Thiol-Ene Click Chemistry.

ACS macro letters·2025
Same author

Nanofibrous Hyper-Cross-Linked Polymer Based on Veratraldehyde-Derived Triarylimidazole for Cationic Organic Pollutant Adsorption.

Biomacromolecules·2025
Same author

Cellulose-Wool Keratin Composite Hydrogels as Selective Support Carriers for Gold Nanoparticles: Synthesis and Catalytic Applications in the Reduction of 4-Nitrophenol in Water.

Langmuir : the ACS journal of surfaces and colloids·2025
Same author

Enclose Biobased Content into Polyurethane Elastomers: A Summary of Synthetic Routes and an Inverse Prediction of their Percentages.

Macromolecular rapid communications·2025
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
Same journal

Structural and Thermodynamic Discrimination between Agonists and Antagonists of Retinoic Acid Receptor γ and the Vitamin D Receptor.

Journal of chemical information and modeling·2026
Same journal

PACEff Builder: An Efficient Platform for Constructing PACE Hybrid-Resolution Models for Molecular Dynamics Simulations of Aqueous Protein, Peptide Assembly, and Membrane Protein Systems.

Journal of chemical information and modeling·2026
Same journal

TransKla: A Local-Global Cross-Attention Based Transformer Approach for Prediction of Lysine Lactylation Sites.

Journal of chemical information and modeling·2026
Same journal

CondenSimAdapter: A Versatile Builder for Multiscale Simulations of Protein Condensates with Broad Force-Field Compatibility and Robust Dense-Phase Relaxation.

Journal of chemical information and modeling·2026
查看所有相关文章

相关实验视频

Updated: Jan 9, 2026

Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

11.1K

DCC:一个无模型的框架来评估数据集质量.

Chunhui Xie1, Yunqi Li1

  • 1Department of Polymer Materials and Engineering, College of Materials and Metallurgy, Guizhou University, Guiyang 550025, P.R. China.

Journal of chemical information and modeling
|December 9, 2025
PubMed
概括
此摘要是机器生成的。

我们介绍了数据相关性收 (DCC),这是评估数据集质量的新框架. DCC量化了扰动下的数据稳定性,为评估数据完整性和代表性提供了传统方法的计算效率高的替代方案.

更多相关视频

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.9K
A Quantitative Fitness Analysis Workflow
11:39

A Quantitative Fitness Analysis Workflow

Published on: August 13, 2012

14.9K

相关实验视频

Last Updated: Jan 9, 2026

Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

11.1K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.9K
A Quantitative Fitness Analysis Workflow
11:39

A Quantitative Fitness Analysis Workflow

Published on: August 13, 2012

14.9K

科学领域:

  • 数据科学数据科学数据科学
  • 材料科学 材料科学 材料科学
  • 统计建模 统计建模

背景情况:

  • 评估数据集质量对于可靠的分析和模型性能至关重要.
  • 现有的数据质量评估方法往往是计算密集型和模型依赖的.
  • 需要一个理论上的基础和广泛适用的框架来评估数据的完整性和代表性.

研究的目的:

  • 为评估数据集质量提出数据相关性趋同 (DCC) 框架.
  • 为传统的计算密集型和依赖模型的方法提供替代方案.
  • 量化数据集在扰动下的稳定性,反映完整性和代表性.

主要方法:

  • DCC集成了多个相关函数来量化数值相关性和分布相似性.
  • 该框架假设高质量的数据集在扰乱下表现出稳定的相关性模式.
  • 用假设和基准数据集来验证DCC框架的有效性.

主要成果:

  • 最低的DCC值在10-20%的线性相关性中观察到,随着更具决定性的相关性而增加.
  • DCC值有效地预测机器学习模型的性能指标 (例如,精度,R平方) 和特征重要性 (SHAP值).
  • 通过捕捉固有的相关性模式,DCC可以有效地压缩数据集.

结论:

  • DCC框架为数据集质量评估提供了一个理论上有根据的,广泛适用的和可扩展的方法.
  • DCC提供了关于数据完整性,代表性和潜在偏差的见解.
  • 这种方法可以为科学研究和机器学习应用提供更好的数据注释和选择.