Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Accuracy and Errors in Hypothesis Testing01:13

Accuracy and Errors in Hypothesis Testing

229
Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5%...
229
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

2.6K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
2.6K
Margin of Error01:27

Margin of Error

4.4K
The margin of error is also called the maximum error of an estimate. The margin of error is the maximum possible or expected difference between the observed sample parameter value and the actual population parameter value. For proportion, it is the maximum difference between the value of sample proportion obtained from the data and the true value of population proportion. As the true value of the population parameter is not known, the margin of error is calculated using the sample statistic.
4.4K
Statistical Analysis: Overview01:11

Statistical Analysis: Overview

6.7K
When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...
6.7K
Uncertainty in Measurement: Accuracy and Precision03:37

Uncertainty in Measurement: Accuracy and Precision

74.0K
Scientists typically make repeated measurements of a quantity to ensure the quality of their findings and to evaluate both the precision and the accuracy of their results. Measurements are said to be precise if they yield very similar results when repeated in the same manner. A measurement is considered accurate if it yields a result that is very close to the true or the accepted value. Precise values agree with each other; accurate values agree with a true value. 
74.0K
Binomial Probability Distribution01:15

Binomial Probability Distribution

11.2K
A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...
11.2K

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

PToPI: A Comprehensive Review, Analysis, and Knowledge Representation of Binary Classification Performance Measures/Metrics.

SN computer science·2022
Same journal

ICUnet++: an Inception-CBAM network based on Unet++ for MR spine image segmentation.

International journal of machine learning and cybernetics·2023
Same journal

A three-way decisions approach based on double hierarchy linguistic aggregation operators of strict t-norms and t-conorms.

International journal of machine learning and cybernetics·2023
Same journal

RNON: image inpainting via repair network and optimization network.

International journal of machine learning and cybernetics·2023
Same journal

Optimal interventional policy based on discrete-time fuzzy rules equivalent model utilizing with COVID-19 pandemic data.

International journal of machine learning and cybernetics·2023
Same journal

SecureFed: federated learning empowered medical imaging technique to analyze lung abnormalities in chest X-rays.

International journal of machine learning and cybernetics·2023
Same journal

A novel framework based on the multi-label classification for dynamic selection of classifiers.

International journal of machine learning and cybernetics·2023
查看所有相关文章

相关实验视频

Updated: Jul 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K

BenchMetrics Prob:对二进制分类问题的概率错误/损失性能评估工具进行基准测试.

Gürol Canbek1

  • 1Pointr, Ankara, Turkey.

International journal of machine learning and cybernetics
|June 26, 2023
PubMed
概括
此摘要是机器生成的。

这项研究评估了二进制分类的概率性性能指标,发现平均绝对误差 (MAE) 在一般使用中最强大,而根平均平方误差 (RMSE) 在大错误最重要时最好. 避免使用不太可靠的指标,如LogLoss和MAPE.

关键词:
二元分类二元分类二元分类.绩效指标是指性能指标.可能性的错误/损失.回归是一种回归.错误的平方值是错误的平方.时间序列预测时间序列预测

更多相关视频

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.1K
Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students
12:51

Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students

Published on: June 16, 2018

7.5K

相关实验视频

Last Updated: Jul 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.1K
Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students
12:51

Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students

Published on: June 16, 2018

7.5K

科学领域:

  • 机器学习 机器学习
  • 统计建模 统计建模
  • 计算机科学 计算机科学

背景情况:

  • 在回归中常见的概率错误/损失指标越来越多地用于二进制分类.
  • 现有的方法缺乏对其适用于分类任务的系统评估.

研究的目的:

  • 系统地评估二进制分类性能评估的概率工具.
  • 识别当前指标的弱点,并确定最强大的选择.

主要方法:

  • 开发了一个两阶段的基准测试方法,BenchMetrics Prob.
  • 该方法使用了五个标准和十四个模拟案例与合成数据集.
  • 测试了31种仪器/仪器变体.

主要成果:

  • 平均绝对误差 (MAE) 和根平均平方误差 (RMSE) 被确定为最强大的指标.
  • 由于其可解释性和 [0, 1] 范围,MAE 建议用于一般用途.
  • 当强调更大的错误时,RMSE是最好的.

结论:

  • 研究人员应仔细选择可靠的概率指标来评估二进制分类的性能.
  • 像LogLoss,MAPE,sMAPE和MRAE这样的指标表现出较低的稳定性,应该避免使用.