Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Accuracy and Errors in Hypothesis Testing

Accuracy and Errors in Hypothesis Testing

Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5%...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Margin of Error

Margin of Error

The margin of error is also called the maximum error of an estimate. The margin of error is the maximum possible or expected difference between the observed sample parameter value and the actual population parameter value. For proportion, it is the maximum difference between the value of sample proportion obtained from the data and the true value of population proportion. As the true value of the population parameter is not known, the margin of error is calculated using the sample statistic.

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

Uncertainty in Measurement: Accuracy and Precision

Uncertainty in Measurement: Accuracy and Precision

Scientists typically make repeated measurements of a quantity to ensure the quality of their findings and to evaluate both the precision and the accuracy of their results. Measurements are said to be precise if they yield very similar results when repeated in the same manner. A measurement is considered accurate if it yields a result that is very close to the true or the accepted value. Precise values agree with each other; accurate values agree with a true value.

Binomial Probability Distribution

Binomial Probability Distribution

A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

PToPI: A Comprehensive Review, Analysis, and Knowledge Representation of Binary Classification Performance Measures/Metrics.

SN computer science·2022

Same journal

ICUnet++: an Inception-CBAM network based on Unet++ for MR spine image segmentation.

International journal of machine learning and cybernetics·2023

Same journal

A three-way decisions approach based on double hierarchy linguistic aggregation operators of strict t-norms and t-conorms.

International journal of machine learning and cybernetics·2023

Same journal

RNON: image inpainting via repair network and optimization network.

International journal of machine learning and cybernetics·2023

Same journal

Optimal interventional policy based on discrete-time fuzzy rules equivalent model utilizing with COVID-19 pandemic data.

International journal of machine learning and cybernetics·2023

Same journal

SecureFed: federated learning empowered medical imaging technique to analyze lung abnormalities in chest X-rays.

International journal of machine learning and cybernetics·2023

Same journal

A novel framework based on the multi-label classification for dynamic selection of classifiers.

International journal of machine learning and cybernetics·2023

查看所有相关文章

Search research articles

相关实验视频

Updated: Jul 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

BenchMetrics Prob:对二进制分类问题的概率错误/损失性能评估工具进行基准测试.

Gürol Canbek¹

¹Pointr, Ankara, Turkey.

International journal of machine learning and cybernetics

|June 26, 2023

概括

此摘要是机器生成的。

这项研究评估了二进制分类的概率性性能指标,发现平均绝对误差 (MAE) 在一般使用中最强大,而根平均平方误差 (RMSE) 在大错误最重要时最好. 避免使用不太可靠的指标,如LogLoss和MAPE.

关键词:

二元分类二元分类二元分类.绩效指标是指性能指标.可能性的错误/损失.回归是一种回归.错误的平方值是错误的平方.时间序列预测时间序列预测

更多相关视频

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students

Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students

Published on: June 16, 2018

相关实验视频

Last Updated: Jul 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students

Evaluation of Commercial-Off-The-Shelf Wrist Wearables to Estimate Stress on Students

Published on: June 16, 2018

科学领域:

机器学习机器学习
统计建模统计建模
计算机科学计算机科学

背景情况:

在回归中常见的概率错误/损失指标越来越多地用于二进制分类.
现有的方法缺乏对其适用于分类任务的系统评估.

研究的目的:

系统地评估二进制分类性能评估的概率工具.
识别当前指标的弱点,并确定最强大的选择.

主要方法:

开发了一个两阶段的基准测试方法,BenchMetrics Prob.
该方法使用了五个标准和十四个模拟案例与合成数据集.
测试了31种仪器/仪器变体.

主要成果:

平均绝对误差 (MAE) 和根平均平方误差 (RMSE) 被确定为最强大的指标.
由于其可解释性和 [0, 1] 范围,MAE 建议用于一般用途.
当强调更大的错误时,RMSE是最好的.

结论:

研究人员应仔细选择可靠的概率指标来评估二进制分类的性能.
像LogLoss,MAPE,sMAPE和MRAE这样的指标表现出较低的稳定性,应该避免使用.