Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

How Data are Classified: Categorical Data

How Data are Classified: Categorical Data

A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Classification of Systems-I

Classification of Systems-I

Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:

Classification of Systems-II

Classification of Systems-II

Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,

How Data are Classified: Numerical Data

How Data are Classified: Numerical Data

Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Synthesis and Luminescence Properties of Eu<sup>2+</sup>-Doped Sr<sub>3</sub>MgSi<sub>2</sub>O<sub>8</sub> Blue Light-Emitting Phosphor for Application in Near-Ultraviolet Excitable White Light-Emitting Diodes.

Nanomaterials (Basel, Switzerland)·2022

Same author

Investigations of a Statistical and Analytical Method to Find the Relationship between the Morphological and Optical Properties of ZnO Nanoflower Arrays.

ACS omega·2022

Same author

Study on the university students' satisfaction of the wisdom tree massive open online course platform based on parameter optimization intelligent algorithm.

Science progress·2021

Same author

Protective effect of <i>Bifidobacterium infantis</i> CGMCC313-2 on ovalbumin-induced airway asthma and β-lactoglobulin-induced intestinal food allergy mouse models.

World journal of gastroenterology·2017

Same author

Loop-mediated isothermal amplification of Neisseria gonorrhoeae porA pseudogene: a rapid and reliable method to detect gonorrhea.

AMB Express·2017

Same author

Efficiently solving general weapon-target assignment problem by genetic algorithms with greedy eugenics.

IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society·2008

Same journal

Therapeutic potential of crude protein extracts from two Egyptian freshwater snails Lanistes carinatus and Bellamya unicolor.

Scientific reports·2026

Same journal

Microbial contamination of donor corneas and post-keratoplasty endophthalmitis: a comparison between Japanese and U.S. eye banks using cold storage.

Scientific reports·2026

Same journal

Prevalence and contributing factors of virological non-suppression among adult patients on first-line antiretroviral therapy in tertiary hospitals in Ethiopia.

Scientific reports·2026

Same journal

An in vitro comparison of color stability between alkasite and different restorative materials in various staining solutions.

Scientific reports·2026

Same journal

Toward accessible mRNA LNP formulation: systematic evaluation of mixing strategies and key parameters.

Scientific reports·2026

Same journal

A network analysis of personality traits, mentalizing, and psychological health in Chinese college students.

Scientific reports·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: May 8, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

干豆的聚类和分类具有不平衡的数据特征.

Chou-Yuan Lee¹, Wei Wang², Jian-Qiong Huang³

¹School of Big Data, Fuzhou University of International Studies and Trade, Fuzhou, 350202, China. lqy@fzfu.edu.cn.

Scientific reports

|December 27, 2024

概括

此摘要是机器生成的。

本研究引入了一种新的算法,结合了边界线合成少数群体过量采样技术 (BLSMOTE) 和K-means集群,以提高不平衡数据集的机器学习分类准确性. 提出的方法显著改善了性能指标,如精度和回忆.

关键词:

布尔斯莫特 (BLSMOTE) 是一个决策树决策树是一个决策树.不平衡的数据不平衡的数据 K-意味着K的意思是K.随机的森林随机的森林支持矢量机器的支持矢量机器.

更多相关视频

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

相关实验视频

Last Updated: May 8, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

科学领域:

机器学习机器学习
数据科学数据科学数据科学
计算机科学计算机科学

背景情况:

传统的机器学习模型,如决策树 (DT),随机森林 (RF) 和支持矢量机 (SVM),在不平衡的数据集上表现出有限的分类性能.
不平衡的数据,其中一个类明显超过其他类,对模型训练和准确的预测构成挑战.
现有的方法往往难以有效地处理阶级不平衡,导致有偏见的模型和糟糕的概括.

研究的目的:

开发和评估一种新的混合算法,以提高不平衡数据集的分类准确性.
解决传统机器学习算法在处理不同类分布的数据集方面的局限性.
提高关键性能指标,如精度,回忆,F1得分和曲线下面积 (AUC).

主要方法:

拟议的算法集成了边界合成少数群体过量采样技术 (BLSMOTE) 与K-means集群.
BLSMOTE在少数阶级的边界上生成合成样本,以减轻噪音和改善阶级代表性.
K-意味着基于相似性的数据点进行聚类,进一步帮助数据分区和模型训练.

主要成果:

与传统方法相比,BLSMOTE + K-means + SVM算法在干豆和肥胖水平数据集上表现出优异的分类性能.
BLSMOTE + K-means + DT成功地为两个数据集生成了决策规则,提供了可解释的见解.
BLSMOTE + K-means + RF有效地对解释变量的重要性进行了排名,为特征选择提供了有价值的信息.

结论:

拟议的BLSMOTE + K-means混合方法为增强对不平衡数据的机器学习分类提供了强大的解决方案.
这种方法提高了整体预测准确度,并通过决策规则和变量重要性排名提供了有价值的见解.
这些发现提供了科学证据,以支持处理不平衡数据集的领域的决策过程.