Optimizing high dimensional data classification with a hybrid AI driven feature selection framework and machine learning schema
View abstract on PubMed
Summary
This summary is machine-generated.Feature selection (FS) significantly enhances classification accuracy by reducing model complexity and training time. The TMGWO hybrid algorithm demonstrated superior performance in identifying key features and improving classification outcomes.
Area Of Science
- Machine Learning
- Data Science
- Bioinformatics
Background
- Feature selection (FS) is crucial for high-dimensional datasets to improve classification accuracy.
- FS minimizes model complexity, reduces training time, and enhances generalization.
- The curse of dimensionality necessitates effective feature selection strategies.
Purpose Of The Study
- To evaluate and compare various classification algorithms for feature selection.
- To introduce and assess novel hybrid algorithms for enhanced feature identification.
- To demonstrate the impact of feature selection on classification performance metrics.
Main Methods
- Experiments were conducted on the Wisconsin Breast Cancer Diagnostic, Sonar, and Differentiated Thyroid Cancer datasets.
- Evaluated standard classifiers: K-Nearest Neighbors (KNN), Random Forest (RF), Multi-Layer Perceptron (MLP), Logistic Regression (LR), and Support Vector Machines (SVM).
- Introduced and tested hybrid algorithms: TMGWO (Two-phase Mutation Grey Wolf Optimization), ISSA (Improved Salp Swarm Algorithm), and BBPSO (Binary Black Particle Swarm Optimization).
Main Results
- The TMGWO hybrid approach achieved superior results in both feature selection and classification accuracy.
- Comparative analysis showed significant improvements in accuracy, precision, and recall with FS.
- TMGWO outperformed other experimental methods in identifying significant features for classification.
Conclusions
- Hybrid feature selection algorithms, particularly TMGWO, offer significant advantages for classification tasks.
- Effective feature selection is vital for improving model performance and avoiding the curse of dimensionality.
- The study highlights the importance of advanced FS techniques in machine learning applications.
Related Concept Videos
Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
Additivity means that the response to the sum of multiple inputs is the sum of their individual responses. For inputs x1(t) and x2(t) producing outputs y1(t) and y2(t), respectively:
Combining homogeneity and...
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,
Discrete-time systems have input and output signals at specific intervals, defined at distinct instants by difference equations. An example is a...

