Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Feature selection for descriptor based classification models. 1. Theory and GA-SEC algorithm.

Jörg K Wegner1, Holger Fröhlich, Andreas Zell

  • 1Zentrum für Bioinformatik Tübingen (ZBIT), Universität Tübingen, Sand 1, D-72076 Tübingen, Germany. wegnerj@informatik.uni-tuebingen.de

Journal of Chemical Information and Computer Sciences
|May 25, 2004
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Screening for Alzheimer's disease in the community using an AI-driven screening platform: design of the PREDICTOM study.

The journal of prevention of Alzheimer's disease·2026
Same author

Data-driven clinical decision support tool for diagnosing mild cognitive impairment in Parkinson's disease.

NPJ Parkinson's disease·2026
Same author

Dementia Care Research and Psychosocial Factors.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2025
Same author

Biomarkers.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2025
Same author

Dementia Care Research and Psychosocial Factors.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2025
Same author

Brain age gap as predictor of disease progression in Parkinson's disease.

NPJ Parkinson's disease·2025
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
See all related articles

This study explores feature selection for molecular data classification models, focusing on improving model quality and preventing overfitting. A modified Genetic Algorithm based on Shannon Entropy Cliques (GA-SEC) with boosting is presented.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Classification models are crucial for analyzing molecular data.
  • Feature selection is vital for improving model performance and generalization.
  • Overfitting remains a significant challenge in molecular data analysis.

Purpose of the Study:

  • To investigate feature selection methods for molecular data classification.
  • To enhance model quality and mitigate overfitting.
  • To present an improved Genetic Algorithm for classification tasks.

Main Methods:

  • Exploration of standard feature selection approaches.
  • Presentation of modifications to the Genetic Algorithm based on Shannon Entropy Cliques (GA-SEC).
  • Extension of GA-SEC for classification using boosting techniques.

Related Experiment Videos

Main Results:

  • Demonstration of improved model quality through effective feature selection.
  • Evidence of reduced variance and overfitting on unseen data.
  • Validation of the GA-SEC algorithm's efficacy in classification.

Conclusions:

  • Feature selection significantly impacts molecular data classification model performance.
  • The proposed GA-SEC algorithm with boosting offers a robust solution for enhancing classification accuracy and generalization.
  • This approach aids in building more reliable predictive models from complex biological datasets.