Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Frequency-dependent Selection

Frequency-dependent Selection

When the fitness of a trait is influenced by how common it is (i.e., its frequency) relative to different traits within a population, this is referred to as frequency-dependent selection. Frequency-dependent selection may occur between species or within a single species. This type of selection can either be positive—with more common phenotypes having higher fitness—or negative, with rarer phenotypes conferring increased fitness.

Types of Selection

Types of Selection

Natural selection influences the frequencies of particular alleles and phenotypes within populations in several different ways. Primarily, natural selection can be directional, stabilizing, or disruptive. Directional selection favors one extreme trait and shifts the population towards that phenotype while selecting against individuals displaying alternate traits. Stabilizing selection favors an intermediate trait with a narrow range of variation. Deviation from the optimal phenotype towards an...

Force Classification

Force Classification

Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same journal

Opportunities and Challenges of Integrating Ethiopian Traditional Medicine System Into Modern Medicine: A Narrative Review.

TheScientificWorldJournal·2026

Same journal

Exploring the Antiparasitic Activity of the Sea Cucumber Isostichopus sp. aff. badionotus From the Northern Coast of Colombia Against Trypanosoma cruzi.

TheScientificWorldJournal·2026

Same journal

Kalanchoe ceratophylla (Crassulaceae): The True Identity of Sidingin, a Medicinal Plant From Sumatra, Based on Morphological and Molecular Evidence.

TheScientificWorldJournal·2026

Same journal

Genetic Variation of Chicken Growth Differentiation Factor-9 Gene and Association With Egg Characteristics: A Systematic Review.

TheScientificWorldJournal·2026

Same journal

Applied Research on the Effect of Risks on Public Health Building Projects' Performance: Empirical Results From Tanzania.

TheScientificWorldJournal·2026

Same journal

Projected Impacts of Climate and Land Use/Land Cover Change on Sediment Yield and Surface Runoff in the Baro River Sub-Basin, Ethiopia.

TheScientificWorldJournal·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 27, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Improved feature-selection method considering the imbalance problem in text categorization.

Jieming Yang¹, Zhaoyang Qu¹, Zhiying Liu¹

¹College of Information Engineering, Northeast Dianli University, Jilin, Jilin 132012, China.

Thescientificworldjournal

|June 28, 2014

Summary

This summary is machine-generated.

This study introduces a new scheme to improve feature selection algorithms for text categorization by addressing dataset imbalance. The enhanced methods significantly boost performance in text classification tasks.

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Apr 27, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Computer Science
Artificial Intelligence
Natural Language Processing

Background:

Filtering feature-selection algorithms are crucial for dimensionality reduction in text categorization.
Existing algorithms often overlook dataset imbalance, negatively impacting performance.
Dataset imbalance is a significant challenge in text classification.

Purpose of the Study:

To propose a novel scheme that mitigates the adverse effects of dataset imbalance on feature selection.
To enhance the performance of established feature selection methods in imbalanced text data.

Main Methods:

Proposed a new scheme to address dataset imbalance in feature selection.
Evaluated nine improved feature-selection methods: Information Gain, Chi statistic, Document Frequency, Orthogonal Centroid Feature Selection, DIA association factor, Comprehensive Measurement Feature Selection, Deviation from Poisson Feature Selection, improved Gini index, and Mutual Information.
Utilized naïve Bayes and support vector machines classifiers on three benchmark datasets: 20-Newsgroups, Reuters-21578, and WebKB.

Main Results:

The improved scheme significantly enhanced the performance of the evaluated feature-selection methods.
The proposed approach effectively weakens the adverse impact of dataset imbalance.
Demonstrated superior performance of enhanced algorithms on benchmark document collections.

Conclusions:

The developed scheme offers a robust solution for handling imbalanced datasets in text categorization.
The improved feature selection methods show substantial gains in text classification accuracy.
This work contributes to more effective dimensionality reduction techniques for imbalanced text data.