Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

498
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
498
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

4.0K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
4.0K
Frequency-dependent Selection01:21

Frequency-dependent Selection

20.1K
When the fitness of a trait is influenced by how common it is (i.e., its frequency) relative to different traits within a population, this is referred to as frequency-dependent selection. Frequency-dependent selection may occur between species or within a single species. This type of selection can either be positive—with more common phenotypes having higher fitness—or negative, with rarer phenotypes conferring increased fitness.
20.1K
Types of Selection01:46

Types of Selection

37.5K
Natural selection influences the frequencies of particular alleles and phenotypes within populations in several different ways. Primarily, natural selection can be directional, stabilizing, or disruptive. Directional selection favors one extreme trait and shifts the population towards that phenotype while selecting against individuals displaying alternate traits. Stabilizing selection favors an intermediate trait with a narrow range of variation. Deviation from the optimal phenotype towards an...
37.5K
Force Classification01:22

Force Classification

2.8K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
2.8K
Classification of Signals01:30

Classification of Signals

1.5K
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
1.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same journal

Opportunities and Challenges of Integrating Ethiopian Traditional Medicine System Into Modern Medicine: A Narrative Review.

TheScientificWorldJournal·2026
Same journal

Exploring the Antiparasitic Activity of the Sea Cucumber Isostichopus sp. aff. badionotus From the Northern Coast of Colombia Against Trypanosoma cruzi.

TheScientificWorldJournal·2026
Same journal

Kalanchoe ceratophylla (Crassulaceae): The True Identity of Sidingin, a Medicinal Plant From Sumatra, Based on Morphological and Molecular Evidence.

TheScientificWorldJournal·2026
Same journal

Genetic Variation of Chicken Growth Differentiation Factor-9 Gene and Association With Egg Characteristics: A Systematic Review.

TheScientificWorldJournal·2026
Same journal

Applied Research on the Effect of Risks on Public Health Building Projects' Performance: Empirical Results From Tanzania.

TheScientificWorldJournal·2026
Same journal

Projected Impacts of Climate and Land Use/Land Cover Change on Sediment Yield and Surface Runoff in the Baro River Sub-Basin, Ethiopia.

TheScientificWorldJournal·2026
See all related articles

Related Experiment Video

Updated: Apr 27, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.0K

Improved feature-selection method considering the imbalance problem in text categorization.

Jieming Yang1, Zhaoyang Qu1, Zhiying Liu1

  • 1College of Information Engineering, Northeast Dianli University, Jilin, Jilin 132012, China.

Thescientificworldjournal
|June 28, 2014
PubMed
Summary
This summary is machine-generated.

This study introduces a new scheme to improve feature selection algorithms for text categorization by addressing dataset imbalance. The enhanced methods significantly boost performance in text classification tasks.

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.6K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.0K

Related Experiment Videos

Last Updated: Apr 27, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.0K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.6K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.0K

Area of Science:

  • Computer Science
  • Artificial Intelligence
  • Natural Language Processing

Background:

  • Filtering feature-selection algorithms are crucial for dimensionality reduction in text categorization.
  • Existing algorithms often overlook dataset imbalance, negatively impacting performance.
  • Dataset imbalance is a significant challenge in text classification.

Purpose of the Study:

  • To propose a novel scheme that mitigates the adverse effects of dataset imbalance on feature selection.
  • To enhance the performance of established feature selection methods in imbalanced text data.

Main Methods:

  • Proposed a new scheme to address dataset imbalance in feature selection.
  • Evaluated nine improved feature-selection methods: Information Gain, Chi statistic, Document Frequency, Orthogonal Centroid Feature Selection, DIA association factor, Comprehensive Measurement Feature Selection, Deviation from Poisson Feature Selection, improved Gini index, and Mutual Information.
  • Utilized naïve Bayes and support vector machines classifiers on three benchmark datasets: 20-Newsgroups, Reuters-21578, and WebKB.

Main Results:

  • The improved scheme significantly enhanced the performance of the evaluated feature-selection methods.
  • The proposed approach effectively weakens the adverse impact of dataset imbalance.
  • Demonstrated superior performance of enhanced algorithms on benchmark document collections.

Conclusions:

  • The developed scheme offers a robust solution for handling imbalanced datasets in text categorization.
  • The improved feature selection methods show substantial gains in text classification accuracy.
  • This work contributes to more effective dimensionality reduction techniques for imbalanced text data.