Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quadratic Models

Quadratic Models

Quadratic models are mathematical representations used to describe relationships in which the rate of change changes at a constant rate. These models appear in a wide variety of natural and engineered systems, especially those involving motion, forces, and optimization. One common application is analyzing the vertical motion of objects influenced by gravity, such as a ball thrown into the air.In such scenarios, the object's height changes over time in a curved pattern, rising to a maximum point...

Quadratic Equations

Quadratic Equations

A quadratic equation is an algebraic expression where a variable is raised to the second power and combined with its first power and a constant; all equated to zero. These equations are frequently used to model relationships involving area, motion, and optimization. The general representation of a quadratic equation iswhere a, b, and c are real values, and a is nonzero to ensure the presence of the squared term.One method for solving a quadratic equation involves rewriting it as a product of...

Residuals and Least-Squares Property

Residuals and Least-Squares Property

The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...

Classification of Systems-II

Classification of Systems-II

Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Real-Time On-Device Continual Learning Based on a Combined Nearest Class Mean and Replay Method for Smartphone Gesture Recognition.

Sensors (Basel, Switzerland)·2025

Same author

Table-Balancing Cooperative Robot Based on Deep Reinforcement Learning.

Sensors (Basel, Switzerland)·2023

Same author

Multi-Population Genetic Algorithm for Multilabel Feature Selection Based on Label Complementary Communication.

Entropy (Basel, Switzerland)·2020

Same author

Competitive Particle Swarm Optimization for Multi-Category Text Feature Selection.

Entropy (Basel, Switzerland)·2020

Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026

Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026

Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026

Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026

Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026

Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Generalized Term Similarity for Feature Selection in Text Classification Using Quadratic Programming.

Hyunki Lim¹, Dae-Won Kim²

¹Image and Media Research Center, Korea Institute of Science and Technology, 5 Hwarang-Ro 14-gil, Seongbuk-Gu, Seoul 02792, Korea.

Entropy (Basel, Switzerland)

|December 8, 2020

Summary

This summary is machine-generated.

This study introduces a novel feature selection method for text classification that incorporates term similarity to reduce redundancy. This approach enhances accuracy compared to traditional methods by balancing term ranking and similarity.

Keywords:

chi-square statistic information gain mutual information quadratic programming text categorization

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Related Experiment Videos

Last Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Area of Science:

Computer Science
Information Science
Data Science

Background:

The proliferation of internet technologies has resulted in a massive increase in electronic documents.
Text categorization is crucial for managing and organizing big data from unstructured documents.
The bag-of-words model is a common, simple representation for text classification, but it leads to a large feature space.

Purpose of the Study:

To propose a new feature selection method for text categorization.
To address the issue of redundant terms in the bag-of-words model.
To improve the accuracy of text classification by considering term similarity.

Main Methods:

A novel feature selection method is proposed, incorporating term similarity alongside term ranking.
Term similarity is quantified using methods like mutual information.
A quadratic programming-based numerical optimization approach is used to balance term ranking and similarity.

Main Results:

The proposed feature selection method effectively reduces redundant terms.
Experimental results show higher accuracy compared to conventional feature selection methods.
Considering term similarity in feature selection is proven to be effective.

Conclusions:

The developed feature selection technique offers improved performance in text categorization.
Balancing term ranking and term similarity is key to enhancing classification accuracy.
This method provides a more efficient way to manage and classify large volumes of text data.