Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when researchers try to extrapolate results...

Residuals and Least-Squares Property

Residuals and Least-Squares Property

The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

CRISPR/Cas9 in locusts: Successful establishment of an olfactory deficiency line by targeting the mutagenesis of an odorant receptor co-receptor (Orco).

Insect biochemistry and molecular biology·2016

Same author

To Be or Not To Be Humorous? Cross Cultural Perspectives on Humor.

Frontiers in psychology·2016

Same author

Can Down-gaze During Near Work Cause Peripheral Deprivation in Asian Eyes?

Optometry and vision science : official publication of the American Academy of Optometry·2016

Same author

Armadillo Repeat-Containing Protein 8 (ARMC8) Silencing Inhibits Proliferation and Invasion in Osteosarcoma Cells.

Oncology research·2016

Same author

Knockdown of DDX46 Inhibits the Invasion and Tumorigenesis in Osteosarcoma Cells.

Oncology research·2016

Same author

Corrigendum: The Associations of Dyadic Coping and Relationship Satisfaction Vary between and within Nations: A 35-Nation Study.

Frontiers in psychology·2016

Same journal

The effect of glycyrrhetinic acid on pharmacokinetics of cortisone and its metabolite cortisol in rats.

Journal of biomedicine & biotechnology·2012

Same journal

Insights and hopes in umbilical cord blood stem cell transplantations.

Journal of biomedicine & biotechnology·2012

Same journal

Three-dimensional visualization with large data sets: a simulation of spreading cortical depression in human brain.

Journal of biomedicine & biotechnology·2012

Same journal

Bioconversion of sugarcane biomass into ethanol: an overview about composition, pretreatment methods, detoxification of hydrolysates, enzymatic saccharification, and ethanol fermentation.

Journal of biomedicine & biotechnology·2012

Same journal

Trends in tissue engineering for blood vessels.

Journal of biomedicine & biotechnology·2012

Same journal

Salinomycin as a drug for targeting human cancer stem cells.

Journal of biomedicine & biotechnology·2012

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 23, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Regularized F-measure maximization for feature selection and classification.

Zhenqiu Liu¹, Ming Tan, Feng Jiang

¹Division of Biostatistics, University of Maryland Greenebaum Cancer Center, Baltimore, MD 21201, USA. zliu@umm.edu

Journal of Biomedicine & Biotechnology

|May 8, 2009

Summary

This summary is machine-generated.

This study introduces a novel regularized F-measure maximization method for classification, enhancing performance in unbalanced datasets. The approach integrates feature selection and prediction, offering a robust alternative for diagnostic tests and biological marker evaluation.

More Related Videos

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Jun 23, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Area of Science:

Biostatistics
Machine Learning
Bioinformatics

Background:

Receiver Operating Characteristic (ROC) analysis is widely used for classification performance assessment, particularly in medical diagnostics and biological marker evaluation.
Traditional ROC analysis and utility functions like F-measure are valuable when misclassification costs are unknown, common in real-world scenarios.
F-measure, a combination of precision and recall, offers a global performance metric.

Purpose of the Study:

To propose a novel method for classification performance assessment using regularized F-measure maximization.
To address challenges in highly unbalanced datasets and scenarios with missing labels for samples.
To integrate simultaneous feature selection and prediction within the proposed framework.

Main Methods:

A novel method based on regularized F-measure maximization is proposed.
The method incorporates differential costs for positive and negative samples.
Simultaneous feature selection and prediction are achieved using an L(1) penalty.

Main Results:

Experimental results on benchmark, methylation, and high-dimensional microarray data are presented.
The proposed algorithm demonstrates superior or equivalent performance compared to other popular classifiers in limited experiments.
The method is particularly effective for highly unbalanced datasets and datasets with missing negative or positive sample labels.

Conclusions:

The proposed regularized F-measure maximization method offers an effective approach for classification, especially in challenging data scenarios.
The integration of feature selection and prediction enhances its utility for complex biological and medical data.
This method provides a valuable tool for evaluating diagnostic tests and biological markers.