Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Choosing Between z and t Distribution

Choosing Between z and t Distribution

The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Chebyshev's Theorem to Interpret Standard Deviation

Chebyshev's Theorem to Interpret Standard Deviation

Chebyshev’s theorem, also known as Chebyshev’s Inequality, states that the proportion of values of a dataset for K standard deviation is calculated using the equation:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The Causal Mechanism Between the Dipeptidyl Peptidase-4, Heart Failure, and Other Cardiovascular Diseases: A Mendelian Randomization and Mediation Study.

International journal of endocrinology·2025

Same author

Assessing the causal relationships between gout and hypertension: a bidirectional Mendelian randomisation study with coarsened exposures.

Arthritis research & therapy·2022

Same author

Factors predicting improved compliance towards colonoscopy in individuals with positive faecal immunochemical test (FIT).

Cancer medicine·2021

Same author

MAVTgsa: an R package for gene set (enrichment) analysis.

BioMed research international·2014

Same journal

Tissue MicroRNAs in Arrhythmogenic Cardiomyopathy: A Systematic Review of Studies in Human Myocardium and Animal Models with Implications for Post-Mortem Molecular Diagnostics.

Genes·2026

Same journal

Genetic Variants and Dental Caries Susceptibility: An Umbrella Review and Multilevel Meta-Analysis.

Genes·2026

Same journal

Generative AI and Language Models in Human Genetics and Health: From Variant Interpretation to Clinical Decision Support.

Genes·2026

Same journal

Familial White-Sutton Syndrome Caused by a Pathogenic POGZ p.Arg508* Variant: Intrafamilial Variability from Childhood to Adulthood.

Genes·2026

Same journal

Genetic Influence on LDL-Cholesterol Levels: Role of Polygenic Risk Scores and Lp(a) Beyond Monogenic Hypercholesterolemia.

Genes·2026

Same journal

THBS1 as a Key Regulator of Myoblasts: Validation of Its Inhibitory Roles in Skeletal Muscle Development.

Genes·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 5, 2025

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

Efficient Selection of Gaussian Kernel SVM Parameters for Imbalanced Data.

Chen-An Tsai¹, Yu-Jing Chang¹

¹Division of Biometry, Department of Agronomy, National Taiwan University, Taipei 106216, Taiwan.

|March 29, 2023

Summary

This summary is machine-generated.

This study introduces b-SVM and Min-max gamma selection to improve Support Vector Machine (SVM) classification on imbalanced medical data. These methods enhance cancer cell detection accuracy and are significantly faster than existing techniques.

Keywords:

imbalanced datasets parameter selection support vector machine threshold adjustment

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Aug 5, 2025

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Area of Science:

Bioinformatics
Machine Learning
Computational Biology

Background:

Medical data mining frequently uses class prediction models for data classification.
High-dimensional gene expression datasets present challenges due to skewed class distributions, leading to poor minority class prediction.
Support Vector Machines (SVMs) are powerful classifiers but require careful parameter optimization, especially with imbalanced data.

Purpose of the Study:

To address class imbalance and parameter selection in SVM classifiers for binary classification problems.
To improve the generalization ability and classification performance of SVMs on imbalanced medical datasets.
To develop novel methods for adjusting SVM cutoff thresholds and optimizing SVM parameters efficiently.

Main Methods:

Proposed a novel adjustment method, b-SVM, for modifying the SVM cutoff threshold.
Introduced Min-max gamma selection, a fast approach for optimizing SVM model parameters without extensive k-fold cross-validation.
Evaluated algorithms using simulated and real imbalanced medical datasets, comparing against standard SVM and existing methods.

Main Results:

The proposed b-SVM and Min-max gamma selection algorithms demonstrated superior performance compared to standard SVM and over-sampling techniques.
Min-max gamma selection was found to be at least 10 times faster than cross-validation selection based on average running times across six real datasets.
The developed methods effectively improved classification accuracy for minority classes in imbalanced datasets.

Conclusions:

The novel b-SVM and Min-max gamma selection methods offer significant improvements for SVM classification on imbalanced medical data.
Min-max gamma selection provides a computationally efficient alternative for SVM parameter optimization.
These algorithms hold promise for enhancing the identification of marker genes and improving cancer cell classification.