Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.7K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.7K
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

2.9K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
2.9K
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.1K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.1K
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

173
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
173
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

613
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
613
Chebyshev's Theorem to Interpret Standard Deviation01:15

Chebyshev's Theorem to Interpret Standard Deviation

4.3K
Chebyshev’s theorem, also known as Chebyshev’s Inequality, states that the proportion of values of a dataset for K standard deviation is calculated using the equation:
4.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Causal Mechanism Between the Dipeptidyl Peptidase-4, Heart Failure, and Other Cardiovascular Diseases: A Mendelian Randomization and Mediation Study.

International journal of endocrinology·2025
Same author

Assessing the causal relationships between gout and hypertension: a bidirectional Mendelian randomisation study with coarsened exposures.

Arthritis research & therapy·2022
Same author

Factors predicting improved compliance towards colonoscopy in individuals with positive faecal immunochemical test (FIT).

Cancer medicine·2021
Same author

MAVTgsa: an R package for gene set (enrichment) analysis.

BioMed research international·2014
Same journal

Tissue MicroRNAs in Arrhythmogenic Cardiomyopathy: A Systematic Review of Studies in Human Myocardium and Animal Models with Implications for Post-Mortem Molecular Diagnostics.

Genes·2026
Same journal

Genetic Variants and Dental Caries Susceptibility: An Umbrella Review and Multilevel Meta-Analysis.

Genes·2026
Same journal

Generative AI and Language Models in Human Genetics and Health: From Variant Interpretation to Clinical Decision Support.

Genes·2026
Same journal

Familial White-Sutton Syndrome Caused by a Pathogenic POGZ p.Arg508* Variant: Intrafamilial Variability from Childhood to Adulthood.

Genes·2026
Same journal

Genetic Influence on LDL-Cholesterol Levels: Role of Polygenic Risk Scores and Lp(a) Beyond Monogenic Hypercholesterolemia.

Genes·2026
Same journal

THBS1 as a Key Regulator of Myoblasts: Validation of Its Inhibitory Roles in Skeletal Muscle Development.

Genes·2026
See all related articles

Related Experiment Video

Updated: Aug 5, 2025

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines
08:27

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

1.2K

Efficient Selection of Gaussian Kernel SVM Parameters for Imbalanced Data.

Chen-An Tsai1, Yu-Jing Chang1

  • 1Division of Biometry, Department of Agronomy, National Taiwan University, Taipei 106216, Taiwan.

Genes
|March 29, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces b-SVM and Min-max gamma selection to improve Support Vector Machine (SVM) classification on imbalanced medical data. These methods enhance cancer cell detection accuracy and are significantly faster than existing techniques.

Keywords:
imbalanced datasetsparameter selectionsupport vector machinethreshold adjustment

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

822

Related Experiment Videos

Last Updated: Aug 5, 2025

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines
08:27

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

1.2K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

822

Area of Science:

  • Bioinformatics
  • Machine Learning
  • Computational Biology

Background:

  • Medical data mining frequently uses class prediction models for data classification.
  • High-dimensional gene expression datasets present challenges due to skewed class distributions, leading to poor minority class prediction.
  • Support Vector Machines (SVMs) are powerful classifiers but require careful parameter optimization, especially with imbalanced data.

Purpose of the Study:

  • To address class imbalance and parameter selection in SVM classifiers for binary classification problems.
  • To improve the generalization ability and classification performance of SVMs on imbalanced medical datasets.
  • To develop novel methods for adjusting SVM cutoff thresholds and optimizing SVM parameters efficiently.

Main Methods:

  • Proposed a novel adjustment method, b-SVM, for modifying the SVM cutoff threshold.
  • Introduced Min-max gamma selection, a fast approach for optimizing SVM model parameters without extensive k-fold cross-validation.
  • Evaluated algorithms using simulated and real imbalanced medical datasets, comparing against standard SVM and existing methods.

Main Results:

  • The proposed b-SVM and Min-max gamma selection algorithms demonstrated superior performance compared to standard SVM and over-sampling techniques.
  • Min-max gamma selection was found to be at least 10 times faster than cross-validation selection based on average running times across six real datasets.
  • The developed methods effectively improved classification accuracy for minority classes in imbalanced datasets.

Conclusions:

  • The novel b-SVM and Min-max gamma selection methods offer significant improvements for SVM classification on imbalanced medical data.
  • Min-max gamma selection provides a computationally efficient alternative for SVM parameter optimization.
  • These algorithms hold promise for enhancing the identification of marker genes and improving cancer cell classification.