Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Does Radiation Boost Dose Affect Organ Preservation Rates? A Secondary Analysis of the Organ Preservation in Patients With Rectal Adenocarcinoma Trial.

International journal of radiation oncology, biology, physics·2026

Same author

Phase 2 study of palbociclib plus retifanlimab in patients with advanced dedifferentiated liposarcoma.

Journal for immunotherapy of cancer·2026

Same author

Association of Circulating T Cell and Tumor Microenvironment Profiles with Immune Checkpoint Blockade Outcomes in Sarcoma.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026

Same author

Tumor and Immune Dynamics Following Sequential CDK4/6 and PD-1 Inhibition: Results from a Phase 2 Study in Dedifferentiated Liposarcoma.

Cancer research communications·2025

Same author

Accuracy of Flexible Sigmoidoscopy and MRI in Restaging Rectal Cancer after Neoadjuvant Therapy: A Secondary Analysis of the OPRA Randomized Clinical Trial.

Annals of surgery·2025

Same author

Histologic Subvariants of Retroperitoneal Well-Differentiated Liposarcoma Show Evidence of Clinical and Genomic Progression Toward Dedifferentiated Liposarcoma.

JCO precision oncology·2025

Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026

Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026

Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026

Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026

Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026

Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 12, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Optimizing Sample Size for Supervised Machine Learning with Bulk Transcriptomic Sequencing: A Learning Curve

Yunhui Qi^1,2, Xinyi Wang^1,3, Li-Xuan Qin¹

¹Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, United States.

|September 24, 2024

Summary

This summary is machine-generated.

Determining optimal sample size for transcriptomics studies is key for personalized medicine. This study introduces a novel computational method using data augmentation and learning curves to establish the power-versus-sample-size relationship for machine learning classification.

Keywords:

Bulk Sequencing Machine Learning Sample Size Transcriptomics

More Related Videos

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data

Published on: February 7, 2025

Related Experiment Videos

Last Updated: Jun 12, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data

MEDUSA for Identifying Death Regulatory Genes in Chemo-genetic Profiling Data

Published on: February 7, 2025

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Accurate sample classification using transcriptomics data is vital for personalized medicine.
Current sample size calculation methods may not be suitable for supervised machine learning (ML) classification.
A methodological gap exists in determining adequate sample size for ML-based transcriptomics analysis.

Purpose of the Study:

To develop and evaluate a novel computational approach for establishing the power-versus-sample-size relationship in transcriptomics studies.
To address the limitations of existing methods for sample size determination in the context of ML classification.
To facilitate the use of ML in transcriptomics for personalized medicine.

Main Methods:

A novel computational approach employing data augmentation and fitting a learning curve to establish the power-versus-sample-size relationship.
Comprehensive performance evaluation using microRNA and RNA sequencing data.
Consideration of diverse data characteristics and algorithm configurations.

Main Results:

The developed approach effectively establishes the power-versus-sample-size relationship for transcriptomics data.
Performance was validated across various data types (miRNA, RNA-seq) and ML algorithms.
The method provides a robust framework for sample size estimation in ML-driven transcriptomics.

Conclusions:

The novel computational approach bridges a critical methodological gap in sample size determination for ML-based transcriptomics.
This method enhances statistical power and optimizes resource allocation in transcriptomics studies.
Availability of code on GitHub promotes accessibility, reproducibility, and accelerates the translation of transcriptomics findings into clinical applications for personalized treatment.