Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cell Specific Gene Expression01:58

Cell Specific Gene Expression

Multicellular organisms contain a variety of structurally and functionally distinct cell types, but the DNA in all the cells originated from the same parent cells. The differences in the cells can be attributed to the differential gene expression. Liver cells, whose functions include detoxification of blood, production of bile to metabolize fats, and synthesis of proteins essential for metabolism, must express a specific set of genes to perform their functions. Gene expression also varies with...
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
RNA-seq03:21

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...
Cell Specific Gene Expression01:58

Cell Specific Gene Expression

Multicellular organisms contain a variety of structurally and functionally distinct cell types, but the DNA in all the cells originated from the same parent cells. The differences in the cells can be attributed to the differential gene expression. Liver cells, whose functions include detoxification of blood, production of bile to metabolize fats, and synthesis of proteins essential for metabolism, must express a specific set of genes to perform their functions. Gene expression also varies with...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SECTOR: structural entropy-based learning of spatiotemporal organisation in spatial transcriptomics.

Bioinformatics (Oxford, England)·2026
Same author

Imaging-based organ-specific aging clock predicts human diseases and mortality.

NPJ digital medicine·2026
Same author

Lifestyle-Associated Metabolic Signature Predicts the Risk of Amyotrophic Lateral Sclerosis.

Muscle & nerve·2026
Same author

DeepRMSF: a deep learning-based automated approach for predicting atomic-level flexibility in RNA structure.

Briefings in bioinformatics·2026
Same author

scGeneBank: cross-species screening of functional gene sets at single-cell resolution.

Nucleic acids research·2025
Same author

scBrainScope: cross-species multidimensional brain atlas.

Nucleic acids research·2025
Same journal

STED: flexible cross-modal topic modeling infers cell-type-specific regulatory landscapes from bulk epigenomics.

Briefings in bioinformatics·2026
Same journal

A knowledge-guided deep learning framework for quantitative nucleic acid testing.

Briefings in bioinformatics·2026
Same journal

Optimal transport for label transfer in single-cell multi-omics integration.

Briefings in bioinformatics·2026
Same journal

Continuous multi-omics pathway enrichment analysis resolves hidden functional heterogeneity.

Briefings in bioinformatics·2026
Same journal

Evaluating completeness, coherence, and consistency of genome-scale function annotations.

Briefings in bioinformatics·2026
Same journal

Transformers for single-cell RNA sequencing: a survey.

Briefings in bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Jun 25, 2026

Single-cell RNA Sequencing of Fluorescently Labeled Mouse Neurons Using Manual Sorting and Double In Vitro Transcription with Absolute Counts Sequencing DIVA-Seq
07:49

Single-cell RNA Sequencing of Fluorescently Labeled Mouse Neurons Using Manual Sorting and Double In Vitro Transcription with Absolute Counts Sequencing DIVA-Seq

Published on: October 26, 2018

9.5K

scValue: value-based subsampling of large-scale single-cell transcriptomic data for machine and deep learning tasks.

Li Huang1, Weikang Gong1,2, Dongsheng Chen1

  • 1State Key Laboratory of Common Mechanism Research for Major Diseases, Suzhou Institute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, 100 Chongwen Road, Suzhou Industrial Park, Suzhou, Jiangsu Province 215123, China.

Briefings in Bioinformatics
|June 14, 2025
PubMed
Summary
This summary is machine-generated.

scValue is a new method for subsampling large single-cell RNA sequencing (scRNA-seq) datasets. It prioritizes high-value cells, improving machine learning and deep learning tasks while maintaining biological signals.

Keywords:
cell type analysisdata valuationmachine and deep learningsingle-cell transcriptomicssubsampling

More Related Videos

Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues
10:12

Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues

Published on: January 10, 2019

18.5K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

663

Related Experiment Videos

Last Updated: Jun 25, 2026

Single-cell RNA Sequencing of Fluorescently Labeled Mouse Neurons Using Manual Sorting and Double In Vitro Transcription with Absolute Counts Sequencing DIVA-Seq
07:49

Single-cell RNA Sequencing of Fluorescently Labeled Mouse Neurons Using Manual Sorting and Double In Vitro Transcription with Absolute Counts Sequencing DIVA-Seq

Published on: October 26, 2018

9.5K
Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues
10:12

Droplet Barcoding-Based Single Cell Transcriptomics of Adult Mammalian Tissues

Published on: January 10, 2019

18.5K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

663

Area of Science:

  • Computational Biology
  • Genomics
  • Machine Learning

Background:

  • Large single-cell RNA sequencing (scRNA-seq) datasets provide deep biological insights but pose significant computational challenges.
  • Existing subsampling techniques can improve efficiency but may compromise performance in downstream machine learning and deep learning (ML/DL) analyses.

Purpose of the Study:

  • Introduce scValue, a novel cell-ranking approach for efficient and effective subsampling of large scRNA-seq data.
  • To enhance ML/DL workflows by preserving critical biological signals and improving cell-type representation in subsamples.

Main Methods:

  • Developed scValue, a method that ranks cells by 'data value' using random forest out-of-bag estimates.
  • Prioritized high-value cells and oversampled cell types with greater data value variability.
  • Benchmarked scValue on cell-type annotation, label transfer learning, cross-study label harmonization, and bulk RNA-seq deconvolution tasks.

Main Results:

  • scValue consistently outperformed existing subsampling methods in automatic cell-type annotation tasks, achieving performance close to full-data analysis.
  • Demonstrated superior preservation of T-cell annotations and accurate reproduction of T-cell subtype relationships in case studies.
  • Evaluated on 16 public datasets, scValue showed fast execution, balanced cell-type representation, and distributional properties similar to uniform sampling.

Conclusions:

  • scValue offers a robust and scalable solution for subsampling large scRNA-seq datasets in ML/DL applications.
  • The method effectively preserves biological signals and enhances the performance of downstream analyses.
  • scValue is available as an open-source Python package.