Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

RNA-seq03:21

RNA-seq

10.7K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
10.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

AI-guided analysis of human pancreatic islet sociology reveals distinct cell compositional changes in type 1 diabetes.

bioRxiv : the preprint server for biology·2026
Same author

Adaptive Fisher's method using weakly geometric grid for combining <i>p</i>-values with application to COVID-19 surveillance.

Journal of the Royal Statistical Society. Series C, Applied statistics·2026
Same author

Impact of sex chromosomes and gonad type in stress susceptibility in corticostriatal brain regions.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

Unraveling Tissue-Specific Molecular Signatures and Convergent Pathway Enrichments in Suicidal Behavior.

bioRxiv : the preprint server for biology·2026
Same author

Quantitative and qualitative patient-reported analysis of misdiagnosis and/or late diagnosis of metastatic lobular cancer.

medRxiv : the preprint server for health sciences·2026
Same author

Benchmarking scRNA-seq Copy Number Inference: A Comprehensive Evaluation and Practitioner's Guide.

bioRxiv : the preprint server for biology·2026
Same journal

A Bayesian functional concurrent zero-inflated Dirichlet-multinomial regression model with application to infant microbiome.

Biostatistics (Oxford, England)·2026
Same journal

Towards optimal environmental policies: policy learning under arbitrary bipartite network interference.

Biostatistics (Oxford, England)·2026
Same journal

Multilevel functional quantile principal component analysis.

Biostatistics (Oxford, England)·2026
Same journal

Adaptive transfer learning for time-to-event modeling with applications in disease risk assessment.

Biostatistics (Oxford, England)·2026
Same journal

High-dimensional test for one-sided hypotheses.

Biostatistics (Oxford, England)·2026
Same journal

NBSR: a Negative Binomial Softmax Regression model for microRNA-seq data analysis.

Biostatistics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Oct 25, 2025

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing
10:44

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing

Published on: March 23, 2022

4.5K

A sparse negative binomial mixture model for clustering RNA-seq count data.

Yujia Li1, Tanbin Rahman1, Tianzhou Ma2

  • 1Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA.

Biostatistics (Oxford, England)
|August 7, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a new clustering method for RNA sequencing count data, improving sample classification and gene selection for high-dimensional biological data. The negative binomial mixture model offers superior accuracy and interpretation for transcriptomic studies.

Keywords:
Cluster analysisFeature selectionGaussian mixture modelSparse K-means

More Related Videos

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress
05:22

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Published on: July 29, 2022

3.7K
Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2
10:10

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Published on: September 18, 2021

39.2K

Related Experiment Videos

Last Updated: Oct 25, 2025

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing
10:44

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing

Published on: March 23, 2022

4.5K
Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress
05:22

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Published on: July 29, 2022

3.7K
Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2
10:10

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Published on: September 18, 2021

39.2K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Statistical Genetics

Background:

  • Clustering small-n-large-p data is crucial for analyzing modern biological datasets.
  • Current methods often rely on Gaussian assumptions, which are unsuitable for RNA sequencing count data.
  • Normalization of count data can lead to loss of information and biased results.

Purpose of the Study:

  • To develop a novel clustering method specifically designed for RNA sequencing count data.
  • To address the limitations of existing methods that assume continuous data.
  • To enable accurate sample clustering and simultaneous feature selection in high-dimensional transcriptomic studies.

Main Methods:

  • Development of a negative binomial mixture model for count data clustering.
  • Incorporation of lasso and fused lasso regularization for gene selection.
  • Utilizing a modified Expectation-Maximization (EM) algorithm for model inference.
  • Employing the Bayesian Information Criterion (BIC) for tuning parameter selection.

Main Results:

  • The proposed negative binomial mixture model demonstrates superior clustering accuracy compared to existing methods.
  • Effective feature selection of relevant genes is achieved through regularization techniques.
  • The method provides enhanced biological interpretation of results, particularly in pathway analysis.
  • Successful application to real-world transcriptomic data from rat brain and breast cancer studies.

Conclusions:

  • The developed negative binomial mixture model is a powerful tool for clustering RNA sequencing count data.
  • This approach overcomes the limitations of Gaussian-based methods for count data.
  • The method facilitates robust sample classification, accurate gene selection, and meaningful biological insights in transcriptomics.