Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

AI-guided analysis of human pancreatic islet sociology reveals distinct cell compositional changes in type 1 diabetes.

bioRxiv : the preprint server for biology·2026

Same author

Adaptive Fisher's method using weakly geometric grid for combining <i>p</i>-values with application to COVID-19 surveillance.

Journal of the Royal Statistical Society. Series C, Applied statistics·2026

Same author

Impact of sex chromosomes and gonad type in stress susceptibility in corticostriatal brain regions.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same author

Unraveling Tissue-Specific Molecular Signatures and Convergent Pathway Enrichments in Suicidal Behavior.

bioRxiv : the preprint server for biology·2026

Same author

Quantitative and qualitative patient-reported analysis of misdiagnosis and/or late diagnosis of metastatic lobular cancer.

medRxiv : the preprint server for health sciences·2026

Same author

Benchmarking scRNA-seq Copy Number Inference: A Comprehensive Evaluation and Practitioner's Guide.

bioRxiv : the preprint server for biology·2026

Same journal

A Bayesian functional concurrent zero-inflated Dirichlet-multinomial regression model with application to infant microbiome.

Biostatistics (Oxford, England)·2026

Same journal

Towards optimal environmental policies: policy learning under arbitrary bipartite network interference.

Biostatistics (Oxford, England)·2026

Same journal

Multilevel functional quantile principal component analysis.

Biostatistics (Oxford, England)·2026

Same journal

Adaptive transfer learning for time-to-event modeling with applications in disease risk assessment.

Biostatistics (Oxford, England)·2026

Same journal

High-dimensional test for one-sided hypotheses.

Biostatistics (Oxford, England)·2026

Same journal

NBSR: a Negative Binomial Softmax Regression model for microRNA-seq data analysis.

Biostatistics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 25, 2025

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing

Published on: March 23, 2022

A sparse negative binomial mixture model for clustering RNA-seq count data.

Yujia Li¹, Tanbin Rahman¹, Tianzhou Ma²

¹Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA.

Biostatistics (Oxford, England)

|August 7, 2021

Summary

This summary is machine-generated.

This study introduces a new clustering method for RNA sequencing count data, improving sample classification and gene selection for high-dimensional biological data. The negative binomial mixture model offers superior accuracy and interpretation for transcriptomic studies.

Keywords:

Cluster analysis Feature selection Gaussian mixture model Sparse K-means

More Related Videos

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Published on: July 29, 2022

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Published on: September 18, 2021

Related Experiment Videos

Last Updated: Oct 25, 2025

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing

Low-input Nucleus Isolation and Multiplexing with Barcoded Antibodies of Mouse Sympathetic Ganglia for Single-nucleus RNA Sequencing

Published on: March 23, 2022

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Published on: July 29, 2022

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Published on: September 18, 2021

Area of Science:

Bioinformatics
Computational Biology
Statistical Genetics

Background:

Clustering small-n-large-p data is crucial for analyzing modern biological datasets.
Current methods often rely on Gaussian assumptions, which are unsuitable for RNA sequencing count data.
Normalization of count data can lead to loss of information and biased results.

Purpose of the Study:

To develop a novel clustering method specifically designed for RNA sequencing count data.
To address the limitations of existing methods that assume continuous data.
To enable accurate sample clustering and simultaneous feature selection in high-dimensional transcriptomic studies.

Main Methods:

Development of a negative binomial mixture model for count data clustering.
Incorporation of lasso and fused lasso regularization for gene selection.
Utilizing a modified Expectation-Maximization (EM) algorithm for model inference.
Employing the Bayesian Information Criterion (BIC) for tuning parameter selection.

Main Results:

The proposed negative binomial mixture model demonstrates superior clustering accuracy compared to existing methods.
Effective feature selection of relevant genes is achieved through regularization techniques.
The method provides enhanced biological interpretation of results, particularly in pathway analysis.
Successful application to real-world transcriptomic data from rat brain and breast cancer studies.

Conclusions:

The developed negative binomial mixture model is a powerful tool for clustering RNA sequencing count data.
This approach overcomes the limitations of Gaussian-based methods for count data.
The method facilitates robust sample classification, accurate gene selection, and meaningful biological insights in transcriptomics.