Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance, comparing...

Variability: Analysis

Variability: Analysis

Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics

Nonparametric statistics offer a powerful alternative to traditional parametric methods, useful when assumptions about the population distribution cannot be made. Unlike parametric tests, which require data to follow a specific distribution with well-defined parameters (such as the mean and standard deviation), nonparametric tests do not require such constraints. This makes them particularly valuable when dealing with small sample sizes, skewed data, or ordinal and categorical variables.
One of...

Biostatistics: Overview

Biostatistics: Overview

Biostatistics plays a crucial role in understanding and analyzing data in healthcare and biology. Biostatisticians conduct experiments, gather evidence, and draw meaningful conclusions using statistical methods and techniques. Different variables form the foundation of biostatistical analysis, allowing researchers to understand and interpret data effectively. These variables are classified into different types, each serving a specific purpose in statistical analysis.
Discrete variables are...

Parametric Survival Analysis: Weibull and Exponential Methods

Parametric Survival Analysis: Weibull and Exponential Methods

Parametric survival analysis models survival data by assuming a specific probability distribution for the time until an event occurs. The Weibull and exponential distributions are two of the most commonly used methods in this context, due to their versatility and relatively straightforward application.
Weibull Distribution
The Weibull distribution is a flexible model used in parametric survival analysis. It can handle both increasing and decreasing hazard rates, depending on its shape parameter...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Scaling Up Bayesian Neural Networks with Neural Networks.

Transactions on machine learning research·2026

Same author

A Bayesian Time-Varying Psychophysiological Interaction Model.

Data science in science·2026

Same author

Neurodatascience: Past, Present, and Future.

Data science in science·2026

Same author

A HORSESHOE MIXTURE MODEL FOR BAYESIAN SCREENING WITH AN APPLICATION TO LIGHT SHEET FLUORESCENCE MICROSCOPY IN BRAIN IMAGING.

The annals of applied statistics·2026

Same author

Optimal Transport based Cross-Domain Integration for Heterogeneous Data.

Journal of the American Statistical Association·2025

Same author

REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA.

Annals of statistics·2025

Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026

Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026

Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026

Same journal

Beyond Fixed Thresholds: Optimizing Summaries of Wearable Device Data via Piecewise Linearization of Quantile Functions.

Statistics in medicine·2026

Same journal

A Causal Framework for Evaluating the Total Effect of Strategies Aiming to Expand Screening and to Improve Outcomes.

Statistics in medicine·2026

Same journal

Causal Effects on Nonterminal Event Time With Application to Antibiotic Usage and Future Resistance.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 16, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Bayesian nonparametric variable selection as an exploratory tool for discovering differentially expressed genes.

Babak Shahbaba¹, Wesley O Johnson

¹Department of Statistics, University of California at Irvine, CA, USA. babaks@uci.edu

Statistics in Medicine

|November 23, 2012

Summary

This summary is machine-generated.

This study introduces a novel Bayesian variable selection model for high-throughput genomic studies. The method effectively identifies relevant genes by clustering regression effects, improving upon existing approaches for disease research.

More Related Videos

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Published on: July 29, 2022

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Published on: June 24, 2021

Related Experiment Videos

Last Updated: May 16, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Analyzing Multifactorial RNA-Seq Experiments with DiCoExpress

Published on: July 29, 2022

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Published on: June 24, 2021

Area of Science:

Genomics
Biostatistics
Bioinformatics

Background:

High-throughput studies often lack specific hypotheses, necessitating methods to explore numerous factors like genes.
Identifying relevant genes in large-scale genomic datasets is crucial for follow-up investigations.

Purpose of the Study:

To develop a statistical model for identifying potentially relevant genes in large-scale genomic studies without a priori hypotheses.
To cluster genes based on their potential effect sizes related to disease outcomes.

Main Methods:

A hierarchical linear regression model with random coefficients, related to Bayesian variable selection, is employed for case-control data.
A Dirichlet process mixture model is used for regression coefficients to group genes by relevance.
The model identifies clusters of genes with varying degrees of association with the outcome of interest.

Main Results:

The proposed method effectively clusters regression effects, distinguishing genes with minimal, moderate, and high relevance.
Simulations demonstrate the approach's effectiveness in identifying relevant genes compared to alternatives.
The model was successfully applied to transcriptome data for human cytomegalovirus infection and leukemia gene expression studies.

Conclusions:

The Dirichlet process mixture model provides a robust framework for gene relevance discovery in large-scale, hypothesis-free genomic studies.
This approach aids in prioritizing genes for more focused, in-depth analysis in subsequent research phases.
The method has practical utility in analyzing complex biological datasets, such as those from infectious diseases and cancer research.