Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Regression01:25

Multiple Regression

3.1K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.1K
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.1K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.1K
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

173
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
173
Regression Analysis01:11

Regression Analysis

5.9K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
5.9K
Biostatistics: Overview01:20

Biostatistics: Overview

297
Biostatistics plays a crucial role in understanding and analyzing data in healthcare and biology. Biostatisticians conduct experiments, gather evidence, and draw meaningful conclusions using statistical methods and techniques. Different variables form the foundation of biostatistical analysis, allowing researchers to understand and interpret data effectively. These variables are classified into different types, each serving a specific purpose in statistical analysis.
Discrete variables are...
297
Model Approaches for Pharmacokinetic Data: Distributed Parameter Models01:06

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

102
Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...
102

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026
Same author

Pan-Cancer Drug Response Prediction Using Integrative Principal Component Regression.

Statistics in biosciences·2026
Same author

COVID-19 vaccination campaign, knowledge, and trust in Duran, Ecuador: a cross-sectional study.

Vaccine·2026
Same author

Splicing of HPV16 E6 promotes aggressive invasion in oropharyngeal cancer via endocytosis of E-cadherin.

bioRxiv : the preprint server for biology·2025
Same author

Corrigendum to "Are we there yet? Gut microbiota for cancer diagnosis, prognosis and treatment" [Seminars in Oncology Volume 52, Issue 4, 2025, 152376].

Seminars in oncology·2025
Same author

Thoracic trauma WSES-AAST guidelines.

World journal of emergency surgery : WJES·2025
Same journal

ProcessGAN: Generating Privacy-Preserving Time-Aware Process Data with Conditional Generative Adversarial Nets.

ACM transactions on knowledge discovery from data·2025
Same journal

ArieL: Adversarial Graph Contrastive Learning.

ACM transactions on knowledge discovery from data·2025
Same journal

Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping.

ACM transactions on knowledge discovery from data·2019
Same journal

Cross-Dependency Inference in Multi-Layered Networks: A Collaborative Filtering Perspective.

ACM transactions on knowledge discovery from data·2017
Same journal

CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering.

ACM transactions on knowledge discovery from data·2017
Same journal

Scalable and Axiomatic Ranking of Network Role Similarity.

ACM transactions on knowledge discovery from data·2014
See all related articles

Related Experiment Video

Updated: Aug 4, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K

Bayesian Variable Selection in Linear Regression in One Pass for Large Data Sets.

Carlos Ordonez1, Carlos Garcia-Alvarado1, Veerabhadran Baladandayuthapani2

  • 1University of Houston.

ACM Transactions on Knowledge Discovery From Data
|April 4, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a faster Bayesian approach for variable selection in linear regression using an optimized Gibbs sampler. The new method significantly speeds up computation, making Bayesian variable selection more efficient for large datasets.

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

820

Related Experiment Videos

Last Updated: Aug 4, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

820

Area of Science:

  • Computational Statistics
  • Machine Learning
  • Database Systems

Background:

  • Bayesian models often rely on Markov Chain Monte Carlo (MCMC) methods for computation.
  • MCMC methods require numerous iterations, posing challenges for large datasets.
  • Variable selection in linear regression is computationally intensive due to its combinatorial nature.

Purpose of the Study:

  • To accelerate Bayesian model computation for variable selection in linear regression.
  • To develop an efficient algorithm that overcomes the limitations of traditional MCMC methods.
  • To integrate Bayesian variable selection into database management systems.

Main Methods:

  • Developed a fast Gibbs sampler algorithm with optimizations for Bayesian variable selection.
  • Utilized non-informative and conjugate prior distributions for efficient data summarization.
  • Employed sparse binary vectors for efficient matrix projections and hash tables for variable subset probabilities.
  • Integrated the algorithm into a database management system (DBMS) using User-Defined Functions and stored procedures.

Main Results:

  • The proposed algorithm achieves accurate results comparable to existing methods.
  • Demonstrated linear scalability with respect to dataset size.
  • Achieved orders-of-magnitude speedup compared to the R package for Bayesian variable selection.
  • Showcased efficient parallel data summarization and matrix manipulation within a DBMS.

Conclusions:

  • The optimized Gibbs sampler significantly accelerates Bayesian model computation for variable selection.
  • Integrating the algorithm into a DBMS enhances performance and scalability.
  • This approach offers a practical and efficient solution for variable selection in large-scale Bayesian analyses.