Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Regression01:25

Multiple Regression

3.0K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.0K
Outliers and Influential Points01:08

Outliers and Influential Points

4.1K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
4.1K
Regression Analysis01:11

Regression Analysis

5.7K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
5.7K
Regression Toward the Mean01:52

Regression Toward the Mean

6.3K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.3K
Variation01:19

Variation

6.8K
An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation, which is the square root of variance.
When independent and dependent variables are plotted on a scatter plot, the slope of a line is a value that describes the rate of change between the two...
6.8K
Hindsight Biases01:12

Hindsight Biases

3.4K
Hindsight bias leads you to believe that the event you just experienced was predictable, even though it really wasn’t. In other words, you knew all along that things would turn out the way they did. Can you relate this to the phrase "Hindsight is 20/20" now? 
3.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Bayesian Machine Learning Tools for Alcohol Use Disorder Research: The bpaup R Package.

Multivariate behavioral research·2026
Same author

Serum Endogenous Opioid Levels are Associated with Self-Injury Severity in Adolescents with Non-Suicidal Self-Injury and Comorbid Depression.

Neuroscience bulletin·2026
Same author

Prognostic Impact of <i>KRAS</i> and <i>SMARCA4</i> Mutations and Co-Mutations on Survival in Non-Small Cell Lung Cancer: Insights from the AACR GENIE BPC Dataset.

Biomedicines·2025
Same author

Improving thermostability of α-L-fucosidase from Pedobacter sp. via consensus-guided engineering and directed evolution.

Journal of biotechnology·2025
Same author

Intergenerational Associations Between Maternal Diet and Childhood Adiposity: A Bayesian Regularized Mediation Analysis.

Statistics in biosciences·2025
Same author

Aldolase A accelerates hepatocarcinogenesis by refactoring c-Jun transcription.

Journal of pharmaceutical analysis·2025
Same journal

A joint model for a longitudinal outcome and a progressive multistate model under a mixed observation scheme.

Statistical methods in medical research·2026
Same journal

Efficient semi-supervised estimation of optimal individualized treatment regimes with survival outcome.

Statistical methods in medical research·2026
Same journal

Asymptotic online FWER control for dependent test statistics.

Statistical methods in medical research·2026
Same journal

Regression analysis of misclassified current status data with potentially unknown test accuracy.

Statistical methods in medical research·2026
Same journal

Bayesian multivariate linear mixed-effects models with varied association structures.

Statistical methods in medical research·2026
Same journal

Inference about the ratio of age-standardized rates between two overlapping populations.

Statistical methods in medical research·2026
See all related articles

Related Experiment Video

Updated: Jul 13, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.3K

SIGHR: Side information guided high-dimensional regression.

Yuan Yang1, Christopher S McMahan1, Yu-Bo Wang1

  • 1School of Mathematical and Statistical Sciences, Clemson University, Clemson, SC, USA.

Statistical Methods in Medical Research
|October 12, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a new Bayesian regression method for variable selection in high-dimensional data. It effectively uses side information to improve the identification of important genetic markers for nicotine dependence.

Keywords:
Biomarkerconditional means priornicotine metabolite ratioside informationspike and slab prior

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.5K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Related Experiment Videos

Last Updated: Jul 13, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.3K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.5K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Area of Science:

  • Statistics
  • Genetics
  • Bioinformatics

Background:

  • High-dimensional data presents challenges for variable selection.
  • Existing methods may not fully utilize available side information.
  • Identifying genetic markers for nicotine dependence is crucial for smoking cessation.

Purpose of the Study:

  • To develop a novel Bayesian regression framework for variable selection in high-dimensional settings.
  • To incorporate side information into the sparsity structure of regression coefficients.
  • To identify genetic markers associated with the nicotine metabolite ratio.

Main Methods:

  • A Bayesian regression framework using a spike and slab prior.
  • Incorporation of side information via a binary regression model for inclusion probabilities.
  • Development of a computationally efficient Markov chain Monte Carlo (MCMC) algorithm.
  • Data augmentation steps for efficient posterior sampling.

Main Results:

  • The proposed method effectively leverages side information for variable selection.
  • Numerical simulations demonstrate strong finite sample performance.
  • Successful identification of genetic markers linked to the nicotine metabolite ratio.

Conclusions:

  • The novel Bayesian framework offers an improved approach to variable selection in high-dimensional data.
  • The method's ability to integrate side information enhances the identification of relevant predictors.
  • This approach has significant potential for applications in genetic association studies, such as those for nicotine dependence.