Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

One-Way ANOVA01:18

One-Way ANOVA

7.9K
One-way ANOVA analyzes more than three samples categorized by one factor. For example, it can compare the average mileage of sports bikes. Here, the data is categorized by one factor - the company. However, one-way ANOVA cannot be used to simultaneously compare the sample mean of three or more samples categorized by two factors. An example of two factors would be sports bikes from different companies driven in different terrains, such as a desert or snowy landscape. Here, two-way ANOVA is used...
7.9K
Regression Analysis01:11

Regression Analysis

5.7K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
5.7K
Statistical Methods to Analyze Parametric Data: ANOVA01:12

Statistical Methods to Analyze Parametric Data: ANOVA

305
Analysis of Variance, or ANOVA, is a powerful statistical technique used to analyze parametric data, primarily in research and experimental studies. It's designed to compare the means of two or more groups, assisting researchers in identifying any significant differences between these group means. There are two main types of ANOVA based on the complexity of the analysis: one-way and two-way.
One-way ANOVA is applied when a single independent variable or factor is scrutinized. It compares...
305
Multiple Regression01:25

Multiple Regression

3.0K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.0K
Biostatistics: Overview01:20

Biostatistics: Overview

227
Biostatistics plays a crucial role in understanding and analyzing data in healthcare and biology. Biostatisticians conduct experiments, gather evidence, and draw meaningful conclusions using statistical methods and techniques. Different variables form the foundation of biostatistical analysis, allowing researchers to understand and interpret data effectively. These variables are classified into different types, each serving a specific purpose in statistical analysis.
Discrete variables are...
227
Survival Tree01:19

Survival Tree

74
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
74

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Mitigating bias in the analysis and inferences from using longitudinal EHR data in disease outcomes research.

BMC medical research methodology·2026
Same author

Swab Testing to Optimize Pneumonia Treatment With Empiric Vancomycin: A Randomized Controlled Trial.

Clinical infectious diseases : an official publication of the Infectious Diseases Society of America·2026
Same author

Using routinely collected data for research purposes: challenges and mitigation strategies.

BMJ (Clinical research ed.)·2026
Same author

Determining the Physiological Threshold for Angina (ORBITA-FIRE): A Double-Blind, Randomized, Placebo-Controlled Study.

Circulation·2026
Same author

Embracing Bayesian Methods in Clinical Trials: FDA's Long-Awaited Draft Guidance.

JAMA·2026
Same author

Chest drain REgular FLushing in ComplIcated parapneumonic EFfusions and empyemas: Study protocol for the RELIEF randomized controlled trial.

PloS one·2026
Same journal

Methods for incorporating test result information within the high-dimensional propensity score framework: application in UK electronic health record data.

BMC medical research methodology·2026
Same journal

Sparse multi-way DMDC for longitudinal classification in high dimension low sample size data.

BMC medical research methodology·2026
Same journal

Tree-based exploratory identification of predictive biomarkers in non-randomized data.

BMC medical research methodology·2026
Same journal

Comparative evaluation of interrupted time series analytical methods for healthcare quality improvement research: a Monte Carlo simulation study.

BMC medical research methodology·2026
Same journal

Methodological advances in claims-based dementia algorithms: integrating medication and clinical data for medicare populations.

BMC medical research methodology·2026
Same journal

An interpretable XGboost algorithm for predicting 30-day mortality in acute pancreatitis using routine biomarkers.

BMC medical research methodology·2026
See all related articles

Related Experiment Video

Updated: Jun 17, 2025

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

16.8K

Regression without regrets -initial data analysis is a prerequisite for multivariable regression.

Georg Heinze1, Mark Baillie2, Lara Lusa3,4

  • 1Center for Medical Data Science, Institute of Clinical Biometrics, Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria. georg.heinze@meduniwien.ac.at.

BMC Medical Research Methodology
|August 8, 2024
PubMed
Summary
This summary is machine-generated.

Initial data analysis (IDA) is crucial before regression modeling to understand data properties and avoid errors. A preplanned IDA, documented thoroughly, ensures reproducible and accurate statistical inference for better model interpretation.

Keywords:
Data screeningFunctional formIDA frameworkInitial data analysisRegression modelsReportingSTRATOS InitiativeVariable selectionVariable transformation

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.3K
Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills
06:52

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Published on: September 17, 2019

6.3K

Related Experiment Videos

Last Updated: Jun 17, 2025

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

16.8K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.3K
Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills
06:52

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Published on: September 17, 2019

6.3K

Area of Science:

  • Statistics
  • Data Science
  • Biostatistics

Background:

  • Regression models are widely used for prediction and describing associations between variables.
  • Standard software facilitates easy fitting of regression models, increasing the risk of misuse.
  • Insufficient understanding of data properties can lead to flawed analysis, interpretation, and presentation of regression results.

Purpose of the Study:

  • To emphasize the prerequisite role of Initial Data Analysis (IDA) for regression modeling.
  • To guide the development of a preplanned IDA strategy for data screening in regression contexts.
  • To improve the clarity, accuracy, and reproducibility of regression modeling outcomes.

Main Methods:

  • Advocating for a preplanned Initial Data Analysis (IDA) integrated into the overall statistical analysis plan.
  • Recommending specific aspects for data screening within an IDA plan for regression modeling.
  • Illustrating the IDA plan with a diagnostic modeling project example, including data visualization recommendations.

Main Results:

  • IDA provides essential data knowledge to validate or refine regression model building strategies.
  • Proper IDA facilitates correct interpretation and clear presentation of modeling results.
  • Adhering to IDA principles, such as abstaining from outcome-predictor association evaluation, minimizes biased statistical inference.

Conclusions:

  • Initial Data Analysis is a critical prerequisite for robust and reproducible regression modeling.
  • A well-documented and preplanned IDA strategy enhances the reliability of statistical inference.
  • Implementing IDA best practices leads to more accurate interpretation and clearer communication of regression model findings.