Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Regression Analysis01:11

Regression Analysis

6.0K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
6.0K
Multiple Regression01:25

Multiple Regression

3.1K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.1K
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

79
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
79
Residuals and Least-Squares Property01:11

Residuals and Least-Squares Property

7.8K
The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...
7.8K
Microsoft Excel: Regression Analysis01:18

Microsoft Excel: Regression Analysis

812
Regression analysis in Microsoft Excel is a powerful statistical method for examining the relationship between a dependent variable and one or more independent variables. It's used extensively in fields such as economics, biology, and business to predict outcomes, understand relationships, and make data-driven decisions. The most common type is linear regression, which attempts to fit a straight line through the data points to model the relationship between variables.
To perform regression...
812
Regression Toward the Mean01:52

Regression Toward the Mean

6.3K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MethylCurate: Tool for Dataset Curation and Epigenetic Aging Clock Evaluation.

bioRxiv : the preprint server for biology·2026
Same author

Knowledge-guided Bayesian biclustering model for omics data with noisy graphs.

Biometrics·2026
Same author

Simultaneous Representation Learning of Multi-Omics and Clinical Outcome Data via a Supervised Knowledge-Guided Bayesian Factor Model.

Statistics in medicine·2026
Same author

Bayesian scalar-on-image regression with spatial interactions for modeling Alzheimer's disease.

Biometrics·2025
Same author

Introducing mCODEGPT as a zero-shot information extraction from clinical free text data tool for cancer research.

Communications medicine·2025
Same author

SDoH-GPT: using large language models to extract social determinants of health.

Journal of the American Medical Informatics Association : JAMIA·2025
Same journal

Fast penalized generalized estimating equations for large longitudinal functional datasets.

Biometrics·2026
Same journal

Causally-interpretable random-effects meta-analysis.

Biometrics·2026
Same journal

Statistical inference for mean function of partially observed functional time series.

Biometrics·2026
Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

Biometrics·2026
Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

Biometrics·2026
Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

Biometrics·2026
See all related articles

Related Experiment Video

Updated: Aug 23, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

CEDAR: communication efficient distributed analysis for regressions.

Changgee Chang1, Zhiqi Bu1, Qi Long1

  • 1Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA.

Biometrics
|October 28, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces a novel distributed learning method for electronic health records (EHRs) that avoids sharing patient data. The approach enhances parameter estimation efficiency and ensures differential privacy for secure statistical inference.

Keywords:
communication efficientdifferential privacydistributed learningdistributed statistical inference

More Related Videos

Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
20:24

Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study

Published on: January 31, 2014

16.6K
A Cost Effective and Adaptable Scratch Migration Assay
08:59

A Cost Effective and Adaptable Scratch Migration Assay

Published on: June 30, 2020

5.5K

Related Experiment Videos

Last Updated: Aug 23, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
20:24

Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study

Published on: January 31, 2014

16.6K
A Cost Effective and Adaptable Scratch Migration Assay
08:59

A Cost Effective and Adaptable Scratch Migration Assay

Published on: June 30, 2020

5.5K

Area of Science:

  • Health Informatics
  • Biostatistics
  • Machine Learning

Background:

  • Electronic health records (EHRs) hold significant potential for precision medicine.
  • Sharing patient-level data across institutions is restricted by regulations and policies.
  • Distributed learning over multiple EHR databases without data sharing is a growing area of interest.

Purpose of the Study:

  • To propose a communication-efficient distributed learning method for EHRs.
  • To enable statistical inference without sharing raw patient-level data.
  • To ensure differential privacy and reduce information leakage risks.

Main Methods:

  • Framing the distributed learning problem as a missing data problem.
  • Aggregating optimal estimates from external sites.
  • Incorporating posterior samples from remote sites for improved efficiency.
  • Theoretical investigation of asymptotic properties for statistical inference and differential privacy.

Main Results:

  • The proposed method allows for proper statistical inference without raw data sharing.
  • Demonstrated improved efficiency of parameter estimates through posterior sample incorporation.
  • Achieved differential privacy, mitigating risks of information leakage.
  • Evaluated performance via simulations and real-world data analyses.

Conclusions:

  • The novel distributed learning approach effectively addresses EHR data sharing challenges.
  • The method provides a secure and efficient framework for multi-institutional EHR analysis.
  • This work advances precision medicine through privacy-preserving distributed statistical inference.