Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Statistical Analysis System (SAS)01:14

Statistical Analysis System (SAS)

262
SAS, short for Statistical Analysis System, is a powerful data analysis, management, and visualization tool. Developed by the SAS Institute in the early 1970s, SAS has evolved into a comprehensive software suite used across various industries for statistical analysis, business intelligence, and predictive modeling.
Applications: SAS finds applications in numerous fields, including healthcare for clinical trial analysis, finance for risk assessment, marketing for customer data analysis, and...
262
Survival Tree01:19

Survival Tree

126
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
126
Statistical Software for Data Analysis and Clinical Trials01:12

Statistical Software for Data Analysis and Clinical Trials

672
Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...
672
Introduction to R01:11

Introduction to R

452
R is a powerful software environment for statistical computing and graphics. Originating as an implementation of the S language, developed at Bell Laboratories, R has evolved into a robust, open-source statistical software favored by statisticians and data scientists worldwide. Its comprehensive suite includes data manipulation, calculation, and graphical display capabilities, making it versatile for data analysis and visualization. Its programming language is at the core of R's...
452
Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

690
The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
690
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

438
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
438

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Systematic estimates of global causes of neonatal and under 5 mortality in 2000-24: secondary data analysis using bayesian multinomial logistic regression.

BMJ (Clinical research ed.)·2026
Same author

BLOG: Bayesian longitudinal omics with group constraints.

Statistical applications in genetics and molecular biology·2026
Same author

Country-specific estimates of misclassification rates of computer-coded verbal autopsy algorithms.

BMJ global health·2026
Same author

Shortcomings of deep learning for distributional predictors: a note.

Biostatistics (Oxford, England)·2026
Same author

Scalable and Adaptive Spatiotemporal Modeling for Task-Based fMRI Analysis.

bioRxiv : the preprint server for biology·2025
Same author

Relation of wind direction and coal terminal activity patterns with air pollution burden in a community bordering a coal export terminal, Curtis Bay, Maryland, USA.

Air quality, atmosphere, & health·2025
Same journal

ggpedigree: Visualizing Pedigrees with 'ggplot2' and 'plotly'.

Journal of open source software·2026
Same journal

ACHR.cu: GPU-accelerated sampling of metabolic networks.

Journal of open source software·2026
Same journal

svZeroDSolver: A modular package for lumped-parameter cardiovascular simulations.

Journal of open source software·2026
Same journal

baysc: An R package for Bayesian survey clustering.

Journal of open source software·2026
Same journal

FastPCA: An R package for fast singular value decomposition.

Journal of open source software·2026
Same journal

Napari-3D-Counter: A manual cell counter for napari.

Journal of open source software·2026
See all related articles

Related Experiment Video

Updated: Aug 2, 2025

Competing-Risk Nomogram for Predicting Cancer-Specific Survival in Multiple Primary Colorectal Cancer Patients after Surgery
06:46

Competing-Risk Nomogram for Predicting Cancer-Specific Survival in Multiple Primary Colorectal Cancer Patients after Surgery

Published on: September 27, 2024

312

RandomForestsGLS: An R package for Random Forests for dependent data.

Arkajyoti Saha1, Sumanta Basu2, Abhirup Datta3

  • 1Departments of Statistics, University of Washington.

Journal of Open Source Software
|April 20, 2023
PubMed
Summary
This summary is machine-generated.

RandomForestsGLS improves non-linear modeling for spatial and temporal data by integrating Random Forests (RF) with generalized least squares (GLS). This approach effectively handles data dependence, enhancing regression function estimation and predictions.

More Related Videos

Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

35
Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study
07:50

Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study

Published on: April 18, 2025

340

Related Experiment Videos

Last Updated: Aug 2, 2025

Competing-Risk Nomogram for Predicting Cancer-Specific Survival in Multiple Primary Colorectal Cancer Patients after Surgery
06:46

Competing-Risk Nomogram for Predicting Cancer-Specific Survival in Multiple Primary Colorectal Cancer Patients after Surgery

Published on: September 27, 2024

312
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

35
Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study
07:50

Global and Current Research Trends of Single-Cell Sequencing in Cancer: A Bibliometric and Visualization Study

Published on: April 18, 2025

340

Area of Science:

  • Geospatial statistics
  • Machine learning
  • Data science

Background:

  • Modern datasets often exhibit spatial or serial dependence, requiring specialized modeling techniques.
  • Existing machine learning methods struggle to account for this dependence, leading to suboptimal estimations.
  • Traditional spatial/temporal software offers correlation modeling but lacks flexibility in mean function estimation.

Purpose of the Study:

  • To bridge the gap between flexible non-linear modeling and accurate dependence structure handling in data.
  • To introduce a novel method, RandomForestsGLS, that integrates Random Forests with spatial/serial correlation modeling.
  • To improve the estimation of the mean function in the presence of dependent observations.

Main Methods:

  • Developed RandomForestsGLS, a novel rendition of Random Forests (RF) that explicitly models spatial/serial data correlation.
  • Incorporated generalized least squares (GLS) principles into the RF fitting procedure.
  • Utilized kriging for making predictions at new locations, particularly for geo-spatial data.

Main Results:

  • RandomForestsGLS substantially improves the estimation of the mean function by explicitly modeling data correlation.
  • The method effectively captures complex interactions among variables while accounting for spatial or temporal dependencies.
  • Predictions at new locations are enhanced through the integration of kriging.

Conclusions:

  • RandomForestsGLS offers a powerful and flexible approach for analyzing dependent data in fields utilizing geographical information systems and remote sensing.
  • The method provides a significant advancement over traditional machine learning and statistical software for spatial/temporal data analysis.
  • This approach enhances the accuracy and reliability of statistical modeling for complex, correlated datasets.