Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Variability: Analysis

Variability: Analysis

Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...

Random Variables

Random Variables

A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

A Data-Driven Approach for Studying the Influence of Carbides on Work Hardening of Steel.

Materials (Basel, Switzerland)·2022

Same journal

Elastic functional Cox regression model with shape predictors.

Journal of applied statistics·2026

Same journal

An improved two-stage binary relevance method for multilabel classification.

Journal of applied statistics·2026

Same journal

Classification of multivariate functional data with an application to ADHD fMRI data.

Journal of applied statistics·2026

Same journal

Assessing the performance of longitudinal T-lymphocytes as biomarkers of immune recovery in HIV-infected children with or without TB co-infection.

Journal of applied statistics·2026

Same journal

Sparse long-only Markowitz portfolio optimization.

Journal of applied statistics·2026

Same journal

Homogeneity of multinomial populations when data are classified into a large number of groups.

Journal of applied statistics·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jul 16, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

随机森林模型的前期变量选择.

Jasper Velthoen¹, Juan-Juan Cai², Geurt Jongbloed¹

¹Department of Applied Mathematics, Delft University of Technology, Delft, The Netherlands.

Journal of applied statistics

|September 18, 2023

概括

此摘要是机器生成的。

本研究引入了一种使用前变量选择和连续排列概率得分 (CRPS) 的新可解释的预测方法. 该方法在高维数据的变量选择中显著减少了假阳性.

关键词:

在CRPS中,我们可以使用CRPS.随机森林是随机的森林.相关联的共变量.期货选择期货选择选择变量的选择变量.

更多相关视频

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

相关实验视频

Last Updated: Jul 16, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

科学领域:

统计统计统计统计
机器学习机器学习
环境科学环境科学

背景情况:

随机森林对于高维数据有效,但缺乏可解释性.
可解释的预测模型对于理解复杂的关系至关重要.

研究的目的:

为可解释的预测建模开发一个前变量选择方法.
为了最大限度地减少持续排名概率得分 (CRPS) 以获得最佳的变量选择.
提供一个统计严格的方法来选择相关的共变量.

主要方法:

一个逐步推进的选择程序,尽量减少CRPS.
一个基于CRPS风险差异估计的停止标准.
在人口意义上的最佳性的数学证明.
模拟研究将性能与现有方法进行比较.

主要成果:

与现有技术相比,拟议的方法实现了较低的错误阳性率.
在温度预测的统计后处理中证明有效.
在保持预测能力的同时选择了大约10%的共同变量.

结论:

开发的方法为高维预测提供了一个可解释的替代方案.
它为统计建模中的变量选择提供了一个强大的方法.
该方法适用于现实世界的预测问题,提高模型的透明度.