Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Types of Errors: Detection and Minimization

Types of Errors: Detection and Minimization

Error is the deviation of the obtained result from the true, expected value or the estimated central value. Errors are expressed in absolute or relative terms.
Absolute error in a measurement is the numerical difference from the true or central value. Relative error is the ratio between absolute error and the true or central value, expressed as a percentage.
Errors can be classified by source, magnitude, and sign. There are three types of errors: systematic, random, and gross.
Systematic or...

Random and Systematic Errors

Random and Systematic Errors

Scientists always try their best to record measurements with the utmost accuracy and precision. However, sometimes errors do occur. These errors can be random or systematic. Random errors are observed due to the inconsistency or fluctuation in the measurement process, or variations in the quantity itself that is being measured. Such errors fluctuate from being greater than or less than the true value in repeated measurements. Consider a scientist measuring the length of an earthworm using a...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Elevated IL-6 receptor expression on CD4+ T cells contributes to the increased Th17 responses in patients with chronic hepatitis B.

Virology journal·2011

Same author

Neurochemical plasticity of nitric oxide synthase isoforms in neurogenic detrusor overactivity after spinal cord injury.

Neurochemical research·2011

Same author

[Clinical significance of 5-HT and DA levels in serum and cerebrospinal fluid of the patients with delayed encephalopathy after acute carbon monoxide poisoning].

Zhonghua lao dong wei sheng zhi ye bing za zhi = Zhonghua laodong weisheng zhiyebing zazhi = Chinese journal of industrial hygiene and occupational diseases·2011

Same author

Reconstitution of lysosomal NAADP-TRP-ML1 signaling pathway and its function in TRP-ML1(-/-) cells.

American journal of physiology. Cell physiology·2011

Same author

[The association between HBV genotyping and clinical characteristics and expression of TH1/TH2 cytokines].

Zhonghua shi yan he lin chuang bing du xue za zhi = Zhonghua shiyan he linchuang bingduxue zazhi = Chinese journal of experimental and clinical virology·2011

Same author

Bis[5-(2-pyrid-yl)pyrazine-2-carbonitrile]-silver(I) tetra-fluorido-borate.

Acta crystallographica. Section E, Structure reports online·2011

Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026

Same journal

Self-Supervised Continuous Dynamic Graph Representation Learning via Hawkes Processes.

IEEE transactions on neural networks and learning systems·2026

Same journal

cPU: Consistent Risk Estimator for Positive-Unlabeled Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Tuning-Free Latent Diffusion Models for Ultrahigh-Resolution Image Editing.

IEEE transactions on neural networks and learning systems·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jan 15, 2026

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Published on: June 30, 2020

在离线强化学习中阻止错误恶化,使用数据稀疏度进行强化学习.

Fan Zhang, Malu Zhang, Wenyu Chen

IEEE transactions on neural networks and learning systems

|October 9, 2025

概括

此摘要是机器生成的。

离线强化学习 (RL) 代理可以通过解决数据稀疏性来改进,这是估计错误的关键因素. 我们的IEEDS方法使用V-nets和状态意识的稀疏性马尔科夫决策流程 (MDP) 来减轻这些错误以获得更好的性能.

更多相关视频

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

Published on: January 5, 2018

相关实验视频

Last Updated: Jan 15, 2026

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Published on: June 30, 2020

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

Published on: January 5, 2018

科学领域:

人工智能的人工智能
机器学习机器学习
强化学习是一种强化学习.

背景情况:

离线增强学习 (RL) 从固定的数据集学习,避免风险的实时交互.
分布之外 (OOD) 的近似错误可能导致线下RL的性能下降.
数据稀疏性显著影响估计错误,这是一个经常被忽视的因素.

研究的目的:

提出一种新的离线RL方法,IEEDS,以抑制因数据稀疏而导致的错误恶化.
开发一种价值估计方法,考虑到数据稀疏性的影响.
为了提高线下RL代理的稳定性和性能.

主要方法:

实施了专注于数据稀疏性的离线RL方法 (IEEDS).
引入了一种使用V-nets而不是Q-nets用于更密集状态空间的新型值估计方法.
设计了一个状态意识稀疏的马尔科夫决策过程 (MDP),以将状态稀疏性纳入培训.
从理论上证明了IEEDS在拟议的MDP框架下的趋同.

主要成果:

通过考虑数据稀疏性,IEEDS方法有效地抑制了错误的恶化.
使用V-nets可以更准确地估计价值,因为数据集中在较小的状态空间中.
状态意识稀疏性MDP成功地减少了在培训期间稀疏状态的影响.
对线下RL基准进行了广泛的实验,证明IEEDS的性能优于现有方法.

结论:

数据稀疏性是影响线下RL估计错误的关键因素.
拟议的IEEDS方法提供了一个强大的解决方案,以减轻线下RL中的错误恶化.
通过有效管理数据稀疏性和提高价值估计准确度,IEEDS提高了代理商的性能.