Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Types of Errors: Detection and Minimization

Types of Errors: Detection and Minimization

Error is the deviation of the obtained result from the true, expected value or the estimated central value. Errors are expressed in absolute or relative terms.
Absolute error in a measurement is the numerical difference from the true or central value. Relative error is the ratio between absolute error and the true or central value, expressed as a percentage.
Errors can be classified by source, magnitude, and sign. There are three types of errors: systematic, random, and gross.
Systematic or...

Random and Systematic Errors

Random and Systematic Errors

Scientists always try their best to record measurements with the utmost accuracy and precision. However, sometimes errors do occur. These errors can be random or systematic. Random errors are observed due to the inconsistency or fluctuation in the measurement process, or variations in the quantity itself that is being measured. Such errors fluctuate from being greater than or less than the true value in repeated measurements. Consider a scientist measuring the length of an earthworm using a...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Elevated IL-6 receptor expression on CD4+ T cells contributes to the increased Th17 responses in patients with chronic hepatitis B.

Virology journal·2011

Same author

Neurochemical plasticity of nitric oxide synthase isoforms in neurogenic detrusor overactivity after spinal cord injury.

Neurochemical research·2011

Same author

[Clinical significance of 5-HT and DA levels in serum and cerebrospinal fluid of the patients with delayed encephalopathy after acute carbon monoxide poisoning].

Zhonghua lao dong wei sheng zhi ye bing za zhi = Zhonghua laodong weisheng zhiyebing zazhi = Chinese journal of industrial hygiene and occupational diseases·2011

Same author

Reconstitution of lysosomal NAADP-TRP-ML1 signaling pathway and its function in TRP-ML1(-/-) cells.

American journal of physiology. Cell physiology·2011

Same author

[The association between HBV genotyping and clinical characteristics and expression of TH1/TH2 cytokines].

Zhonghua shi yan he lin chuang bing du xue za zhi = Zhonghua shiyan he linchuang bingduxue zazhi = Chinese journal of experimental and clinical virology·2011

Same author

Bis[5-(2-pyrid-yl)pyrazine-2-carbonitrile]-silver(I) tetra-fluorido-borate.

Acta crystallographica. Section E, Structure reports online·2011

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 15, 2026

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Published on: June 30, 2020

Inhibiting Error Exacerbation in Offline Reinforcement Learning With Data Sparsity.

Fan Zhang, Malu Zhang, Wenyu Chen

IEEE Transactions on Neural Networks and Learning Systems

|October 9, 2025

Summary

This summary is machine-generated.

Offline reinforcement learning (RL) agents can be improved by addressing data sparsity, a key factor in estimation errors. Our IEEDS approach uses V-nets and state-aware sparsity Markov decision processes (MDPs) to mitigate these errors for better performance.

More Related Videos

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

Published on: January 5, 2018

Related Experiment Videos

Last Updated: Jan 15, 2026

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Measuring Statistical Learning Across Modalities and Domains in School-Aged Children Via an Online Platform and Neuroimaging Techniques

Published on: June 30, 2020

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

Published on: January 5, 2018

Area of Science:

Artificial Intelligence
Machine Learning
Reinforcement Learning

Background:

Offline reinforcement learning (RL) learns from fixed datasets, avoiding risky real-time interaction.
Out-of-distribution (OOD) approximation errors can lead to performance degradation in offline RL.
Data sparsity significantly impacts estimation errors, a factor often overlooked.

Purpose of the Study:

To propose a novel offline RL approach, IEEDS, to inhibit error exacerbation caused by data sparsity.
To develop a value estimation method that accounts for the influence of data sparsity.
To improve the stability and performance of offline RL agents.

Main Methods:

Implemented an offline RL approach (IEEDS) focusing on data sparsity.
Introduced a novel value estimation method using V-nets instead of Q-nets for denser state spaces.
Designed a state-aware-sparsity Markov decision process (MDP) to incorporate state sparsity into training.
Theoretically proved the convergence of IEEDS under the proposed MDP framework.

Main Results:

The IEEDS approach effectively inhibits error exacerbation by considering data sparsity.
Using V-nets leads to more accurate value estimation due to concentrated data in smaller state spaces.
The state-aware-sparsity MDP successfully lessens the impact of sparse states during training.
Extensive experiments on offline RL benchmarks demonstrated IEEDS's superior performance compared to existing methods.

Conclusions:

Data sparsity is a critical factor influencing estimation errors in offline RL.
The proposed IEEDS method offers a robust solution for mitigating error exacerbation in offline RL.
IEEDS enhances agent performance by effectively managing data sparsity and improving value estimation accuracy.