Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Value

Expected Value

The expected value is known as the "long-term" average or mean. This means that over the long term of experimenting over and over, you would expect this average. The expected average is represented by the symbol μ. It is calculated as follows:

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Introduction to Learning

Introduction to Learning

Learning is the process of acquiring knowledge or skills through practice or experience, leading to long-lasting behavioral changes. This acquisition occurs through interaction with the environment and requires practice or experience. For instance, mastering a skill such as surfing requires considerable practice and experience, highlighting the essential role of repeated interactions with the environment in learning.
In contrast to learned behaviors, unlearned behaviors such as crying, sexual...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A phylogeny-guided framework for decoding mechanisms of human endogenous retrovirus regulation in health and disease.

bioRxiv : the preprint server for biology·2026

Same author

Removal of an Embedded Taser Probe in the Glans Penis in the Operating Room.

Cureus·2026

Same author

Use of a capillary blood collection device to monitor exposure to per- and polyfluoroalkyl substances (PFAS) in Veterans living in proximity to potential sources of environmental contamination.

Journal of exposure science & environmental epidemiology·2026

Same author

Integrated Multi-Omics Reveals Synergistic Hepatotoxicity of Ethanol and PFOS Co-Exposure.

Chemico-biological interactions·2026

Same author

Clonal dynamics shaped by diverse drug-tolerant persister states in melanoma resistance.

Molecular cancer·2026

Same author

The impact of adverse childhood experiences on gut microbiota and markers of inflammation is mediated by obesity and depression.

Brain, behavior, and immunity·2026

Same journal

Benchmarking the Robustness of Autonomous Driving to Environmental Illusions: A Lane Perception Perspective.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Topology-Aware Representations via Test-Time Adaptation for Anomaly Segmentation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 25, 2025

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

Robust Losses for Learning Value Functions.

Andrew Patterson, Victor Liao, Martha White

IEEE Transactions on Pattern Analysis and Machine Intelligence

|October 13, 2022

Summary

This summary is machine-generated.

This study introduces robust losses for reinforcement learning, addressing issues with mean squared Bellman error sensitivity to outliers. New algorithms offer more stable value function learning with reduced parameter sensitivity.

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Related Experiment Videos

Last Updated: Aug 25, 2025

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Area of Science:

Machine Learning
Artificial Intelligence
Optimization

Background:

Reinforcement learning (RL) often uses mean squared Bellman error, which is sensitive to outliers.
Outliers cause skewed solutions and high-variance gradients, necessitating clipping or rescaling strategies.
Current RL methods with these strategies use semi-gradient rules, not minimizing a defined loss.

Purpose of the Study:

Reformulate Bellman errors using robust losses like Huber and Absolute Bellman error.
Develop sound gradient-based algorithms for online, off-policy prediction and control.
Analyze the benefits of robust losses over mean squared Bellman error.

Main Methods:

Reformulated squared Bellman errors as a saddlepoint optimization problem.
Derived gradient-based algorithms for Huber Bellman error and Absolute Bellman error.
Formalized robust loss functions and analyzed their properties.

Main Results:

Proposed saddlepoint reformulations for robust Bellman errors.
Derived gradient-based algorithms for prediction and control settings.
Characterized solutions, showing advantages over mean squared Bellman error in specific scenarios.

Conclusions:

Robust Bellman error algorithms demonstrate improved stability in RL.
The proposed methods are less sensitive to meta-parameters compared to traditional approaches.
This work provides a theoretical foundation and practical algorithms for robust value function learning.