Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Purposive Learning

Purposive Learning

E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a...

Behaviorism

Behaviorism

The field of behaviorism was pioneered by figures such as Ivan Pavlov, John B. Watson, and B.F. Skinner fundamentally shifted the focus of psychology to the observable and controllable aspects of human and animal behavior. This shift marked a critical evolution in the discipline, emphasizing scientific rigor and experimental methodology.
The core premise of behaviorism is its focus on observable behavior rather than internal thoughts or feelings. This approach argues that true scientific...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects.

Frontiers in robotics and AI·2021

Same author

Discovering space - Grounding spatial topology and metric regularity in a naive agent's sensorimotor experience.

Neural networks : the official journal of the International Neural Network Society·2018

Same journal

Computing Optimal Populations for Binary Problems using Logic Minimization.

Evolutionary computation·2026

Same journal

Enhancing Generalization and Scalability for Multi-Objective Optimization with Population Pre-Training.

Evolutionary computation·2026

Same journal

XCS for Sequential Perceptual Aliasing in Multi-Step Decision Making.

Evolutionary computation·2026

Same journal

A dynamic multi-objective evolutionary algorithm using dual-space prediction and surrogate-based sampling.

Evolutionary computation·2026

Same journal

Adapting MOEA/D to CMA-ES for Dealing with Ill-conditioned Multiobjective Problems.

Evolutionary computation·2026

Same journal

Editorial of the Special Issue: Parallel Problem Solving from Nature PPSN 2024 Extended Versions of Best Paper Candidates.

Evolutionary computation·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 15, 2025

An Open-Source Virtual Reality System for the Measurement of Spatial Learning in Head-Restrained Mice

An Open-Source Virtual Reality System for the Measurement of Spatial Learning in Head-Restrained Mice

Published on: March 3, 2023

Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.

Giuseppe Paolo¹, Miranda Coninx², Alban Laflaquière³

¹AI Lab, SoftBank Robotics Europe Sorbonne Université, CNRS, Institut des Systèmes Intelligents et de Robotique, ISIR Paris, France giuseppe.paolo@softbankrobotics.com.

Evolutionary Computation

|October 4, 2023

Summary

This summary is machine-generated.

STAX, a new algorithm, autonomously learns and explores behavior spaces for reinforcement learning in sparse reward settings. This approach overcomes limitations of prior methods by eliminating the need for pre-defined behavior spaces.

Keywords:

Sparse rewards emitters evolutionary algorithms novelty search quality diversity

More Related Videos

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Related Experiment Videos

Last Updated: Jul 15, 2025

An Open-Source Virtual Reality System for the Measurement of Spatial Learning in Head-Restrained Mice

An Open-Source Virtual Reality System for the Measurement of Spatial Learning in Head-Restrained Mice

Published on: March 3, 2023

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Reinforcement learning in sparse reward environments is challenging due to limited feedback.
Effective exploration strategies are crucial for discovering reward signals.

Purpose of the Study:

Introduce STAX, an algorithm that learns behavior spaces on-the-fly for sparse reward settings.
Enable agents to explore and exploit rewards without pre-defined behavior spaces.

Main Methods:

STAX employs a two-step alternating process: policy exploration and reward exploitation.
It learns a low-dimensional behavior representation from high-dimensional observations.
Diverse policies are generated and evaluated in the learned behavior space.

Main Results:

STAX demonstrates comparable performance to existing baselines in sparse reward tasks.
The algorithm significantly reduces the requirement for task-specific prior information.
STAX autonomously constructs the necessary behavior space for exploration.

Conclusions:

STAX offers an effective solution for reinforcement learning in challenging sparse reward scenarios.
Its ability to learn behavior spaces dynamically enhances exploration efficiency.
The method shows promise for advancing autonomous agents in complex environments.