Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Generalization, Discrimination, and Extinction

Generalization, Discrimination, and Extinction

Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Liposome-cyclodextrin synergistic system for enhanced chiral separation of Fmoc-amino acid enantiomers in capillary electrophoresis.

Mikrochimica acta·2026

Same author

An anti-swelling and wet-adhesive nanocellulose hydrogel sensor for underwater communication.

Materials horizons·2026

Same author

Model-informed safety management of tocilizumab for pediatric sJIA: a PBPK approach for dose-escalation and vaccination timing.

Frontiers in immunology·2026

Same author

Development and Interpretable Machine Learning-Based Prediction of Cardiovascular Disease Risk in Chinese COPD Patients: An Analysis of the CHARLS Database.

International journal of chronic obstructive pulmonary disease·2026

Same author

Comparative Evaluation of Functional Outcomes and Postoperative Complications After Minimally Invasive Fixation for Acromioclavicular Joint Injuries.

British journal of hospital medicine (London, England : 2005)·2026

Same author

Metabolic reconfiguration via bioenergetic repair of constructed wetlands: How magnesite transforms rhizosphere functionality in acid mine drainage treatment.

Journal of hazardous materials·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 16, 2025

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

Published on: July 8, 2015

Safe Reinforcement Learning With Dual Robustness.

Zeyang Li, Chuxiong Hu, Yunan Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence

|August 15, 2024

Summary

This summary is machine-generated.

This study introduces a unified framework for reinforcement learning (RL) agents, enabling them to be both safe and robust against adversarial disturbances. The novel dual policy iteration scheme ensures agents maintain performance and safety even under worst-case scenarios.

More Related Videos

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Related Experiment Videos

Last Updated: Jun 16, 2025

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

Published on: July 8, 2015

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Reinforcement learning (RL) agents are susceptible to adversarial attacks, compromising performance and safety.
Existing safe RL and robust RL methods address these issues in isolation, leaving a gap for unified solutions.
The challenge lies in balancing feasibility and optimality under adversarial conditions.

Purpose of the Study:

To develop a systematic framework unifying safe and robust reinforcement learning.
To address the intertwined challenges of feasibility and optimality in adversarial environments.
To create a reinforcement learning agent that is both safe and performs optimally under adversarial disturbances.

Main Methods:

Formulation of the problem as constrained two-player zero-sum Markov games.
Proposal of a dual policy iteration scheme to simultaneously optimize task and safety policies.
Development of a practical deep reinforcement learning algorithm, dually robust actor-critic (DRAC), using adversarial networks.

Main Results:

The dual policy iteration scheme converges to optimal policies for both task performance and safety.
DRAC demonstrates high performance and persistent safety across various adversarial scenarios (no adversary, safety adversary, performance adversary).
DRAC significantly outperforms existing baseline methods on safety-critical benchmarks.

Conclusions:

The proposed framework effectively unifies safe and robust reinforcement learning.
DRAC offers a practical and effective solution for developing agents that are both safe and robust in adversarial settings.
This work advances the development of reliable AI systems in safety-critical applications.