Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Reinforcement Schedules01:24

Reinforcement Schedules

160
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
160
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

57
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
57
Generalization, Discrimination, and Extinction01:24

Generalization, Discrimination, and Extinction

575
Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...
575
Reinforcement01:23

Reinforcement

221
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
221

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE.

Frontiers in robotics and AI·2025
Same author

Resilient Supervisory Multi-Agent Systems.

IEEE transactions on robotics : a publication of the IEEE Robotics and Automation Society·2024
Same author

Cooperative planning for physically interacting heterogeneous robots.

Frontiers in robotics and AI·2024
Same author

Editorial: Thought leaders in robotics and AI.

Frontiers in robotics and AI·2023
Same author

Design and Construction of Unmanned Ground Vehicles for Sub-canopy Plant Phenotyping.

Methods in molecular biology (Clifton, N.J.)·2022
Same author

Non-Smooth Control Barrier Navigation Functions for STL Motion Planning.

Frontiers in robotics and AI·2022
Same journal

On the control of recurrent neural networks using constant inputs.

IEEE transactions on automatic control·2026
Same journal

Robust Control Barrier Functions for Uncertain Parameter-Varying Control Affine Systems with Set-Membership Parameter Estimation.

IEEE transactions on automatic control·2026
Same journal

Estimation in Networks with Spatiotemporally Correlated Noise.

IEEE transactions on automatic control·2026
Same journal

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse.

IEEE transactions on automatic control·2025
Same journal

Transient Analysis of Serial Production Lines With Perishable Products: Bernoulli Reliability Model.

IEEE transactions on automatic control·2024
Same journal

Solid Boundary Output Feedback Control of the Stefan Problem: The Enthalpy Approach.

IEEE transactions on automatic control·2024
See all related articles

Related Experiment Video

Updated: Jul 12, 2025

The HoneyComb Paradigm for Research on Collective Human Behavior
06:48

The HoneyComb Paradigm for Research on Collective Human Behavior

Published on: January 19, 2019

9.4K

PAC Reinforcement Learning Algorithm for General-Sum Markov Games.

Ashkan Zehfroosh1, Herbert G Tanner1

  • 1Department of Mechanical Engineering, University of Delaware, Newark, DE 19716 USA.

IEEE Transactions on Automatic Control
|November 2, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) in Markov games. It presents a novel PAC MARL algorithm for general-sum games, enhancing existing methods and enabling PAC verification.

Keywords:
Markov GameMulti-agent systemNash EquilibriumProbably approximately correctReinforcement Learning

More Related Videos

Pavlovian Conditioned Approach Training in Rats
06:57

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

11.0K
New Variations for Strategy Set-shifting in the Rat
09:45

New Variations for Strategy Set-shifting in the Rat

Published on: January 23, 2017

8.2K

Related Experiment Videos

Last Updated: Jul 12, 2025

The HoneyComb Paradigm for Research on Collective Human Behavior
06:48

The HoneyComb Paradigm for Research on Collective Human Behavior

Published on: January 19, 2019

9.4K
Pavlovian Conditioned Approach Training in Rats
06:57

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

11.0K
New Variations for Strategy Set-shifting in the Rat
09:45

New Variations for Strategy Set-shifting in the Rat

Published on: January 23, 2017

8.2K

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Game Theory

Background:

  • Multi-agent reinforcement learning (MARL) is crucial for complex decision-making.
  • Markov games are standard models for strategic interactions.
  • Existing MARL algorithms often lack theoretical performance guarantees.

Purpose of the Study:

  • To develop a theoretical framework for probably approximately correct (PAC) MARL algorithms.
  • To introduce a novel PAC MARL algorithm for general-sum Markov games.
  • To provide a method for verifying the PAC property of MARL algorithms.

Main Methods:

  • Extension of Nash Q-learning using delayed Q-learning principles.
  • Development of a theoretical PAC framework for MARL.
  • Comparative numerical simulations to evaluate algorithm performance.

Main Results:

  • A new PAC MARL algorithm for general-sum Markov games is proposed.
  • The theoretical framework allows for PAC verification of MARL algorithms.
  • Numerical results validate the algorithm's performance and robustness.

Conclusions:

  • The proposed framework advances PAC MARL theory.
  • The novel algorithm offers provable PAC guarantees.
  • The framework facilitates the design and analysis of reliable MARL systems.