Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Reinforcement Schedules01:24

Reinforcement Schedules

144
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
144
Collisions in Multiple Dimensions: Problem Solving01:06

Collisions in Multiple Dimensions: Problem Solving

4.1K
In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...
4.1K
Observational Learning01:12

Observational Learning

168
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
168
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

106
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
106
Reinforcement01:23

Reinforcement

202
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
202
Stability of Equilibrium Configuration: Problem Solving01:13

Stability of Equilibrium Configuration: Problem Solving

606
The stability of equilibrium configurations is an important concept in physics, engineering, and other related fields. In simple terms, it refers to the tendency of an object or system to return to its equilibrium position after being disturbed. The stability of an equilibrium configuration can be analyzed by considering the potential energy function of the system and examining its behavior near the equilibrium point.
Problem-solving in the context of the stability of equilibrium configuration...
606

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Clinical features and gastrointestinal bleeding risk factors in IgA vasculitis patients: a retrospective study in a large volume centre.

Clinical and experimental rheumatology·2026
Same author

A dual-functional PEG-tyrosine hydrogel with photothermal effect and antioxidant capacity for cancer therapy and tissue regeneration.

Regenerative biomaterials·2026
Same author

ATP2B4 driven chromatin compaction exacerbates pancreatic cancer radiotherapy resistance.

Cell death discovery·2026
Same author

Overcoming Biofilm Barriers in Periodontitis: A Lectin-Targeted Conjugate for Enhanced Antimicrobial Photodynamic Therapy.

Journal of dentistry·2026
Same author

Knowledge, attitude, and practices on gestational weight gain among pregnant women, partners, female household members, and healthcare providers: a mixed-method study in Tanzania.

BMC pregnancy and childbirth·2026
Same author

Endoscopic features associated with hospitalization outcomes in IgA vasculitis patients: a single-center retrospective cohort study.

Frontiers in immunology·2026
Same journal

Granular Ball-Based Noise-Resistant Fuzzy Multineighborhood Feature Selection via Label Enhancement and Feature Graph.

IEEE transactions on neural networks and learning systems·2026
Same journal

Fighting Evolving Spam With ARTMAP Models: A Noise-Resilient Online Detection Framework.

IEEE transactions on neural networks and learning systems·2026
Same journal

HyperSAT: Unsupervised Hypergraph Neural Networks for Weighted MaxSAT Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

Negation of Basic Belief Assignment in Multisource Information Fusion on Dempster-Shafer Theory With Applications in Pattern Classification.

IEEE transactions on neural networks and learning systems·2026
Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Jun 28, 2025

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control
08:18

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

5.0K

Adaptive Individual Q-Learning-A Multiagent Reinforcement Learning Method for Coordination Optimization.

Zhen Zhang, Dongqing Wang

    IEEE Transactions on Neural Networks and Learning Systems
    |April 16, 2024
    PubMed
    Summary
    This summary is machine-generated.

    We introduce adaptive individual Q-learning (A-IQL), a cooperative multiagent reinforcement learning (MARL) algorithm. A-IQL effectively adapts to changing environments, optimizing coordination in dynamic settings like traffic flow.

    More Related Videos

    Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit
    05:30

    Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

    Published on: September 8, 2023

    542
    A Modified Lean and Release Technique to Emphasize Response Inhibition and Action Selection in Reactive Balance
    07:19

    A Modified Lean and Release Technique to Emphasize Response Inhibition and Action Selection in Reactive Balance

    Published on: March 19, 2020

    5.9K

    Related Experiment Videos

    Last Updated: Jun 28, 2025

    WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control
    08:18

    WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

    Published on: August 15, 2020

    5.0K
    Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit
    05:30

    Large Scale Energy Efficient Sensor Network Routing Using a Quantum Processor Unit

    Published on: September 8, 2023

    542
    A Modified Lean and Release Technique to Emphasize Response Inhibition and Action Selection in Reactive Balance
    07:19

    A Modified Lean and Release Technique to Emphasize Response Inhibition and Action Selection in Reactive Balance

    Published on: March 19, 2020

    5.9K

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Robotics

    Background:

    • Multiagent reinforcement learning (MARL) is utilized for coordination optimization due to its scalability and task distribution capabilities.
    • Existing MARL convergence results are largely limited to repeated games, neglecting adaptation to dynamic environments.
    • Few MARL algorithms address environmental shifts, such as fluctuating traffic or unexpected obstacles for automated guided vehicles.

    Purpose of the Study:

    • To propose a novel cooperative MARL algorithm, adaptive individual Q-learning (A-IQL), designed for adaptation to switched environments.
    • To analyze the convergence properties of A-IQL in stochastic games with chronologically ordered deterministic state transitions.
    • To investigate the impact of the update period (T) on A-IQL's convergence.

    Main Methods:

    • The adaptive individual Q-learning (A-IQL) algorithm is proposed, where each agent updates its Q-function with a period T.
    • Convergence analysis is performed for stochastic games with deterministic state transitions in chronological order.
    • A fictitious stochastic game is used to study the influence of period T on convergence.
    • The algorithm's efficacy is validated through simulations in two distinct switched environments: distributed sensor network (DSN) and target transportation tasks.

    Main Results:

    • A-IQL demonstrates the ability to learn optimal joint strategies in stochastic games with specific transition properties.
    • The study analyzes the relationship between the update period T and the algorithm's convergence behavior.
    • Empirical validation confirms A-IQL's effectiveness in dynamic scenarios, including DSN and target transportation tasks.

    Conclusions:

    • The proposed A-IQL algorithm offers a viable solution for coordination optimization in multiagent systems facing dynamic and switched environments.
    • A-IQL provides a framework for agents to adapt their strategies effectively, enhancing overall system performance.
    • The findings highlight the importance of adaptive mechanisms in MARL for real-world applications.