Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Observational Learning01:12

Observational Learning

250
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
250
Collisions in Multiple Dimensions: Problem Solving01:06

Collisions in Multiple Dimensions: Problem Solving

4.3K
In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...
4.3K
Reinforcement01:23

Reinforcement

304
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
304
Masking and Demasking Agents01:19

Masking and Demasking Agents

2.5K
EDTA titrations may necessitate masking and demasking agents to temporarily protect a particular metal ion in a mixture from the EDTA reaction. These agents facilitate the sequential analysis of the metal ions by forming stable complexes with some—but not all—metal ions during certain steps.
There are many masking agents, such as cyanide, fluoride, triethanolamine, thiourea, and 2,3-bis(sulfanyl)propan-1-ol (formerly 2,3-dimercapto-1-propanol), with the masking agent chosen based on...
2.5K
Collisions in Multiple Dimensions: Introduction01:05

Collisions in Multiple Dimensions: Introduction

5.5K
It is far more common for collisions to occur in two dimensions; that is, the initial velocity vectors are neither parallel nor antiparallel to each other. Let's see what complications arise from this. The first idea is that momentum is a vector. Like all vectors, it can be expressed as a sum of perpendicular components (usually, though not always, an x-component and a y-component, and a z-component if necessary). Thus, when the statement of conservation of momentum is written for a...
5.5K
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

134
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
134

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Diversification in ANME-1 archaea is associated with the presence of highly variable genomic hotspots.

Nature communications·2026
Same author

Connexin 43 promotes stemness of leukemia cells and chemoresistance in T-cell acute lymphoblastic leukemia via the RAC1/AKT/GSK3β axis.

Chinese medical journal·2026
Same author

Ultrastable Soft Capacitive Tactile Sensor with Impedance-Modulated Signal.

Soft robotics·2026
Same author

Transposable element-driven expansion of enhancer RNA repertoires underlies regulatory innovation and polyploid adaptation in cereal crops.

Plant communications·2026
Same author

GD3s-mediated lipid metabolism reprograming promotes proliferation and metastasis of melanoma.

Journal of translational medicine·2026
Same author

A Survey on Vision-Language-Action Models for Embodied AI.

IEEE transactions on neural networks and learning systems·2026
Same journal

Granular Ball-Based Noise-Resistant Fuzzy Multineighborhood Feature Selection via Label Enhancement and Feature Graph.

IEEE transactions on neural networks and learning systems·2026
Same journal

Fighting Evolving Spam With ARTMAP Models: A Noise-Resilient Online Detection Framework.

IEEE transactions on neural networks and learning systems·2026
Same journal

HyperSAT: Unsupervised Hypergraph Neural Networks for Weighted MaxSAT Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

Negation of Basic Belief Assignment in Multisource Information Fusion on Dempster-Shafer Theory With Applications in Pattern Classification.

IEEE transactions on neural networks and learning systems·2026
Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Aug 4, 2025

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants
06:28

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

6.0K

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain.

Jianye Hao, Tianpei Yang, Hongyao Tang

    IEEE Transactions on Neural Networks and Learning Systems
    |April 6, 2023
    PubMed
    Summary
    This summary is machine-generated.

    Deep reinforcement learning (DRL) and multiagent reinforcement learning (MARL) face sample inefficiency due to the exploration problem. This survey categorizes and compares exploration methods to improve DRL and MARL efficiency.

    More Related Videos

    A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
    05:41

    A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

    Published on: February 6, 2020

    9.5K
    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function
    06:17

    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

    Published on: January 26, 2024

    2.0K

    Related Experiment Videos

    Last Updated: Aug 4, 2025

    A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants
    06:28

    A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

    Published on: August 26, 2018

    6.0K
    A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
    05:41

    A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

    Published on: February 6, 2020

    9.5K
    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function
    06:17

    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

    Published on: January 26, 2024

    2.0K

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Robotics

    Background:

    • Deep reinforcement learning (DRL) and deep multiagent reinforcement learning (MARL) show promise in AI, autonomous vehicles, and robotics.
    • A key limitation is sample inefficiency, requiring millions of interactions, hindering real-world deployment.
    • The exploration problem, or efficiently gathering informative experiences, is a major bottleneck, especially in complex environments.

    Purpose of the Study:

    • To provide a comprehensive survey of exploration methods in single-agent and multiagent reinforcement learning.
    • To systematically classify existing exploration approaches.
    • To empirically compare different exploration methods for DRL and identify future research directions.

    Main Methods:

    • Categorization of exploration methods into uncertainty-oriented and intrinsic motivation-oriented approaches.
    • Inclusion of other notable exploration techniques.
    • Algorithmic analysis and a unified empirical comparison of DRL exploration methods on standard benchmarks.

    Main Results:

    • Identification of key challenges in efficient exploration for DRL and MARL.
    • Systematic review and classification of existing exploration strategies.
    • Empirical evaluation highlighting the performance of various exploration techniques.

    Conclusions:

    • Exploration remains a critical challenge in DRL and MARL.
    • The survey provides a foundation for understanding and advancing exploration strategies.
    • Future research should focus on addressing open problems and developing more efficient exploration techniques.