Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Reinforcement Schedules01:24

Reinforcement Schedules

649
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
649
Reinforcement01:23

Reinforcement

1.1K
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
1.1K
Observational Learning01:12

Observational Learning

1.1K
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
1.1K
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

449
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence of...
449
Associative Learning01:27

Associative Learning

1.6K
Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...
1.6K
Avoidance Learning and Learned Helplessness01:14

Avoidance Learning and Learned Helplessness

2.8K
Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...
2.8K

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Hydrophobic Promoter-Enhanced Tandem Catalysis for Alkene Epoxidation With H<sub>2</sub> and O<sub>2</sub>.

Angewandte Chemie (International ed. in English)·2026
Same author

Research on a rapid and accurate diagnosis platform for liver fibrosis based on machine learning-assisted SERS technology.

Biomedical optics express·2026
Same author

Slow-wave sleep engages brainstem circuitry to prevent stress-induced anxiety.

Neuron·2026
Same author

Clustering characteristics of upper gastrointestinal cancer risk behaviours and their association with social determinants of health: a latent class analysis.

Scientific reports·2026
Same author

Mechanical cues as immunomodulators in neuroinflammation-driven spinal sensitization: analgesic mechanisms and therapeutic strategies.

Frontiers in immunology·2026
Same author

Magnetic resonance imaging-based radiomics of mesorectum for predicting extramural venous invasion in patients with rectal cancer: a bi-centric study.

Cancer imaging : the official publication of the International Cancer Imaging Society·2026
Same journal

TraNce: Type-aware hypergraph neural network with biological mediators for drug repositioning.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Decentralized ADMM for factorization-based Low-rank matrix estimation.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Memristive neuromorphic circuit design inspired by the neural mechanisms of conditioned fear.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Q-learning based asynchronous Boolean control networks stabilization with data loss.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

New results on prescribed-time synchronization of complex networks via intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Variance-constrained multi-view ensemble broad network for imbalanced data.

Neural networks : the official journal of the International Neural Network Society·2026
查看所有相关文章

相关实验视频

Updated: Mar 6, 2026

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
07:14

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

Published on: December 23, 2025

182

一个多代理持续强化学习框架,具有多时间尺度重复和动态任务分类.

Yang Liu1, Xiang Feng1, Huiqun Yu1

  • 1Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China.

Neural networks : the official journal of the International Neural Network Society
|March 4, 2026
PubMed
概括
此摘要是机器生成的。

这项研究引入了一种新的多因素持续强化学习 (MACRL) 框架. 它通过减少遗忘和改进知识转移来增强动态系统中的学习,以改善决策.

关键词:
多代理持续强化学习的学习.灾难性的遗忘.动态任务分类 动态任务分类多时间尺度重复播放.

更多相关视频

Automated Interactive Video Playback for Studies of Animal Communication
07:21

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

14.1K

相关实验视频

Last Updated: Mar 6, 2026

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
07:14

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

Published on: December 23, 2025

182
Automated Interactive Video Playback for Studies of Animal Communication
07:21

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

14.1K

科学领域:

  • 人工智能的人工智能
  • 机器学习 机器学习
  • 多代理系统 多代理系统

背景情况:

  • 传统的强化学习在非静止,多代理环境中与灾难性遗忘和知识转移不佳作斗争.
  • 动态环境在多代理系统中对顺序任务学习提出了重大挑战.

研究的目的:

  • 提出一个创新的多代理持续强化学习 (MACRL) 框架.
  • 解决灾难性遗忘问题,并在动态的多代理系统中加强跨任务知识转移.
  • 在复杂,不断变化的环境中实现可扩展的协作决策.

主要方法:

  • 引入了多时间尺度重复 (MTR) 缓冲器,用于跨时间尺度的分层体验存储.
  • 开发了一个动态任务分类机制,使用基于注意力的上下文编码器来测量任务相似性.
  • 实现了自适应性策略路由,以最大限度地减少任务间干扰.

主要成果:

  • 与基线相比,MACRL框架在合作基准 (LBF和PP) 上的顺序任务学习中取得了更高的平均回报.
  • 证明了卓越的零射击通用化性能.
  • 废除研究证实了MTR和任务分类在缓解灾难性遗忘方面的有效性.

结论:

  • 拟议的MACRL框架为动态多代理系统的持续学习提供了一个可扩展的解决方案.
  • MTR缓冲区和动态任务分类对于保留知识和最小化干扰至关重要.
  • 这种方法在复杂,不断变化的环境中改善了协作决策.