Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Distribution Reliability and Automation

Distribution Reliability and Automation

Distribution reliability in electrical power systems is critical for ensuring an uninterrupted power supply to consumers at minimal cost. According to IEEE Standard Terms, reliability is the probability that a device will function without failure over a specified time period or amount of usage. For electric power distribution, this translates to maintaining continuous power supply and addressing customer concerns over power outages. Several indices, as defined by IEEE Standard 1366-2012, are...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Transformers in Distribution System

Transformers in Distribution System

Transformers in distribution systems can be broadly categorized into distribution substation transformers and other distribution transformers. They are crucial for stepping down high transmission voltages to levels suitable for distribution and end-user applications.
Distribution substation transformers come in various ratings and typically use mineral oil for insulation and cooling. To prevent moisture and air from entering the oil, some transformers use an inert gas like nitrogen to fill the...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Herpes zoster as a vaccine-preventable risk factor increases the risk of dementia: A nested case-control study in Chinese population.

Human vaccines & immunotherapeutics·2026

Same author

Biomimetic Microstructured Scaffold with Release of Re-Modified Teriparatide for Osteoporotic Tendon-to-Bone Regeneration via Balancing Bone Homeostasis.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2025

Same author

WAKE: Towards Robust and Physically Feasible Trajectory Prediction for Autonomous Vehicles With WAvelet and KinEmatics Synergy.

IEEE transactions on pattern analysis and machine intelligence·2025

Same author

Real-time accident anticipation for autonomous driving through monocular depth-enhanced 3D modeling.

Accident; analysis and prevention·2024

Same author

Learning Disentangled Representation for One-Shot Progressive Face Swapping.

IEEE transactions on pattern analysis and machine intelligence·2024

Same author

Efficient and robust estimation of single-vehicle crash severity: A mixed logit model with heterogeneity in means and variances.

Accident; analysis and prevention·2023

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

IGFD-Net: Illumination-guided frequency decoupling for polarization image fusion.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Multiple-Strategies dung beetle optimizer and its applications in engineering optimization and bankruptcy prediction.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Finite-Time intermittent control for secure synchronization of Neutral-Type stochastic delayed neural networks under aperiodic DoS attacks.

Neural networks : the official journal of the International Neural Network Society·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jun 4, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

一个全价值的分布式深度强化学习框架,用于多代理合作.

Mingsheng Fu¹, Liwei Huang¹, Fan Li²

¹School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, China.

Neural networks : the official journal of the International Neural Network Society

|December 18, 2024

概括

此摘要是机器生成的。

本研究引入了一个全分布式多代理强化学习 (RL) 的新框架,该框架保证了个人-全球-最大原则. 拟议的全分布式多代理合作 (FDMAC) 模型显著提高了复杂的合作任务的性能.

关键词:

深度强化学习的学习.分布式强化学习的学习.多机构合作多机构合作神经网络的神经网络的神经网络

更多相关视频

Automated Interactive Video Playback for Studies of Animal Communication

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

相关实验视频

Last Updated: Jun 4, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Automated Interactive Video Playback for Studies of Animal Communication

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

科学领域:

人工智能的人工智能
机器学习机器学习
多代理系统多代理系统

背景情况:

分布强化学习 (RL) 模拟了整个回报分布,提供了比预期值更丰富的见解.
现有的分布式多代理系统在使用传统的价值分解时,难以满足个人-全球-最大 (IGM) 原则.
一个完全分布式的多代理系统需要个人和全球价值函数都以分布式形式存在.

研究的目的:

提出一个全新的全价值分布式多代理框架,保证IGM原则.
基于这个框架,引入一个实际的深度强化学习模型,完全分布式多代理合作 (FDMAC).
在复杂的多代理合作场景中验证FDMAC的有效性.

主要方法:

为完全分布式的多代理系统开发了一个新的价值分解框架.
证明,拟议的框架确保了满足IGM原则的要求.
实施了完全分布式多代理合作 (FDMAC) 深度强化学习模型.

主要成果:

拟议的框架保证了完全分布式多代理系统中的IGM原则.
在StarCraft多代理挑战赛中,FDMAC模型表现出卓越的性能.
与最佳基线相比,FDMAC的中位数测试胜率平均提高了10.47%.

结论:

新的框架有效地解决了现有的分布式多代理RL的局限性.
FDMAC代表了合作型多代理强化学习的重大进步.
结果强调了在复杂的合作任务中完全分布式价值函数的好处.