Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Collisions in Multiple Dimensions: Problem Solving

Collisions in Multiple Dimensions: Problem Solving

In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Uniform zinc oxide nanowire arrays grown on nonepitaxial surface with general orientation control.

Nano letters·2013

Same author

[American head and neck surgery progress of in 2012].

Zhonghua er bi yan hou tou jing wai ke za zhi = Chinese journal of otorhinolaryngology head and neck surgery·2013

Same author

A compact thermo-optical multimode-interference silicon-based 1 × 4 nano-photonic switch.

Optics express·2013

Same author

Experimental demonstration of 110-Gb/s unsynchronized band-multiplexed superchannel coherent optical OFDM/OQAM system.

Optics express·2013

Same author

Potentially functional variants of p14ARF are associated with HPV-positive oropharyngeal cancer patients and survival after definitive chemoradiotherapy.

Carcinogenesis·2013

Same author

Enhanced molecular transport in hierarchical silicalite-1.

Langmuir : the ACS journal of surfaces and colloids·2013

Same journal

Q-learning based asynchronous Boolean control networks stabilization with data loss.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

New results on prescribed-time synchronization of complex networks via intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Variance-constrained multi-view ensemble broad network for imbalanced data.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jun 7, 2025

Operation of the Collaborative Composite Manufacturing CCM System

Operation of the Collaborative Composite Manufacturing CCM System

Published on: October 1, 2019

通过双重协作约束协调多代理强化学习.

Chao Li¹, Shaokang Dong¹, Shangdong Yang²

¹State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China.

Neural networks : the official journal of the International Neural Network Society

|November 17, 2024

概括

此摘要是机器生成的。

本研究介绍了双协作约束 (DCC),这是一种用于多代理强化学习的新算法. DCC有效地协调了几乎可以分解的任务中的代理人,提高了学习效率.

关键词:

合作任务合作任务协调协调协调多个代理强化学习学习多个代理强化学习学习几乎可以分解的结构结构.

更多相关视频

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

相关实验视频

Last Updated: Jun 7, 2025

Operation of the Collaborative Composite Manufacturing CCM System

Operation of the Collaborative Composite Manufacturing CCM System

Published on: October 1, 2019

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

科学领域:

人工智能的人工智能
机器学习机器学习
机器人技术机器人技术机器人技术

背景情况:

现实世界的多代理任务往往具有几乎可分解的结构.
在这些任务中协调代理人对于多代理强化学习 (MARL) 的学习效率至关重要.
现有的MARL算法很难有效地建模和利用这种可分解结构.

研究的目的:

提出一种新的算法,双协作约束 (DCC),用于合作的多代理任务.
解决现有方法在处理几乎可分解的任务结构方面的局限性.
通过改善代理人协调来提高学习效率.

主要方法:

DCC将交互集识别为子任务.
它采用双层结构,将代理人分为子任务.
基于相互信息的本地和全球协作约束被建议用于内部和跨子任务的协调.

主要成果:

DCC实现了分任务内部的共识和分任务之间的高级联合行动.
副任务中的代理人就当地行动达成共识.
该算法最大限度地提高了整体任务性能.
实验评估显示,与最先进的基线相比,性能优越.

结论:

在多代理任务中,DCC有效地模拟了几乎可分解的结构.
该算法提高了合作MARL中的协调和学习效率.
与现有方法相比,DCC显示出了显著的改进.