Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Targeted Etching Strategy to Expand Low-Voltage Plateau of Pitch-Based Hard Carbon for Sodium Storage.

ACS applied materials & interfaces·2026

Same author

RBM10 Deficiency Promotes Anti-PD-1 Resistance in LUAD via STING Alternative Splicing-Driven CCL7 Signaling and Macrophage Polarization.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026

Same author

Large-scale integrated optoelectronic chaos for machine learning acceleration.

Nature communications·2026

Same author

Frustrated Lewis Pair and Photocatalysis Synergistically Promote Copper Nanocluster Catalysis.

ACS nano·2026

Same author

Vertical Interaction between Thiourea and Perovskite Surface Results in Obviously Enhanced Performance with PCE Surpassing 24% Efficiency.

ACS applied materials & interfaces·2026

Same author

Unlocking miniature brilliance: micro/nanorobots for advanced dental theranostics.

Journal of nanobiotechnology·2026

Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026

Same journal

Self-Supervised Continuous Dynamic Graph Representation Learning via Hawkes Processes.

IEEE transactions on neural networks and learning systems·2026

Same journal

cPU: Consistent Risk Estimator for Positive-Unlabeled Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Tuning-Free Latent Diffusion Models for Ultrahigh-Resolution Image Editing.

IEEE transactions on neural networks and learning systems·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Sep 16, 2025

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

通过相互信息规范化进行强有力的多代理增强学习.

Simin Li, Ruixiao Xu, Jingqiao Xiu

IEEE transactions on neural networks and learning systems

|July 9, 2025

概括

此摘要是机器生成的。

本研究介绍了相互信息规范化作为强大的规范化 (MIR3) 强大的多代理强化学习 (MARL). 在复杂的系统中,MIR3增强了代理人的谨慎性,提高了强度和训练效率,以防复杂系统中的对抗行为.

更多相关视频

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

相关实验视频

Last Updated: Sep 16, 2025

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

科学领域:

人工智能的人工智能
机器人技术机器人技术机器人技术
控制理论控制理论

背景情况:

合作的多代理强化学习 (MARL) 面临着由于不可预测或对抗性代理行动的强度挑战.
现有的强大的 MARL 方法在计算复杂性和不足的强度上扎,因为代理人数量增加.
人类的决策提供了一个强有力的行为模式,通过一般的谨慎而不是详尽的威胁准备.

研究的目的:

开发一种由人类决策启发的新型强大的MARL方法,以解决计算和强度的局限性.
引入相互信息规范化作为强有力的规范化 (MIR3),以实现隐性最坏情况下的强度优化.
加强MARL代理人的谨慎性和政策与强有力的行动优先事项的协调.

主要方法:

将强大的MARL作为一个控制作为推理问题的框架.
采用政策之外的评估来隐性优化最坏情况下的稳定性.
引入MIR3,一种相互信息规范化技术,以在训练期间最大限度地提高强度的下限.

主要成果:

在MARL中,MIR3在稳定性和训练效率方面明显超过了基线方法.
该方法在复杂的模拟中保持合作性能,如StarCraft II和群控制任务.
在机器人群控制中实际部署MIR3比最好的基线提高了14.29%的奖励.

结论:

MIR3提供了一种有效和高效的方法,通过充当信息瓶并促进谨慎的代理行为来实现强大的MARL.
拟议的方法为现实世界MARL应用程序提供了可扩展的解决方案,这些应用程序需要在逆境条件下具有弹性.
与现有方法相比,MIR3在各种合作MARL场景中表现出卓越的性能和稳定性.