Reinforcement
Incentive Theory: Pull Theory of Motivation
Gradient and Del Operator
Primary and Secondary Reinforcers
Reinforcement Schedules
Generalization, Discrimination, and Extinction
您也可能阅读
通过共同作者、期刊和引用图与本文相关的文章。
Updated: Sep 17, 2025

Pavlovian Conditioned Approach Training in Rats
Published on: February 4, 2016
Jacopo Castellini1, Sam Devlin2, Frans A Oliehoek3
1Department of Computer Science, University of Liverpool, Liverpool, UK.
通过将差异奖励与政策梯度相结合,Dr.Reinforce为多代理强化学习提供了一个新的解决方案. 这种方法有效地解决了分散政策的多代理信用分配问题,即使奖励函数是未知的.
科学领域:
背景情况:
研究的目的:
主要方法:
主要成果:
结论: