Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Reinforcement Schedules01:24

Reinforcement Schedules

398
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
398
Operant Conditioning01:21

Operant Conditioning

2.7K
Operant conditioning, a key concept in behavioral psychology, involves using reinforcement and punishment to alter the likelihood of a behavior being repeated. B.F. introduced this type of conditioning. Skinner focused on voluntary behaviors and the consequences that follow them, influencing whether these behaviors will be strengthened or diminished.
Reinforcement in operant conditioning can be positive or negative, both of which serve to increase the likelihood of a behavior. Positive...
2.7K
Reinforcement01:23

Reinforcement

748
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
748
Primary and Secondary Reinforcers01:23

Primary and Secondary Reinforcers

769
In psychology, reinforcement is a key concept in behavior modification. B.F. Skinner demonstrated this with his experiments involving rats in what is known as a Skinner box. The rats learned to press a lever to receive food, a primary reinforcer that fulfilled their innate need for nourishment.
Effective reinforcers for humans vary depending on the individual and the context. Primary reinforcers, such as food, water, sleep, shelter, and pleasure, have inherent value and satisfy basic biological...
769
Observational Learning01:12

Observational Learning

751
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
751
Decision Making: P-value Method01:09

Decision Making: P-value Method

6.7K
The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim  is also stated. These statements can act as null and alternative hypotheses:  a null hypothesis would be a neutral statement while the alternative hypothesis can...
6.7K

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

AI-Discovered Cognitive Models Reveal Novel Insights into Human and Animal Learning.

bioRxiv : the preprint server for biology·2026
Same author

Accelerating scientific discovery with Co-Scientist.

Nature·2026
Same author

Dopamine in the ventral and tail of striatum supports global and local evaluation in reward-threat conflict.

bioRxiv : the preprint server for biology·2026
Same author

Spectral envelopes of facial movements predict intention, cortical representations, and neural prosthetic control.

bioRxiv : the preprint server for biology·2026
Same author

A novel behavioral paradigm using mice to study predictive postural control.

Frontiers in neuroscience·2026
Same author

Technological <i>folie à deux</i>: feedback loops between AI chatbots and mental health.

Nature. Mental health·2026
Same journal

Retraction Note: NSD2 targeting reverses plasticity and drug resistance in prostate cancer.

Nature·2026
Same journal

Enhanced B cell priming induces broadly neutralizing HIV-1 apex antibodies.

Nature·2026
Same journal

Vaccination elicits HIV broadly neutralizing antibodies in primates.

Nature·2026
Same journal

Child online safety needs more than social-media bans.

Nature·2026
Same journal

Ebola preparedness must start with ecosystems and before humans show symptoms.

Nature·2026
Same journal

AI tools can speed up thinking, but evidence still comes from the lab bench.

Nature·2026
查看所有相关文章

相关实验视频

Updated: Dec 30, 2025

Studying Food Reward and Motivation in Humans
12:09

Studying Food Reward and Motivation in Humans

Published on: March 19, 2014

24.0K

基于多巴胺的强化学习的分配代码

Will Dabney1, Zeb Kurth-Nelson2,3, Naoshige Uchida4

  • 1DeepMind, London, UK. wdabney@google.com.

Nature
|January 17, 2020
PubMed
概括
此摘要是机器生成的。

基于多巴胺的强化学习可以将奖励表现为概率分布,而不仅仅是单个值. 这项研究提供了支持大脑中这种分布式强化学习模型的神经证据.

更多相关视频

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents
09:13

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents

Published on: May 3, 2012

14.8K
Pavlovian Conditioned Approach Training in Rats
06:57

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

11.4K

相关实验视频

Last Updated: Dec 30, 2025

Studying Food Reward and Motivation in Humans
12:09

Studying Food Reward and Motivation in Humans

Published on: March 19, 2014

24.0K
A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents
09:13

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents

Published on: May 3, 2012

14.8K
Pavlovian Conditioned Approach Training in Rats
06:57

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

11.4K

科学领域:

  • 神经科学
  • 计算神经科学
  • 人工智能

背景情况:

  • 多巴胺的规范奖励预测错误理论解释了大脑中的奖励和价值表现.
  • 这个理论假定奖励预测被表示为单个标量,表示随机结果的平均值.

研究的目的:

  • 提出和测试一种基于多巴胺的强化学习的新方法,其灵感来源于人工智能中的分布式强化学习.
  • 研究大脑是否代表潜在的未来奖励作为一个概率分布而不是一个单一的平均值.

主要方法:

  • 采用了小鼠腹膜区域的单个单位记录.
  • 从分布式强化学习假设中得出的经验预测.

主要成果:

  • 这些发现提供了强有力的证据,支持分布式强化学习的神经基础.
  • 证明多巴胺神经元可能编码未来奖励的分布.

结论:

  • 大脑对奖励的表现可能比以前想象的更复杂,涉及分布而不是单个值.
  • 这项研究为理解多巴胺在强化学习中的作用提供了一个新的框架, 将神经科学与人工智能的进步结合起来.