Search research articles

Videos de Conceptos Relacionados

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Operant Conditioning

Operant Conditioning

Operant conditioning, a key concept in behavioral psychology, involves using reinforcement and punishment to alter the likelihood of a behavior being repeated. B.F. introduced this type of conditioning. Skinner focused on voluntary behaviors and the consequences that follow them, influencing whether these behaviors will be strengthened or diminished.
Reinforcement in operant conditioning can be positive or negative, both of which serve to increase the likelihood of a behavior. Positive...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Primary and Secondary Reinforcers

Primary and Secondary Reinforcers

In psychology, reinforcement is a key concept in behavior modification. B.F. Skinner demonstrated this with his experiments involving rats in what is known as a Skinner box. The rats learned to press a lever to receive food, a primary reinforcer that fulfilled their innate need for nourishment.
Effective reinforcers for humans vary depending on the individual and the context. Primary reinforcers, such as food, water, sleep, shelter, and pleasure, have inherent value and satisfy basic biological...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

También podría leer

Artículos Relacionados

Artículos vinculados a este trabajo por autores compartidos, revista y gráfico de citas.

Ordenar por

Same author

AI-Discovered Cognitive Models Reveal Novel Insights into Human and Animal Learning.

bioRxiv : the preprint server for biology·2026

Same author

Accelerating scientific discovery with Co-Scientist.

Nature·2026

Same author

Dopamine in the ventral and tail of striatum supports global and local evaluation in reward-threat conflict.

bioRxiv : the preprint server for biology·2026

Same author

Spectral envelopes of facial movements predict intention, cortical representations, and neural prosthetic control.

bioRxiv : the preprint server for biology·2026

Same author

A novel behavioral paradigm using mice to study predictive postural control.

Frontiers in neuroscience·2026

Same author

Technological <i>folie à deux</i>: feedback loops between AI chatbots and mental health.

Nature. Mental health·2026

Same journal

Retraction Note: NSD2 targeting reverses plasticity and drug resistance in prostate cancer.

Nature·2026

Same journal

Enhanced B cell priming induces broadly neutralizing HIV-1 apex antibodies.

Nature·2026

Same journal

Vaccination elicits HIV broadly neutralizing antibodies in primates.

Nature·2026

Same journal

Child online safety needs more than social-media bans.

Nature·2026

Same journal

Ebola preparedness must start with ecosystems and before humans show symptoms.

Nature·2026

Same journal

AI tools can speed up thinking, but evidence still comes from the lab bench.

Nature·2026

Ver todos los artículos relacionados

ACERCA DE JoVE

Visión General Liderazgo Blog Centro de Ayuda JoVE

AUTORES

Proceso de Publicación Consejo Editorial Alcance y Políticas Revisión por Pares Preguntas Frecuentes Enviar

BIBLIOTECARIOS

Testimonios Suscripciones Acceso Recursos Consejo Asesor de Bibliotecas Preguntas Frecuentes

INVESTIGACIÓN

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archivo

EDUCACIÓN

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Centro de Recursos para Profesores Sitio de Profesores

Términos y Condiciones de Uso

Política de Privacidad

Search research articles

Video Experimental Relacionado

Updated: Dec 30, 2025

Studying Food Reward and Motivation in Humans

Studying Food Reward and Motivation in Humans

Published on: March 19, 2014

Un código de distribución para el valor en el aprendizaje por refuerzo basado en la dopamina

Will Dabney¹, Zeb Kurth-Nelson^2,3, Naoshige Uchida⁴

¹DeepMind, London, UK. wdabney@google.com.

|January 17, 2020

Resumen

Este resumen es generado por máquina.

El aprendizaje de refuerzo basado en dopamina puede representar recompensas como distribuciones de probabilidad, no solo valores individuales. Este estudio proporciona evidencia neuronal que apoya este modelo de aprendizaje de refuerzo distribuido en el cerebro.

Más Videos Relacionados

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents

Published on: May 3, 2012

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Videos de Experimentos Relacionados

Last Updated: Dec 30, 2025

Studying Food Reward and Motivation in Humans

Studying Food Reward and Motivation in Humans

Published on: March 19, 2014

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents

A Fully Automated and Highly Versatile System for Testing Multi-cognitive Functions and Recording Neuronal Activities in Rodents

Published on: May 3, 2012

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Área de la Ciencia:

La neurociencia
Neurociencia computacional
Inteligencia artificial

Sus antecedentes:

La teoría del error de predicción de recompensa canónica de la dopamina explica la representación de la recompensa y el valor en el cerebro.
Esta teoría postula que las predicciones de recompensa se representan como una sola cantidad escalar, que representa la media de los resultados estocásticos.

Objetivo del estudio:

Proponer y probar una versión novedosa del aprendizaje por refuerzo basado en la dopamina inspirado en el aprendizaje por refuerzo distributivo en la inteligencia artificial.
Investigar si el cerebro representa recompensas futuras potenciales como una distribución de probabilidad en lugar de un solo valor medio.

Principales métodos:

Se utilizaron grabaciones de una sola unidad del área tegmental ventral en ratones.
Predicciones empíricas probadas derivadas de la hipótesis de aprendizaje por refuerzo distributivo.

Principales resultados:

Los hallazgos proporcionan pruebas sólidas que apoyan una base neuronal para el aprendizaje por refuerzo distributivo.
Se demostró que las neuronas de dopamina pueden codificar una distribución de posibles recompensas futuras.

Conclusiones:

La representación del cerebro de la recompensa puede ser más compleja de lo que se pensaba anteriormente, ya que implica distribuciones en lugar de valores individuales.
Este estudio ofrece un nuevo marco para comprender el papel de la dopamina en el aprendizaje por refuerzo, alineando la neurociencia con los avances de la inteligencia artificial.