Jove
Visualize
Contáctanos
JoVE
x logofacebook logolinkedin logoyoutube logo
ACERCA DE JoVE
Visión GeneralLiderazgoBlogCentro de Ayuda JoVE
AUTORES
Proceso de PublicaciónConsejo EditorialAlcance y PolíticasRevisión por ParesPreguntas FrecuentesEnviar
BIBLIOTECARIOS
TestimoniosSuscripcionesAccesoRecursosConsejo Asesor de BibliotecasPreguntas Frecuentes
INVESTIGACIÓN
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchivo
EDUCACIÓN
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualCentro de Recursos para ProfesoresSitio de Profesores
Términos y Condiciones de Uso
Política de Privacidad
Políticas

Videos de Conceptos Relacionados

Reinforcement01:23

Reinforcement

992
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
992
Reinforcement Schedules01:24

Reinforcement Schedules

559
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
559
Velocity and Position by Integral Method01:13

Velocity and Position by Integral Method

8.7K
If acceleration as a function of time is known, then velocity and position functions can be derived using integral calculus. For constant acceleration, the integral equations refer to the first and second kinematic equations for velocity and position functions, respectively.
Consider an example to calculate the velocity and position from the acceleration function. A motorboat is traveling at a constant velocity of 5.0 m/s when it starts to decelerate to arrive at the dock. Its acceleration is...
8.7K
Observational Learning01:12

Observational Learning

1.1K
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
1.1K
Average and Instantaneous Velocity Vectors01:12

Average and Instantaneous Velocity Vectors

8.9K
To calculate other physical quantities in kinematics, the time variable must be introduced. The time variable not only allows us to state where an object is (its position) during its motion, but also how fast it’s moving. The speed at which an object is moving is given by the rate at which the position changes with time. For each position, a particular time is assigned. If the details of the motion at each instant are not important, the rate is usually expressed as the average velocity v.
8.9K
Instantaneous Velocity - I01:15

Instantaneous Velocity - I

30.4K
The average velocity during a time interval cannot tell us how fast or in what direction a particle is moving at any given time during the interval. To calculate this, it is important to know the instantaneous velocity, which is the velocity at a specific instant of time or at a specific point along the path. Instantaneous velocity is the quantity that measures how fast an object is moving along its path. In other words, the instantaneous velocity vx of an object is the limit of the average...
30.4K

También podría leer

Artículos Relacionados

Artículos vinculados a este trabajo por autores compartidos, revista y gráfico de citas.

Ordenar por
Same author

Causal-StoNet: Causal Inference for High-Dimensional Complex Data.

... International Conference on Learning Representations·2026
Same author

Conformal Prediction in Clinical Artificial Intelligence: Enhancing Model Reliability and Interpretability.

Chest·2026
Same author

Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior.

Journal of data science : JDS·2025
Same author

Extended fiducial inference for individual treatment effects via deep neural networks.

Statistics and computing·2025
Same author

A New Paradigm for Generative Adversarial Networks based on Randomized Decision Rules.

Statistica Sinica·2025
Same author

Extended fiducial inference: toward an automated process of statistical inference.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2025
Same journal

Continual Slow-and-Fast Adaptation of Latent Neural Dynamics (CoSFan): Meta-Learning What-How & When to Adapt.

... International Conference on Learning Representations·2026
Same journal

Topology-Aware Segmentation Using Discrete Morse Theory.

... International Conference on Learning Representations·2026
Same journal

TOPODIFFUSIONNET: A TOPOLOGY-AWARE DIFFUSION MODEL.

... International Conference on Learning Representations·2026
Same journal

GEOMETRY OF LONG-TAILED REPRESENTATION LEARNING: REBALANCING FEATURES FOR SKEWED DISTRIBUTIONS.

... International Conference on Learning Representations·2026
Same journal

Probabilistic Geometric Principal Component Analysis with application to neural data.

... International Conference on Learning Representations·2026
Same journal

BRAID: Input-driven nonlinear dynamical modeling of neural-behavioral data.

... International Conference on Learning Representations·2026
Ver todos los artículos relacionados

Video Experimental Relacionado

Updated: Feb 24, 2026

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut
08:32

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut

Published on: June 15, 2020

13.5K

Seguimiento Rápido de Valor para Aprendizaje Profundo por Refuerzo

Frank Shih1, Faming Liang1

  • 1Department of Statistics, Purdue University, West Lafayette, IN 47907, USA.

... International Conference on Learning Representations
|February 23, 2026
PubMed
Resumen
Este resumen es generado por máquina.

Este estudio presenta Langevinized Kalman Temporal-Difference (LKTD), un novedoso algoritmo de aprendizaje por refuerzo (RL). LKTD cuantifica la incertidumbre en el aprendizaje profundo por refuerzo aprovechando los métodos de filtrado de Kalman y Muestreo de Monte Carlo de Markov de Gradiente Estocástico.

Palabras clave:
aprendizaje por refuerzocuantificación de la incertidumbrefiltrado de Kalmanmuestreo de Monte Carlo de Markov de gradiente estocásticoaprendizaje profundo

Más Videos Relacionados

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice
06:27

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice

Published on: September 6, 2024

2.4K
Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies
07:34

Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies

Published on: November 7, 2025

313

Videos de Experimentos Relacionados

Last Updated: Feb 24, 2026

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut
08:32

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut

Published on: June 15, 2020

13.5K
Behavioral Training Procedures for Head-fixed Virtual Reality in Mice
06:27

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice

Published on: September 6, 2024

2.4K
Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies
07:34

Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies

Published on: November 7, 2025

313

Área de la Ciencia:

  • Inteligencia Artificial
  • Aprendizaje Automático
  • Teoría de Control

Sus antecedentes:

  • El aprendizaje por refuerzo (RL) los agentes interactúan con los entornos para la toma de decisiones secuencial.
  • Los algoritmos de RL actuales a menudo pasan por alto la estocasticidad ambiental y la cuantificación de la incertidumbre.
  • Los modelos estáticos se centran en estimaciones puntuales, descuidando las interacciones dinámicas.

Objetivo del estudio:

  • Introducir un novedoso algoritmo de muestreo escalable para el aprendizaje profundo por refuerzo.
  • Abordar las limitaciones en los métodos de RL existentes con respecto a la cuantificación de la incertidumbre.
  • Desarrollar un método para cuantificar y monitorear las incertidumbres durante el entrenamiento de RL.

Principales métodos:

  • Aprovechar el paradigma del filtrado de Kalman.
  • Introducir el algoritmo Langevinized Kalman Temporal-Difference (LKTD).
  • Utilizar el Muestreo de Monte Carlo de Markov de Gradiente Estocástico (SGMCMC) para el muestreo posterior de los parámetros de la red neuronal.

Principales resultados:

  • Demostrar la convergencia de las muestras posteriores de LKTD a una distribución estacionaria bajo condiciones leves.
  • Permitir la cuantificación de las incertidumbres en las funciones de valor y los parámetros del modelo.
  • Permitir el monitoreo de las incertidumbres durante las actualizaciones de políticas en el aprendizaje profundo por refuerzo.

Conclusiones:

  • El algoritmo LKTD proporciona un enfoque robusto para la cuantificación de la incertidumbre en RL.
  • LKTD facilita sistemas de aprendizaje por refuerzo más adaptables y confiables.
  • Este método mejora la comprensión y la gestión de la incertidumbre en las interacciones agente-entorno.