Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Velocity and Position by Integral Method

Velocity and Position by Integral Method

If acceleration as a function of time is known, then velocity and position functions can be derived using integral calculus. For constant acceleration, the integral equations refer to the first and second kinematic equations for velocity and position functions, respectively.
Consider an example to calculate the velocity and position from the acceleration function. A motorboat is traveling at a constant velocity of 5.0 m/s when it starts to decelerate to arrive at the dock. Its acceleration is...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Average and Instantaneous Velocity Vectors

Average and Instantaneous Velocity Vectors

To calculate other physical quantities in kinematics, the time variable must be introduced. The time variable not only allows us to state where an object is (its position) during its motion, but also how fast it’s moving. The speed at which an object is moving is given by the rate at which the position changes with time. For each position, a particular time is assigned. If the details of the motion at each instant are not important, the rate is usually expressed as the average velocity v.

Instantaneous Velocity - I

Instantaneous Velocity - I

The average velocity during a time interval cannot tell us how fast or in what direction a particle is moving at any given time during the interval. To calculate this, it is important to know the instantaneous velocity, which is the velocity at a specific instant of time or at a specific point along the path. Instantaneous velocity is the quantity that measures how fast an object is moving along its path. In other words, the instantaneous velocity vx of an object is the limit of the average...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Causal-StoNet: Causal Inference for High-Dimensional Complex Data.

... International Conference on Learning Representations·2026

Same author

Conformal Prediction in Clinical Artificial Intelligence: Enhancing Model Reliability and Interpretability.

Chest·2026

Same author

Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior.

Journal of data science : JDS·2025

Same author

Extended fiducial inference for individual treatment effects via deep neural networks.

Statistics and computing·2025

Same author

A New Paradigm for Generative Adversarial Networks based on Randomized Decision Rules.

Statistica Sinica·2025

Same author

Extended fiducial inference: toward an automated process of statistical inference.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2025

Same journal

Continual Slow-and-Fast Adaptation of Latent Neural Dynamics (CoSFan): Meta-Learning What-How & When to Adapt.

... International Conference on Learning Representations·2026

Same journal

Topology-Aware Segmentation Using Discrete Morse Theory.

... International Conference on Learning Representations·2026

Same journal

TOPODIFFUSIONNET: A TOPOLOGY-AWARE DIFFUSION MODEL.

... International Conference on Learning Representations·2026

Same journal

GEOMETRY OF LONG-TAILED REPRESENTATION LEARNING: REBALANCING FEATURES FOR SKEWED DISTRIBUTIONS.

... International Conference on Learning Representations·2026

Same journal

Probabilistic Geometric Principal Component Analysis with application to neural data.

... International Conference on Learning Representations·2026

Same journal

BRAID: Input-driven nonlinear dynamical modeling of neural-behavioral data.

... International Conference on Learning Representations·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Feb 24, 2026

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut

Published on: June 15, 2020

快速值跟踪用于深度强化学习的学习.

Frank Shih¹, Faming Liang¹

¹Department of Statistics, Purdue University, West Lafayette, IN 47907, USA.

... International Conference on Learning Representations

|February 23, 2026

概括

此摘要是机器生成的。

这项研究介绍了Langevinized Kalman Temporal-Difference (LKTD),一种新的强化学习 (RL) 算法. 通过利用卡尔曼过和随机梯度马尔科夫链蒙特卡洛方法,LKTD量化了深度强化学习中的不确定性.

更多相关视频

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice

Published on: September 6, 2024

Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies

Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies

Published on: November 7, 2025

相关实验视频

Last Updated: Feb 24, 2026

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut

Tracking Rats in Operant Conditioning Chambers Using a Versatile Homemade Video Camera and DeepLabCut

Published on: June 15, 2020

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice

Behavioral Training Procedures for Head-fixed Virtual Reality in Mice

Published on: September 6, 2024

Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies

Utilizing vmTracking to Improve the Accuracy of Multi-Animal Pose Estimation in Rodent Social Behavior Studies

Published on: November 7, 2025

科学领域:

人工智能的人工智能
机器学习机器学习
控制理论控制理论

背景情况:

强化学习 (RL) 代理人与环境互动,以进行连续的决策.
当前的RL算法经常忽视环境随机性和不确定性量化.
静态模型专注于点估计,忽视动态相互作用.

研究的目的:

介绍一个新的,可扩展的采样算法,用于深度强化学习.
解决现有的RL方法在不确定性量化方面的局限性.
开发一种方法来量化和监测RL培训期间的不确定性.

主要方法:

利用卡尔曼的过模式.
介绍Langevin化卡尔曼时间差异 (LKTD) 算法.
使用随机梯度马尔科夫链蒙特卡罗 (SGMCMC) 来进行神经网络参数的后置采样.

主要成果:

在温和条件下证明LKTD后部样本的趋同到静止分布.
能够量化价值函数和模型参数中的不确定性.
允许在深度强化学习的政策更新期间监控不确定性.

结论:

LKTD算法为RL的不确定性量化提供了一个强大的方法.
LKTD促进了更具适应性和可靠性的强化学习系统.
这种方法增强了对代理-环境相互作用的不确定性的理解和管理.