Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Transformers in Distribution System

Transformers in Distribution System

Transformers in distribution systems can be broadly categorized into distribution substation transformers and other distribution transformers. They are crucial for stepping down high transmission voltages to levels suitable for distribution and end-user applications.
Distribution substation transformers come in various ratings and typically use mineral oil for insulation and cooling. To prevent moisture and air from entering the oil, some transformers use an inert gas like nitrogen to fill the...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Transformers with Off-Nominal Turns Ratios

Transformers with Off-Nominal Turns Ratios

In scenarios involving parallel transformers with disparate ratings, developing per-unit models requires accommodating off-nominal turns ratios. This situation arises when the selected base voltages are not proportional to the transformer’s voltage ratings. Consider a transformer where the rated voltages are related by the term a. If the chosen voltage bases satisfy a relationship involving term b, term c is defined as the ratio of these bases. This ratio is then substituted into the...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Cognitive Learning

Cognitive Learning

Cognitive learning is based on purposive behavior, incidental learning, and insight learning.
E. C. Tolman's theory of purposive behavior emphasizes that much behavior is goal-directed. He argued that to understand behavior, we must look at the entire sequence of actions leading to a goal. For instance, high school students study hard, not just due to past reinforcement but also to achieve the goal of getting into a good college.
Tolman introduced the idea that behavior is influenced by...

Hydraulic Jump: Problem Solving

Hydraulic Jump: Problem Solving

To analyze a hydraulic jump in a rectangular channel with a flow speed of 6 meters per second, follow these steps:Calculate Effective Upstream Velocity:When the downstream gate closes, a hydraulic jump forms, traveling upstream at 2 meters per second. This wave speed combines with the initial channel flow velocity, creating an effective upstream velocity.Identify Flow Velocities Before and After the Hydraulic Jump:Upstream of the hydraulic jump, the effective flow velocity includes both the...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

UX Framework Including Imbalanced UX Dataset Reduction Method for Analyzing Interaction Trends of Agent Systems.

Sensors (Basel, Switzerland)·2023

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jul 16, 2025

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Published on: March 2, 2015

基于变压器解码器的增强探索方法,以缓解强化学习的初始探索问题.

Dohyun Kyoung¹, Yunsick Sung²

¹Department of Autonomous Things Intelligence, Graduate School, Dongguk University-Seoul, Seoul 04620, Republic of Korea.

Sensors (Basel, Switzerland)

|September 9, 2023

概括

此摘要是机器生成的。

这项研究引入了一种新的强化学习方法,使用预训练的变压器解码器来显著减少初始探索. 这种方法加速学习,提高绩效,与传统策略相比,实现更高的奖励和获胜率.

关键词:

勘探勘探勘探是一个过程.机器学习是机器学习.预训练的预训练强化学习是一种强化学习.变压器 - 解码器

更多相关视频

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

相关实验视频

Last Updated: Jul 16, 2025

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Published on: March 2, 2015

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

科学领域:

人工智能的人工智能
机器学习机器学习
强化学习是一种强化学习.

背景情况:

埃普西隆贪策略是强化学习中常见的一种探索技术.
这种策略往往导致广泛的初步探索和长时间的学习期.
目前减少勘探的方法,比如使用专家数据,在减少初始勘探范围方面存在局限性.

研究的目的:

提出一种新的方法来减少强化学习中的初始探索范围.
通过指导早期行动,提高学习效率和代理业绩.
在强化学习中改进现有的探索技术.

主要方法:

用广泛的专家数据训练一个变压器解码器.
在初始学习阶段使用预训练模型指导代理行为.
在达到学习值后,过渡到epsilon-greedy策略.

主要成果:

拟议的方法显示了FreeStyle1篮球比赛中平均奖励的约2.5倍.
与传统的深度Q网络 (DQN) 相比,通过epsilon-greedy策略实现了26%的更高胜利率.
有效地减少了最初的探索范围和优化学习时间.

结论:

预训练的变压器解码器方法显著提高了强化学习的性能.
这种方法比传统的勘探技术有了显著的改进.
该方法有效地平衡了探索和利用,以实现更快,更有效的学习.