Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Generalization, Discrimination, and Extinction

Generalization, Discrimination, and Extinction

Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Photosensitizing metal-organic framework nanoparticles combined with tumor-sensitization strategies can enhance the phototherapeutic effect upon medullary thyroid carcinoma.

Biochimica et biophysica acta. General subjects·2024

Same author

Advancements in NADH Oxidase Nanozymes: Bridging Nanotechnology and Biomedical Applications.

Advanced healthcare materials·2024

Same author

Regulatory effects of tea polysaccharides on hepatic inflammation, gut microbiota dysbiosis, and serum metabolomic signatures in beef cattle under heat stress.

Frontiers in physiology·2024

Same author

Empowering brain tumor management: chimeric antigen receptor macrophage therapy.

Theranostics·2024

Same author

Effect of perioperative blood transfusion (BTF) on elderly gastric cancer patients.

Journal of gastrointestinal oncology·2024

Same author

Tuberculosis to lung cancer: application of tuberculosis signatures in identification of lung adenocarcinoma subtypes and marker screening.

Journal of Cancer·2024

Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026

Same journal

Self-Supervised Continuous Dynamic Graph Representation Learning via Hawkes Processes.

IEEE transactions on neural networks and learning systems·2026

Same journal

cPU: Consistent Risk Estimator for Positive-Unlabeled Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Tuning-Free Latent Diffusion Models for Ultrahigh-Resolution Image Editing.

IEEE transactions on neural networks and learning systems·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

基于模型的离线增强学习与对抗数据增强

Hongye Cao, Fan Feng, Jing Huo

IEEE transactions on neural networks and learning systems

|December 2, 2025

概括

此摘要是机器生成的。

基于模型的离线强化学习 (RL) 使用对抗数据增强来改善政策优化. 通过动态选择模型,MORAL增强了培训数据,从而提高了各种任务的性能.

相关实验视频

Last Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

科学领域:

人工智能的人工智能
机器学习机器学习
机器人技术机器人技术机器人技术

背景情况:

基于模型的线下强化学习 (RL) 旨在利用预先收集的数据集优化政策.
目前的方法与静态数据和从固定的模型中推断错误作斗争.
离线代理无法与数据收集环境进行交互.

研究的目的:

引入一种新的方法,即基于模型的离线增强学习与对抗数据增强 (MORAL),以解决离线RL的局限性.
通过对抗增强,通过丰富培训数据来增强政策优化.
提高基于模型的线下RL的稳定性和适用性.

主要方法:

MORAL采用对抗性数据增强,用集成模型替代固定视界推出,采用交替采样.
一个动态的对抗过程选择组合模型与政策相对应,以减轻乐观偏见.
一个差分因子 (DF) 集成用于规范化和误差最小化在外推过程中.

主要成果:

实际上,MORAL有效地丰富了培训数据,使得在没有手动推出视野调整的情况下能够进行强有力的政策优化.
该方法在各种离线任务中展示了适应性.
对D4RL基准的实验表明,MORAL超越了现有的基于模型的线下RL技术.

结论:

在基于模型的线下强化学习中,MORAL提供了显著的进步.
对抗性数据增强策略提高了政策学习和样本效率.
MORAL为线下RL挑战提供了一个强大且广泛适用的解决方案.