Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Inter-module credit assignment in modular reinforcement learning.

Kazuyuki Samejima¹, Kenji Doya, Mitsuo Kawato

¹Human information science laboratories, ATR International, 2-2-2 Hikaridai, Seika, Soraku, Kyoto 619-0288, Japan. samejima@atr.co.jp

Neural Networks : the Official Journal of the International Neural Network Society

|December 25, 2003

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Consensus Paper: Models of Cerebellar Functions.

Cerebellum (London, England)·2026

Same author

Extraction of robust functional connectivity patterns across psychiatric disorders using principal component analysis-based feature selection.

Imaging neuroscience (Cambridge, Mass.)·2026

Same author

A computational model of canonical cortical microcircuits for dynamic Bayesian inference and control as inference.

Neuroscience research·2025

Same author

Generalizable stratification based on thalamo-somatomotor functional connectivity predicts responses to antidepressants in patients with depression.

Molecular psychiatry·2025

Same author

Computational mechanisms of neuroimaging biomarkers uncovered by multicenter resting-state fMRI connectivity variation profile.

Molecular psychiatry·2025

Same author

Enhancement of the left frontoparietal network through real-time functional magnetic resonance imaging functional connectivity-informed neurofeedback and its impact on working memory in schizophrenia: A pilot study.

Psychiatry and clinical neurosciences·2025

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Physics-encoded convolutional neural operators for parametric PDEs: A convergence-guaranteed framework via pre-computed kernel fields.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

We introduce a novel modular reward method to improve hierarchical reinforcement learning (RL). This approach enhances sub-task independence and overall policy optimality in complex tasks.

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Modular or hierarchical reinforcement learning (RL) faces challenges in task decomposition, sub-task independence, and composite policy optimality.
Optimality and sub-task independence in hierarchical RL are often in conflict.

Purpose of the Study:

To propose a novel method for propagating task rewards between modules in hierarchical RL.
To address the trade-off between sub-task independence and overall policy optimality.

Main Methods:

Introduced a 'modular reward' calculated from the temporal difference of module gating signals and succeeding module values.
Implemented the modular reward within a multiple model-based reinforcement learning (MMRL) architecture.

Related Experiment Videos

Main Results:

Demonstrated the effectiveness of the modular reward in simulations.
Successfully applied the method to a pursuit task with hidden states.
Validated the approach in a continuous-time non-linear control task.

Conclusions:

The proposed modular reward effectively propagates task achievement rewards between modules.
This method helps overcome the inherent trade-offs in hierarchical reinforcement learning.
The approach shows promise for complex control and decision-making tasks in AI.