Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Inter-module credit assignment in modular reinforcement learning.

Kazuyuki Samejima1, Kenji Doya, Mitsuo Kawato

  • 1Human information science laboratories, ATR International, 2-2-2 Hikaridai, Seika, Soraku, Kyoto 619-0288, Japan. samejima@atr.co.jp

Neural Networks : the Official Journal of the International Neural Network Society
|December 25, 2003
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Consensus Paper: Models of Cerebellar Functions.

Cerebellum (London, England)·2026
Same author

Extraction of robust functional connectivity patterns across psychiatric disorders using principal component analysis-based feature selection.

Imaging neuroscience (Cambridge, Mass.)·2026
Same author

A computational model of canonical cortical microcircuits for dynamic Bayesian inference and control as inference.

Neuroscience research·2025
Same author

Generalizable stratification based on thalamo-somatomotor functional connectivity predicts responses to antidepressants in patients with depression.

Molecular psychiatry·2025
Same author

Computational mechanisms of neuroimaging biomarkers uncovered by multicenter resting-state fMRI connectivity variation profile.

Molecular psychiatry·2025
Same author

Enhancement of the left frontoparietal network through real-time functional magnetic resonance imaging functional connectivity-informed neurofeedback and its impact on working memory in schizophrenia: A pilot study.

Psychiatry and clinical neurosciences·2025
Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Physics-encoded convolutional neural operators for parametric PDEs: A convergence-guaranteed framework via pre-computed kernel fields.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026
See all related articles

We introduce a novel modular reward method to improve hierarchical reinforcement learning (RL). This approach enhances sub-task independence and overall policy optimality in complex tasks.

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Robotics

Background:

  • Modular or hierarchical reinforcement learning (RL) faces challenges in task decomposition, sub-task independence, and composite policy optimality.
  • Optimality and sub-task independence in hierarchical RL are often in conflict.

Purpose of the Study:

  • To propose a novel method for propagating task rewards between modules in hierarchical RL.
  • To address the trade-off between sub-task independence and overall policy optimality.

Main Methods:

  • Introduced a 'modular reward' calculated from the temporal difference of module gating signals and succeeding module values.
  • Implemented the modular reward within a multiple model-based reinforcement learning (MMRL) architecture.

Related Experiment Videos

Main Results:

  • Demonstrated the effectiveness of the modular reward in simulations.
  • Successfully applied the method to a pursuit task with hidden states.
  • Validated the approach in a continuous-time non-linear control task.

Conclusions:

  • The proposed modular reward effectively propagates task achievement rewards between modules.
  • This method helps overcome the inherent trade-offs in hierarchical reinforcement learning.
  • The approach shows promise for complex control and decision-making tasks in AI.