Decision Making: P-value Method
Reinforcement
Observational Learning
Approximate Integration
Avoidance Learning and Learned Helplessness
Implicit Differentiation: Problem Solving
You might also read
Articles linked to this work by shared authors, journal, and citation graph.
Takayuki Akiyama1, Hirotaka Hachiya, Masashi Sugiyama
1Department of Computer Science, Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, Tokyo 152-8552, Japan. akiyama@sg.cs.titech.ac.jp
Designing effective sampling policies is crucial for reinforcement learning control. This study introduces active policy iteration (API) for efficient exploration, especially when reward sampling is costly, improving reinforcement learning outcomes.
Area of Science:
Background:
Purpose of the Study:
Main Methods:
Main Results:
Conclusions: