Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mean free path and Mean free time01:22

Mean free path and Mean free time

5.0K
Consider the gas molecules in a cylinder. They move in a random motion as they collide with each other and change speed and direction. The average of all the path lengths between collisions is known as the "mean free path."
5.0K
Path Between Thermodynamics States01:21

Path Between Thermodynamics States

3.9K
Consider the two thermodynamic processes involving an ideal gas that are represented by paths AC and ABC in Figure 1:
3.9K
Interference: Path Lengths01:10

Interference: Path Lengths

1.9K
Consider two sources of sound, that may or may not be in phase, emitting waves at a single frequency, and consider the frequencies to be the same.
Two special sources may be considered when they are in phase. This can be easily achieved by feeding the two sources from the same source. An example would be synchronizing the two speakers by feeding them with the same source, such as the sound waves produced by a tuning fork. This setup ensures that the two sources have the same frequency and are...
1.9K
Reinforcement01:23

Reinforcement

872
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
872
Intelligence01:27

Intelligence

8.5K
The term "intelligence" is complex because it refers to both behavior and individuals, and its interpretation varies across cultures. European Americans tend to link intelligence with reasoning and cognitive skills, while in Kenya, it is tied to responsible participation in family and social life. In Uganda, intelligence is seen as the ability to know the right actions and carry them out effectively, while the Iatmul people of Papua New Guinea associate it with the capacity to remember...
8.5K
Behavior of Gas Molecules: Molecular Diffusion, Mean Free Path, and Effusion03:48

Behavior of Gas Molecules: Molecular Diffusion, Mean Free Path, and Effusion

31.2K
Although gaseous molecules travel at tremendous speeds (hundreds of meters per second), they collide with other gaseous molecules and travel in many different directions before reaching the desired target. At room temperature, a gaseous molecule will experience billions of collisions per second. The mean free path is the average distance a molecule travels between collisions. The mean free path increases with decreasing pressure; in general, the mean free path for a gaseous molecule will be...
31.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Fractional Order Total Variation Low-Rank Representation on Single-Cell RNA Sequencing Clustering.

IET systems biology·2026
Same author

Blockchain and federated Q-learning-based secure, fault tolerant, and energy efficient framework for ad hoc networks.

PloS one·2026
Same author

TSSP-UNet: A Two-Stage Weakly Supervised Pathological Image Segmentation With Point Annotations.

IET systems biology·2026
Same author

Towards a cybersecure and privacy enhanced smart grid: A blockchain enabled federated learning framework.

PloS one·2026
Same author

Molecular insights into glial neuroimmune cross reactivity with CNS antigens and its role in neuroinflammation.

Inflammopharmacology·2026
Same author

MFS-Unet: A Multi-Path Vision Mamba Network for Precise Thyroid Nodule Segmentation.

IET systems biology·2026
Same journal

Differential Neural Networks Prediction Using Slow and Fast Hybrid Learning: Application to Prognosis of Infectionsand Deaths of COVID-19 Dynamics.

Neural processing letters·2023
Same journal

An Optimal Stacked ResNet-BiLSTM-Based Accurate Detection and Classification of Genetic Disorders.

Neural processing letters·2023
Same journal

A Rumor Detection Model Incorporating Propagation Path Contextual Semantics and User Information.

Neural processing letters·2023
Same journal

Improving the Polarity of Text through word2vec Embedding for Primary Classical Arabic Sentiment Analysis.

Neural processing letters·2023
Same journal

Co-Membership-based Generic Anomalous Communities Detection.

Neural processing letters·2023
Same journal

A Radial Basis Function Neural Network for Stochastic Frontier Analyses of General Multivariate Production and Cost Functions.

Neural processing letters·2023
See all related articles

Related Experiment Video

Updated: Jan 24, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind
09:01

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

14.9K

Reinforcement Learning-Based Intelligent Path Planning for Optimal Navigation in Dynamic Environments.

Anil Kumar Yadav1, Purushottam Sharma2, Xiaochun Cheng3

  • 1VIT Bhopal University, Bhopal-Indore Highway, Bhopal, India.

Neural Processing Letters
|January 23, 2026
PubMed
Summary
This summary is machine-generated.

Optimizing reward functions in reinforcement learning (RL) significantly improves autonomous mobile robot navigation. This enhanced RL approach reduces path distance and learning time in dynamic environments.

Keywords:
NavigationPath optimizationPolicy iterationQ-learning (QL)Reinforcement learning (RL)Reward functionTrajectory planning

More Related Videos

Dynamic Navigation for Dental Implant Placement
05:42

Dynamic Navigation for Dental Implant Placement

Published on: September 13, 2022

4.4K
Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System
07:03

Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System

Published on: May 5, 2022

5.3K

Related Experiment Videos

Last Updated: Jan 24, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind
09:01

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

14.9K
Dynamic Navigation for Dental Implant Placement
05:42

Dynamic Navigation for Dental Implant Placement

Published on: September 13, 2022

4.4K
Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System
07:03

Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System

Published on: May 5, 2022

5.3K

Area of Science:

  • Robotics
  • Artificial Intelligence
  • Machine Learning

Background:

  • Path selection and planning are critical for autonomous mobile robots (AMRs) to navigate efficiently and avoid obstacles.
  • Traditional methods often use analytical search for shortest paths, but reinforcement learning (RL) offers enhanced performance through action sequence optimization.
  • Q-learning, a common RL algorithm, struggles with environment generalization in dynamic systems due to its reliance on cumulative rewards.

Purpose of the Study:

  • To optimize reward functions for efficient navigation and obstacle avoidance in RL-based path planning for AMRs.
  • To enhance the generalization capabilities of RL algorithms in dynamic environments.
  • To evaluate the impact of optimized reward mechanisms on path planning efficiency and learning performance.

Main Methods:

  • The study proposes an optimized reward function for RL-based path planning, considering total steps, counted steps, and discount rates in dynamic environments.
  • Implemented and analyzed state reward values across different environments using the optimized reward mechanism.
  • Evaluated the effect on Q-Learning and Deep Q-Learning algorithms, comparing state-action pair-based performance.

Main Results:

  • The optimized reward function significantly decreased the number of iterations and episodes required for learning.
  • Achieved a 30% to 70% reduction in overall trajectory distance compared to traditional methods.
  • Demonstrated improved path optimization, learning rate, episode completion, and decision accuracy.

Conclusions:

  • Optimized reward functions enhance the effectiveness of RL for AMR path planning in dynamic environments.
  • The proposed method shows significant improvements in navigation efficiency and obstacle avoidance.
  • Combining multiple agents and advanced techniques like federated and transfer learning can further improve convergence and performance on larger maps.