Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Mean free path and Mean free time

Mean free path and Mean free time

Consider the gas molecules in a cylinder. They move in a random motion as they collide with each other and change speed and direction. The average of all the path lengths between collisions is known as the "mean free path."

Path Between Thermodynamics States

Path Between Thermodynamics States

Consider the two thermodynamic processes involving an ideal gas that are represented by paths AC and ABC in Figure 1:

Interference: Path Lengths

Interference: Path Lengths

Consider two sources of sound, that may or may not be in phase, emitting waves at a single frequency, and consider the frequencies to be the same.
Two special sources may be considered when they are in phase. This can be easily achieved by feeding the two sources from the same source. An example would be synchronizing the two speakers by feeding them with the same source, such as the sound waves produced by a tuning fork. This setup ensures that the two sources have the same frequency and are...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Intelligence

Intelligence

The term "intelligence" is complex because it refers to both behavior and individuals, and its interpretation varies across cultures. European Americans tend to link intelligence with reasoning and cognitive skills, while in Kenya, it is tied to responsible participation in family and social life. In Uganda, intelligence is seen as the ability to know the right actions and carry them out effectively, while the Iatmul people of Papua New Guinea associate it with the capacity to remember...

Behavior of Gas Molecules: Molecular Diffusion, Mean Free Path, and Effusion

Behavior of Gas Molecules: Molecular Diffusion, Mean Free Path, and Effusion

Although gaseous molecules travel at tremendous speeds (hundreds of meters per second), they collide with other gaseous molecules and travel in many different directions before reaching the desired target. At room temperature, a gaseous molecule will experience billions of collisions per second. The mean free path is the average distance a molecule travels between collisions. The mean free path increases with decreasing pressure; in general, the mean free path for a gaseous molecule will be...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Fractional Order Total Variation Low-Rank Representation on Single-Cell RNA Sequencing Clustering.

IET systems biology·2026

Same author

Blockchain and federated Q-learning-based secure, fault tolerant, and energy efficient framework for ad hoc networks.

PloS one·2026

Same author

TSSP-UNet: A Two-Stage Weakly Supervised Pathological Image Segmentation With Point Annotations.

IET systems biology·2026

Same author

Towards a cybersecure and privacy enhanced smart grid: A blockchain enabled federated learning framework.

PloS one·2026

Same author

Molecular insights into glial neuroimmune cross reactivity with CNS antigens and its role in neuroinflammation.

Inflammopharmacology·2026

Same author

MFS-Unet: A Multi-Path Vision Mamba Network for Precise Thyroid Nodule Segmentation.

IET systems biology·2026

Same journal

Differential Neural Networks Prediction Using Slow and Fast Hybrid Learning: Application to Prognosis of Infectionsand Deaths of COVID-19 Dynamics.

Neural processing letters·2023

Same journal

An Optimal Stacked ResNet-BiLSTM-Based Accurate Detection and Classification of Genetic Disorders.

Neural processing letters·2023

Same journal

A Rumor Detection Model Incorporating Propagation Path Contextual Semantics and User Information.

Neural processing letters·2023

Same journal

Improving the Polarity of Text through word2vec Embedding for Primary Classical Arabic Sentiment Analysis.

Neural processing letters·2023

Same journal

Co-Membership-based Generic Anomalous Communities Detection.

Neural processing letters·2023

Same journal

A Radial Basis Function Neural Network for Stochastic Frontier Analyses of General Multivariate Production and Cost Functions.

Neural processing letters·2023

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 24, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

Reinforcement Learning-Based Intelligent Path Planning for Optimal Navigation in Dynamic Environments.

Anil Kumar Yadav¹, Purushottam Sharma², Xiaochun Cheng³

¹VIT Bhopal University, Bhopal-Indore Highway, Bhopal, India.

Neural Processing Letters

|January 23, 2026

Summary

This summary is machine-generated.

Optimizing reward functions in reinforcement learning (RL) significantly improves autonomous mobile robot navigation. This enhanced RL approach reduces path distance and learning time in dynamic environments.

Keywords:

Navigation Path optimization Policy iteration Q-learning (QL)Reinforcement learning (RL)Reward function Trajectory planning

More Related Videos

Dynamic Navigation for Dental Implant Placement

Dynamic Navigation for Dental Implant Placement

Published on: September 13, 2022

Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System

Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System

Published on: May 5, 2022

Related Experiment Videos

Last Updated: Jan 24, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

Dynamic Navigation for Dental Implant Placement

Dynamic Navigation for Dental Implant Placement

Published on: September 13, 2022

Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System

Dynamic Navigation in Endodontics: Guided Access Cavity Preparation by Means of a Miniaturized Navigation System

Published on: May 5, 2022

Area of Science:

Robotics
Artificial Intelligence
Machine Learning

Background:

Path selection and planning are critical for autonomous mobile robots (AMRs) to navigate efficiently and avoid obstacles.
Traditional methods often use analytical search for shortest paths, but reinforcement learning (RL) offers enhanced performance through action sequence optimization.
Q-learning, a common RL algorithm, struggles with environment generalization in dynamic systems due to its reliance on cumulative rewards.

Purpose of the Study:

To optimize reward functions for efficient navigation and obstacle avoidance in RL-based path planning for AMRs.
To enhance the generalization capabilities of RL algorithms in dynamic environments.
To evaluate the impact of optimized reward mechanisms on path planning efficiency and learning performance.

Main Methods:

The study proposes an optimized reward function for RL-based path planning, considering total steps, counted steps, and discount rates in dynamic environments.
Implemented and analyzed state reward values across different environments using the optimized reward mechanism.
Evaluated the effect on Q-Learning and Deep Q-Learning algorithms, comparing state-action pair-based performance.

Main Results:

The optimized reward function significantly decreased the number of iterations and episodes required for learning.
Achieved a 30% to 70% reduction in overall trajectory distance compared to traditional methods.
Demonstrated improved path optimization, learning rate, episode completion, and decision accuracy.

Conclusions:

Optimized reward functions enhance the effectiveness of RL for AMR path planning in dynamic environments.
The proposed method shows significant improvements in navigation efficiency and obstacle avoidance.
Combining multiple agents and advanced techniques like federated and transfer learning can further improve convergence and performance on larger maps.