Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Maxwell-Boltzmann Distribution: Problem Solving

Maxwell-Boltzmann Distribution: Problem Solving

Individual molecules in a gas move in random directions, but a gas containing numerous molecules has a predictable distribution of molecular speeds, which is known as the Maxwell-Boltzmann distribution, f(v).
This distribution function f(v) is defined by saying that the expected number N (v1,v2) of particles with speeds between v1 and v2 is given by

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Residuals and Least-Squares Property

Residuals and Least-Squares Property

The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...

Calibration Curves: Linear Least Squares

Calibration Curves: Linear Least Squares

A calibration curve is a plot of the instrument's response against a series of known concentrations of a substance. This curve is used to set the instrument response levels, using the substance and its concentrations as standards. Alternatively, or additionally, an equation is fitted to the calibration curve plot and subsequently used to calculate the unknown concentrations of other samples reliably.
For data that follow a straight line, the standard method for fitting is the linear...

Statically Indeterminate Problem Solving

Statically Indeterminate Problem Solving

Statically indeterminate problems are those where statics alone can not determine the internal forces or reactions. Consider a structure comprising two cylindrical rods made of steel and brass. These rods are joined at point B and restrained by rigid supports at points A and C. Now, the reactions at points A and C and the deflection at point B are to be determined. This rod structure is classified as statically indeterminate as the structure has more supports than are necessary for maintaining...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Comparison of fresh and frozen-thawed embryo transfer cycles in patients with low oocyte retrieval.

Pakistan journal of medical sciences·2024

Same author

Banxia Baizhu Tianma Decoction alleviates pentylenetetrazol-induced epileptic seizures in rats by preventing neuronal cell damage and apoptosis and altering serum and urine metabolic profiles.

Journal of ethnopharmacology·2024

Same author

Neural network enhanced time-varying parameter estimation via weak measurement.

Optics express·2024

Same author

MS-YOLO: A Lightweight and High-Precision YOLO Model for Drowning Detection.

Sensors (Basel, Switzerland)·2024

Same author

The Consumption of Non-Sugar Sweetened and Ready-to-Drink Beverages as Emerging Types of Beverages in Shanghai.

Nutrients·2024

Same author

The relationship between stress, anxiety and eating behavior among Chinese students: a cross-sectional study.

Frontiers in public health·2024

Same journal

RETRACTION: Real-Time Modulation of Physical Training Intensity Based on Wavelet Recursive Fuzzy Neural Networks.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Multidimensional Heterogeneous Network Link Adaptation Based on Mobile Environment.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Framework to Segment and Evaluate Multiple Sclerosis Lesion in MRI Slices Using VGG-UNet.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Facial Emotion Recognition Using a Novel Fusion of Convolutional Neural Network and Local Binary Pattern in Crime Investigation.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Automatic Intelligent System Using Medical of Things for Multiple Sclerosis Detection.

Computational intelligence and neuroscience·2026

Same journal

RETRACTION: Intangible Cultural Heritage Reproduction and Revitalization: Value Feedback, Practice, and Exploration Based on the IPA Model.

Computational intelligence and neuroscience·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 16, 2025

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Published on: March 2, 2015

Minibatch Recursive Least Squares Q-Learning.

Chunyuan Zhang¹, Qi Song¹, Zeng Meng¹

¹School of Computer Science and Technology, Hainan University, Haikou, Hainan 570228, China.

Computational Intelligence and Neuroscience

|October 18, 2021

Summary

This summary is machine-generated.

We introduce minibatch recursive least squares Q-learning (MRLS-Q), a novel algorithm combining linear function approximation with deep Q-network (DQN) advantages. MRLS-Q offers improved convergence and stability for reinforcement learning tasks.

More Related Videos

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Published on: October 28, 2022

Related Experiment Videos

Last Updated: Oct 16, 2025

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Published on: March 2, 2015

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Published on: October 28, 2022

Area of Science:

Artificial Intelligence
Machine Learning
Reinforcement Learning

Background:

Deep Q-network (DQN) is a successful reinforcement learning algorithm but suffers from slow convergence and instability.
Traditional linear function approximation methods offer faster convergence and stability but struggle with high-dimensional problems.
Existing DQN improvements rarely leverage traditional methods' strengths.

Purpose of the Study:

To propose a novel Q-learning algorithm, minibatch recursive least squares Q-learning (MRLS-Q), that integrates the benefits of linear function approximation with DQN.
To enhance convergence speed and stability in reinforcement learning.
To create a versatile algorithm applicable to both low- and high-dimensional problems.

Main Methods:

Developed MRLS-Q, a Q-learning algorithm with linear function approximation.
Modeled MRLS-Q's learning mechanism and structure similar to DQNs, using states as inputs and incorporating experience replay and minibatch training.
Implemented an average recursive least squares (RLS) optimization technique for improved convergence.

Main Results:

MRLS-Q demonstrated effectiveness on the CartPole problem and four Atari games.
The algorithm showed improved convergence performance, both standalone and when integrated with DQN.
Experimental analysis investigated the impact of MRLS-Q's hyperparameters.

Conclusions:

MRLS-Q successfully combines linear function approximation with DQN principles.
The proposed algorithm offers enhanced convergence and stability for reinforcement learning.
MRLS-Q is a flexible solution for various problem dimensions and can be seamlessly integrated into existing DQN architectures.