Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Observational Learning01:12

Observational Learning

386
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
386
Maxwell-Boltzmann Distribution: Problem Solving01:20

Maxwell-Boltzmann Distribution: Problem Solving

1.9K
Individual molecules in a gas move in random directions, but a gas containing numerous molecules has a predictable distribution of molecular speeds, which is known as the Maxwell-Boltzmann distribution, f(v).
This distribution function f(v) is defined by saying that the expected number N (v1,v2) of particles with speeds between v1 and v2 is given by
1.9K
Reinforcement Schedules01:24

Reinforcement Schedules

266
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
266
Residuals and Least-Squares Property01:11

Residuals and Least-Squares Property

8.1K
The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...
8.1K
Calibration Curves: Linear Least Squares01:20

Calibration Curves: Linear Least Squares

3.1K
A calibration curve is a plot of the instrument's response against a series of known concentrations of a substance. This curve is used to set the instrument response levels, using the substance and its concentrations as standards. Alternatively, or additionally, an equation is fitted to the calibration curve plot and subsequently used to calculate the unknown concentrations of other samples reliably.
For data that follow a straight line, the standard method for fitting is the linear...
3.1K
Statically Indeterminate Problem Solving01:16

Statically Indeterminate Problem Solving

537
Statically indeterminate problems are those where statics alone can not determine the internal forces or reactions. Consider a structure comprising two cylindrical rods made of steel and brass. These rods are joined at point B and restrained by rigid supports at points A and C. Now, the reactions at points A and C and the deflection at point B are to be determined. This rod structure is classified as statically indeterminate as the structure has more supports than are necessary for maintaining...
537

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Comparison of fresh and frozen-thawed embryo transfer cycles in patients with low oocyte retrieval.

Pakistan journal of medical sciences·2024
Same author

Banxia Baizhu Tianma Decoction alleviates pentylenetetrazol-induced epileptic seizures in rats by preventing neuronal cell damage and apoptosis and altering serum and urine metabolic profiles.

Journal of ethnopharmacology·2024
Same author

Neural network enhanced time-varying parameter estimation via weak measurement.

Optics express·2024
Same author

MS-YOLO: A Lightweight and High-Precision YOLO Model for Drowning Detection.

Sensors (Basel, Switzerland)·2024
Same author

The Consumption of Non-Sugar Sweetened and Ready-to-Drink Beverages as Emerging Types of Beverages in Shanghai.

Nutrients·2024
Same author

The relationship between stress, anxiety and eating behavior among Chinese students: a cross-sectional study.

Frontiers in public health·2024
Same journal

RETRACTION: Real-Time Modulation of Physical Training Intensity Based on Wavelet Recursive Fuzzy Neural Networks.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Multidimensional Heterogeneous Network Link Adaptation Based on Mobile Environment.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Framework to Segment and Evaluate Multiple Sclerosis Lesion in MRI Slices Using VGG-UNet.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Facial Emotion Recognition Using a Novel Fusion of Convolutional Neural Network and Local Binary Pattern in Crime Investigation.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Automatic Intelligent System Using Medical of Things for Multiple Sclerosis Detection.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Intangible Cultural Heritage Reproduction and Revitalization: Value Feedback, Practice, and Exploration Based on the IPA Model.

Computational intelligence and neuroscience·2026
See all related articles

Related Experiment Video

Updated: Oct 16, 2025

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks
11:18

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Published on: March 2, 2015

10.5K

Minibatch Recursive Least Squares Q-Learning.

Chunyuan Zhang1, Qi Song1, Zeng Meng1

  • 1School of Computer Science and Technology, Hainan University, Haikou, Hainan 570228, China.

Computational Intelligence and Neuroscience
|October 18, 2021
PubMed
Summary
This summary is machine-generated.

We introduce minibatch recursive least squares Q-learning (MRLS-Q), a novel algorithm combining linear function approximation with deep Q-network (DQN) advantages. MRLS-Q offers improved convergence and stability for reinforcement learning tasks.

More Related Videos

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.6K
Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator
06:45

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Published on: October 28, 2022

1.8K

Related Experiment Videos

Last Updated: Oct 16, 2025

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks
11:18

Closed-loop Neuro-robotic Experiments to Test Computational Properties of Neuronal Networks

Published on: March 2, 2015

10.5K
A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.6K
Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator
06:45

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Published on: October 28, 2022

1.8K

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Reinforcement Learning

Background:

  • Deep Q-network (DQN) is a successful reinforcement learning algorithm but suffers from slow convergence and instability.
  • Traditional linear function approximation methods offer faster convergence and stability but struggle with high-dimensional problems.
  • Existing DQN improvements rarely leverage traditional methods' strengths.

Purpose of the Study:

  • To propose a novel Q-learning algorithm, minibatch recursive least squares Q-learning (MRLS-Q), that integrates the benefits of linear function approximation with DQN.
  • To enhance convergence speed and stability in reinforcement learning.
  • To create a versatile algorithm applicable to both low- and high-dimensional problems.

Main Methods:

  • Developed MRLS-Q, a Q-learning algorithm with linear function approximation.
  • Modeled MRLS-Q's learning mechanism and structure similar to DQNs, using states as inputs and incorporating experience replay and minibatch training.
  • Implemented an average recursive least squares (RLS) optimization technique for improved convergence.

Main Results:

  • MRLS-Q demonstrated effectiveness on the CartPole problem and four Atari games.
  • The algorithm showed improved convergence performance, both standalone and when integrated with DQN.
  • Experimental analysis investigated the impact of MRLS-Q's hyperparameters.

Conclusions:

  • MRLS-Q successfully combines linear function approximation with DQN principles.
  • The proposed algorithm offers enhanced convergence and stability for reinforcement learning.
  • MRLS-Q is a flexible solution for various problem dimensions and can be seamlessly integrated into existing DQN architectures.