Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Actor-Observer Effect01:23

Actor-Observer Effect

404
The actor-observer effect, a cognitive bias closely linked to the fundamental attribution error, refers to the tendency for individuals to attribute their behavior to external, situational factors while explaining others’ behavior in terms of internal, dispositional traits. This asymmetry in attribution significantly influences social perception and judgment.Cognitive Mechanisms Behind the EffectTwo primary psychological mechanisms contribute to the actor-observer effect: differences in...
404
What is an Electrochemical Gradient?01:26

What is an Electrochemical Gradient?

128.1K
Adenosine triphosphate, or ATP, is considered the primary energy source in cells. However, energy can also be stored in the electrochemical gradient of an ion across the plasma membrane, which is determined by two factors: its chemical and electrical gradients.
The chemical gradient relies on differences in the abundance of a substance on the outside versus the inside of a cell and flows from areas of high to low ion concentration. In contrast, the electrical gradient revolves around an...
128.1K
Distance Corrections01:15

Distance Corrections

297
To achieve precise distance measurements, especially in surveying and construction, certain corrections must be applied to account for potential sources of error like the standardization errors, temperature variations, and slope adjustments.Standardization error emerges when measurement equipment undergoes changes, such as wear, repairs, or weather impacts. To address this, surveyors compare the equipment’s readings to a standard. This process identifies any deviation that might lead to...
297
Power Factor Correction01:20

Power Factor Correction

545
The power transmission to a factory involves the transfer of apparent power, a combination of active and reactive power. The power factor measures how effectively electrical power is converted into useful work output. The ratio of the real power (KW) that does the work to the apparent power (KVA) supplied to the circuit.
545
Predicting Molecular Geometry02:27

Predicting Molecular Geometry

46.0K
VSEPR Theory for Determination of Electron Pair Geometries
46.0K
Critical Region, Critical Values and Significance Level01:16

Critical Region, Critical Values and Significance Level

13.4K
The critical region, critical value, and significance level are interdependent concepts crucial in hypothesis testing.
In hypothesis testing, a sample statistic is converted to a test statistic using z, t, or chi-square distribution. A critical region is an area under the curve in  probability distributions demarcated by the critical value. When the test statistic falls in this region, it suggests that the null hypothesis must be rejected. As this region contains all those values of the...
13.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Decomposed Multi-Modality Fusion: Integrating Frames and Events for Efficient Visuomotor Policies.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Learning predictive control based on extended fuzzy state observation for trajectory tracking of an uncertain manipulator.

ISA transactions·2025
Same author

Enhanced T<sub>g</sub> Prediction in Polyimide via PolySDA: A Novel Shallow-Deep Multimodal Fusion Framework.

Macromolecular rapid communications·2025
Same author

Enhancing Graph Reconstruction: Uniting Dual-Level Graph Structure With Graph Reinforcement Learning.

IEEE transactions on neural networks and learning systems·2025
Same author

A novel class of non-Gaussian system performance assessment and controller parameter tuning methods.

ISA transactions·2024
Same author

Glass Transition Temperature Prediction of Polymers via Graph Reinforcement Learning.

Langmuir : the ACS journal of surfaces and colloids·2024
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Feb 8, 2026

Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki
07:31

Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki

Published on: September 13, 2019

10.6K

Actor-Critic Learning Control Based on -Regularized Temporal-Difference Prediction With Gradient Correction.

Luntong Li, Dazi Li, Tianheng Song

    IEEE Transactions on Neural Networks and Learning Systems
    |July 12, 2018
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces Critic-Iteration Policy Gradient (CIPG), a novel actor-critic framework. CIPG improves data efficiency and convergence for learning control problems by using a regularized RLS-TD critic for on-policy evaluation.

    More Related Videos

    Constructing and Visualizing Models using Mime-based Machine-learning Framework
    06:19

    Constructing and Visualizing Models using Mime-based Machine-learning Framework

    Published on: July 22, 2025

    2.6K
    Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model
    07:13

    Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

    Published on: April 18, 2025

    747

    Related Experiment Videos

    Last Updated: Feb 8, 2026

    Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki
    07:31

    Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki

    Published on: September 13, 2019

    10.6K
    Constructing and Visualizing Models using Mime-based Machine-learning Framework
    06:19

    Constructing and Visualizing Models using Mime-based Machine-learning Framework

    Published on: July 22, 2025

    2.6K
    Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model
    07:13

    Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

    Published on: April 18, 2025

    747

    Area of Science:

    • Reinforcement Learning
    • Machine Learning
    • Control Theory

    Background:

    • Actor-critic (AC) methods based on policy gradient (PG-based AC) are prevalent for learning control problems.
    • Enhancing data efficiency in the critic component of PG-based AC has led to research in recursive least-squares temporal difference (RLS-TD) algorithms for policy evaluation.
    • Existing RLS-TD critic implementations evaluate mixed policies from varying actors, hindering convergence proofs to optimal fixed points.

    Purpose of the Study:

    • To propose a new actor-critic framework, Critic-Iteration Policy Gradient (CIPG), that addresses the convergence limitations of existing RLS-TD critic methods.
    • To enable on-policy learning of the state-value function for the current policy.
    • To achieve gradient ascent towards maximizing the discounted total reward.

    Main Methods:

    • CIPG maintains fixed policy parameters within each iteration.
    • It employs an RLS-TD critic with -regularization for evaluating the fixed policy.
    • Convergence analysis is extended for PG with function approximation to incorporate the RLS-TD critic.

    Main Results:

    • The -regularization term in the CIPG critic remains active throughout the learning process.
    • CIPG demonstrates superior learning efficiency compared to conventional AC methods.
    • Simulation results indicate a faster convergence rate for CIPG.

    Conclusions:

    • CIPG provides a theoretically sound and practically effective framework for policy gradient actor-critic methods.
    • The proposed method overcomes convergence issues associated with RLS-TD critics evaluating non-fixed policies.
    • CIPG offers improved performance in learning control tasks.