Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Law of Effect

Law of Effect

B.F. Skinner, a prominent figure in behavioral psychology, introduced operant conditioning by emphasizing the role of consequences in shaping behavior. This theory builds upon the law of effect proposed by Edward Thorndike, which posits that behaviors followed by satisfying outcomes are likely to be repeated. In contrast, those followed by unsatisfying outcomes are less likely to recur.
Edward Thorndike's foundational work involved studying learning in animals, particularly using puzzle...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Effect of modulation of alimentary and formulation pH on the pharmacokinetics of BCS-Class I, II, III, and IV compounds in rats following cassette dosing.

Drug development and industrial pharmacy·2026

Same author

Olympiad-level formal mathematical reasoning with reinforcement learning.

Nature·2025

Same author

On design, fabrication, and pre-clinical validation of customized 3D-printed dental implant assembly.

Proceedings of the Institution of Mechanical Engineers. Part H, Journal of engineering in medicine·2025

Same author

Impact of Individual and Cassette Administration on Pharmacokinetics of Prednisolone, Diclofenac, and Methotrexate in a Rodent Model of Rheumatoid Arthritis.

Recent advances in inflammation & allergy drug discovery·2025

Same author

Right Atrial Mass Presenting With Pulmonary Thromboembolic Disease: An Unexpected Hepatic Tissue.

JACC. Case reports·2025

Same author

A randomized, open-label two-period crossover pilot study to evaluate the relative bioavailability in the fed state of atovaquone-proguanil (Atoguanil™) versus atovaquone-proguanil hydrochloride (Malarone®) in healthy adult participants.

Naunyn-Schmiedeberg's archives of pharmacology·2024

Same journal

Inside the new political screening that's stalling NIH grants.

Nature·2026

Same journal

Europe's record heatwave: does the continent have a new climate?

Nature·2026

Same journal

Daily briefing: Humans and great apes giggle in the same rhythms.

Nature·2026

Same journal

The surprising career parallels between footballers and researchers.

Nature·2026

Same journal

I study World Cup penalty shoot-outs: they say a lot about the psychology of performance under pressure.

Nature·2026

Same journal

CRISPR's next act: the companies editing the epigenome to treat disease.

Nature·2026

See all related articles

Search research articles

Home
Discovering State-of-the-art Reinforcement Learning Algorithms.

Home
Discovering State-of-the-art Reinforcement Learning Algorithms.

Related Experiment Video

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

Published on: October 14, 2017

Discovering state-of-the-art reinforcement learning algorithms.

Junhyuk Oh¹, Greg Farquhar², Iurii Kemaev²

¹Google DeepMind, London, UK. junhyuk@google.com.

|October 22, 2025

View abstract on PubMed

Summary

This summary is machine-generated.

Machines can now discover advanced reinforcement learning (RL) rules, outperforming human-designed ones. This breakthrough in artificial intelligence was achieved through meta-learning from agent experiences.

Related Experiment Videos

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

Published on: October 14, 2017

Area of Science:

Artificial Intelligence
Machine Learning
Reinforcement Learning

Background:

Biological systems utilize evolved reinforcement learning (RL) mechanisms.
Current artificial agents rely on manually designed learning rules.
Discovering autonomous RL algorithms has been a long-standing challenge.

Purpose of the Study:

To demonstrate that machines can autonomously discover state-of-the-art reinforcement learning rules.
To develop a method for discovering RL rules through meta-learning.

Main Methods:

Meta-learning from the collective experiences of a population of agents.
Training agents across a diverse range of complex environments.
Discovering the specific RL rule governing policy and prediction updates.

Main Results:

The discovered RL rule outperformed all existing rules on the Atari benchmark.
The discovered rule surpassed state-of-the-art RL algorithms on unseen challenging benchmarks.
This represents a significant advancement in reinforcement learning algorithm discovery.

Conclusions:

It is possible for machines to autonomously discover powerful reinforcement learning algorithms.
Future artificial intelligence may rely on automatically discovered RL algorithms.
This approach shifts from manual design to experience-driven discovery of AI learning rules.