Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Reinforcement01:23

Reinforcement

826
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
826
Reinforcement Schedules01:24

Reinforcement Schedules

453
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
453
Observational Learning01:12

Observational Learning

824
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
824
Associative Learning01:27

Associative Learning

1.2K
Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...
1.2K
Law of Effect01:06

Law of Effect

2.5K
B.F. Skinner, a prominent figure in behavioral psychology, introduced operant conditioning by emphasizing the role of consequences in shaping behavior. This theory builds upon the law of effect proposed by Edward Thorndike, which posits that behaviors followed by satisfying outcomes are likely to be repeated. In contrast, those followed by unsatisfying outcomes are less likely to recur.
Edward Thorndike's foundational work involved studying learning in animals, particularly using puzzle...
2.5K
Avoidance Learning and Learned Helplessness01:14

Avoidance Learning and Learned Helplessness

2.5K
Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...
2.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Effect of modulation of alimentary and formulation pH on the pharmacokinetics of BCS-Class I, II, III, and IV compounds in rats following cassette dosing.

Drug development and industrial pharmacy·2026
Same author

Olympiad-level formal mathematical reasoning with reinforcement learning.

Nature·2025
Same author

On design, fabrication, and pre-clinical validation of customized 3D-printed dental implant assembly.

Proceedings of the Institution of Mechanical Engineers. Part H, Journal of engineering in medicine·2025
Same author

Impact of Individual and Cassette Administration on Pharmacokinetics of Prednisolone, Diclofenac, and Methotrexate in a Rodent Model of Rheumatoid Arthritis.

Recent advances in inflammation & allergy drug discovery·2025
Same author

Right Atrial Mass Presenting With Pulmonary Thromboembolic Disease: An Unexpected Hepatic Tissue.

JACC. Case reports·2025
Same author

A randomized, open-label two-period crossover pilot study to evaluate the relative bioavailability in the fed state of atovaquone-proguanil (Atoguanil™) versus atovaquone-proguanil hydrochloride (Malarone®) in healthy adult participants.

Naunyn-Schmiedeberg's archives of pharmacology·2024
Same journal

Inside the new political screening that's stalling NIH grants.

Nature·2026
Same journal

Europe's record heatwave: does the continent have a new climate?

Nature·2026
Same journal

Daily briefing: Humans and great apes giggle in the same rhythms.

Nature·2026
Same journal

The surprising career parallels between footballers and researchers.

Nature·2026
Same journal

I study World Cup penalty shoot-outs: they say a lot about the psychology of performance under pressure.

Nature·2026
Same journal

CRISPR's next act: the companies editing the epigenome to treat disease.

Nature·2026
See all related articles
  1. Home
  2. Discovering State-of-the-art Reinforcement Learning Algorithms.
  1. Home
  2. Discovering State-of-the-art Reinforcement Learning Algorithms.

Related Experiment Video

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy
11:53

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

Published on: October 14, 2017

12.1K

Discovering state-of-the-art reinforcement learning algorithms.

Junhyuk Oh1, Greg Farquhar2, Iurii Kemaev2

  • 1Google DeepMind, London, UK. junhyuk@google.com.

Nature
|October 22, 2025

View abstract on PubMed

Summary
This summary is machine-generated.

Machines can now discover advanced reinforcement learning (RL) rules, outperforming human-designed ones. This breakthrough in artificial intelligence was achieved through meta-learning from agent experiences.

Related Experiment Videos

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy
11:53

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

Published on: October 14, 2017

12.1K

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Reinforcement Learning

Background:

  • Biological systems utilize evolved reinforcement learning (RL) mechanisms.
  • Current artificial agents rely on manually designed learning rules.
  • Discovering autonomous RL algorithms has been a long-standing challenge.

Purpose of the Study:

  • To demonstrate that machines can autonomously discover state-of-the-art reinforcement learning rules.
  • To develop a method for discovering RL rules through meta-learning.

Main Methods:

  • Meta-learning from the collective experiences of a population of agents.
  • Training agents across a diverse range of complex environments.
  • Discovering the specific RL rule governing policy and prediction updates.

Main Results:

  • The discovered RL rule outperformed all existing rules on the Atari benchmark.
  • The discovered rule surpassed state-of-the-art RL algorithms on unseen challenging benchmarks.
  • This represents a significant advancement in reinforcement learning algorithm discovery.

Conclusions:

  • It is possible for machines to autonomously discover powerful reinforcement learning algorithms.
  • Future artificial intelligence may rely on automatically discovered RL algorithms.
  • This approach shifts from manual design to experience-driven discovery of AI learning rules.