Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Fixed Action Patterns

Fixed Action Patterns

A fixed action pattern (FAP) is a specific, hard-wired sequence of behaviors that occurs in response to an external stimulus, called a sign stimulus. The behavior is “fixed” because it is essentially unchangeable—proceeding similarly across individuals of a species every time it occurs.

Elaborative Rehearsals

Elaborative Rehearsals

Elaborative rehearsal is a crucial cognitive strategy that strengthens information encoding in long-term memory by making meaningful connections between new data and pre-existing knowledge. This approach contrasts with maintenance rehearsal, which involves simple repetition without delving into the significance of the information. While maintenance rehearsal might temporarily keep information active in short-term memory, it is less effective for long-term retention.
The effectiveness of...

Role of Shaping in Operant Conditioning

Role of Shaping in Operant Conditioning

Shaping is a technique used in operant conditioning to train complex behaviors by rewarding successive approximations toward the target behavior. This method is necessary because organisms are unlikely to perform complex behaviors spontaneously. Instead, shaping breaks down the desired behavior into small, manageable steps.
The steps involved in shaping begin with reinforcing any response that resembles the desired behavior. For example, parents might praise a child for picking up one toy. As...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Electrochemically Assisted Synthesis of Covalent Organic Frameworks via In Situ Oxidation of Alcohol Monomers.

Journal of the American Chemical Society·2026

Same author

Divergent environmental constraints shape the spatial patterns of annual gross primary productivity in China's terrestrial ecosystems.

Frontiers in plant science·2026

Same author

Pediatric GBS-myelitis overlap syndrome: Severe phenotype, treatment response, and neurological outcomes.

Brain & development·2026

Same author

A pH-responsive liposomal nanoplatform for reprogramming lactate metabolism and immunogenic activation in pancreatic cancer.

Biomaterials·2026

Same author

Enhancing Enzyme Activity With Mutation Combinations Guided by Few-Shot Learning and Causal Inference.

Angewandte Chemie (International ed. in English)·2026

Same author

Antidepressant use and worsening of non-suicidal self-injury and suicidality in bipolar disorder: A multicenter retrospective cohort study.

Psychiatry research·2026

Same journal

DSPE-ViT: a lightweight vision transformer with dynamic sparse positional encoding for dense small object detection in UAV imagery.

Frontiers in neurorobotics·2026

Same journal

ST-HONet: Spatio-Temporal Hierarchical Network for long-horizon bimanual visuomotor imitation.

Frontiers in neurorobotics·2026

Same journal

ST-HADP: Spatio-Temporal hierarchical attention diffusion policy for long-horizon generalizable bimanual visuomotor imitation.

Frontiers in neurorobotics·2026

Same journal

EQISP: efficient quantized image signal processing with multi-scale pyramid fusion for resource constrained embodied perception.

Frontiers in neurorobotics·2026

Same journal

Research on embodied agent multimodal perception and real-time path planning algorithms for complex unstructured environments.

Frontiers in neurorobotics·2026

Same journal

NL-YOLOv5: a model with a larger receptive field and the ability to globally acquire features.

Frontiers in neurorobotics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 12, 2025

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning

Published on: June 2, 2014

Research on automatic pilot repetition generation method based on deep reinforcement learning.

Weijun Pan¹, Peiyuan Jiang¹, Yukun Li¹

¹Air Traffic Control Automation Laboratory, College of Air Traffic Management, Civil Aviation Flight University of China, Deyang, China.

Frontiers in Neurorobotics

|October 27, 2023

Summary

This summary is machine-generated.

A new RoBERTa-RL model uses deep reinforcement learning to generate realistic pilot communications for air traffic control simulation. This approach enhances training efficiency and reduces costs by improving model generalization for text generation tasks.

Keywords:

controller training generalization reinforcement learning text generation transfer learning

More Related Videos

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories

Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories

Published on: July 10, 2019

Related Experiment Videos

Last Updated: Jul 12, 2025

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning

Published on: June 2, 2014

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories

Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories

Published on: July 10, 2019

Area of Science:

Artificial Intelligence
Natural Language Processing
Air Traffic Management

Background:

Air traffic control (ATC) simulation training traditionally relies on pilot seats, which are costly and inefficient.
Developing automated methods for generating realistic pilot communications is crucial for advancing ATC simulation.

Purpose of the Study:

To propose and evaluate a deep reinforcement learning model, RoBERTa-RL, for generating pilot repetitions in ATC simulations.
To enhance the efficiency and reduce the cost of ATC controller training through automated communication generation.

Main Methods:

Utilized RoBERTa, a pre-trained language model, enhanced with transfer learning to address data scarcity in the ATC domain.
Employed reinforcement learning algorithms to optimize the RoBERTa model, improving generalization capabilities.
Trained and tested the model on real-world area control and simulated tower control datasets.

Main Results:

RoBERTa-RL achieved high ROUGE scores (e.g., 0.996 for ROUGE-L on area control data).
Keyword-based evaluation demonstrated high accuracy (98.8% on area control, 81.8% on tower control).
Significant improvements in generalization were observed, with a 56% increase over the baseline model.

Conclusions:

Deep reinforcement learning effectively enhances deep learning models for text generation, mitigating generalization issues.
The RoBERTa-RL model shows significant promise for improving ATC simulation and training.
The proposed approach has potential applications in other related text generation domains.