Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Reinforcement Schedules01:24

Reinforcement Schedules

160
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
160
Reinforcement01:23

Reinforcement

221
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
221
Observational Learning01:12

Observational Learning

188
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
188
Fixed Action Patterns01:06

Fixed Action Patterns

16.0K
A fixed action pattern (FAP) is a specific, hard-wired sequence of behaviors that occurs in response to an external stimulus, called a sign stimulus. The behavior is “fixed” because it is essentially unchangeable—proceeding similarly across individuals of a species every time it occurs.
16.0K
Elaborative Rehearsals01:07

Elaborative Rehearsals

90
Elaborative rehearsal is a crucial cognitive strategy that strengthens information encoding in long-term memory by making meaningful connections between new data and pre-existing knowledge. This approach contrasts with maintenance rehearsal, which involves simple repetition without delving into the significance of the information. While maintenance rehearsal might temporarily keep information active in short-term memory, it is less effective for long-term retention.
The effectiveness of...
90
Role of Shaping in Operant Conditioning01:19

Role of Shaping in Operant Conditioning

347
Shaping is a technique used in operant conditioning to train complex behaviors by rewarding successive approximations toward the target behavior. This method is necessary because organisms are unlikely to perform complex behaviors spontaneously. Instead, shaping breaks down the desired behavior into small, manageable steps.
The steps involved in shaping begin with reinforcing any response that resembles the desired behavior. For example, parents might praise a child for picking up one toy. As...
347

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Electrochemically Assisted Synthesis of Covalent Organic Frameworks via In Situ Oxidation of Alcohol Monomers.

Journal of the American Chemical Society·2026
Same author

Divergent environmental constraints shape the spatial patterns of annual gross primary productivity in China's terrestrial ecosystems.

Frontiers in plant science·2026
Same author

Pediatric GBS-myelitis overlap syndrome: Severe phenotype, treatment response, and neurological outcomes.

Brain & development·2026
Same author

A pH-responsive liposomal nanoplatform for reprogramming lactate metabolism and immunogenic activation in pancreatic cancer.

Biomaterials·2026
Same author

Enhancing Enzyme Activity With Mutation Combinations Guided by Few-Shot Learning and Causal Inference.

Angewandte Chemie (International ed. in English)·2026
Same author

Antidepressant use and worsening of non-suicidal self-injury and suicidality in bipolar disorder: A multicenter retrospective cohort study.

Psychiatry research·2026
Same journal

DSPE-ViT: a lightweight vision transformer with dynamic sparse positional encoding for dense small object detection in UAV imagery.

Frontiers in neurorobotics·2026
Same journal

ST-HONet: Spatio-Temporal Hierarchical Network for long-horizon bimanual visuomotor imitation.

Frontiers in neurorobotics·2026
Same journal

ST-HADP: Spatio-Temporal hierarchical attention diffusion policy for long-horizon generalizable bimanual visuomotor imitation.

Frontiers in neurorobotics·2026
Same journal

EQISP: efficient quantized image signal processing with multi-scale pyramid fusion for resource constrained embodied perception.

Frontiers in neurorobotics·2026
Same journal

Research on embodied agent multimodal perception and real-time path planning algorithms for complex unstructured environments.

Frontiers in neurorobotics·2026
Same journal

NL-YOLOv5: a model with a larger receptive field and the ability to globally acquire features.

Frontiers in neurorobotics·2026
See all related articles

Related Experiment Video

Updated: Jul 12, 2025

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning
11:20

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning

Published on: June 2, 2014

12.0K

Research on automatic pilot repetition generation method based on deep reinforcement learning.

Weijun Pan1, Peiyuan Jiang1, Yukun Li1

  • 1Air Traffic Control Automation Laboratory, College of Air Traffic Management, Civil Aviation Flight University of China, Deyang, China.

Frontiers in Neurorobotics
|October 27, 2023
PubMed
Summary
This summary is machine-generated.

A new RoBERTa-RL model uses deep reinforcement learning to generate realistic pilot communications for air traffic control simulation. This approach enhances training efficiency and reduces costs by improving model generalization for text generation tasks.

Keywords:
controller traininggeneralizationreinforcement learningtext generationtransfer learning

More Related Videos

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.4K
Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories
07:52

Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories

Published on: July 10, 2019

14.2K

Related Experiment Videos

Last Updated: Jul 12, 2025

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning
11:20

Recording Single Neurons' Action Potentials from Freely Moving Pigeons Across Three Stages of Learning

Published on: June 2, 2014

12.0K
A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.4K
Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories
07:52

Automated Rat Single-Pellet Reaching with 3-Dimensional Reconstruction of Paw and Digit Trajectories

Published on: July 10, 2019

14.2K

Area of Science:

  • Artificial Intelligence
  • Natural Language Processing
  • Air Traffic Management

Background:

  • Air traffic control (ATC) simulation training traditionally relies on pilot seats, which are costly and inefficient.
  • Developing automated methods for generating realistic pilot communications is crucial for advancing ATC simulation.

Purpose of the Study:

  • To propose and evaluate a deep reinforcement learning model, RoBERTa-RL, for generating pilot repetitions in ATC simulations.
  • To enhance the efficiency and reduce the cost of ATC controller training through automated communication generation.

Main Methods:

  • Utilized RoBERTa, a pre-trained language model, enhanced with transfer learning to address data scarcity in the ATC domain.
  • Employed reinforcement learning algorithms to optimize the RoBERTa model, improving generalization capabilities.
  • Trained and tested the model on real-world area control and simulated tower control datasets.

Main Results:

  • RoBERTa-RL achieved high ROUGE scores (e.g., 0.996 for ROUGE-L on area control data).
  • Keyword-based evaluation demonstrated high accuracy (98.8% on area control, 81.8% on tower control).
  • Significant improvements in generalization were observed, with a 56% increase over the baseline model.

Conclusions:

  • Deep reinforcement learning effectively enhances deep learning models for text generation, mitigating generalization issues.
  • The RoBERTa-RL model shows significant promise for improving ATC simulation and training.
  • The proposed approach has potential applications in other related text generation domains.