Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Self-Evaluation: Self-Enhancement and Self-Verification

Self-Evaluation: Self-Enhancement and Self-Verification

Social psychologists have documented that feeling good about ourselves and maintaining positive self-esteem is a powerful motivator of human behavior (Tavris & Aronson, 2008). In the United States, members of the predominant culture typically think very highly of themselves and view themselves as good people who are above average on many desirable traits (Ehrlinger, Gilovich, & Ross, 2005). Often, our behavior, attitudes, and beliefs are affected when we experience a threat to our...

Stereotype Threat and Self-fulfilling Prophecies

Stereotype Threat and Self-fulfilling Prophecies

When we hold a stereotype about a person, we have expectations that he or she will fulfill that stereotype. A self-fulfilling prophecy is an expectation held by a person that alters his or her behavior in a way that tends to make it true. When we hold stereotypes about a person, we tend to treat the person according to our expectations. This treatment can influence the person to act according to our stereotypic expectations, thus confirming our stereotypic beliefs. Research by Rosenthal and...

Confirmation Biases

Confirmation Biases

The confirmation bias is the tendency to focus on information that confirms our existing beliefs and ignore information that is inconsistent with our expectations. For example, if you think that your professor is not very nice, you notice all of the instances of rude behavior exhibited by the professor while ignoring the countless pleasant interactions he is involved in on a daily basis. Have you ever fallen prey to the confirmation bias, either as the source or target of such bias?

Fundamental Attribution Error

Fundamental Attribution Error

According to some social psychologists, people tend to overemphasize internal factors as explanations—or attributions—for the behavior of other people. They tend to assume that the behavior of another person is a trait of that person, and to underestimate the power of the situation on the behavior of others. They tend to fail to recognize when the behavior of another is due to situational variables, and thus to the person’s state. This erroneous assumption is...

In- and Out-Groups

In- and Out-Groups

People all belong to a gender, race, age, and social economic group. These groups provide a powerful source of our identity and self-esteem (Tajfel & Turner, 1979) and serve as our in-groups. An in-group is a group that we identify with or see ourselves as belonging to.

Cognitive Dissonance

Cognitive Dissonance

Social psychologists have documented that feeling good about ourselves and maintaining positive self-esteem is a powerful motivator of human behavior (Tavris & Aronson, 2008). In the United States, members of the predominant culture typically think very highly of themselves and view themselves as good people who are above average on many desirable traits (Ehrlinger, Gilovich, & Ross, 2005). Often, our behavior, attitudes, and beliefs are affected when we experience a threat to our...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Optimized Memory Allocation and Power Minimization for FPGA-Based Image Processing.

Journal of imaging·2021

Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026

Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026

Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026

Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026

Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026

Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 13, 2025

Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses

Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses

Published on: January 7, 2019

Aversion to external feedback suffices to ensure agent alignment.

¹International School of Engineering, Chulalongkorn University, Bangkok, Thailand. paulo.g@chula.ac.th.

Scientific Reports

|September 10, 2024

Summary

This summary is machine-generated.

Apprehensive agents, a new AI architecture, align with human values by anticipating negative feedback. This approach improves AI alignment, even with increasing intelligence, unlike traditional methods.

More Related Videos

Investigating Pain-Related Avoidance Behavior using a Robotic Arm-Reaching Paradigm

Investigating Pain-Related Avoidance Behavior using a Robotic Arm-Reaching Paradigm

Published on: October 3, 2020

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Related Experiment Videos

Last Updated: Jun 13, 2025

Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses

Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses

Published on: January 7, 2019

Investigating Pain-Related Avoidance Behavior using a Robotic Arm-Reaching Paradigm

Investigating Pain-Related Avoidance Behavior using a Robotic Arm-Reaching Paradigm

Published on: October 3, 2020

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Area of Science:

Artificial Intelligence
AI Alignment
Machine Ethics

Background:

The AI alignment challenge seeks to ensure artificial intelligence (AI) systems operate in accordance with human values.
Rational agents maximizing utility functions may diverge from human values, especially at higher intelligence levels.
A singular utility function is insufficient; holistic alignment approaches are necessary.

Purpose of the Study:

To introduce and evaluate a novel AI agent architecture, termed 'apprehensive agents'.
To demonstrate how these agents can achieve better alignment with human values without external feedback.
To show that this alignment improves with increasing agent intelligence.

Main Methods:

Architecting agents with an effective utility function combining a designer-defined partial utility and an expectation of negative feedback.
Implementing temporal reasoning to approximate designer intentions under environmental evolution.
Evaluating apprehensive agents in simulated environments designed to reveal misalignment opportunities.

Main Results:

Apprehensive agents demonstrated superior alignment compared to baseline agents in simulated environments.
The alignment of apprehensive agents improved as their intelligence increased.
This strategy achieved alignment without requiring direct external feedback.

Conclusions:

Apprehensive agents offer a promising solution to the AI alignment challenge.
Anticipating negative feedback through internal reasoning is a viable mechanism for enhancing AI alignment.
The proposed architecture shows potential for robust AI alignment as intelligence scales.