Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Types of Errors: Detection and Minimization

Types of Errors: Detection and Minimization

Error is the deviation of the obtained result from the true, expected value or the estimated central value. Errors are expressed in absolute or relative terms.
Absolute error in a measurement is the numerical difference from the true or central value. Relative error is the ratio between absolute error and the true or central value, expressed as a percentage.
Errors can be classified by source, magnitude, and sign. There are three types of errors: systematic, random, and gross.
Systematic or...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning because...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Advancing AI negotiations: A large-scale autonomous negotiation competition.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same author

Providing normative information increases intentions to accept a COVID-19 vaccine.

Nature communications·2023

Same author

Identity effects in social media.

Nature human behaviour·2022

Same author

A causal test of the strength of weak ties.

Science (New York, N.Y.)·2022

Same author

Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T.

Human brain mapping·2022

Same author

Global survey on COVID-19 beliefs, behaviours and norms.

Nature human behaviour·2022

Same journal

Confident judgments of (mis)information veracity are more, rather than less, accurate.

PNAS nexus·2026

Same journal

Can AI help reduce prejudice? Evaluating the effectiveness of AI-powered personalized persuasion on support for transgender rights.

PNAS nexus·2026

Same journal

A cultural explanation for parole decisions in the United States.

PNAS nexus·2026

Same journal

A transformer-based language model reveals developmental constraint and network complexity during zebrafish embryogenesis.

PNAS nexus·2026

Same journal

Dual phosphoregulatory mechanisms of condensin I revealed by biochemical reconstitution.

PNAS nexus·2026

Same journal

Vanin-1 deficiency enhances host tolerance to influenza infection by modulating cellular redox status.

PNAS nexus·2026

See all related articles

Search research articles

Related Experiment Videos

Teaching AI to handle exceptions: Supervised fine-tuning with human-aligned judgment.

Matthew DosSantos DiSorbo¹, Harang Ju², Sinan Aral³

¹Harvard Business School, Harvard University, 20 N Harvard Street, Cambridge, MA 02163, USA.

|May 28, 2026

Summary

This summary is machine-generated.

Large language models (LLMs) struggle with real-world decision-making, often failing to handle exceptions like humans. Supervised fine-tuning with human explanations significantly improves their ability to make human-aligned judgments, even in new situations.

Keywords:

agentic AI decision-making large language models supervised fine-tuning transfer learning

Related Experiment Videos

Area of Science:

Artificial Intelligence
Human-Computer Interaction
Cognitive Science

Background:

Large language models (LLMs) are transitioning from generative tasks to agentic decision-making in complex environments.
The decision-making processes of LLMs, particularly their handling of exceptions, remain poorly understood.
Strict adherence to policies by LLMs can lead to impractical or suboptimal decisions, diverging from human judgment.

Purpose of the Study:

To investigate how LLMs handle exceptions in decision-making.
To evaluate different tuning methods for improving LLM exception handling and human alignment.
To determine the impact of human explanations versus labels in supervised fine-tuning.

Main Methods:

Evaluating LLM decision-making against human judgments on exception handling.
Comparing three tuning approaches: ethical framework prompting, chain-of-thought (CoT) reasoning, and supervised fine-tuning.
Analyzing the effectiveness of supervised fine-tuning with and without human explanations.

Main Results:

LLMs deviate from human judgments due to rigid policy adherence, even when impractical.
Ethical framework prompting was ineffective; CoT prompting offered minor improvements.
Supervised fine-tuning, especially with human explanations, significantly enhanced LLM decision-making and alignment.
Fine-tuning with explanations enabled generalization of human-like decision-making to novel scenarios.

Conclusions:

Aligning LLMs with human judgment requires training on the reasoning process ('how') not just the outcome ('which').
Supervised fine-tuning with human explanations is critical for developing adaptable and human-aligned agentic AI.
Addressing LLMs' exception-handling deficits is crucial for advancing AI that effectively aligns with human values and adapts to new contexts.