Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

15.4K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
15.4K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.8K
3.8K
Calibration Curves: Linear Least Squares01:20

Calibration Curves: Linear Least Squares

5.1K
A calibration curve is a plot of the instrument's response against a series of known concentrations of a substance. This curve is used to set the instrument response levels, using the substance and its concentrations as standards. Alternatively, or additionally, an equation is fitted to the calibration curve plot and subsequently used to calculate the unknown concentrations of other samples reliably.
For data that follow a straight line, the standard method for fitting is the linear...
5.1K
Instrument Calibration01:12

Instrument Calibration

1.1K
Instrument calibration is essential for ensuring that instruments produce accurate and consistent results. It is vital in manufacturing, healthcare, testing laboratories, and scientific research. Calibration processes are specific to each instrument and help enhance data accuracy. Each instrument has a unique calibration process tailored to its design and function to improve data accuracy.
Analytical Balance Calibration
An analytical balance measures mass and requires regular calibration to...
1.1K
Linearization and Approximation01:26

Linearization and Approximation

154
Linearization is a mathematical technique used to approximate complex, nonlinear functions with simpler linear models in the vicinity of a chosen reference point. The method is based on the idea that, although a function may be difficult to evaluate exactly, its behavior near a specific input value can often be closely approximated by the tangent line at that point. This approach is particularly useful when small deviations from a known value are involved.Consider the square root function, for...
154
Language Development01:22

Language Development

1.0K
Children master language quickly and with relative ease, supported by both biological predisposition and reinforcement. B. F. Skinner (1957) proposed that language is learned through reinforcement, while Noam Chomsky (1965) argued that language acquisition mechanisms are biologically determined.
The critical period for language acquisition suggests that the ability to acquire language is at its peak early in life. As people age, this proficiency decreases. Language development begins very...
1.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Tabular LLMs for Interpretable Few-Shot Alzheimer's Disease Prexdiction with Multimodal Biomedical Data.

ArXiv·2026
Same author

Integrating Social Determinants of Health in a Multi-Modal Deep Clustering Survival Model for Injury-Risk in Alzheimer's and Related Dementia Patients.

Proceedings of machine learning research·2026
Same author

IRIS: Interpretable Risk Clustering Intelligence for Survival Analysis.

Proceedings : ... IEEE International Conference on Big Data. IEEE International Conference on Big Data·2026
Same author

Multi-Modal Deep Clustering Survival Machines for Alzheimer's Disease Subtype Discovery.

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision·2026
Same author

Fair Multi-modal Canonical Correlation Analysis: A Neuroimaging Study of Alzheimer's Disease.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same author

ICAFS: Inter-Client-Aware Feature Selection for Vertical Federated Learning.

IEEE transactions on artificial intelligence·2026
Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Mar 24, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach.

Jiancong Xiao1, Bojian Hou1, Zhanliang Wang1

  • 1University of Pennsylvania, PA, USA.

Proceedings of Machine Learning Research
|March 23, 2026
PubMed
Summary
This summary is machine-generated.

Preference alignment in Large Language Models (LLMs) causes poor calibration, leading to overconfidence. This study introduces domain-specific fine-tuning and calibration-aware methods to improve LLM calibration without sacrificing performance.

More Related Videos

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

965

Related Experiment Videos

Last Updated: Mar 24, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

965

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Natural Language Processing

Background:

  • Large Language Models (LLMs) rely on preference alignment for success.
  • Preference alignment often results in poor model calibration, a phenomenon known as overconfidence.
  • Pre-trained models are typically well-calibrated, but LLMs degrade after alignment.

Purpose of the Study:

  • Investigate the reasons behind calibration degradation in LLMs post-preference alignment.
  • Develop methods to address and mitigate poor calibration in aligned LLMs.
  • Analyze the impact of calibration on LLM performance and propose solutions for different model regimes.

Main Methods:

  • Observed that preference collapse during alignment generalizes to calibration issues, causing overconfidence.
  • Demonstrated the effectiveness of fine-tuning with domain-specific knowledge to reduce overconfidence.
  • Categorized models into 'calibratable' and 'non-calibratable' based on Expected Calibration Error (ECE).
  • Proposed a calibration-aware fine-tuning approach for the calibratable regime.
  • Developed an EM-algorithm-based ECE regularization for the non-calibratable regime.

Main Results:

  • Preference alignment leads to overconfidence and poor calibration in LLMs.
  • Domain-specific fine-tuning alleviates overconfidence.
  • A calibration-aware fine-tuning approach maintains performance in the calibratable regime.
  • ECE regularization effectively reduces calibration error in the non-calibratable regime.
  • Proposed methods were validated through extensive experiments.

Conclusions:

  • Preference alignment negatively impacts LLM calibration due to preference collapse.
  • Domain-specific knowledge and calibration-aware fine-tuning are crucial for improving LLM calibration.
  • Tailored methods for calibratable and non-calibratable models effectively address overconfidence and maintain performance.