Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Propagation of Uncertainty from Random Error

Propagation of Uncertainty from Random Error

An experiment often consists of more than a single step. In this case, measurements at each step give rise to uncertainty. Because the measurements occur in successive steps, the uncertainty in one step necessarily contributes to that in the subsequent step. As we perform statistical analysis on these types of experiments, we must learn to account for the propagation of uncertainty from one step to the next. The propagation of uncertainty depends on the type of arithmetic operation performed on...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

Difference from Background: Limit of Detection

Difference from Background: Limit of Detection

The limit of detection (LOD) is the smallest amount of analyte that can be distinguished from the background noise. The LOD value corresponds to the concentration at which the analyte signal is three times larger than the standard deviation of the blank signal. Below this value, the analyte signal cannot be differentiated from the background noise. It is calculated by dividing the calibration slope by 3 times the standard deviation of the blank signals.
The LOD indicates the presence or absence...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Utilization of Lung Cancer Registries in Learning Health Systems for Health Care Improvement.

JCO clinical cancer informatics·2025

Same author

Transbronchial cryobiopsy followed by as-needed surgical lung biopsy versus immediate surgical lung biopsy for diagnosing interstitial lung disease (the COLD study): a randomised controlled trial.

The Lancet. Respiratory medicine·2024

Same author

Minimally invasive technique for gastric GIST at challenging locations: single incision surgical gastroscopy.

Updates in surgery·2023

Same author

[Effect of melatonin on hyperoxia-induced oxidant/antioxidant imbalance in the lung of neonatal rats with chronic lung disease].

Zhongguo dang dai er ke za zhi = Chinese journal of contemporary pediatrics·2009

Same author

Phase I/II trial of AEG35156 X-linked inhibitor of apoptosis protein antisense oligonucleotide combined with idarubicin and cytarabine in patients with relapsed or primary refractory acute myeloid leukemia.

Journal of clinical oncology : official journal of the American Society of Clinical Oncology·2009

Same author

Multiplex single-nucleotide polymorphism typing by nanoparticle-coupled DNA-templated reactions.

Journal of the American Chemical Society·2009

Same journal

VideoPASTA: 7K Preference Pairs That Matter for Video-LLM Alignment.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026

Same journal

Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026

Same journal

X-CoT: Explainable Text-to-Video Retrieval via LLM-based Chain-of-Thought Reasoning.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026

Same journal

DischargeSim: A Simulation Benchmark for Educational Doctor-Patient Communication at Discharge.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026

Same journal

From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026

Same journal

BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 17, 2025

P300-Based Brain-Computer Interface Speller Performance Estimation with Classifier-Based Latency Estimation

P300-Based Brain-Computer Interface Speller Performance Estimation with Classifier-Based Latency Estimation

Published on: September 8, 2023

Improving Minimum Bayes Risk Decoding with Multi-Prompt.

David Heineman¹, Yao Dou¹, Wei Xu¹

¹School of Interactive Computing, Georgia Institute of Technology.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

|July 4, 2025

Summary

This summary is machine-generated.

Instruction fine-tuned large language models (LLMs) benefit from multi-prompt decoding, which generates diverse candidates for improved performance. This approach enhances Minimum Bayes Risk (MBR) decoding for more stable and optimal text generation across various tasks.

More Related Videos

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

Related Experiment Videos

Last Updated: Sep 17, 2025

P300-Based Brain-Computer Interface Speller Performance Estimation with Classifier-Based Latency Estimation

P300-Based Brain-Computer Interface Speller Performance Estimation with Classifier-Based Latency Estimation

Published on: September 8, 2023

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

A Tactile Automated Passive-Finger Stimulator TAPS

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

Area of Science:

Natural Language Processing
Artificial Intelligence
Machine Learning

Background:

Instruction fine-tuned large language models (LLMs) demonstrate strong text generation capabilities but suffer from performance instability due to prompt sensitivity.
A single prompt may not encompass all optimal strategies for a given generation task, leading to sub-optimal outcomes.

Purpose of the Study:

To introduce and evaluate a novel multi-prompt decoding strategy to enhance the stability and performance of LLM text generation.
To investigate if generating multiple candidate outputs from a prompt bank improves downstream task performance.

Main Methods:

Proposing multi-prompt decoding, which generates numerous candidate text outputs from a curated bank of prompts at inference time.
Employing Minimum Bayes Risk (MBR) decoding to ensemble these candidates, selecting the final output based on a trained value metric.

Main Results:

Multi-prompt decoding significantly improves MBR decoding performance across a wide range of conditional text generation tasks.
The enhanced performance is attributed to the creation of a more diverse and higher-quality candidate solution space compared to single-prompt methods.
Further experiments validate the effectiveness of multi-prompt decoding across different LLM architectures, tasks, and evaluation metrics.

Conclusions:

Multi-prompt decoding offers a robust method to overcome prompt sensitivity issues in instruction fine-tuned LLMs.
This technique leads to more stable, optimal, and diverse text generation, improving overall LLM utility in conditional generation scenarios.