Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Types of Errors: Detection and Minimization

Types of Errors: Detection and Minimization

Error is the deviation of the obtained result from the true, expected value or the estimated central value. Errors are expressed in absolute or relative terms.
Absolute error in a measurement is the numerical difference from the true or central value. Relative error is the ratio between absolute error and the true or central value, expressed as a percentage.
Errors can be classified by source, magnitude, and sign. There are three types of errors: systematic, random, and gross.
Systematic or...

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Errors In Hypothesis Tests

Errors In Hypothesis Tests

When performing a hypothesis test, there are four possible outcomes depending on the actual truth (or falseness) of the null hypothesis and the decision to reject or not.

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Development and evaluation of an ontology for non-invasive respiratory support in acute care.

PloS one·2026

Same author

Failure Modes of Time Series Interpretability Algorithms for Critical Care Applications and Potential Solutions.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same author

PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same author

SHREC: A framework for advancing next-generation computational phenotyping with large language models.

PLOS digital health·2026

Same author

Standardizing Data Elements for Implementation of ICU Liberation Bundle.

Applied clinical informatics·2026

Same author

Comparative Evaluation of USG, CT, and MRI in Acute Pancreatitis.

Journal of pharmacy & bioallied sciences·2026

Same journal

MetaboNet-Bench: A Multi-modal Benchmark for Glucose Forecasting in Type 1 Diabetes.

ArXiv·2026

Same journal

A Positron Range Correction with Texture Preservation Framework in PET Imaging.

ArXiv·2026

Same journal

Automated optimization of force field parameters against ensemble-averaged measurements with Bayesian Inference of Conformational Populations.

ArXiv·2026

Same journal

Droplet Fusion as a Relaxation Process: Comparison with Shape Recovery of Newtonian and Viscoelastic Droplets.

ArXiv·2026

Same journal

Ridge-filter crosstalk in conformal proton FLASH planning: dependence on beamlet pitch and iterative mitigation.

ArXiv·2026

Same journal

Electrochemical DNA Hairpin Sensors for Differentiating Small Molecule Intercalation from Minor Groove Binding.

ArXiv·2026

查看所有相关文章

Search research articles

相关实验视频

Updated: Sep 12, 2025

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

轻量级语言模型对于复杂的计算表型化任务容易产生推理错误.

Shashank Yadav¹, David Maughan¹, Vignesh Subbian¹

¹College of Engineering, The University of Arizona, Tucson, AZ.

|August 6, 2025

概括

此摘要是机器生成的。

大型语言模型 (LLM) 在复杂的计算表型化任务中显示推理错误. 加强像PHEONA这样的LLM评估框架对于识别和解决人工智能开发中的这些错误至关重要.

关键词:

计算的表型化计算的表型化.计算机推理计算机推理电子表现成型电子表现成型电子表现成型生成型的人工智能大型语言模型

更多相关视频

In Vivo Modeling of the Morbid Human Genome using Danio rerio

In Vivo Modeling of the Morbid Human Genome using Danio rerio

Published on: August 24, 2013

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

相关实验视频

Last Updated: Sep 12, 2025

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

In Vivo Modeling of the Morbid Human Genome using Danio rerio

In Vivo Modeling of the Morbid Human Genome using Danio rerio

Published on: August 24, 2013

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

科学领域:

生物医学信息学生物医学信息学
人工智能的人工智能

背景情况:

计算表型化对于队列识别至关重要,但由于手动数据审查,需要大量的时间.
之前的研究表明,LLM在复杂的表型化任务中存在局限性,特别是在多种疗法中.

研究的目的:

评估轻量级LLM在计算表型化中的推理能力.
加强PHEONA框架,用于评估LLMs中的错误推理.

主要方法:

评估了三种轻量级的LLM (DeepSeek-r1,Mistral Small,Phi-4) 进行表型准确性.
使用快速修改来识别解释正确性和不忠错误.
扩展了PHEONA框架,包括错误推理评估.

主要成果:

在所有测试的LLMs中,推理错误,包括解释的正确性和不忠诚性,普遍存在.
与Mistral和Phi相比,DeepSeek在快速修改后表现出最小的准确性影响.
增强的PHEONA框架成功地发现了普遍存在的推理错误.

结论:

推理错误在LLM对复杂任务的响应中无处不在,例如计算表型化.
增强的PHEONA框架对于LLM评估至关重要,强调需要改进可解释性方法.