Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

48
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
48
Language and Cognition01:27

Language and Cognition

342
Language serves as a bridge between ideas and communication, influencing how individuals perceive and interact with the world. Psychologists have long debated whether language shapes thought or vice versa. This discussion gained grip with Edward Sapir and Benjamin Lee Whorf in the 1940s, who proposed that language determines thought, a concept known as linguistic determinism. They suggested that the vocabulary and structure of a language influence how its speakers think and perceive reality.
342
Typical Model Studies01:30

Typical Model Studies

354
Fluid mechanics model studies often utilize scaled-down systems to predict fluid behavior in full-scale environments, such as river flows, dam spillways, and structures interacting with open surfaces. Maintaining Froude number similarity in river models is crucial, as it replicates surface flow features like wave patterns and velocities.
354
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

106
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
106
Modeling and Similitude01:12

Modeling and Similitude

262
Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...
262
Modeling in Therapy01:26

Modeling in Therapy

66
Modeling, a key technique in therapy, uses observational learning to help clients acquire and practice new skills by watching therapists demonstrate desired behaviors. This approach, rooted in Albert Bandura's concept of vicarious learning, plays a significant role in therapeutic interventions for various psychological conditions, including social anxiety, ADHD, and depression.
Participant Modeling
Participant modeling involves therapists demonstrating calm and effective behaviors in...
66

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Soft-Templated Synthesis of Large-Extrinsic-Mesopore Covalent Organic Frameworks with Tunable Pore Architecture and Size.

ACS nano·2026
Same author

A reporting checklist for large language models in behavioural science.

Nature human behaviour·2026
Same author

Perceived authenticity drives gaze behavior when watching AI-generated videos of physical scenes.

Scientific reports·2026
Same author

Conniving With Continuations: Representing Goals in a Domain-Specific Language of Thought.

Topics in cognitive science·2026
Same author

Neural representation of action symbols in primate frontal cortex.

Nature·2026
Same author

Human-level learning of complex novel tasks as theory-based modelling, exploration and planning.

Philosophical transactions. Series A, Mathematical, physical, and engineering sciences·2026
Same journal

In This Issue.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Correction for Otsuki et al., Extracellular sulfatases support cartilage homeostasis by regulating BMP and FGF signaling pathways.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Hive mind: Microbial communities and the making of memory.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Targets for disease modification in schizophrenia: New findings add to evidence for the involvement of the immune complement system.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Correction for Wang et al., The role of reduced aerosol masking from air pollutant emission reductions in recent global warming acceleration (2013-2023).

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Correction for Mishra, Ecology is not yet ready for AI-and why that matters.

Proceedings of the National Academy of Sciences of the United States of America·2026
查看所有相关文章

相关实验视频

Updated: Jun 24, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

544

通过交互来评估数学语言模型.

Katherine M Collins1, Albert Q Jiang1, Simon Frieder2

  • 1University of Cambridge, Cambridge CB2 1TN, United Kingdom.

Proceedings of the National Academy of Sciences of the United States of America
|June 3, 2024
PubMed
概括
此摘要是机器生成的。

评估用于交互式问题解决的大型语言模型 (LLM) 需要的不仅仅是静态测试. 我们的研究表明,虽然像GPT-4这样的模型在数学上表现良好,但人类的互动揭示了帮助和正确性的细微差别.

关键词:
在这里,我们可以看到AIAIAI.人与计算机的互动.语言模型语言模型证明定理证明的定理

更多相关视频

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA
10:58

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA

Published on: August 28, 2021

4.5K
The Spatial Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition
05:15

The Spatial Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition

Published on: February 19, 2018

10.8K

相关实验视频

Last Updated: Jun 24, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

544
Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA
10:58

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA

Published on: August 28, 2021

4.5K
The Spatial Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition
05:15

The Spatial Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition

Published on: February 19, 2018

10.8K

科学领域:

  • 人工智能的人工智能
  • 人与计算机的交互
  • 数学教育教育 数学教育

背景情况:

  • 大型语言模型 (LLM) 显示出作为解决问题助手的希望.
  • 目前使用静态输入-输出对的LLM评估方法对于交互式设置是不够的.
  • 了解在动态,现实世界的应用中LLM的能力至关重要.

研究的目的:

  • 介绍CheckMate,一个交互式LLM评估平台.
  • 评估 InstructGPT,ChatGPT 和 GPT-4 作为数学问题解决助手.
  • 在数学背景下分析人类互动模式和LLM绩效.

主要方法:

  • 开发并使用CheckMate平台进行人与LLM的交互.
  • 进行了一项涉及本科数学学生和教授的研究.
  • 收集了交互数据和评分,形成了MathConverse数据集.
  • 对GPT-4的数学问题解决能力进行了案例研究.

主要成果:

  • 在LLM交互过程中推导出人类查询行为分类.
  • 观察到LLM输出正确性和感知到的帮助性之间的分歧.
  • 确定了GPT-4在数学证明中的特定优点和弱点.
  • 发布了MathConverse数据集,用于进一步研究.

结论:

  • 互动评估对于理解LLM实用性至关重要.
  • 传达不确定性并接受纠正的LLM是更好的助理.
  • 数学家和ML从业者应该意识到LLM的局限性和潜在的错误性.