Jove
Visualize
联系我们
JoVE
x logofacebook logolinkedin logoyoutube logo
关于 JoVE
概览领导团队博客JoVE 帮助中心
作者
出版流程编辑委员会范围与政策同行评审常见问题投稿
图书馆员
用户评价订阅访问资源图书馆顾问委员会常见问题
研究
JoVE JournalMethods CollectionsJoVE Encyclopedia of Experiments存档
教育
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab Manual教师资源中心教师网站
使用条款与条件
隐私政策
政策

相关概念视频

Approximate Integration01:24

Approximate Integration

46
In many practical and theoretical contexts, the exact value of a definite integral may be inaccessible. This limitation typically arises when the antiderivative of a function is either unknown or cannot be expressed in a closed mathematical form. Alternatively, it can occur when a function is defined not by a formula but by a finite set of empirical data points, such as those collected during experiments. In these cases, approximate integration techniques provide a valuable solution.One of the...
46
Linearization and Approximation01:26

Linearization and Approximation

57
Linearization is a mathematical technique used to approximate complex, nonlinear functions with simpler linear models in the vicinity of a chosen reference point. The method is based on the idea that, although a function may be difficult to evaluate exactly, its behavior near a specific input value can often be closely approximated by the tangent line at that point. This approach is particularly useful when small deviations from a known value are involved.Consider the square root function, for...
57
Accuracy, limits, and approximation01:28

Accuracy, limits, and approximation

1.3K
Accuracy, limits, and approximations are common in many fields, especially in engineering calculations. These concepts are imperative for ensuring that a given value is as close as possible to its true value.
Accuracy is defined as the closeness of the measured value to the true or actual value. In engineering mechanics, repeated measurements are taken during theoretical or experimental analyses to ensure that the result is precise and accurate.
The accuracy of any solution is based on the...
1.3K
Application of Linearization and Approximation01:29

Application of Linearization and Approximation

88
A drone flying through complex terrain often relies on more than one sensing method to estimate small changes in altitude. Along with direct measurements, air pressure provides a useful indirect indicator of vertical movement. Atmospheric pressure decreases as altitude increases, and this relationship is commonly described using an exponential model. Although accurate, converting pressure measurements into altitude values requires calculations that are too complex to perform repeatedly during...
88
Bacterial Transformation01:33

Bacterial Transformation

59.7K
In 1928, bacteriologist Frederick Griffith worked on a vaccine for pneumonia, which is caused by Streptococcus pneumoniae bacteria. Griffith studied two pneumonia strains in mice: one pathogenic and one non-pathogenic. Only the pathogenic strain killed host mice.
Griffith made an unexpected discovery when he killed the pathogenic strain and mixed its remains with the live, non-pathogenic strain. Not only did the mixture kill host mice, but it also contained living pathogenic bacteria that...
59.7K
Linear Approximation in Frequency Domain01:26

Linear Approximation in Frequency Domain

370
Linear systems are characterized by two main properties: superposition and homogeneity. Superposition allows the response to multiple inputs to be the sum of the responses to each individual input. Homogeneity ensures that scaling an input by a scalar results in the response being scaled by the same scalar.
In contrast, nonlinear systems do not inherently possess these properties. However, for small deviations around an operating point, a nonlinear system can often be approximated as linear....
370

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序
Same author

Sparse Convolution FPGA Accelerator Based on Multi-Bank Hash Selection.

Micromachines·2025
Same author

LDF-BNN: A Real-Time and High-Accuracy Binary Neural Network Accelerator Based on the Improved BNext.

Micromachines·2024
Same author

Ponte: Represent Totally Binary Neural Network Toward Efficiency.

Sensors (Basel, Switzerland)·2024
Same author

An FPGA-Based YOLOv5 Accelerator for Real-Time Industrial Vision Applications.

Micromachines·2024
Same author

An OpenCL-Based FPGA Accelerator for Faster R-CNN.

Entropy (Basel, Switzerland)·2023
Same author

Fast and Accurate Object Detection in Remote Sensing Images Based on Lightweight Deep Neural Network.

Sensors (Basel, Switzerland)·2021
Same journal

Correction: Kang et al. Fluid Flow to Electricity: Capturing Flow-Induced Vibrations with Micro-Electromechanical-System-Based Piezoelectric Energy Harvester. <i>Micromachines</i> 2024, <i>15</i>, 581.

Micromachines·2026
Same journal

Femtosecond Laser Texturing of Wood Coatings with Bio-Based Epoxy and Wax Additives for Enhanced Hydrophobicity.

Micromachines·2026
Same journal

Engineering of Optoelectronic Devices for Renewable Energy Applications.

Micromachines·2026
Same journal

Phase Transformation and Electrochemical Behavior of Hexagonal TiO<sub>2</sub> Nanotubes Under Different Annealing Temperatures and Heating Rates.

Micromachines·2026
Same journal

Process Optimization and Predictive Modeling of Femtosecond Laser Precision Milling for Commercial PMMA Slices.

Micromachines·2026
Same journal

A Hybrid Preprocessing Multi-Objective Surrogate Model for Thermal MEMS Actuators.

Micromachines·2026
查看所有相关文章

相关实验视频

Updated: Jan 29, 2026

Efficient Polyethylene Glycol PEG Mediated Transformation of the Moss Physcomitrella patens
04:54

Efficient Polyethylene Glycol PEG Mediated Transformation of the Moss Physcomitrella patens

Published on: April 19, 2011

41.5K

对软max和RMSNorm的硬件导向近似,用于高效的变压器推理.

Yiwen Kang1,2, Dong Wang1,2

  • 1Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China.

Micromachines
|January 28, 2026
PubMed
概括
此摘要是机器生成的。

本研究介绍了硬件效率高的方法,通过优化非线性运算符 (如Softmax和RMSNorm) 来加速变压器推断. 这些技术可以降低资源成本和延迟,同时保持大型语言模型 (LLM) 的模型准确性.

关键词:
在FPGA中,FPGA是指FPGA.在RMSNorm中使用RMSNorm.软max 是一个软max.硬件加速加速器 硬件加速器变压器推断的推理

更多相关视频

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
13:44

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Published on: August 30, 2013

43.6K
Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3
11:10

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Published on: December 27, 2010

12.7K

相关实验视频

Last Updated: Jan 29, 2026

Efficient Polyethylene Glycol PEG Mediated Transformation of the Moss Physcomitrella patens
04:54

Efficient Polyethylene Glycol PEG Mediated Transformation of the Moss Physcomitrella patens

Published on: April 19, 2011

41.5K
Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
13:44

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Published on: August 30, 2013

43.6K
Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3
11:10

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Published on: December 27, 2010

12.7K

科学领域:

  • 计算机工程 计算机工程
  • 人工智能的人工智能
  • 软件工程 软件工程 软件工程

背景情况:

  • 基于变压器的大型语言模型 (LLM) 在软件工程中越来越多地用于代码生成和NFR分类等任务.
  • 现有的LLM优化研究主要针对线性运算,使得非线性运算符未被充分探索.
  • 非线性运算符如Softmax和RMSNorm对于变压器性能至关重要,但在计算上昂贵.

研究的目的:

  • 为变压器模型中的Softmax和RMSNorm操作员提出硬件高效的近似和加速方法.
  • 为了降低资源成本并加快变压器推断速度.
  • 为了保持LLMs的准确性,同时优化硬件利用率.

主要方法:

  • 开发了一种带有范围缩小的SafeSoftmax技术,用于双方查找表 (LUT) 近似和加速.
  • 使用帕雷托边界分析优化了比特宽度配置,并应用了对数值准确性的错误补偿.
  • 用LOD驱动的LUT重构划分作为对数减去,并使用LOD进行并行计算优化RMSNorm.

主要成果:

  • 实现了基于FPGA的管道加速器,证明了低操作员级延迟和低功耗.
  • 在硬件资源使用方面实现了显著的减少.
  • 尽管对Softmax和RMSNorm应用了近似和加速度,但模型的准确性仍然保持着.

结论:

  • 提出的硬件效率高的方法通过优化关键的非线性运算符来有效地加速变压器推断.
  • 基于FPGA的加速器提供了一个实用的解决方案,用于部署LLM,减少资源足迹和提高性能.
  • 这项工作突出了非线性运算符的硬件级优化在推进LLM应用程序中的潜力.