Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Understanding Memory

Understanding Memory

Memory is the retention of information or experiences over time, facilitated through three main processes: encoding, storage, and retrieval. Encoding is the process of inputting information into the memory system. For instance, when listening to a lecture, watching a play, reading a book, or having a conversation, the brain is actively encoding information. This initial stage involves transforming sensory input into a form that can be processed and stored by the brain. Various factors, such as...

System of Memory

System of Memory

Memory is categorized into three major systems: sensory memory, short-term memory (STM), and long-term memory (LTM). These systems differ in their capacity and the duration for which they can hold information. Sensory memory captures raw sensory input from the environment, holding it for just a few seconds or less. For example, on hearing a brief, loud sound, like a car horn honking, the sound seems to linger in the mind for a moment even after it stops. This is an instance of sensory memory...

Flashbulb Memory

Flashbulb Memory

A flashbulb memory is a highly vivid and detailed memory, often linked to events of significant emotional impact. These memories stand out in contrast to everyday memories due to their clarity and the precision with which they are recalled. The strong emotions associated with the event act as a catalyst, ensuring that specific details, such as one's location, actions, and even peripheral elements, are etched into memory with remarkable accuracy. For example, many people can vividly recall...

Buffers: Buffer Capacity

Buffers: Buffer Capacity

Buffer capacity is the quantitative measure of a buffer to resist the change in pH. As shown in the following equation, the buffer capacity, denoted by 'beta', is expressed as the number of moles of acid or base needed to change the pH of a one-liter buffer solution by 1 unit. Here, Ca and Cb indicate the number of moles of acid and base, respectively. Note that dpH represents the change in pH.
In the graph, pH is plotted as a function of the number of moles of base (Cb) added to a weak...

Sensory Memory

Sensory Memory

Sensory memory captures information from the environment in its original form for a very brief duration, just long enough to be exposed to visual, auditory, and other senses. This type of memory is detailed and rich but quickly lost unless certain strategies are employed to transfer it into short-term or long-term memory. Sensory information is continuously bombarding the human brain, yet only a small fraction is absorbed, as most of it does not significantly impact daily life. For instance,...

Working Memory

Working Memory

Working memory refers to a combination of components, including short-term memory and attention, that allow an individual to hold information temporarily as we perform cognitive tasks. It is an essential cognitive function that enables the execution of complex tasks such as problem-solving, comprehension, and reasoning. Unlike short-term memory, which simply involves the storage of information for a brief period, working memory involves the active manipulation and processing of this...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Big Data, Machine Learning, and Personalization in Health Systems: Ethical Issues and Emerging Trade-Offs.

Science and engineering ethics·2025

Same author

NACHOS: Neural Architecture Search for Hardware-Constrained Early-Exit Neural Networks.

IEEE transactions on neural networks and learning systems·2025

Same author

Modifiable and Non-Modifiable Risk Factors and Vascular Damage Progression in Type 2 Diabetes: A Primary Care Analysis.

Journal of clinical medicine·2025

Same author

Enhancing Privacy-Preserving Cancer Classification with Convolutional Neural Networks.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2024

Same author

Star-Shaped Space of Solutions of the Spherical Negative Perceptron.

Physical review letters·2023

Same author

Tiny Machine Learning for Concept Drift.

IEEE transactions on neural networks and learning systems·2023

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

IGFD-Net: Illumination-guided frequency decoupling for polarization image fusion.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Multiple-Strategies dung beetle optimizer and its applications in engineering optimization and bankruptcy prediction.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Finite-Time intermittent control for secure synchronization of Neutral-Type stochastic delayed neural networks under aperiodic DoS attacks.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 14, 2026

A Dual Task Procedure Combined with Rapid Serial Visual Presentation to Test Attentional Blink for Nontargets

A Dual Task Procedure Combined with Rapid Serial Visual Presentation to Test Attentional Blink for Nontargets

Published on: December 5, 2014

EmbBERT: Attention under 2 MB memory.

Riccardo Bravin¹, Massimo Pavan¹, Hazem Hesham Yousef Shalby¹

¹Department of Electronics, Information and Bioengineering, Politecnico di Milano, Via Ponzio 34/5, Milano, 20133, Italy.

Neural Networks : the Official Journal of the International Neural Network Society

|March 12, 2026

Summary

This summary is machine-generated.

EmbBERT is a tiny language model (TLM) designed for efficient deployment on memory-constrained devices. This compact transformer model achieves state-of-the-art accuracy using only 2 MB of memory, outperforming larger models.

Keywords:

Efficient deep learning Hardware acceleration Language models Model compression Natural language processing Tiny machine learning

More Related Videos

Eye Movement Monitoring of Memory

Eye Movement Monitoring of Memory

Published on: August 15, 2010

Assessing the Multiple Dimensions of Engagement to Characterize Learning: A Neurophysiological Perspective

Assessing the Multiple Dimensions of Engagement to Characterize Learning: A Neurophysiological Perspective

Published on: July 1, 2015

Related Experiment Videos

Last Updated: Mar 14, 2026

A Dual Task Procedure Combined with Rapid Serial Visual Presentation to Test Attentional Blink for Nontargets

A Dual Task Procedure Combined with Rapid Serial Visual Presentation to Test Attentional Blink for Nontargets

Published on: December 5, 2014

Eye Movement Monitoring of Memory

Eye Movement Monitoring of Memory

Published on: August 15, 2010

Assessing the Multiple Dimensions of Engagement to Characterize Learning: A Neurophysiological Perspective

Assessing the Multiple Dimensions of Engagement to Characterize Learning: A Neurophysiological Perspective

Published on: July 1, 2015

Area of Science:

Artificial Intelligence
Natural Language Processing
Edge Computing

Background:

Transformer architectures, while powerful for Natural Language Processing (NLP), have significant memory and computational demands.
Deployment of advanced NLP models on ultra-constrained devices like wearables and IoT units is challenging due to limited memory (megabytes).

Purpose of the Study:

To introduce EmbBERT, a tiny language model (TLM) architecturally optimized for extreme efficiency on edge devices.
To demonstrate that simplified transformer architectures can maintain high performance under strict memory constraints.

Main Methods:

Designed EmbBERT with a compact embedding layer, streamlined feed-forward blocks, and an efficient attention mechanism.
Evaluated EmbBERT on the TinyNLP benchmark and GLUE suite, comparing its performance against larger state-of-the-art (SotA) models and similarly sized BERT and MAMBA variants.
Assessed the model's resilience to 8-bit quantization and its scalability across different memory ranges (sub-megabyte to tens-of-megabytes).

Main Results:

EmbBERT requires only 2 MB of memory, achieving accuracy comparable to SotA models with 10x more memory.
Outperformed downsized BERT and MAMBA models of similar size on NLP tasks.
Demonstrated resilience to 8-bit quantization, reducing memory footprint to 781 kB.
Showcased scalability of the EmbBERT architecture across various memory constraints.

Conclusions:

Highly simplified transformer architectures are effective for edge NLP tasks under tight resource constraints.
EmbBERT offers a viable solution for deploying advanced NLP capabilities on memory-limited edge devices.
The proposed architecture and pre-training strategy contribute to efficient and accurate edge AI.