Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Language and Cognition

Language and Cognition

Language serves as a bridge between ideas and communication, influencing how individuals perceive and interact with the world. Psychologists have long debated whether language shapes thought or vice versa. This discussion gained grip with Edward Sapir and Benjamin Lee Whorf in the 1940s, who proposed that language determines thought, a concept known as linguistic determinism. They suggested that the vocabulary and structure of a language influence how its speakers think and perceive reality.

Components of Language

Components of Language

Language, whether spoken, signed, or written, consists of specific components: lexicon and grammar. The lexicon is the vocabulary of a language, comprising its words. Grammar is the set of rules used to convey meaning through the lexicon. For example, English grammar adds “-ed” to most verbs to indicate past tense. Words are formed by combining phonemes, which are the basic sound units of a language. Different languages have different sets of phonemes (e.g., “ah” vs.

Language

Language

Language is a unique communication system that uses words and systematic rules to organize and transmit information. Unlike other forms of communication, which may involve postures, movements, odors, or vocalizations, language relies on symbols and grammar. This makes human communication distinct from that of other species, who also communicate but do not use language in the same way humans do.
Corballis and Suddendorf (2007) and Tomasello and Rakoczy (2003) highlight the role of language in...

Stereotype Content Model

Stereotype Content Model

The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...

Language Development

Language Development

Children master language quickly and with relative ease, supported by both biological predisposition and reinforcement. B. F. Skinner (1957) proposed that language is learned through reinforcement, while Noam Chomsky (1965) argued that language acquisition mechanisms are biologically determined.
The critical period for language acquisition suggests that the ability to acquire language is at its peak early in life. As people age, this proficiency decreases. Language development begins very...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Heterogeneity of adjuvant therapy effectiveness across post-neoadjuvant pathologic strata in esophageal squamous cell carcinoma: a retrospective cohort study.

Esophagus : official journal of the Japan Esophageal Society·2026

Same author

Spatiotemporal coupling effects of rainfall and underlying surface on urban flood characteristics.

Water research·2026

Same author

Interacting and joint effects of frailty and depressive symptoms in relation to risk of cardio-oncology comorbidity in older adults from three prospective cohorts.

BMC geriatrics·2026

Same author

Rule-guided Skip-GCN in neural latent information diffusion network for social recommendation.

Scientific reports·2026

Same author

High-Quality Entity Segmentation and Grounding.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Characteristics and Screening Strategies of Hepatitis B in Guangdong Province, China.

Viruses·2026

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 27, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey.

Jindong Li, Yali Fu, Jiahong Liu

IEEE Transactions on Pattern Analysis and Machine Intelligence

|March 24, 2026

Summary

This summary is machine-generated.

This survey provides the first structured taxonomy of discrete tokenization methods, focusing on vector quantization (VQ), for large language models (LLMs). It analyzes VQ variants and their impact on multimodal LLM performance, addressing key challenges and future directions.

More Related Videos

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

Related Experiment Videos

Last Updated: Mar 27, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

Area of Science:

Artificial Intelligence
Machine Learning
Natural Language Processing

Background:

Large language models (LLMs) require discrete data representations for efficient processing.
Vector quantization (VQ) is a key technique for transforming continuous multimodal data into discrete tokens.
Existing literature lacks a systematic survey of VQ methods tailored for LLM integration.

Purpose of the Study:

To present the first structured taxonomy and analysis of discrete tokenization methods, specifically VQ, for LLM applications.
To systematically categorize and analyze representative VQ variants within the context of LLM pipelines.
To bridge the gap between VQ techniques and modern LLM development for multimodal systems.

Main Methods:

Categorization of 8 representative VQ variants, spanning classical and modern approaches.
Analysis of algorithmic principles, training dynamics, and integration challenges of VQ methods with LLMs.
Review of existing research across classical, single-modality, and multimodal LLM systems.

Main Results:

Identification of how quantization strategies influence alignment, reasoning, and generation in multimodal LLMs.
Highlighting key challenges such as codebook collapse and unstable gradient estimation.
Discussion of emerging research directions including dynamic quantization and unified tokenization frameworks.

Conclusions:

This survey provides a foundational reference for developing efficient and generalizable multimodal LLM systems.
The structured analysis of VQ techniques addresses a critical need in the rapidly advancing field of LLMs.
Understanding VQ is crucial for optimizing the performance of LLM-based multimodal applications.