Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Home
How Far Are We From Generating Missing Modalities With Foundation Models?

Home
How Far Are We From Generating Missing Modalities With Foundation Models?

Related Concept Videos

Sensory Modalities

Sensory Modalities

Sensation typically is the process by which the sensory receptors and sense organs detect stimuli from the internal and external environment and transmit this information to the central nervous system for processing.
General senses refer to the broad category of sensory information detected by receptors in the body and can be further grouped into somatic and visceral senses. Somatic sensations include touch, pressure, temperature, and pain and are essential for navigating our environment and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

High mobility group box 1 (HMGB1) levels in the placenta and in serum in preeclampsia.

American journal of reproductive immunology (New York, N.Y. : 1989)·2011

Same author

Destabilization of coxsackievirus b3 genome integrated with enhanced green fluorescent protein gene.

Intervirology·2011

Same author

[Clinicopathological features of primary splenic histiocytic sarcoma: a case report and literature review].

Zhonghua xue ye xue za zhi = Zhonghua xueyexue zazhi·2011

Same author

[Comparison of treatment with micro endoscopic discectomy and posterior lumbar interbody fusion using single and double B-Twin expandable spinal spacer].

Zhonghua wai ke za zhi [Chinese journal of surgery]·2011

Same author

Virtual transplantation in designing a facial prosthesis for extensive maxillofacial defects that cross the facial midline using computer-assisted technology.

The International journal of prosthodontics·2011

Same author

Total synthesis of phorboxazole A via de novo oxazole formation: convergent total synthesis.

Journal of the American Chemical Society·2010

Same journal

Raising the Bar in Graph OOD Generalization: Invariant Learning beyond Explicit Environment Modeling.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

LoRASculpt: Harmonious Low-Rank Adaptation for Multimodal Large Language Models.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Linearly Solving Robust Rotation Estimation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adapting Dense Vision-Language Relationships for Multi-label Classification with Partial Label.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

MoE-Enhanced Explainable Deep Manifold Transformation for Complex Data Embedding and Visualization.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Related Experiment Video

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

How Far Are We from Generating Missing Modalities with Foundation Models?

Guanzhou Ke, Bo Wang, Guoqing Chao

IEEE Transactions on Pattern Analysis and Machine Intelligence

|May 25, 2026

View abstract on PubMed

Summary

This summary is machine-generated.

Multimodal foundation models struggle with reconstructing missing data. A new agentic framework improves semantic extraction and generation quality for more accurate multimodal AI.

Related Experiment Videos

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Area of Science:

Artificial Intelligence
Machine Learning
Computer Vision

Background:

Multimodal foundation models show promise but are underexplored for missing data reconstruction.
Current models face challenges in semantic extraction and generation validation.

Purpose of the Study:

To evaluate existing paradigms for missing modality reconstruction.
To propose a novel agentic framework to enhance reconstruction accuracy and adaptability.

Main Methods:

Formalized three paradigms for missing modality reconstruction.
Evaluated 42 model variants on reconstruction accuracy and downstream task performance.
Developed an agentic framework with modality-aware mining and self-refinement mechanisms.

Main Results:

Identified limitations in semantic extraction and validation in current foundation models.
The proposed agentic framework reduced FID for image reconstruction by >=14% and MER for text reconstruction by >=10%.
Demonstrated improved fine-grained semantic feature extraction and robust generation validation.

Conclusions:

Current foundation models require enhancements for effective missing modality reconstruction.
The proposed agentic framework offers a promising solution for accurate and robust multimodal data generation.
Future work should explore further refinements of agentic approaches in multimodal AI.