Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies
  1. Home
  2. How Far Are We From Generating Missing Modalities With Foundation Models?
  1. Home
  2. How Far Are We From Generating Missing Modalities With Foundation Models?

Related Concept Videos

Sensory Modalities01:15

Sensory Modalities

Sensation typically is the process by which the sensory receptors and sense organs detect stimuli from the internal and external environment and transmit this information to the central nervous system for processing.
General senses refer to the broad category of sensory information detected by receptors in the body and can be further grouped into somatic and visceral senses. Somatic sensations include touch, pressure, temperature, and pain and are essential for navigating our environment and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

High mobility group box 1 (HMGB1) levels in the placenta and in serum in preeclampsia.

American journal of reproductive immunology (New York, N.Y. : 1989)·2011
Same author

Destabilization of coxsackievirus b3 genome integrated with enhanced green fluorescent protein gene.

Intervirology·2011
Same author

[Clinicopathological features of primary splenic histiocytic sarcoma: a case report and literature review].

Zhonghua xue ye xue za zhi = Zhonghua xueyexue zazhi·2011
Same author

[Comparison of treatment with micro endoscopic discectomy and posterior lumbar interbody fusion using single and double B-Twin expandable spinal spacer].

Zhonghua wai ke za zhi [Chinese journal of surgery]·2011
Same author

Virtual transplantation in designing a facial prosthesis for extensive maxillofacial defects that cross the facial midline using computer-assisted technology.

The International journal of prosthodontics·2011
Same author

Total synthesis of phorboxazole A via de novo oxazole formation: convergent total synthesis.

Journal of the American Chemical Society·2010
Same journal

Raising the Bar in Graph OOD Generalization: Invariant Learning beyond Explicit Environment Modeling.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

LoRASculpt: Harmonious Low-Rank Adaptation for Multimodal Large Language Models.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Linearly Solving Robust Rotation Estimation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adapting Dense Vision-Language Relationships for Multi-label Classification with Partial Label.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

MoE-Enhanced Explainable Deep Manifold Transformation for Complex Data Embedding and Visualization.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

How Far Are We from Generating Missing Modalities with Foundation Models?

Guanzhou Ke, Bo Wang, Guoqing Chao

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |May 25, 2026

    View abstract on PubMed

    Summary
    This summary is machine-generated.

    Multimodal foundation models struggle with reconstructing missing data. A new agentic framework improves semantic extraction and generation quality for more accurate multimodal AI.

    Related Experiment Videos

    Cross-Modal Multivariate Pattern Analysis
    13:51

    Cross-Modal Multivariate Pattern Analysis

    Published on: November 9, 2011

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Computer Vision

    Background:

    • Multimodal foundation models show promise but are underexplored for missing data reconstruction.
    • Current models face challenges in semantic extraction and generation validation.

    Purpose of the Study:

    • To evaluate existing paradigms for missing modality reconstruction.
    • To propose a novel agentic framework to enhance reconstruction accuracy and adaptability.

    Main Methods:

    • Formalized three paradigms for missing modality reconstruction.
    • Evaluated 42 model variants on reconstruction accuracy and downstream task performance.
    • Developed an agentic framework with modality-aware mining and self-refinement mechanisms.

    Main Results:

    • Identified limitations in semantic extraction and validation in current foundation models.
    • The proposed agentic framework reduced FID for image reconstruction by >=14% and MER for text reconstruction by >=10%.
    • Demonstrated improved fine-grained semantic feature extraction and robust generation validation.

    Conclusions:

    • Current foundation models require enhancements for effective missing modality reconstruction.
    • The proposed agentic framework offers a promising solution for accurate and robust multimodal data generation.
    • Future work should explore further refinements of agentic approaches in multimodal AI.