Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Magnetic Resonance Imaging01:24

Magnetic Resonance Imaging

4.9K
Magnetic resonance imaging (MRI) is a noninvasive medical imaging technique based on a phenomenon of nuclear physics discovered in the 1930s, in which matter exposed to magnetic fields and radio waves was found to emit radio signals. In 1970, a physician and researcher named Raymond Damadian noticed that malignant (cancerous) tissue gave off different signals than normal body tissue. He applied for a patent for the first MRI scanning device in clinical use by the early 1980s. The early MRI...
4.9K
  1. Home
  2. Research Domains
  3. Information And Computing Sciences
  4. Data Management And Data Science
  5. Query Processing And Optimisation
  6. Leveraging A Vision-language Model With Natural Text Supervision For Mri Retrieval, Captioning, Classification, And Visual Question Answering.
  1. Home
  2. Research Domains
  3. Information And Computing Sciences
  4. Data Management And Data Science
  5. Query Processing And Optimisation
  6. Leveraging A Vision-language Model With Natural Text Supervision For Mri Retrieval, Captioning, Classification, And Visual Question Answering.

Related Experiment Video

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

475

Leveraging a Vision-Language Model with Natural Text Supervision for MRI Retrieval, Captioning, Classification, and Visual Question Answering.

Nikhil J Dhinagar, Sophia I Thomopoulos, Paul M Thompson

    Biorxiv : the Preprint Server for Biology
    |March 3, 2025

    View abstract on PubMed

    Summary
    This summary is machine-generated.

    This study introduces a novel framework for brain MRI analysis using natural language supervision, enabling versatile tasks like retrieval and question answering for Alzheimer's disease research.

    More Related Videos

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    348
    Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language
    09:27

    Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

    Published on: October 13, 2018

    9.9K

    Related Experiment Videos

    Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
    03:14

    Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

    Published on: December 6, 2024

    475
    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    348
    Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language
    09:27

    Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

    Published on: October 13, 2018

    9.9K

    Area of Science:

    • Artificial Intelligence
    • Medical Imaging
    • Neuroscience

    Background:

    • Large multimodal models are widely used but raise concerns about data quality, domain relevance, and privacy in medical applications.
    • Current deep learning models in radiology are often task-specific and lack natural language interaction capabilities.

    Purpose of the Study:

    • To develop a versatile framework for learning visual brain MRI concepts using natural language supervision.
    • To enable multiple downstream tasks including MRI retrieval, captioning, classification, and visual question answering.
    • To investigate the identification of factors affecting Alzheimer's disease (AD) in brain MRIs.

    Main Methods:

    • Utilized vector retrieval and contrastive learning for efficient concept learning.
    • Employed self-supervised learning to pre-train separate text and image encoders.
  • Jointly fine-tuned encoders to create a shared embedding space for cross-modal learning.
  • Developed a retrieval and re-ranking mechanism with a transformer decoder for visual question answering.
  • Main Results:

    • The framework successfully learns to identify factors influencing Alzheimer's disease (AD) through joint embedding and natural language supervision.
    • Demonstrated the model's capability to perform multiple tasks: MRI retrieval, captioning, and classification.
    • Showcased versatility through a retrieval/re-ranking mechanism and transformer decoder for visual question answering.

    Conclusions:

    • The proposed framework offers a general and versatile tool for radiologic research by integrating medical imaging with text.
    • Enables diagnostic and prognostic assessments in AD research and assists clinicians by detecting radiologic features described in text.
    • Represents a novel approach to radiologic research, enhancing the utility of multimodal models in healthcare.