Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Multi-view Chest X-Ray Vision-Language Pre-training via Semantic-Aware Masked Language Modeling and High-order

Lihong Qiao, Jingya Gong, Yucheng Shu

IEEE Transactions on Medical Imaging

|June 5, 2026

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Health inequities in the comorbid association between pneumoconiosis and chronic obstructive pulmonary disease: a nationwide study of age-related heterogeneity.

Respiratory research·2026

Same author

Low-Volume Plasma Exchange Following Double Plasma Molecular Adsorption System in Acute-On-Chronic Liver Failure: A Plasma-Sparing Strategy.

Journal of clinical apheresis·2026

Same author

Tri-Level Consistency-Diversity Calibration for Multi-View Representation Learning.

Entropy (Basel, Switzerland)·2026

Same author

GarmentRec: Towards Individual Garment Reconstruction From a Monocular Human Image.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

A Technique for Stabilizing Membrane Proteins in Nanodiscs.

Journal of visualized experiments : JoVE·2026

Same author

Diffusion models for brain imaging computing: a survey of frameworks and applications.

Brain informatics·2026

Same journal

Physiology-guided Self-supervised Learning for Simultaneous Dual-Tracer PET Separation.

IEEE transactions on medical imaging·2026

Same journal

Informed-Exploration Reinforcement Learning for Automated Virtual Coronary Intervention Planning.

IEEE transactions on medical imaging·2026

Same journal

4D Reconstruction of Fetal Left Ventricle from Echocardiography via 2.5D Radial Segmentation and Graph-Fourier Reconstruction.

IEEE transactions on medical imaging·2026

Same journal

Generalised Medical Phrase Grounding.

IEEE transactions on medical imaging·2026

Same journal

EndoLRMGS: Combining Large Reconstruction Modelling and Gaussian Splatting for Complete Endoscopic Scene Reconstruction.

IEEE transactions on medical imaging·2026

Same journal

A Neural-Analytical Fusion Scatter Correction Method for Multi-Source CT Using Equivalent High-Order Scatter.

IEEE transactions on medical imaging·2026

See all related articles

This study introduces a novel vision-language pretraining (VLP) framework for chest X-rays, improving multi-view analysis and addressing false negatives in medical image diagnosis.

Area of Science:

Medical Imaging
Artificial Intelligence
Computer Vision

Background:

Chest X-Ray Vision-Language Pretraining (VLP) shows promise for medical image diagnosis by learning joint image-text representations.
Existing VLP methods often neglect the multi-view nature of chest X-rays and may suffer from ineffective feature fusion and alignment issues.
Random cross-modal Masked Language Modeling (MLM) and global alignment can hinder representation learning and introduce false negatives.

Purpose of the Study:

To propose a novel medical VLP framework that addresses limitations in existing multi-view approaches.
To enhance representation learning by incorporating key semantics, view-specific features, and higher-order semantic alignment.
To improve the accuracy and robustness of VLP models for chest X-ray analysis.

Main Methods:

Related Experiment Videos

A Key Semantics-enhanced Multi-view MLM module aggregates pathology-relevant patches across views for semantically rich supervision.
A Frontal-Lateral Alignment module extracts view-specific pathological features to ensure consistency and preserve information.
A High-order Semantic Alignment approach mitigates false negatives by aligning features with semantically consistent clusters.

Main Results:

The proposed framework demonstrates superior performance compared to state-of-the-art methods.
Experiments across seven public datasets validate the framework's efficacy in four downstream tasks.
The method effectively handles multi-view information and improves representation alignment.

Conclusions:

The novel VLP framework significantly advances chest X-ray analysis by effectively leveraging multi-view information and semantic alignment.
The proposed approach offers a more robust and accurate solution for medical image diagnosis.
The framework's components collectively enhance the quality of learned representations for improved diagnostic capabilities.