Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

RadGazeGen: radiomics and gaze-guided chest X-ray generation using diffusion models.

Journal of medical imaging (Bellingham, Wash.)·2026

Same author

Topology-Aware Segmentation Using Discrete Morse Theory.

... International Conference on Learning Representations·2026

Same author

TOPODIFFUSIONNET: A TOPOLOGY-AWARE DIFFUSION MODEL.

... International Conference on Learning Representations·2026

Same authorSame journal

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026

Same author

Look Hear: Gaze Prediction for Speech-directed Human Attention.

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision·2026

Same author

<i>GAZE2REPORT</i>: RADIOLOGY REPORT GENERATION VIA VISUAL-GAZE PROMPT TUNING OF LLMS.

ArXiv·2026

Same journal

CARL: A Framework for Equivariant Image Registration.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026

Same journal

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026

Same journal

The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026

Same journal

Perceptual Inductive Bias Is What You Need Before Contrastive Learning.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026

Same journal

MultiMorph: On-demand Atlas Construction.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 6, 2025

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain

Published on: May 14, 2014

Learned representation-guided diffusion models for large-image generation.

Alexandros Graikos¹, Srikar Yellapragada¹, Minh-Quan Le¹

¹Stony Brook University.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

|November 28, 2024

Summary

This summary is machine-generated.

This study introduces a novel method using self-supervised learning (SSL) embeddings to guide diffusion models for generating high-quality histopathology and satellite images. This approach bypasses the need for extensive manual annotations, improving image synthesis and downstream classification tasks.

More Related Videos

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Related Experiment Videos

Last Updated: Jun 6, 2025

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain

Published on: May 14, 2014

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Area of Science:

Computer Vision
Machine Learning
Medical Imaging
Remote Sensing

Background:

Diffusion models require auxiliary data for high-fidelity sample synthesis, which is often impractical due to extensive annotation needs in specialized domains.
Self-supervised learning (SSL) representations capture rich semantic and visual information, potentially serving as proxies for fine-grained human labels.

Purpose of the Study:

To develop a novel approach for training diffusion models conditioned on SSL embeddings.
To demonstrate the efficacy of SSL embeddings as substitutes for manual annotations in image generation.
To enable the synthesis of large, spatially consistent images and explore text-to-image generation.

Main Methods:

Trained diffusion models using embeddings from self-supervised learning (SSL) models.
Generated high-quality histopathology and remote sensing images from SSL features.
Assembled spatially consistent patches from SSL embeddings to construct larger images, preserving long-range dependencies.
Demonstrated text-to-large image synthesis by generating pathology and satellite images from text descriptions.

Main Results:

Diffusion models conditioned on SSL embeddings successfully generated high-fidelity histopathology and remote sensing images.
Generated images augmented real data, improving downstream classifier accuracy for both patch-level and image-scale classification.
The models exhibited robustness and generalizability, performing well on unseen datasets.
Successfully synthesized large images from text descriptions, showcasing a new text-to-large image paradigm.

Conclusions:

SSL representations are effective proxies for fine-grained labels, enabling diffusion model training without manual annotations.
The proposed method enhances image synthesis quality and improves downstream classification performance.
The approach is generalizable and robust, with potential applications in various domains requiring high-fidelity image generation.