Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

RadGazeGen: radiomics and gaze-guided chest X-ray generation using diffusion models.

Journal of medical imaging (Bellingham, Wash.)·2026
Same author

Topology-Aware Segmentation Using Discrete Morse Theory.

... International Conference on Learning Representations·2026
Same author

TOPODIFFUSIONNET: A TOPOLOGY-AWARE DIFFUSION MODEL.

... International Conference on Learning Representations·2026
Same authorSame journal

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026
Same author

Look Hear: Gaze Prediction for Speech-directed Human Attention.

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision·2026
Same author

<i>GAZE2REPORT</i>: RADIOLOGY REPORT GENERATION VIA VISUAL-GAZE PROMPT TUNING OF LLMS.

ArXiv·2026
Same journal

CARL: A Framework for Equivariant Image Registration.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026
Same journal

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026
Same journal

The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026
Same journal

Perceptual Inductive Bias Is What You Need Before Contrastive Learning.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026
Same journal

MultiMorph: On-demand Atlas Construction.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition·2026
See all related articles

Related Experiment Video

Updated: Jun 6, 2025

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain
09:14

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain

Published on: May 14, 2014

11.4K

Learned representation-guided diffusion models for large-image generation.

Alexandros Graikos1, Srikar Yellapragada1, Minh-Quan Le1

  • 1Stony Brook University.

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition
|November 28, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces a novel method using self-supervised learning (SSL) embeddings to guide diffusion models for generating high-quality histopathology and satellite images. This approach bypasses the need for extensive manual annotations, improving image synthesis and downstream classification tasks.

More Related Videos

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning
12:06

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

3.9K
Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.7K

Related Experiment Videos

Last Updated: Jun 6, 2025

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain
09:14

Image-guided Convection-enhanced Delivery into Agarose Gel Models of the Brain

Published on: May 14, 2014

11.4K
Analyzing Mitochondrial Morphology Through Simulation Supervised Learning
12:06

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

3.9K
Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.7K

Area of Science:

  • Computer Vision
  • Machine Learning
  • Medical Imaging
  • Remote Sensing

Background:

  • Diffusion models require auxiliary data for high-fidelity sample synthesis, which is often impractical due to extensive annotation needs in specialized domains.
  • Self-supervised learning (SSL) representations capture rich semantic and visual information, potentially serving as proxies for fine-grained human labels.

Purpose of the Study:

  • To develop a novel approach for training diffusion models conditioned on SSL embeddings.
  • To demonstrate the efficacy of SSL embeddings as substitutes for manual annotations in image generation.
  • To enable the synthesis of large, spatially consistent images and explore text-to-image generation.

Main Methods:

  • Trained diffusion models using embeddings from self-supervised learning (SSL) models.
  • Generated high-quality histopathology and remote sensing images from SSL features.
  • Assembled spatially consistent patches from SSL embeddings to construct larger images, preserving long-range dependencies.
  • Demonstrated text-to-large image synthesis by generating pathology and satellite images from text descriptions.

Main Results:

  • Diffusion models conditioned on SSL embeddings successfully generated high-fidelity histopathology and remote sensing images.
  • Generated images augmented real data, improving downstream classifier accuracy for both patch-level and image-scale classification.
  • The models exhibited robustness and generalizability, performing well on unseen datasets.
  • Successfully synthesized large images from text descriptions, showcasing a new text-to-large image paradigm.

Conclusions:

  • SSL representations are effective proxies for fine-grained labels, enabling diffusion model training without manual annotations.
  • The proposed method enhances image synthesis quality and improves downstream classification performance.
  • The approach is generalizable and robust, with potential applications in various domains requiring high-fidelity image generation.