Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Real-Time Phishing Campaign Detection for Healthcare Organizations: An Explainable AI Approach Using Semantic Clustering.

Studies in health technology and informatics·2026

Same author

Exact Forecasting and Event-Based Prediction in Annual EARS-Net Antimicrobial Resistance Series.

Studies in health technology and informatics·2026

Same author

Two-Year Real-World Outcomes of Faricimab in Treatment-Resistant Neovascular Age-Related Macular Degeneration.

Clinical ophthalmology (Auckland, N.Z.)·2026

Same author

OCT and Autofluorescence Phenotypic Features in Autosomal Dominant <i>RHO</i>-Associated Retinitis Pigmentosa Variants.

Vision (Basel, Switzerland)·2026

Same author

Letter to the Editor Regarding "Multiwavelength Photobiomodulation Improves Multiple Aspects of Visual Function in Early-Stage Dry Age-Related Macular Degeneration".

Ophthalmology and therapy·2026

Same author

Artificial Intelligence and Big Data in Urological Oncology: From Radiomics to Real-World Evidence.

Archivos espanoles de urologia·2026

Same journal

A GenAI Pipeline for Violinist Kinematic Data Management.

Studies in health technology and informatics·2026

Same journal

AMAL-For-Qatar: A Comprehensive AI Ecosystem for Fetal Ultrasound Analysis - Project Overview and Achievements.

Studies in health technology and informatics·2026

Same journal

Longitudinal Treatment-Aware Multimodal AI for Dermatology: A Scoping Review.

Studies in health technology and informatics·2026

Same journal

Predicting Postpartum Depression Using Imbalance-Aware Machine Learning.

Studies in health technology and informatics·2026

Same journal

Validation of Deep-Learning Models for Autosegmentation of Brain Metastases.

Studies in health technology and informatics·2026

Same journal

Delay-Dependent Gating in Modular RNNs.

Studies in health technology and informatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 7, 2026

Multimodal Volumetric Retinal Imaging by Oblique Scanning Laser Ophthalmoscopy oSLO and Optical Coherence Tomography OCT

Multimodal Volumetric Retinal Imaging by Oblique Scanning Laser Ophthalmoscopy oSLO and Optical Coherence Tomography OCT

Published on: August 4, 2018

Leveraging Vision Transformers in Multimodal Models for Retinal OCT Analysis.

Georgios Feretzakis¹, Christina Karakosta², Aris Gkoulalas-Divanis³

¹School of Science and Technology, Hellenic Open University, Patras, Greece.

Studies in Health Technology and Informatics

|May 17, 2025

Summary

This summary is machine-generated.

Deep learning models, including Vision Transformers (ViTs), accurately classify retinal diseases from Optical Coherence Tomography (OCT) images. Integrating patient metadata with imaging data enhances diagnostic performance for conditions like AMD and DME.

Keywords:

Machine Learning Multimodal Deep Learning Optical Coherence Tomography Retinal Disease Classification Vision Transformers

More Related Videos

Optimization of the Retinal Vein Occlusion Mouse Model to Limit Variability

Optimization of the Retinal Vein Occlusion Mouse Model to Limit Variability

Published on: August 6, 2021

Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping

Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping

Published on: December 8, 2023

Related Experiment Videos

Last Updated: May 7, 2026

Multimodal Volumetric Retinal Imaging by Oblique Scanning Laser Ophthalmoscopy oSLO and Optical Coherence Tomography OCT

Multimodal Volumetric Retinal Imaging by Oblique Scanning Laser Ophthalmoscopy oSLO and Optical Coherence Tomography OCT

Published on: August 4, 2018

Optimization of the Retinal Vein Occlusion Mouse Model to Limit Variability

Optimization of the Retinal Vein Occlusion Mouse Model to Limit Variability

Published on: August 6, 2021

Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping

Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping

Published on: December 8, 2023

Area of Science:

Ophthalmology
Medical Imaging
Artificial Intelligence

Background:

Optical Coherence Tomography (OCT) provides high-resolution retinal imaging crucial for diagnosing eye diseases.
Accurate classification of OCT images aids in identifying conditions such as Age-related Macular Degeneration (AMD) and Diabetic Macular Edema (DME).

Purpose of the Study:

To evaluate the effectiveness of deep learning models (CNNs, ViTs) for classifying OCT images.
To assess the impact of incorporating patient metadata into OCT image classification, even with missing data.

Main Methods:

Comparison of various deep learning architectures, including Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).
Development and evaluation of multimodal models integrating OCT images with patient metadata (age, sex, eye laterality, year).

Main Results:

DenseNet121 and Multimodal ResNet18 achieved the highest accuracy (95.16%).
DenseNet121 exhibited a superior F1-score (0.9313).
A multimodal ViT-based model reached 93.22% accuracy, showcasing ViT potential in multimodal medical data analysis.

Conclusions:

Multimodal deep learning models integrating OCT images and metadata demonstrate competitive diagnostic performance.
Vision Transformers (ViTs) show promise for complex multimodal medical image analysis, particularly in ophthalmology.