Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Vision01:24

Vision

54.0K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
54.0K
Visual System01:26

Visual System

632
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
632

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Traffic Sign Recognition Using Multi-Task Deep Learning for Self-Driving Vehicles.

Sensors (Basel, Switzerland)·2024
Same author

Multimodal Approach for Enhancing Biometric Authentication.

Journal of imaging·2023
Same author

Contrasting EfficientNet, ViT, and gMLP for COVID-19 Detection in Ultrasound Imagery.

Journal of personalized medicine·2022
Same author

COVID-19 Detection in CT/X-ray Imagery Using Vision Transformers.

Journal of personalized medicine·2022
Same author

A Fast Firefly Algorithm for Function Optimization: Application to the Control of BLDC Motor.

Sensors (Basel, Switzerland)·2021
Same author

Unified Generative Adversarial Networks for Multidomain Fingerprint Presentation Attack Detection.

Entropy (Basel, Switzerland)·2021
Same journal

Correction: Komatsu et al. Three-Dimensional Visualization and Detection of the Pulmonary Venous-Left Atrium Connection Using Artificial Intelligence in Fetal Cardiac Ultrasound Screening. <i>Bioengineering</i> 2026, <i>13</i>, 100.

Bioengineering (Basel, Switzerland)·2026
Same journal

Comparison of CO<sub>2</sub> Laser and Microdebrider in the Surgical Treatment of Pediatric Recurrent Respiratory Papillomatosis: A Retrospective Analysis.

Bioengineering (Basel, Switzerland)·2026
Same journal

Toward More Translational Tumor Models: Breast dECM-Based 3D Systems Capture Native Microenvironmental Cues.

Bioengineering (Basel, Switzerland)·2026
Same journal

Postural Stability Changes During the 4 Phases of the Half Squat: Kinematics Profile of the Center of Pressure and Center of Mass in High-Performance Weightlifters-A Pilot Study.

Bioengineering (Basel, Switzerland)·2026
Same journal

Definite Implant Position as Novel Readout for Effectiveness of Ridge Preservation Indicates to Beneficial Effect of Combined Treatment with Platelet-Rich Fibrin (PRF) and Xenogenic Biomaterial in Bone Regeneration.

Bioengineering (Basel, Switzerland)·2026
Same journal

Trueness and Precision of Intraoral Scanners for 3D-Printed Orthodontic Models with Attachments: An In Vitro Comparative Study.

Bioengineering (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Aug 5, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

648

Vision-Language Model for Visual Question Answering in Medical Imagery.

Yakoub Bazi1, Mohamad Mahmoud Al Rahhal2, Laila Bashmal1

  • 1Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia.

Bioengineering (Basel, Switzerland)
|March 29, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a novel transformer-based approach for medical visual question answering (VQA) systems. The model shows promising results on radiology image datasets, advancing diagnostic capabilities.

Keywords:
medical visual question answeringtransformervision–language encoders

More Related Videos

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.8K
Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

2.6K

Related Experiment Videos

Last Updated: Aug 5, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

648
Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.8K
Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

2.6K

Area of Science:

  • Artificial Intelligence
  • Medical Imaging Analysis
  • Natural Language Processing

Background:

  • Medical images are crucial in clinical diagnosis.
  • Medical Visual Question Answering (VQA) systems can enhance diagnostic accuracy.
  • Current VQA technology for medical applications is underdeveloped.

Purpose of the Study:

  • To introduce an advanced transformer encoder-decoder architecture for medical VQA.
  • To improve the performance of VQA systems in analyzing radiology images.
  • To bridge the gap between current VQA capabilities and practical clinical application.

Main Methods:

  • Image features extracted using Vision Transformer (ViT).
  • Questions embedded using a textual encoder transformer.
  • Concatenated visual and textual representations fed into a multi-modal decoder.
  • Answer generation using an autoregressive approach.

Main Results:

  • The proposed model was validated on VQA-RAD and PathVQA datasets.
  • Achieved 84.99% closed and 72.97% open accuracy on VQA-RAD.
  • Achieved 83.86% closed and 62.37% open accuracy on PathVQA.
  • Reported BLUE scores indicate good alignment between predicted and true answers.

Conclusions:

  • The transformer-based VQA model demonstrates strong performance on medical image datasets.
  • The approach shows significant potential for improving diagnostic support in healthcare.
  • Further development of this VQA system could lead to practical clinical tools.