Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

631
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
631
Vision01:24

Vision

53.2K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
53.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

YoLeTooth: A Unified Framework for Joint Tooth Segmentation and Periapical Lesion Detection in Panoramic Radiographs.

Journal of imaging·2026
Same author

Medical Referring Image Segmentation via Next-Token Mask Prediction.

IEEE transactions on medical imaging·2026
Same author

Survival and prognostic factors of parotid malignancies in Northern Italy: a multicentric study.

Acta otorhinolaryngologica Italica : organo ufficiale della Societa italiana di otorinolaringologia e chirurgia cervico-facciale·2026
Same author

Prototype-Based Multi-Dimension Intensity Mapping Density Sampling Network for Corrosion Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

OnUVS: An Online Motion Transfer Framework with Content-Texture Decoupling for High-Fidelity Ultrasound Video Synthesis.

IEEE journal of biomedical and health informatics·2026
Same author

A novel width-expanding spiral microchannel for high-throughput continuous separation of bacterial-sized particles.

Scientific reports·2026
Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jun 28, 2025

Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping
07:11

Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping

Published on: December 8, 2023

1.5K

A Survey on Efficient Vision Transformers: Algorithms, Techniques, and Performance Benchmarking.

Lorenzo Papa, Paolo Russo, Irene Amerini

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |April 24, 2024
    PubMed
    Summary
    This summary is machine-generated.

    This survey explores efficient methodologies for Vision Transformer (ViT) models, addressing their computational costs. It analyzes compact architectures, pruning, knowledge distillation, and quantization to improve performance in resource-constrained environments.

    More Related Videos

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    1.8K
    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss
    07:12

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

    Published on: April 11, 2025

    325

    Related Experiment Videos

    Last Updated: Jun 28, 2025

    Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping
    07:11

    Author Spotlight: Insights into Visual Cortex Research Through Wide-View fMRI Mapping

    Published on: December 8, 2023

    1.5K
    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    1.8K
    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss
    07:12

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

    Published on: April 11, 2025

    325

    Area of Science:

    • Computer Vision
    • Deep Learning
    • Artificial Intelligence

    Background:

    • Vision Transformers (ViT) excel at global information extraction via self-attention, surpassing Convolutional Neural Networks.
    • ViT performance scales with size, parameters, and operations, leading to high computational and memory demands.
    • Quadratic increase in self-attention cost with image resolution challenges real-world deployment due to hardware limitations.

    Purpose of the Study:

    • To investigate efficient methodologies for Vision Transformer (ViT) architectures.
    • To ensure sub-optimal estimation performances despite hardware and environmental restrictions.
    • To analyze strategies for making ViTs suitable for real-world applications.

    Main Methods:

    • Analysis of four efficient categories: compact architecture, pruning, knowledge distillation, and quantization.
    • Introduction of a new metric, Efficient Error Rate, for comparing models based on inference-time hardware impact.
    • Mathematical definition and discussion of state-of-the-art efficient ViT methodologies.

    Main Results:

    • Detailed mathematical definitions of efficiency strategies for Vision Transformers.
    • Comprehensive description and discussion of current state-of-the-art efficient methodologies.
    • Performance analysis of these methodologies across various application scenarios.

    Conclusions:

    • Efficient methodologies are crucial for deploying Vision Transformers in resource-limited settings.
    • The Efficient Error Rate metric provides a standardized way to evaluate model efficiency.
    • Further research into open challenges and promising directions can advance efficient ViT development.