Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Vision01:24

Vision

55.1K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
55.1K
Force Classification01:22

Force Classification

1.4K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
1.4K
Light Acquisition02:16

Light Acquisition

8.6K
In order to produce glucose, plants need to capture sufficient light energy. Many modern plants have evolved leaves specialized for light acquisition. Leaves can be only millimeters in width or tens of meters wide, depending on the environment. Due to competition for sunlight, evolution has driven the evolution of increasingly larger leaves and taller plants, to avoid shading by their neighbors with contaminant elaboration of root architecture and mechanisms to transport water and nutrients.
8.6K
Prosopagnosia01:24

Prosopagnosia

236
Prosopagnosia, also known as face blindness, is the inability to recognize faces. In severe cases, individuals with prosopagnosia may not recognize close family members, including parents and spouses, by their faces. For instance, someone with prosopagnosia might walk past their child in a crowd, only realizing their mistake upon noticing their child's distinctive backpack or favorite jacket. Prosopagnosia specifically impairs facial recognition, while the recognition of other objects or...
236
Perception01:28

Perception

545
Perception is a fundamental psychological process that enables individuals to organize, interpret, and consciously experience sensory information. This process is crucial for understanding and interacting with the world around us. It includes both bottom-up and top-down processing, each playing a distinct role in how we perceive our environment.
Bottom-up processing begins at the sensory level, where receptors detect external environmental stimuli. These could include the tactile sensation of...
545
Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

847
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
847

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Determining chronicity and frequency of histologic lung lesions in feedyard cattle mortalities.

Journal of veterinary diagnostic investigation : official publication of the American Association of Veterinary Laboratory Diagnosticians, Inc·2026
Same author

Calcinosis Circumscripta with Iron Mineralization in an African Green Monkey (Chlorocebus aethiops sabaeus).

Journal of the American Association for Laboratory Animal Science : JAALAS·2025
Same author

Low-Resolution Self-Attention for Semantic Segmentation.

IEEE transactions on pattern analysis and machine intelligence·2025
Same author

Ocular and perineal squamous cell carcinomas in a Holstein Friesian cow.

Open veterinary journal·2024
Same author

Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition.

IEEE transactions on pattern analysis and machine intelligence·2024
Same author

Learnable Central Similarity Quantization for Efficient Image and Video Retrieval.

IEEE transactions on neural networks and learning systems·2023
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Aug 29, 2025

Eye Tracking During A Complex Aviation Task For Insights Into Information Processing
07:48

Eye Tracking During A Complex Aviation Task For Insights Into Information Processing

Published on: April 4, 2025

524

VOLO: Vision Outlooker for Visual Recognition.

Li Yuan, Qibin Hou, Zihang Jiang

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |September 12, 2022
    PubMed
    Summary
    This summary is machine-generated.

    Vision Outlooker (VOLO) enhances Vision Transformers (ViTs) by introducing outlook attention for efficient fine-level feature encoding. This novel approach improves accuracy on ImageNet classification and downstream tasks, outperforming existing models.

    More Related Videos

    VisioTracker, an Innovative Automated Approach to Oculomotor Analysis
    05:51

    VisioTracker, an Innovative Automated Approach to Oculomotor Analysis

    Published on: October 12, 2011

    11.2K
    Evaluating Flight Performance and Eye Movement Patterns Using Virtual Reality Flight Simulator
    03:49

    Evaluating Flight Performance and Eye Movement Patterns Using Virtual Reality Flight Simulator

    Published on: May 19, 2023

    1.1K

    Related Experiment Videos

    Last Updated: Aug 29, 2025

    Eye Tracking During A Complex Aviation Task For Insights Into Information Processing
    07:48

    Eye Tracking During A Complex Aviation Task For Insights Into Information Processing

    Published on: April 4, 2025

    524
    VisioTracker, an Innovative Automated Approach to Oculomotor Analysis
    05:51

    VisioTracker, an Innovative Automated Approach to Oculomotor Analysis

    Published on: October 12, 2011

    11.2K
    Evaluating Flight Performance and Eye Movement Patterns Using Virtual Reality Flight Simulator
    03:49

    Evaluating Flight Performance and Eye Movement Patterns Using Virtual Reality Flight Simulator

    Published on: May 19, 2023

    1.1K

    Area of Science:

    • Computer Vision
    • Deep Learning
    • Machine Learning

    Background:

    • Vision Transformers (ViTs) show promise in visual recognition but struggle with fine-level feature encoding, lagging behind CNNs when trained from scratch.
    • Existing ViTs exhibit low training sample efficiency due to simple tokenization and limited feature richness from redundant backbone designs.

    Purpose of the Study:

    • To address the limitations of Vision Transformers in encoding fine-level features and improve their performance on visual recognition tasks.
    • To introduce a novel, efficient, and generic architecture called Vision Outlooker (VOLO) that overcomes the drawbacks of current ViT designs.

    Main Methods:

    • Developed Vision Outlooker (VOLO) featuring a novel outlook attention mechanism for dynamic local feature aggregation in a sliding window manner.
    • Outlook attention focuses on encoding fine-level features, unlike self-attention's global dependency modeling, offering improved memory efficiency and breaking computational bottlenecks.

    Main Results:

    • VOLO with 26.6M parameters achieved 84.2% top-1 accuracy on ImageNet-1K, surpassing T2T-ViT by 2.7% with comparable parameters.
    • Scaling VOLO to 296M parameters resulted in 87.1% accuracy, setting a new record for ImageNet-1K classification.
    • Pretrained VOLO models demonstrated superior performance on downstream tasks, including semantic segmentation.

    Conclusions:

    • Vision Outlooker (VOLO) offers a more efficient and effective approach to visual recognition by improving fine-level feature encoding.
    • The outlook attention mechanism provides a scalable and memory-efficient alternative to self-attention, enabling state-of-the-art performance on benchmark datasets and downstream applications.