Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

678
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
678
Vision01:24

Vision

53.5K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
53.5K
Parallel Processing01:20

Parallel Processing

157
The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...
157

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Effects of dietary octapeptin supplementation on growth performance, intestinal morphology, immune function, and serum metabolism of weaned piglets.

Journal of animal science·2026
Same author

Synergistic anion-cation descriptor for bidirectional electrocatalyst in Li-CO<sub>2</sub> battery.

Science advances·2026
Same author

Correction: Hu et al. Dietary Net Energy Concentration Affects Growth Performance, Carcass Traits, Intramuscular Fatty Acid Profile, and Cecal Microbiota of Pigs with Restricted Feed Allowance. <i>Animals</i> 2025, <i>15</i>, 3514.

Animals : an open access journal from MDPI·2026
Same author

AniFeats: Animate 3D Feature Meshes for Character Video Generation.

IEEE transactions on visualization and computer graphics·2026
Same author

[Construction of a ubiquitination-related gene prognostic model for breast cancer based on taxane treatment response].

Zhong nan da xue xue bao. Yi xue ban = Journal of Central South University. Medical sciences·2026
Same author

Light-Induced Charge Order Mode in a Metastable Cuprate Ladder.

Physical review letters·2026
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jul 12, 2025

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss
07:12

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Published on: April 11, 2025

402

MetaFormer Baselines for Vision.

Weihao Yu, Chenyang Si, Pan Zhou

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |November 1, 2023
    PubMed
    Summary
    This summary is machine-generated.

    MetaFormer architecture achieves strong performance with basic token mixers, demonstrating its robust foundation. Even simple mixers like identity mapping or random matrices yield high accuracy, with advanced models like CAFormer setting new records.

    More Related Videos

    Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition
    07:45

    Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition

    Published on: July 21, 2020

    4.5K
    Quantification of Oculomotor Responses and Accommodation Through Instrumentation and Analysis Toolboxes
    08:27

    Quantification of Oculomotor Responses and Accommodation Through Instrumentation and Analysis Toolboxes

    Published on: March 3, 2023

    982

    Related Experiment Videos

    Last Updated: Jul 12, 2025

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss
    07:12

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

    Published on: April 11, 2025

    402
    Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition
    07:45

    Assessing Binocular Central Visual Field and Binocular Eye Movements in a Dichoptic Viewing Condition

    Published on: July 21, 2020

    4.5K
    Quantification of Oculomotor Responses and Accommodation Through Instrumentation and Analysis Toolboxes
    08:27

    Quantification of Oculomotor Responses and Accommodation Through Instrumentation and Analysis Toolboxes

    Published on: March 3, 2023

    982

    Area of Science:

    • Computer Vision
    • Deep Learning Architectures
    • Neural Network Design

    Background:

    • The MetaFormer architecture, an abstraction of the Transformer, has shown promise in achieving competitive performance.
    • Previous research focused on token mixer designs within these architectures.
    • This study shifts focus to the inherent capacity of the MetaFormer framework itself.

    Purpose of the Study:

    • To explore the performance potential of the MetaFormer architecture independent of complex token mixer designs.
    • To validate MetaFormer's effectiveness with basic and conventional token mixers.
    • To introduce and evaluate a novel activation function, StarReLU.

    Main Methods:

    • Developed baseline MetaFormer models using simple token mixers: identity mapping (IdentityFormer) and random matrices (RandFormer).
    • Implemented ConvFormer using depthwise separable convolutions, comparing it to ConvNeXt.
    • Created CAFormer by combining depthwise separable convolutions and self-attention.
    • Introduced and tested StarReLU, a variant of Squared ReLU, as an activation function.

    Main Results:

    • IdentityFormer achieved over 80% accuracy on ImageNet-1K, establishing a solid performance baseline.
    • RandFormer surpassed IdentityFormer with over 81% accuracy, showcasing MetaFormer's compatibility with arbitrary mixers.
    • ConvFormer outperformed ConvNeXt, and CAFormer achieved a new record of 85.5% accuracy on ImageNet-1K.
    • StarReLU reduced activation FLOPs by 71% compared to GELU while improving performance.

    Conclusions:

    • MetaFormer provides a robust framework that ensures a high performance lower bound, even with minimal token mixers.
    • The architecture demonstrates versatility, performing well with diverse and even random token mixers.
    • MetaFormer, when combined with conventional or novel components like StarReLU, can achieve state-of-the-art results efficiently.