Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

776
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
776
Convolution Properties II01:17

Convolution Properties II

252
The important convolution properties include width, area, differentiation, and integration properties.
The width property indicates that if the durations of input signals are T1 and T2, then the width of the output response equals the sum of both durations, irrespective of the shapes of the two functions. For instance, convolving two rectangular pulses with durations of 2 seconds and 1 second results in a function with a width of 3 seconds.
The area property asserts that the area under the...
252
Convolution: Math, Graphics, and Discrete Signals01:24

Convolution: Math, Graphics, and Discrete Signals

324
In any LTI (Linear Time-Invariant) system, the convolution of two signals is denoted using a convolution operator, assuming all initial conditions are zero. The convolution integral can be divided into two parts: the zero-input or natural response and the zero-state or forced response, with t0 indicating the initial time.
To simplify the convolution integral, it is assumed that both the input signal and impulse response are zero for negative time values. The graphical convolution process...
324
Convolution Properties I01:20

Convolution Properties I

202
Convolution computations can be simplified by utilizing their inherent properties.
The commutative property reveals that the input and the impulse response of an LTI (Linear Time-Invariant) system can be interchanged without affecting the output:
202
Deconvolution01:20

Deconvolution

212
Deconvolution, also known as inverse filtering, is the process of extracting the impulse response from known input and output signals. This technique is vital in scenarios where the system's characteristics are unknown, and they must be inferred from the observable signals.
Deconvolution involves several mathematical techniques to derive the impulse response. One common approach is polynomial division. In this method, the input and output sequences are treated as coefficients of...
212
Visual System01:26

Visual System

632
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
632

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

RAD51 gene is associated with advanced age-related macular degeneration in Chinese population.

Clinical biochemistry·2013
Same author

Immunization against recombinant GnRH-I alters ultrastructure of gonadotropin cell in an experimental boar model.

Reproductive biology and endocrinology : RB&E·2013
Same author

Multi-class constrained normalized cut with hard, soft, unary and pairwise priors and its applications to object segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2013
Same author

Comparison of genomic and amino acid sequences of eight Japanese encephalitis virus isolates from bats.

Archives of virology·2013
Same author

Regulation of dendritic cell differentiation in bone marrow during emergency myelopoiesis.

Journal of immunology (Baltimore, Md. : 1950)·2013
Same author

Separation of mandelic acid and its derivatives with new immobilized cellulose chiral stationary phase.

Journal of Zhejiang University. Science. B·2013
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Aug 3, 2025

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

470

Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks.

Yongming Rao, Zuyan Liu, Wenliang Zhao

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |April 8, 2023
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces dynamic spatial sparsification to accelerate vision models by removing redundant visual data tokens. This method significantly reduces computations and boosts speed with minimal accuracy loss, offering a new dimension for model acceleration.

    More Related Videos

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    592
    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    1.9K

    Related Experiment Videos

    Last Updated: Aug 3, 2025

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    470
    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    592
    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    1.9K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Vision Transformers (ViTs) rely on a subset of informative regions for accurate image recognition.
    • Existing acceleration methods often lack adaptivity to input-specific visual data characteristics.

    Purpose of the Study:

    • To develop a dynamic token sparsification framework for accelerating vision Transformer models.
    • To propose a generalizable approach for accelerating various deep learning architectures using adaptive computation.

    Main Methods:

    • A dynamic token sparsification framework that progressively prunes redundant tokens based on input.
    • A lightweight prediction module integrated into different layers to estimate token importance hierarchically.
    • A generic dynamic spatial sparsification framework with progressive sparsification and asymmetric computation for structured feature maps.

    Main Results:

    • Hierarchically pruning 66% of input tokens reduced computations by 31%-35% and improved throughput by over 40% with <0.5% accuracy drop in ViTs.
    • Similar acceleration achieved on Convolutional Neural Networks (CNNs) and Swin Transformers using asymmetric computation.
    • Promising results demonstrated on complex tasks like semantic segmentation and object detection.

    Conclusions:

    • Dynamic spatial sparsification is an effective approach for accelerating deep learning models.
    • Adaptive and asymmetric computation offers a general solution for enhancing model efficiency across diverse architectures.
    • The proposed framework significantly reduces computational load while maintaining high accuracy, paving the way for more efficient visual data processing.