Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Convolution Properties II

Convolution Properties II

The important convolution properties include width, area, differentiation, and integration properties.
The width property indicates that if the durations of input signals are T1 and T2, then the width of the output response equals the sum of both durations, irrespective of the shapes of the two functions. For instance, convolving two rectangular pulses with durations of 2 seconds and 1 second results in a function with a width of 3 seconds.
The area property asserts that the area under the...

Convolution: Math, Graphics, and Discrete Signals

Convolution: Math, Graphics, and Discrete Signals

In any LTI (Linear Time-Invariant) system, the convolution of two signals is denoted using a convolution operator, assuming all initial conditions are zero. The convolution integral can be divided into two parts: the zero-input or natural response and the zero-state or forced response, with t0 indicating the initial time.
To simplify the convolution integral, it is assumed that both the input signal and impulse response are zero for negative time values. The graphical convolution process...

Convolution Properties I

Convolution Properties I

Convolution computations can be simplified by utilizing their inherent properties.
The commutative property reveals that the input and the impulse response of an LTI (Linear Time-Invariant) system can be interchanged without affecting the output:

Deconvolution

Deconvolution

Deconvolution, also known as inverse filtering, is the process of extracting the impulse response from known input and output signals. This technique is vital in scenarios where the system's characteristics are unknown, and they must be inferred from the observable signals.
Deconvolution involves several mathematical techniques to derive the impulse response. One common approach is polynomial division. In this method, the input and output sequences are treated as coefficients of...

Visual System

Visual System

Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

RAD51 gene is associated with advanced age-related macular degeneration in Chinese population.

Clinical biochemistry·2013

Same author

Immunization against recombinant GnRH-I alters ultrastructure of gonadotropin cell in an experimental boar model.

Reproductive biology and endocrinology : RB&E·2013

Same author

Multi-class constrained normalized cut with hard, soft, unary and pairwise priors and its applications to object segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2013

Same author

Comparison of genomic and amino acid sequences of eight Japanese encephalitis virus isolates from bats.

Archives of virology·2013

Same author

Regulation of dendritic cell differentiation in bone marrow during emergency myelopoiesis.

Journal of immunology (Baltimore, Md. : 1950)·2013

Same author

Separation of mandelic acid and its derivatives with new immobilized cellulose chiral stationary phase.

Journal of Zhejiang University. Science. B·2013

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 3, 2025

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks.

Yongming Rao, Zuyan Liu, Wenliang Zhao

IEEE Transactions on Pattern Analysis and Machine Intelligence

|April 8, 2023

Summary

This summary is machine-generated.

This study introduces dynamic spatial sparsification to accelerate vision models by removing redundant visual data tokens. This method significantly reduces computations and boosts speed with minimal accuracy loss, offering a new dimension for model acceleration.

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Related Experiment Videos

Last Updated: Aug 3, 2025

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Area of Science:

Computer Vision
Artificial Intelligence
Machine Learning

Background:

Vision Transformers (ViTs) rely on a subset of informative regions for accurate image recognition.
Existing acceleration methods often lack adaptivity to input-specific visual data characteristics.

Purpose of the Study:

To develop a dynamic token sparsification framework for accelerating vision Transformer models.
To propose a generalizable approach for accelerating various deep learning architectures using adaptive computation.

Main Methods:

A dynamic token sparsification framework that progressively prunes redundant tokens based on input.
A lightweight prediction module integrated into different layers to estimate token importance hierarchically.
A generic dynamic spatial sparsification framework with progressive sparsification and asymmetric computation for structured feature maps.

Main Results:

Hierarchically pruning 66% of input tokens reduced computations by 31%-35% and improved throughput by over 40% with <0.5% accuracy drop in ViTs.
Similar acceleration achieved on Convolutional Neural Networks (CNNs) and Swin Transformers using asymmetric computation.
Promising results demonstrated on complex tasks like semantic segmentation and object detection.

Conclusions:

Dynamic spatial sparsification is an effective approach for accelerating deep learning models.
Adaptive and asymmetric computation offers a general solution for enhancing model efficiency across diverse architectures.
The proposed framework significantly reduces computational load while maintaining high accuracy, paving the way for more efficient visual data processing.