Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

The Retina01:32

The Retina

68.0K
The retina is a layer of nervous tissue at the back of the eye that transduces light into neural signals. This process, called phototransduction, is carried out by rod and cone photoreceptor cells in the back of the retina.
68.0K
Vision01:24

Vision

53.1K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
53.1K
Reducing Line Loss01:18

Reducing Line Loss

150
In a three-phase circuit, line loss is an indicator of energy dissipated as heat due to the resistance of transmission lines. To address this, incorporating transformers into the system—a step-up transformer at the source and a step-down transformer at the load—is a strategic solution. Two three-phase transformers are introduced to improve this.
With a step-up transformer at the source, the voltage is increased, thereby reducing the current in the transmission lines since power loss...
150
Visual System01:26

Visual System

557
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
557
Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

609
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
609
Uniform Depth Channel Flow01:27

Uniform Depth Channel Flow

63
Uniform depth channel flow keeps fluid depth consistent along channels such as irrigation canals. In natural channels, such as rivers, approximate uniform flow is often assumed. This condition occurs when the channel’s bottom slope matches the energy slope, balancing potential energy lost from gravity with head loss due to shear stress. This balance prevents depth changes along the channel length, resulting in a steady, uniform flow.Uniform flow in open channels with a constant cross-section...
63

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

DynamicVLN: Incorporating Dynamics into Vision-and-Language Navigation Scenarios.

Sensors (Basel, Switzerland)·2025
Same author

A Comprehensive Analysis of a Social Intelligence Dataset and Response Tendencies Between Large Language Models (LLMs) and Humans.

Sensors (Basel, Switzerland)·2025
Same author

Proto-Adapter: Efficient Training-Free CLIP-Adapter for Few-Shot Image Classification.

Sensors (Basel, Switzerland)·2024
Same author

Synthetic Document Images with Diverse Shadows for Deep Shadow Removal Networks.

Sensors (Basel, Switzerland)·2024
Same author

Action Quality Assessment Model Using Specialists' Gaze Location and Kinematics Data-Focusing on Evaluating Figure Skating Jumps.

Sensors (Basel, Switzerland)·2023
Same author

Boosting Semantic Segmentation by Conditioning the Backbone with Semantic Boundaries.

Sensors (Basel, Switzerland)·2023
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jun 13, 2025

Time-Lapse Imaging of Neuronal Arborization using Sparse Adeno-Associated Virus Labeling of Genetically Targeted Retinal Cell Populations
13:13

Time-Lapse Imaging of Neuronal Arborization using Sparse Adeno-Associated Virus Labeling of Genetically Targeted Retinal Cell Populations

Published on: March 19, 2021

2.9K

RetinaViT: Efficient Visual Backbone for Online Video Streams.

Tomoyuki Suzuki1, Yoshimitsu Aoki1

  • 1Department of Electronics and Electrical Engineering, Faculty of Science and Technology, Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama 223-8522, Kanagawa, Japan.

Sensors (Basel, Switzerland)
|September 14, 2024
PubMed
Summary
This summary is machine-generated.

RetinaViT enhances online video understanding by efficiently extracting frame-level visual features. This method significantly speeds up tasks like action recognition, improving both accuracy and efficiency.

Keywords:
Vision Transformerefficient computationonline video understanding

More Related Videos

Author Spotlight: An Automated Method for Assessing Visual Acuity in Infants and Toddlers Using an Eye-Tracking System
05:10

Author Spotlight: An Automated Method for Assessing Visual Acuity in Infants and Toddlers Using an Eye-Tracking System

Published on: March 17, 2023

2.6K
VisioTracker, an Innovative Automated Approach to Oculomotor Analysis
05:51

VisioTracker, an Innovative Automated Approach to Oculomotor Analysis

Published on: October 12, 2011

11.0K

Related Experiment Videos

Last Updated: Jun 13, 2025

Time-Lapse Imaging of Neuronal Arborization using Sparse Adeno-Associated Virus Labeling of Genetically Targeted Retinal Cell Populations
13:13

Time-Lapse Imaging of Neuronal Arborization using Sparse Adeno-Associated Virus Labeling of Genetically Targeted Retinal Cell Populations

Published on: March 19, 2021

2.9K
Author Spotlight: An Automated Method for Assessing Visual Acuity in Infants and Toddlers Using an Eye-Tracking System
05:10

Author Spotlight: An Automated Method for Assessing Visual Acuity in Infants and Toddlers Using an Eye-Tracking System

Published on: March 17, 2023

2.6K
VisioTracker, an Innovative Automated Approach to Oculomotor Analysis
05:51

VisioTracker, an Innovative Automated Approach to Oculomotor Analysis

Published on: October 12, 2011

11.0K

Area of Science:

  • Computer Vision
  • Artificial Intelligence
  • Machine Learning

Background:

  • Online video understanding is critical for many applications.
  • Frame-level visual feature extraction is a major bottleneck in video processing.
  • Existing methods struggle with real-time inference speed requirements.

Purpose of the Study:

  • To propose RetinaViT, an efficient method for online video understanding.
  • To enhance the speed and accuracy of frame-level visual feature extraction.
  • To improve the overall efficiency of online video understanding tasks.

Main Methods:

  • RetinaViT uses approximated Transformer blocks with event tokens as queries.
  • It reuses previously processed tokens and restricts keys/values to spatial neighborhoods.
  • Model parameters are tuned via multi-step black-box optimization during training.

Main Results:

  • RetinaViT significantly improves the speed/accuracy trade-off on various tasks.
  • For action recognition, it reduces inference time by up to 61.9% (CPU) and 50.8% (GPU).
  • Accuracy is maintained or slightly improved compared to baseline models.

Conclusions:

  • RetinaViT offers a substantial efficiency improvement for online video understanding.
  • The method effectively addresses the bottleneck of frame-level feature extraction.
  • RetinaViT demonstrates practical benefits for real-world video analysis applications.