Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Gender Recognition Based on Gradual and Ensemble Learning from Multi-View Gait Energy Images and Poses.

Sensors (Basel, Switzerland)·2023
Same author

Facial Micro-Expression Recognition Using Double-Stream 3D Convolutional Neural Network with Domain Adaptation.

Sensors (Basel, Switzerland)·2023
Same author

Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information.

Sensors (Basel, Switzerland)·2022
Same author

Saliency Detection with Moving Camera via Background Model Completion.

Sensors (Basel, Switzerland)·2021
Same author

Shadow Detection in Still Road Images Using Chrominance Properties of Shadows and Spectral Power Distribution of the Illumination.

Sensors (Basel, Switzerland)·2020
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jun 23, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.4K

Variable Temporal Length Training for Action Recognition CNNs.

Tan-Kun Li1, Kwok-Leung Chan1, Tardi Tjahjadi2

  • 1Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China.

Sensors (Basel, Switzerland)
|June 19, 2024
PubMed
Summary
This summary is machine-generated.

Deep learning models struggle with variable video lengths. Variable Length Training (VLT) for 3D-CNNs enables flexible processing of videos with diverse temporal dimensions, improving action recognition performance.

Keywords:
action recognitiondeep learningrepresentation learningvideo classification

More Related Videos

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

2.7K
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.5K

Related Experiment Videos

Last Updated: Jun 23, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.4K
Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

2.7K
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.5K

Area of Science:

  • Computer Vision
  • Deep Learning
  • Artificial Intelligence

Background:

  • Current deep learning models, particularly for computer vision, exhibit limited flexibility regarding input shape, often requiring fixed dimensions for optimal performance.
  • Video analysis tasks face challenges due to the inherent variability in video lengths (number of frames), necessitating frame sampling techniques that can degrade feature quality and hinder adaptability.
  • Standard training methods can damage features in longer videos and prevent models from flexibly adapting to variable lengths for on-demand inference.

Purpose of the Study:

  • To propose a novel training paradigm, Variable Length Training (VLT), for 3D Convolutional Neural Networks (3D-CNNs).
  • To enable 3D-CNNs to effectively process videos with variable temporal lengths without performance degradation.
  • To enhance the flexibility and adaptability of deep learning models for video-related tasks.

Main Methods:

  • Introduced Variable Length Training (VLT) for 3D-CNNs, incorporating three additional training operations: sampling twice, temporal packing, and subvideo-independent 3D convolution.
  • Integrated these efficient operations into existing 3D-CNN architectures.
  • Implemented a consistency loss to regularize the representation space, further enhancing model robustness.

Main Results:

  • The proposed VLT method allows trained models to process videos of varying temporal lengths during inference without any architectural modifications.
  • Experiments on popular action recognition datasets demonstrated superior performance compared to conventional training paradigms.
  • The method showed improved results over other state-of-the-art training approaches for variable length video processing.

Conclusions:

  • Variable Length Training (VLT) offers a simple yet effective solution for deep learning models to handle variable length video inputs.
  • The VLT paradigm enhances model flexibility, adaptability, and performance in video analysis tasks, particularly action recognition.
  • This approach overcomes the limitations of fixed-length input requirements in current deep learning models for video processing.