Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

ARKG: Adversarially Residual Knowledge Generalization to Open-Set Domain Adaptation.

IEEE transactions on neural networks and learning systems·2026

Same author

Reconfiguring brain networks via lightweight dynamic connectivity framework: An EEG-based stress validation.

Computers in biology and medicine·2026

Same author

Contrastive adapter training and consensus knowledge distillation for multi-source-free domain adaptation in skin cancer diagnosis.

Artificial intelligence in medicine·2026

Same author

Passive heart-rate monitoring during smartphone use in everyday life.

Nature·2026

Same author

DrawMotion: Generating 3D Human Motions by Freehand Drawing.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Insulin resistance prediction from wearables and routine blood biomarkers.

Nature·2026

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 24, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

A Review of Deep Learning for Video Captioning.

Moloud Abdar, Meenakshi Kollati, Swaraja Kuraparthi

IEEE Transactions on Pattern Analysis and Machine Intelligence

|March 3, 2025

Summary

This summary is machine-generated.

This survey reviews deep learning methods for video captioning (VC), a technique that describes video content in natural language. It covers architectures, datasets, and future research directions for advancing VC applications.

More Related Videos

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Related Experiment Videos

Last Updated: May 24, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Area of Science:

Artificial Intelligence
Computer Vision
Natural Language Processing

Background:

Video captioning (VC) is a multidisciplinary research area.
VC aims to generate natural language descriptions for video content.
Applications include accessibility, video retrieval, and question answering.

Purpose of the Study:

To provide a comprehensive review of deep learning-based video captioning methods.
To categorize and discuss various VC approaches.
To identify research gaps and future directions in the field.

Main Methods:

Overview of problem formulation, evaluation metrics, and training losses.
Categorization of VC methods including attention-based architectures, graph networks, reinforcement learning, adversarial networks, and dense video captioning.
Review of existing datasets for video captioning.

Main Results:

Detailed discussion of different deep learning-based VC architectures.
Analysis of current datasets and their suitability for VC tasks.
Identification of key research gaps and emerging trends.

Conclusions:

Deep learning has significantly advanced video captioning capabilities.
Further research is needed in areas like complex scene understanding and long-form video description.
This survey serves as a guide for researchers in video captioning and related fields.