Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Endoscopic Procedures III: Video Capsule Endoscopy

Endoscopic Procedures III: Video Capsule Endoscopy

Capsule endoscopy, or wireless or video capsule endoscopy, is a diagnostic procedure for examining the entire gastrointestinal tract. Patients swallow a capsule about the size of a vitamin tablet. The capsule is equipped with a transmitter, a battery, an LED light source, and a color video camera to capture images throughout the gastrointestinal tract. This procedure is particularly useful for diagnosing conditions such as Crohn's disease, ulcerative colitis, tumors, polyps, ulcers,...

Social Scripts

People tend to know what behavior is expected of them in specific, familiar settings. A script is a person’s knowledge about the sequence of events expected in a specific setting (Schank & Abelson, 1977). Essentially, scripts are a particular kind of schema, one containing default values for the features within an event. In the restaurant example, the script's features include the props (e.g., tables, menu, food, and money), the roles to be played (e.g., customer and waiter),...

Aggression

Aggression

Humans engage in aggression when they seek to cause harm or pain to another person. Aggression takes two forms depending on one’s motives: hostile or instrumental. Hostile aggression is motivated by feelings of anger with intent to cause pain; a fight in a bar with a stranger is an example of hostile aggression. In contrast, instrumental aggression is motivated by achieving a goal and does not necessarily involve intent to cause pain (Berkowitz, 1993); a contract killer who murders for...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Hierarchical Consistency Learning for Test-Time Adaptation in Camouflage Perception.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Knowledge Diffusion-Based Adaptive Alignment with Hierarchical Context for Video Temporal Grounding.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

OmniCharacter++: Towards Comprehensive Benchmark for Realistic Role-Playing Agents.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Vision-Language Collaborative Representation Learning for Action Quality Assessment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

From Channel Bias to Feature Redundancy: Uncovering the "Less Is More" Principle in Few-Shot Learning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

SeMv-3D: Toward Concurrency of Semantic and Multi-View Consistency in General Text-to-3D Generation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 7, 2026

Video-oculography in Mice

Video-oculography in Mice

Published on: July 19, 2012

Video Captioning by Adversarial LSTM.

Yang Yang, Jie Zhou, Jiangbo Ai

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|July 17, 2018

Summary

This summary is machine-generated.

This study introduces an LSTM-GAN model for video captioning, using adversarial learning to improve accuracy and reduce errors common in Long-Short Term Memory networks. The novel approach enhances video description generation.

More Related Videos

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: A Non-Intubated Video-Assisted Thoracoscopic Surgery with Multimodal Analgesia and Sevoflurane Inhalation Anesthesia

Author Spotlight: A Non-Intubated Video-Assisted Thoracoscopic Surgery with Multimodal Analgesia and Sevoflurane Inhalation Anesthesia

Published on: May 26, 2023

Related Experiment Videos

Last Updated: Feb 7, 2026

Video-oculography in Mice

Video-oculography in Mice

Published on: July 19, 2012

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: A Non-Intubated Video-Assisted Thoracoscopic Surgery with Multimodal Analgesia and Sevoflurane Inhalation Anesthesia

Author Spotlight: A Non-Intubated Video-Assisted Thoracoscopic Surgery with Multimodal Analgesia and Sevoflurane Inhalation Anesthesia

Published on: May 26, 2023

Area of Science:

Artificial Intelligence
Computer Vision
Natural Language Processing

Background:

Long-Short Term Memory (LSTM) networks are effective for video captioning due to their temporal data handling.
However, LSTM-based methods often suffer from significant error accumulation.
Existing video captioning techniques require improvement in accuracy and robustness.

Purpose of the Study:

To propose a novel video captioning approach that addresses the limitations of current LSTM-based methods.
To reduce error accumulation in generated video captions.
To enhance the accuracy and quality of automated video descriptions.

Main Methods:

A Generative Adversarial Network (GAN) architecture was adopted, comprising a generator and a discriminator.
The generator utilizes an LSTM network for caption generation from video content.
A novel discriminator was developed, accepting both sentences and video features to improve accuracy.

Main Results:

The proposed LSTM-GAN system architecture significantly outperforms existing video captioning methods.
Experimental results on public datasets demonstrate superior performance.
The adversarial approach effectively mitigates error accumulation.

Conclusions:

The LSTM-GAN model offers a significant advancement in video captioning technology.
Adversarial learning provides a robust mechanism for improving caption accuracy.
This novel approach sets a new benchmark for automated video description generation.