Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Stereotype Content Model

Stereotype Content Model

The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...

Globular and Fibrous Proteins

Globular and Fibrous Proteins

Master Transcription Regulators

Master Transcription Regulators

Source Transformation

Source Transformation

Source transformation is a fundamental technique employed in circuit analysis, offering a valuable tool for simplifying complex electrical circuits. This technique involves the replacement of either a voltage source in series with a resistor by a current source in parallel with a resistor, or vice versa. The key concept here is that when the original sources are deactivated (turned off), the equivalent resistance at the circuit's end terminals remains the same.
It is essential to note that when...

Gene Conversion

Gene Conversion

Translation

Translation

Lesson: Translation
Translation is the process of synthesizing proteins from the genetic information carried by messenger RNA (mRNA). Following transcription, it constitutes the final step in the expression of genes. This process is carried out by ribosomes, complexes of protein and specialized RNA molecules. Ribosomes, transfer RNA (tRNA), and other proteins produce a chain of amino acids—the polypeptide—as the end product of translation.
Translation Produces the Building Blocks of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Synchronized UAV multi-angle inversion of canopy structure parameters in wheat breeding materials.

Plant phenomics (Washington, D.C.)·2026

Same author

Anti-PD-1 monoclonal antibody suppresses hepatitis B virus in patients with hepatocellular carcinoma.

Chinese medical journal·2026

Same author

Fat-muscle balance and incident diabetes: evidence from Chinese and UK cohorts.

Diabetes research and clinical practice·2026

Same author

CILP2 exacerbates diabetes-induced muscle atrophy by over-activating skeletal muscle autophagy and inflammation via the P38 MAPK pathway.

International immunopharmacology·2026

Same author

Short-Term Combined Treatment With Tirzepatide and Metformin for Overweight/Obese Chinese Women With Polycystic Ovary Syndrome: A Prospective, Open-Label, Randomised Controlled Trial.

Diabetes, obesity & metabolism·2026

Same author

KnitLoRA: bridging low-rank adaptation as interwoven layers for deeper semantic reasoning.

Scientific reports·2026

Same journal

QARV++: An Improved Hierarchical VAE for Learned Image Compression.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2026

Same journal

Unified Architecture Adaptation for Compressed Domain Semantic Inference.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2023

Same journal

Cohesive Multi-Modality Feature Learning and Fusion for COVID-19 Patient Severity Prediction.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2022

Same journal

A Compact VLSI System for Bio-Inspired Visual Motion Estimation.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2018

Same journal

Single image super-resolution via an iterative reproducing kernel Hilbert space method.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2017

Same journal

Structured Set Intra Prediction With Discriminative Learning in a Max-Margin Markov Network for High Efficiency Video Coding.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2014

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 29, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Video Captioning Using Global-Local Representation.

Liqi Yan¹, Siqi Ma², Qifan Wang³

¹Fudan University, China.; Westlake University, China; Rochester Institute of Technology, USA.

IEEE Transactions on Circuits and Systems for Video Technology : a Publication of the Circuits and Systems Society

|May 22, 2023

Summary

This summary is machine-generated.

This study introduces a global-local representation (GLR) framework for improved video captioning. The GLR framework enhances sentence generation by effectively modeling global and local visual information, outperforming existing methods.

Keywords:

Computer vision natural language processing video captioning video representation visual analysis

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Published on: December 23, 2020

Related Experiment Videos

Last Updated: Jul 29, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Published on: December 23, 2020

Area of Science:

Artificial Intelligence
Computer Vision
Natural Language Processing

Background:

Video captioning requires transforming visual data into coherent text.
Current methods struggle to integrate global and local visual features for effective sentence generation.

Purpose of the Study:

To propose a novel Global-Local Representation (GLR) framework for video captioning.
To enhance the modeling of vision-language connections in video understanding.

Main Methods:

Developed a GLR framework utilizing extensive vision representations from diverse video ranges.
Introduced a novel global-local encoder to process long-range, short-range, and keyframe video data.
Implemented a progressive training strategy for optimized feature learning.

Main Results:

The GLR framework significantly outperforms state-of-the-art methods on MSR-VTT and MSVD datasets.
Achieved superior performance compared to a well-tuned SA-LSTM baseline.
Demonstrated shorter training schedules compared to existing approaches.

Conclusions:

The proposed GLR framework offers a simple yet effective approach to video captioning.
GLR provides a richer semantic understanding of video content across frames.
The framework shows potential as a strong baseline for various video understanding tasks.