Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Stereotype Content Model02:16

Stereotype Content Model

14.8K
The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...
14.8K
Globular and Fibrous Proteins02:21

Globular and Fibrous Proteins

4.2K
4.2K
Master Transcription Regulators02:23

Master Transcription Regulators

2.3K
2.3K
Source Transformation01:15

Source Transformation

6.9K
Source transformation is a fundamental technique employed in circuit analysis, offering a valuable tool for simplifying complex electrical circuits. This technique involves the replacement of either a voltage source in series with a resistor by a current source in parallel with a resistor, or vice versa. The key concept here is that when the original sources are deactivated (turned off), the equivalent resistance at the circuit's end terminals remains the same.
It is essential to note that when...
6.9K
Gene Conversion02:08

Gene Conversion

2.4K
2.4K
Translation01:31

Translation

142.6K
Lesson: Translation
Translation is the process of synthesizing proteins from the genetic information carried by messenger RNA (mRNA). Following transcription, it constitutes the final step in the expression of genes. This process is carried out by ribosomes, complexes of protein and specialized RNA molecules. Ribosomes, transfer RNA (tRNA), and other proteins produce a chain of amino acids—the polypeptide—as the end product of translation.
Translation Produces the Building Blocks of...
142.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Synchronized UAV multi-angle inversion of canopy structure parameters in wheat breeding materials.

Plant phenomics (Washington, D.C.)·2026
Same author

Anti-PD-1 monoclonal antibody suppresses hepatitis B virus in patients with hepatocellular carcinoma.

Chinese medical journal·2026
Same author

Fat-muscle balance and incident diabetes: evidence from Chinese and UK cohorts.

Diabetes research and clinical practice·2026
Same author

CILP2 exacerbates diabetes-induced muscle atrophy by over-activating skeletal muscle autophagy and inflammation via the P38 MAPK pathway.

International immunopharmacology·2026
Same author

Short-Term Combined Treatment With Tirzepatide and Metformin for Overweight/Obese Chinese Women With Polycystic Ovary Syndrome: A Prospective, Open-Label, Randomised Controlled Trial.

Diabetes, obesity & metabolism·2026
Same author

KnitLoRA: bridging low-rank adaptation as interwoven layers for deeper semantic reasoning.

Scientific reports·2026
Same journal

QARV++: An Improved Hierarchical VAE for Learned Image Compression.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2026
Same journal

Unified Architecture Adaptation for Compressed Domain Semantic Inference.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2023
Same journal

Cohesive Multi-Modality Feature Learning and Fusion for COVID-19 Patient Severity Prediction.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2022
Same journal

A Compact VLSI System for Bio-Inspired Visual Motion Estimation.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2018
Same journal

Single image super-resolution via an iterative reproducing kernel Hilbert space method.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2017
Same journal

Structured Set Intra Prediction With Discriminative Learning in a Max-Margin Markov Network for High Efficiency Video Coding.

IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society·2014
See all related articles

Related Experiment Video

Updated: Jul 29, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

9.0K

Video Captioning Using Global-Local Representation.

Liqi Yan1, Siqi Ma2, Qifan Wang3

  • 1Fudan University, China.; Westlake University, China; Rochester Institute of Technology, USA.

IEEE Transactions on Circuits and Systems for Video Technology : a Publication of the Circuits and Systems Society
|May 22, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a global-local representation (GLR) framework for improved video captioning. The GLR framework enhances sentence generation by effectively modeling global and local visual information, outperforming existing methods.

Keywords:
Computer visionnatural language processingvideo captioningvideo representationvisual analysis

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

635
Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment
06:25

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Published on: December 23, 2020

2.6K

Related Experiment Videos

Last Updated: Jul 29, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

9.0K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

635
Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment
06:25

Capturing Representative Hand Use at Home Using Egocentric Video in Individuals with Upper Limb Impairment

Published on: December 23, 2020

2.6K

Area of Science:

  • Artificial Intelligence
  • Computer Vision
  • Natural Language Processing

Background:

  • Video captioning requires transforming visual data into coherent text.
  • Current methods struggle to integrate global and local visual features for effective sentence generation.

Purpose of the Study:

  • To propose a novel Global-Local Representation (GLR) framework for video captioning.
  • To enhance the modeling of vision-language connections in video understanding.

Main Methods:

  • Developed a GLR framework utilizing extensive vision representations from diverse video ranges.
  • Introduced a novel global-local encoder to process long-range, short-range, and keyframe video data.
  • Implemented a progressive training strategy for optimized feature learning.

Main Results:

  • The GLR framework significantly outperforms state-of-the-art methods on MSR-VTT and MSVD datasets.
  • Achieved superior performance compared to a well-tuned SA-LSTM baseline.
  • Demonstrated shorter training schedules compared to existing approaches.

Conclusions:

  • The proposed GLR framework offers a simple yet effective approach to video captioning.
  • GLR provides a richer semantic understanding of video content across frames.
  • The framework shows potential as a strong baseline for various video understanding tasks.