Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Language and Cognition

Language and Cognition

Language serves as a bridge between ideas and communication, influencing how individuals perceive and interact with the world. Psychologists have long debated whether language shapes thought or vice versa. This discussion gained grip with Edward Sapir and Benjamin Lee Whorf in the 1940s, who proposed that language determines thought, a concept known as linguistic determinism. They suggested that the vocabulary and structure of a language influence how its speakers think and perceive reality.

Channels of Non-Verbal Communication

Channels of Non-Verbal Communication

Non-verbal communication plays a critical role in human interaction, influencing how individuals perceive emotions and psychological states. It operates through four primary channels: facial expressions, eye contact, body language, and touch. These non-verbal cues help convey meaning beyond spoken language and are often culturally influenced.Facial Expressions and Emotional RecognitionFacial expressions are among the most powerful and universal forms of non-verbal communication. Research has...

Sequence Networks of Rotating Machines

Sequence Networks of Rotating Machines

A Y-connected synchronous generator, grounded through a neutral impedance, is designed to produce balanced internal phase voltages with only positive-sequence components. The generator's sequence networks include a source voltage that is exclusively in the positive-sequence network. The sequence components of line-to-ground voltages at the generator terminals illustrate this configuration.
Zero-sequence current induces a voltage drop across the generator's neutral impedance and other...

Components of Language

Components of Language

Language, whether spoken, signed, or written, consists of specific components: lexicon and grammar. The lexicon is the vocabulary of a language, comprising its words. Grammar is the set of rules used to convey meaning through the lexicon. For example, English grammar adds “-ed” to most verbs to indicate past tense. Words are formed by combining phonemes, which are the basic sound units of a language. Different languages have different sets of phonemes (e.g., “ah” vs.

Language Development

Language Development

Children master language quickly and with relative ease, supported by both biological predisposition and reinforcement. B. F. Skinner (1957) proposed that language is learned through reinforcement, while Noam Chomsky (1965) argued that language acquisition mechanisms are biologically determined.
The critical period for language acquisition suggests that the ability to acquire language is at its peak early in life. As people age, this proficiency decreases. Language development begins very...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Dataset Pruning: Reducing Training Data by Examining SGD-Influence.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

COMBINER: Composed Image Retrieval Guided by Attribute-Based Neighbor Relations.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

UniEmo: Unifying Emotional Understanding and Generation With Learnable Expert Queries.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

SpaceEra++: A Unified Framework Towards 3D Spatial Reasoning in Video.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

A Natural Language Guided Approach for Blind Face Restoration: Methodology and Dataset.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

FRM-PTQ: Feature relationship matching enhanced low-bit post-training quantization for large language models.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 19, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos.

Zongmeng Zhang, Xianjing Han, Xuemeng Song

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|September 24, 2021

Summary

This summary is machine-generated.

This study introduces a new graph convolutional network for video temporal localization, improving accuracy by considering intra-modal relationships. The model enhances understanding and semantic matching between video content and language queries.

Related Experiment Videos

Last Updated: Oct 19, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Computer Science
Artificial Intelligence
Machine Learning

Background:

Temporal language localization in videos requires understanding video content and natural language queries.
Accurate semantic correspondence between video moments and text queries is challenging.
Existing methods often overlook intra-modal relationships within videos and text.

Purpose of the Study:

To propose a novel Multi-modal Interaction Graph Convolutional Network (MIGCN) for temporal language localization.
To jointly model intra-modal relations and inter-modal interactions for better video-text understanding.
To develop an adaptive, context-aware localization method for precise moment identification.

Main Methods:

Utilizing a Multi-modal Interaction Graph Convolutional Network (MIGCN).
Incorporating intra-modal semantic similarities (video clips) and syntactic dependencies (query words).
Employing an adaptive context-aware localization strategy with multi-scale fully connected layers for boundary refinement.

Main Results:

Demonstrated promising performance on the Charades-STA and ActivityNet datasets.
Achieved superior efficiency compared to existing models.
The MIGCN effectively captures complex relationships for accurate temporal localization.

Conclusions:

The proposed MIGCN model significantly advances temporal language localization in videos.
The integration of intra-modal and inter-modal information is crucial for robust video-text understanding.
The adaptive localization method refines moment boundaries effectively, showcasing the model's practical applicability.