Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Language Development

Language Development

Children master language quickly and with relative ease, supported by both biological predisposition and reinforcement. B. F. Skinner (1957) proposed that language is learned through reinforcement, while Noam Chomsky (1965) argued that language acquisition mechanisms are biologically determined.
The critical period for language acquisition suggests that the ability to acquire language is at its peak early in life. As people age, this proficiency decreases. Language development begins very...

Improving Translational Accuracy

Improving Translational Accuracy

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Language and Cognition

Language and Cognition

Language serves as a bridge between ideas and communication, influencing how individuals perceive and interact with the world. Psychologists have long debated whether language shapes thought or vice versa. This discussion gained grip with Edward Sapir and Benjamin Lee Whorf in the 1940s, who proposed that language determines thought, a concept known as linguistic determinism. They suggested that the vocabulary and structure of a language influence how its speakers think and perceive reality.

Purposive Learning

Purposive Learning

E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a...

Introduction to Learning

Introduction to Learning

Learning is the process of acquiring knowledge or skills through practice or experience, leading to long-lasting behavioral changes. This acquisition occurs through interaction with the environment and requires practice or experience. For instance, mastering a skill such as surfing requires considerable practice and experience, highlighting the essential role of repeated interactions with the environment in learning.
In contrast to learned behaviors, unlearned behaviors such as crying, sexual...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Predicting an intrinsic conformational twist in Card1: an in silico study.

Journal of molecular modeling·2026

Same author

A cooperative effect of ligands, Mg<sup>2+</sup> ions, and the U6C mutation on the structural dynamics of the SAM-Ⅵ Riboswitch.

The Journal of biological chemistry·2026

Same author

Strain-tunable optoelectronics in a PdS<sub>2</sub> monolayer: the role of band nesting and carrier-phonon scattering.

Physical chemistry chemical physics : PCCP·2026

Same author

Burden and risk of asthma and rhinitis in people with atopic dermatitis: global estimates from a hierarchical Bayesian model.

The British journal of dermatology·2026

Same author

Conformational Transition of the CARF Domain Driven by Binding Free Energy.

Journal of chemical information and modeling·2026

Same author

Risk Prediction for Delayed Bleeding After Endoscopic Submucosal Dissection in Colorectal Precancerous Lesions Using Artificial Intelligence: Retrospective Clinical Trial.

Diseases of the colon and rectum·2025

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 12, 2025

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Global and Local Semantic Completion Learning for Vision-Language Pre-Training.

Rong-Cheng Tu, Yatai Ji, Jie Jiang

IEEE Transactions on Pattern Analysis and Machine Intelligence

|August 6, 2025

Summary

This summary is machine-generated.

This study introduces Global and Local Semantic Completion Learning (GLSCL) for vision-language pre-training (VLP) models. GLSCL enhances cross-modal alignment by simultaneously learning global and local features, improving performance on various benchmarks.

More Related Videos

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Published on: October 3, 2018

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Related Experiment Videos

Last Updated: Sep 12, 2025

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Published on: October 3, 2018

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Artificial Intelligence
Computer Vision
Natural Language Processing

Background:

Cross-modal alignment is vital for vision-language pre-training (VLP) models.
Existing masked modeling tasks in VLP primarily focus on local-local alignment, neglecting global semantic features.
This limitation hinders the alignment of global representations with local features from other modalities.

Purpose of the Study:

To propose a novel Global and Local Semantic Completion Learning (GLSCL) task for VLP.
To simultaneously facilitate global-local and local-local alignment.
To enhance the cross-modal alignment capabilities of VLP models.

Main Methods:

Introduced the GLSCL task, comprising masked global semantic completion (MGSC) and masked local token completion (MLTC).
MGSC focuses on learning representative global features, while MLTC reconstructs modal-fusion local tokens.
Developed a flexible vision encoder for simultaneous image-text and video-text tasks and introduced the ALIGN-BENCH validation benchmark.

Main Results:

The proposed GLSCL task effectively complements missing semantics and recovers global and local features through cross-modal interactions.
MGSC enhances global feature representation, impacting downstream task performance.
MLTC improves the comprehension of multimodal data by reconstructing local tokens.

Conclusions:

GLSCL significantly improves cross-modal alignment by addressing both global and local feature learning.
The method achieves state-of-the-art performance on diverse vision-language benchmarks, including visual question answering and image-text retrieval.
The flexible vision encoder and ALIGN-BENCH contribute to evaluating and advancing VLP models.