Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Observational Learning01:12

Observational Learning

1.4K
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
1.4K
Impression Management Techniques III: Aligning Actions01:29

Impression Management Techniques III: Aligning Actions

215
Aligning actions are communicative strategies individuals employ to maintain social harmony and preserve personal identity in the face of potential disruptions to social norms. These actions are particularly important in managing social impressions when one's behavior might be seen as inappropriate, incompetent, or morally questionable.Types of Aligning ActionsThe three principal types of aligning actions are disclaimers, accounts, and apologies.DisclaimersDisclaimers are preventive; they are...
215
Associative Learning01:27

Associative Learning

2.1K
Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...
2.1K
Cognitive Learning01:21

Cognitive Learning

1.6K
Cognitive learning is based on purposive behavior, incidental learning, and insight learning.
E. C. Tolman's theory of purposive behavior emphasizes that much behavior is goal-directed. He argued that to understand behavior, we must look at the entire sequence of actions leading to a goal. For instance, high school students study hard, not just due to past reinforcement but also to achieve the goal of getting into a good college.
Tolman introduced the idea that behavior is influenced by...
1.6K
Vision01:24

Vision

61.7K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
61.7K
Purposive Learning01:22

Purposive Learning

667
E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a...
667

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Hierarchical Consistency Learning for Test-time Adaptation in Camouflage Perception.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Knowledge Diffusion-Based Adaptive Alignment with Hierarchical Context for Video Temporal Grounding.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

OmniCharacter++: Towards Comprehensive Benchmark for Realistic Role-Playing Agents.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Scalable and Efficient Deep Reinforcement Learning-Based Model Checker for Computation Tree Logic.

IEEE transactions on neural networks and learning systems·2026
Same author

From Channel Bias to Feature Redundancy: Uncovering the "Less Is More" Principle in Few-Shot Learning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

SeMv-3D: Toward Concurrency of Semantic and Multi-View Consistency in General Text-to-3D Generation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

CLASH-CTTA: Class-Wise Shift-Aware Hierarchical Continual Test-Time Adaptation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Voxel-based Point Cloud Geometry Compression with Space-to-Channel Context.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

DA-Cal: Towards Cross-Domain Calibration in Semantic Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Multi-Dimensional Quality Assessment for Single-Image-to-3D Contents: Dataset and Model.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Enhancing Underwater Light Field Images via Global Geometry-aware Diffusion Process.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles
  1. Home
  2. Vision-language Collaborative Representation Learning For Action Quality Assessment.
  1. Home
  2. Vision-language Collaborative Representation Learning For Action Quality Assessment.

Related Experiment Video

Simulation of a Scaled Assembly Process with Collaboration of a Robotic Arm and Monitoring through a Vision System for Quality Control
05:47

Simulation of a Scaled Assembly Process with Collaboration of a Robotic Arm and Monitoring through a Vision System for Quality Control

Published on: August 29, 2025

648

Vision-Language Collaborative Representation Learning for Action Quality Assessment.

Kumie Gedamu, Yanli Ji, Wangmeng Zuo

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |April 17, 2026

    View abstract on PubMed

    Summary
    This summary is machine-generated.

    This study introduces Vision-Language Collaboration Representation Learning (VLC-Net) for improved Action Quality Assessment (AQA). VLC-Net enhances fine-grained action understanding and prediction accuracy by unifying vision and language features.

    More Related Videos

    Photorealistic Learned Landscapes for Augmented Reality
    06:54

    Photorealistic Learned Landscapes for Augmented Reality

    Published on: June 27, 2025

    908

    Related Experiment Videos

    Simulation of a Scaled Assembly Process with Collaboration of a Robotic Arm and Monitoring through a Vision System for Quality Control
    05:47

    Simulation of a Scaled Assembly Process with Collaboration of a Robotic Arm and Monitoring through a Vision System for Quality Control

    Published on: August 29, 2025

    648
    Photorealistic Learned Landscapes for Augmented Reality
    06:54

    Photorealistic Learned Landscapes for Augmented Reality

    Published on: June 27, 2025

    908

    Area of Science:

    • Computer Vision and Machine Learning
    • Multimodal AI
    • Action Recognition and Understanding

    Background:

    • Action Quality Assessment (AQA) is crucial for real-world applications requiring detailed action sequence comprehension.
    • Existing multimodal approaches for AQA often suffer from instability and suboptimal performance due to directional bias in vision-language embedding spaces.
    • Reliance solely on textual information from language models limits the effectiveness of current AQA methods.

    Purpose of the Study:

    • To propose a novel Vision-Language Collaboration Representation Learning approach (VLC-Net) for accurate AQA score prediction.
    • To develop a unified feature representation that captures temporal dependencies in fine-grained action sequences.
    • To overcome the limitations of existing methods by addressing directional bias and improving multimodal feature integration.

    Main Methods:

    • Implemented a bidirectional knowledge distillation operation for collaborative learning between pre-trained vision-language models and visual action knowledge.
    • Designed vision-language alignment guidance to explicitly align action features with shared semantic meanings across modalities.
    • Utilized multimodal contrastive learning on aligned features to enhance the relationship between modalities and subactions with textual descriptions.

    Main Results:

    • VLC-Net demonstrated superior performance in fine-grained action sequence understanding and AQA score prediction.
    • The proposed methods effectively unified joint representations by aligning features across vision and language modalities.
    • Experimental results on multiple datasets (FineDiving, MTL-AQA, FineFS, Fis-V) show significant improvements over state-of-the-art methods.

    Conclusions:

    • VLC-Net effectively addresses the challenges of directional bias in vision-language embedding spaces for AQA.
    • The approach successfully learns unified representations of fine-grained actions by integrating visual and textual information.
    • The proposed method offers a robust and effective solution for accurate Action Quality Assessment, outperforming existing techniques.