Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Concepts and Prototypes01:24

Concepts and Prototypes

154
The human nervous system handles vast amounts of information by translating sensory stimuli into neural impulses, which the brain processes, creating thoughts expressed through language or stored as memories. The brain also synthesizes information from emotions and memories, which significantly influence thoughts and behaviors. This intricate process creates a comprehensive mental picture.
The brain organizes this information using concepts, which are mental categories grouping linguistic data,...
154
Gestalt Principles of Perception01:21

Gestalt Principles of Perception

314
Gestalt principles provide a framework for understanding how humans perceive objects as unified wholes within their context. These principles are essential in explaining the cognitive processes that make sense of complex visual stimuli by organizing them into coherent groups. One fundamental principle is proximity, which posits that objects located close to each other are perceived as a collective group. For instance, when dots are positioned near one another, the visual system interprets them...
314
Perceptual Constancy01:12

Perceptual Constancy

402
Perceptual constancy is the ability to recognize that objects remain consistent and unchanged even when their appearance varies due to changes in sensory input. There are four main types of perceptual constancy: size constancy, shape constancy, color constancy, and brightness constancy.
Size constancy is the recognition that an object remains the same size, even when its image on the retina changes. For instance, a bus is perceived to be large enough to carry people, even if it looks tiny from...
402
Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

673
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
673
Normal and Tangetial Components: Problem Solving01:24

Normal and Tangetial Components: Problem Solving

181
Consider a man with a mass of 70 kg seated in a chair connected to a pin support through a member BC. If the man maintains an upright position, the task is to determine the horizontal and vertical reactions of the chair on the man when the member makes a 45° angle with the horizontal. At this moment, the man has a speed of 5 m/s, increasing at a rate of 1 m/s².
181
Schemas01:42

Schemas

11.6K
A schema is a mental construct consisting of a cluster or collection of related concepts (Bartlett, 1932). There are many different types of schemata, and they all have one thing in common: schemata are a method of organizing information that allows the brain to work more efficiently. When a schema is activated, the brain makes immediate assumptions about the person or object being observed.
11.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MonSter++: Unified Stereo Matching, Multi-View Stereo, and Real-Time Stereo With Monodepth Priors.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Spatial-Temporal Self-Compensating Graph Convolutional Network for Skeleton-Based Action Recognition Under Data Constraints.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Multimodal detection of microplastics in human kidney stones and multi-omics exploration of renal cell metaflammation.

Journal of hazardous materials·2026
Same author

Long&short Exposures Guided Diffusion Model for Realistic Local Motion Deblurring.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

HiTMM: Generative Temporal Masked Modeling of Human Interactive Motions.

IEEE transactions on visualization and computer graphics·2026
Same author

MoFTSS: Motion Generation With Frequency and Text State Space Models.

IEEE transactions on neural networks and learning systems·2026
Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jul 9, 2025

Creating Objects and Object Categories for Studying Perception and Perceptual Learning
14:38

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

11.9K

Context Disentangling and Prototype Inheriting for Robust Visual Grounding.

Wei Tang, Liang Li, Xuejing Liu

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |December 5, 2023
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a novel framework for robust visual grounding (VG) that improves target discrimination in images. The method enhances generalization to open-vocabulary scenes by disentangling context and inheriting prototypes.

    More Related Videos

    Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms
    07:31

    Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms

    Published on: February 8, 2019

    6.6K
    Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios
    07:43

    Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

    Published on: August 4, 2023

    2.0K

    Related Experiment Videos

    Last Updated: Jul 9, 2025

    Creating Objects and Object Categories for Studying Perception and Perceptual Learning
    14:38

    Creating Objects and Object Categories for Studying Perception and Perceptual Learning

    Published on: November 2, 2012

    11.9K
    Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms
    07:31

    Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms

    Published on: February 8, 2019

    6.6K
    Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios
    07:43

    Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

    Published on: August 4, 2023

    2.0K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Visual grounding (VG) identifies image targets from language queries, but struggles with context discrimination and open-vocabulary generalization.
    • Existing methods often overlook contextual information, limiting performance when targets share categories or appear in novel scenes.

    Purpose of the Study:

    • To develop a robust visual grounding framework capable of handling both standard and open-vocabulary scenes.
    • To improve the discrimination of target objects by effectively utilizing contextual information.
    • To enhance generalization to novel objects and categories not seen during training.

    Main Methods:

    • Proposing a novel framework featuring context disentangling and prototype inheriting for visual grounding.
    • Context disentangling separates referent and context features for improved discrimination.
    • Prototype inheriting utilizes a prototype bank to leverage seen data, particularly for open-vocabulary scenarios.
    • Fused features from disentangled linguistic and visual prototypes are processed by a Vision Transformer for bounding box regression.

    Main Results:

    • The proposed method demonstrates superior performance in distinguishing targets, even those of the same category.
    • Context disentangling effectively enhances the discriminative power of visual features.
    • Prototype inheriting significantly improves performance in open-vocabulary visual grounding tasks.
    • Extensive experiments confirm the method's outperformance against state-of-the-art approaches on both standard and open-vocabulary scenes.

    Conclusions:

    • The novel framework offers a robust solution for visual grounding across diverse scenes, including those with novel objects.
    • Context disentangling and prototype inheriting are key components for achieving high performance and generalization.
    • The approach advances the state-of-the-art in visual grounding, particularly for challenging open-vocabulary scenarios.