Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Concepts and Prototypes

Concepts and Prototypes

The human nervous system handles vast amounts of information by translating sensory stimuli into neural impulses, which the brain processes, creating thoughts expressed through language or stored as memories. The brain also synthesizes information from emotions and memories, which significantly influence thoughts and behaviors. This intricate process creates a comprehensive mental picture.
The brain organizes this information using concepts, which are mental categories grouping linguistic data,...

Gestalt Principles of Perception

Gestalt Principles of Perception

Gestalt principles provide a framework for understanding how humans perceive objects as unified wholes within their context. These principles are essential in explaining the cognitive processes that make sense of complex visual stimuli by organizing them into coherent groups. One fundamental principle is proximity, which posits that objects located close to each other are perceived as a collective group. For instance, when dots are positioned near one another, the visual system interprets them...

Perceptual Constancy

Perceptual Constancy

Perceptual constancy is the ability to recognize that objects remain consistent and unchanged even when their appearance varies due to changes in sensory input. There are four main types of perceptual constancy: size constancy, shape constancy, color constancy, and brightness constancy.
Size constancy is the recognition that an object remains the same size, even when its image on the retina changes. For instance, a bus is perceived to be large enough to carry people, even if it looks tiny from...

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Normal and Tangetial Components: Problem Solving

Normal and Tangetial Components: Problem Solving

Consider a man with a mass of 70 kg seated in a chair connected to a pin support through a member BC. If the man maintains an upright position, the task is to determine the horizontal and vertical reactions of the chair on the man when the member makes a 45° angle with the horizontal. At this moment, the man has a speed of 5 m/s, increasing at a rate of 1 m/s².

Schemas

Schemas

A schema is a mental construct consisting of a cluster or collection of related concepts (Bartlett, 1932). There are many different types of schemata, and they all have one thing in common: schemata are a method of organizing information that allows the brain to work more efficiently. When a schema is activated, the brain makes immediate assumptions about the person or object being observed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

MonSter++: Unified Stereo Matching, Multi-View Stereo, and Real-Time Stereo With Monodepth Priors.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Spatial-Temporal Self-Compensating Graph Convolutional Network for Skeleton-Based Action Recognition Under Data Constraints.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Multimodal detection of microplastics in human kidney stones and multi-omics exploration of renal cell metaflammation.

Journal of hazardous materials·2026

Same author

Long&short Exposures Guided Diffusion Model for Realistic Local Motion Deblurring.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

HiTMM: Generative Temporal Masked Modeling of Human Interactive Motions.

IEEE transactions on visualization and computer graphics·2026

Same author

MoFTSS: Motion Generation With Frequency and Text State Space Models.

IEEE transactions on neural networks and learning systems·2026

Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 9, 2025

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

Context Disentangling and Prototype Inheriting for Robust Visual Grounding.

Wei Tang, Liang Li, Xuejing Liu

IEEE Transactions on Pattern Analysis and Machine Intelligence

|December 5, 2023

Summary

This summary is machine-generated.

This study introduces a novel framework for robust visual grounding (VG) that improves target discrimination in images. The method enhances generalization to open-vocabulary scenes by disentangling context and inheriting prototypes.

More Related Videos

Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms

Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms

Published on: February 8, 2019

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Published on: August 4, 2023

Related Experiment Videos

Last Updated: Jul 9, 2025

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms

Defining the Role Of Language in Infants' Object Categorization with Eye-tracking Paradigms

Published on: February 8, 2019

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Published on: August 4, 2023

Area of Science:

Computer Vision
Artificial Intelligence
Machine Learning

Background:

Visual grounding (VG) identifies image targets from language queries, but struggles with context discrimination and open-vocabulary generalization.
Existing methods often overlook contextual information, limiting performance when targets share categories or appear in novel scenes.

Purpose of the Study:

To develop a robust visual grounding framework capable of handling both standard and open-vocabulary scenes.
To improve the discrimination of target objects by effectively utilizing contextual information.
To enhance generalization to novel objects and categories not seen during training.

Main Methods:

Proposing a novel framework featuring context disentangling and prototype inheriting for visual grounding.
Context disentangling separates referent and context features for improved discrimination.
Prototype inheriting utilizes a prototype bank to leverage seen data, particularly for open-vocabulary scenarios.
Fused features from disentangled linguistic and visual prototypes are processed by a Vision Transformer for bounding box regression.

Main Results:

The proposed method demonstrates superior performance in distinguishing targets, even those of the same category.
Context disentangling effectively enhances the discriminative power of visual features.
Prototype inheriting significantly improves performance in open-vocabulary visual grounding tasks.
Extensive experiments confirm the method's outperformance against state-of-the-art approaches on both standard and open-vocabulary scenes.

Conclusions:

The novel framework offers a robust solution for visual grounding across diverse scenes, including those with novel objects.
Context disentangling and prototype inheriting are key components for achieving high performance and generalization.
The approach advances the state-of-the-art in visual grounding, particularly for challenging open-vocabulary scenarios.