Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Selected Data About Geographic Locations

Selected Data About Geographic Locations

Geographic Information Systems (GIS) rely on two core types of data: spatial data and attribute data.Spatial DataSpatial data defines the physical location of features within a coordinate system, typically expressed in terms of latitude and longitude. It provides precise positioning for elements like roads, rivers, or buildings.Attribute DataAttribute data complements spatial data by adding descriptive information about these features. For example, a road's spatial data includes its start and...

Collisions in Multiple Dimensions: Problem Solving

Collisions in Multiple Dimensions: Problem Solving

In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...

Design Example: Identifying the Locations of Monuments in the Field Using Global Positioning System Device

Design Example: Identifying the Locations of Monuments in the Field Using Global Positioning System Device

Surveyors use Global Positioning System (GPS) technology to measure the precise location and elevation of points on Earth. In a recent survey, GPS receivers were used to determine the coordinates and elevations of two park monuments. The process involved careful mission planning, data collection, and correction to ensure accuracy. The survey began with mission planning to identify optimal satellite visibility and minimize Position Dilution of Precision (PDOP). A geodetic control point...

Collisions in Multiple Dimensions: Introduction

Collisions in Multiple Dimensions: Introduction

It is far more common for collisions to occur in two dimensions; that is, the initial velocity vectors are neither parallel nor antiparallel to each other. Let's see what complications arise from this. The first idea is that momentum is a vector. Like all vectors, it can be expressed as a sum of perpendicular components (usually, though not always, an x-component and a y-component, and a z-component if necessary). Thus, when the statement of conservation of momentum is written for a...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

[Tibetan Medicine Classic Formula Srolo Bzhtang Granules Ameliorates Pulmonary Fibrosis via Dual Pathways of Nrf2/HO-1 and PI3K/AKT/mTOR Regulating Oxidative Stress].

Sichuan da xue xue bao. Yi xue ban = Journal of Sichuan University. Medical science edition·2026

Same author

[Ethanol Extract of <i>Vicatia thibetica</i> de Boiss. Improves Chronic Atrophic Gastritis in Rats via the HO-1/Nrf2 and Myd88/AKT/PI3K Signaling Pathways].

Sichuan da xue xue bao. Yi xue ban = Journal of Sichuan University. Medical science edition·2026

Same author

Global Modeling Matters: A Fast, Lightweight, and Effective Baseline for Efficient Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Flupyradifurone-induced stress effects on Sogatella furcifera (Horváth).

Neotropical entomology·2025

Same author

You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling With Gradient Shortcuts.

IEEE transactions on pattern analysis and machine intelligence·2025

Same author

A survey of low-bit large language models: Basics, systems, and algorithms.

Neural networks : the official journal of the International Neural Network Society·2025

Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 18, 2026

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

MTRAG: Multi-Target Referring and Grounding via Hybrid Semantic-Spatial Integration.

Yili Ren, Jinyang Du, Xi Liu

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|February 16, 2026

Summary

This summary is machine-generated.

This study introduces MTRAG, a novel framework for pixel-level multi-target referring and grounding. MTRAG enhances scene understanding by effectively combining semantic and spatial information for improved vision-language tasks.

Related Experiment Videos

Last Updated: Feb 18, 2026

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Area of Science:

Computer Vision
Natural Language Processing
Artificial Intelligence

Background:

Fine-grained visual referring and grounding are essential for scene understanding and vision-language applications.
Existing multimodal large language models (MLLMs) struggle with fine-grained multi-target scenarios.

Purpose of the Study:

To propose MTRAG, a pixel-level framework for multi-target referring and grounding that addresses limitations in current MLLMs.
To enhance semantic-spatial collaboration for improved performance in complex visual tasks.

Main Methods:

Introduced Channel Extension Mechanism (CEM) for global and multi-region feature extraction without additional region extractors.
Developed a grounding branch for pixel-level grounding and a Hybrid Adapter (HA) to fuse semantic and spatial features.
Curated MTRAG-D dataset and MTR-Bench benchmark for systematic evaluation of multi-target referring.

Main Results:

MTRAG consistently outperforms strong baselines on both multi-target and single-target referring and grounding tasks.
The framework maintains competitive performance in image-level captioning.
Demonstrated effective semantic-spatial alignment through the Hybrid Adapter.

Conclusions:

MTRAG offers a robust solution for pixel-level multi-target referring and grounding.
The proposed methods significantly advance the capabilities of MLLMs in fine-grained visual understanding.
MTRAG provides a valuable benchmark for future research in multi-target visual tasks.