Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Fluid Mosaic Model01:19

Fluid Mosaic Model

Scientists identified the plasma membrane in the 1890s and its principal chemical components (lipids and proteins) by 1915. The model for plasma membrane structure, proposed in 1935 by Hugh Davson and James Danielli, was the first model to be widely accepted in the scientific community. The model was based on the plasma membrane's "railroad track" appearance in early electron micrographs. Davson and Danielli theorized that the plasma membrane's structure resembled a sandwich with the analogy of...
Fluid Mosaic Model01:34

Fluid Mosaic Model

The fluid mosaic model was first proposed as a visual representation of research observations. The model comprises the composition and dynamics of membranes and serves as a foundation for future membrane-related studies. The model depicts the structure of the plasma membrane with a variety of components, which include phospholipids, proteins, and carbohydrates. These integral molecules are loosely bound, defining the cell’s border and providing fluidity for optimal function.LipidsThe most...
Fluid Movement Between Compartments01:18

Fluid Movement Between Compartments

The force applied by fluids against a surface, known as hydrostatic pressure, initiates the transfer of fluid among different compartments. Within our blood vessels, the blood's hydrostatic pressure is a result of the heart's pumping action. At the arteriolar end of capillaries, hydrostatic pressure (capillary blood pressure) exceeds the opposing colloid osmotic pressure created primarily by plasma proteins like albumin. This discrepancy in pressure propels plasma and nutrients from the...
Cross-bridge Cycle01:26

Cross-bridge Cycle

As muscle contracts, the overlap between the thin and thick filaments increases, decreasing the length of the sarcomere—the contractile unit of the muscle—using energy in the form of ATP. At the molecular level, this is a cyclic, multistep process that involves binding and hydrolysis of ATP, and movement of actin by myosin.
Upward Impending Motion01:21

Upward Impending Motion

A square-threaded screw jack is a mechanical device widely used for lifting heavy loads or applying considerable force. Its operation is based on converting the force applied at its handle into a torsional moment, causing the upward impending motion of the screw. This movement is accomplished by overcoming the static friction between the threads of the screw and the jack.
To better comprehend how a screw jack functions, consider the completely unraveled thread as a block in contact with the...
Two Components: Liquid–Liquid Systems01:27

Two Components: Liquid–Liquid Systems

A pressure-composition phase diagram explicitly describes the behavior of an ideal solution of two volatile liquids under varying pressures and compositions. A pressure-composition diagram has two main curves. The bubble point curve represents the plot of pressure versus liquid mole fraction. It indicates the pressure at which the first bubble of vapor forms from the liquid phase as the system pressure decreases.The dew point curve is the pressure versus vapor mole fraction. It indicates the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Decoding the spatiotemporal development of human meninges.

Cell·2026
Same author

Simultaneous multi-band multi-spectral imaging using multi-band RF excitation for accelerated metal artifact reduction in MRI-guided interventions.

Medical physics·2026
Same author

Decoding the spatiotemporal development of the blood-brain barrier in human cortex.

Cell stem cell·2026
Same author

The mechanism and clinical significance of FKBP5 gene DNA methylation in various psychiatric, metabolic and tumor-related diseases.

Frontiers in genetics·2026
Same author

Comparing the clinical accuracy of urine exfoliated cell examination and fluorescence in situ hybridization in the diagnosis of upper urinary tract urothelial carcinoma.

BMC urology·2026
Same author

Tracking Myrosinase Regulation across Multiscale Interactions with Fluorescent Glucosinolates.

ACS sensors·2025
Same journal

Logic, inference, understanding: cross-domain generalization for generative language models.

Frontiers in artificial intelligence·2026
Same journal

Label tree semantic losses for rich multi-class medical image segmentation.

Frontiers in artificial intelligence·2026
Same journal

Score-based generative diffusion models to synthesize full-dose FDG brain PET from MRI in epilepsy patients.

Frontiers in artificial intelligence·2026
Same journal

Resource-efficient retrieval-augmented question answering for the Indian Lok Sabha dataset.

Frontiers in artificial intelligence·2026
Same journal

Violation detection in power operation sites based on multi-scale detection and few-shot learning.

Frontiers in artificial intelligence·2026
Same journal

Deep reinforcement learning-based reversible medical image encryption framework for secure IoMT environments.

Frontiers in artificial intelligence·2026
See all related articles

Related Experiment Videos

Task-aware cross-modal refinement and liquid fusion for text-visual grounding.

Zhirong Li1, Changliang Wang2, Yongheng Pang1

  • 1Criminal Investigation Police University of China, Shenyang, China.

Frontiers in Artificial Intelligence
|June 15, 2026
PubMed
Summary
This summary is machine-generated.

This study introduces a Task-aware Liquid Cross-modal Network (TLCN) for visual grounding, effectively reducing the semantic gap and model parameters. The TLCN achieves superior performance, offering an efficient and lightweight solution for object localization tasks.

Keywords:
cross-modalhuman-robot interactionliquid neural networksmultilevel grounding modulevisual grounding

Related Experiment Videos

Area of Science:

  • Computer Vision
  • Artificial Intelligence
  • Machine Learning

Background:

  • Visual grounding, crucial for autonomous driving and human-robot interaction, faces challenges like semantic gaps between modalities, large model parameters, and insufficient cross-modal attention.
  • Existing models often process visual and textual data independently, leading to feature discrepancies and hindering performance on lightweight devices.
  • Single-level attention mechanisms limit the ability to capture complex interactions between image and text features.

Purpose of the Study:

  • To propose an efficient and lightweight Task-aware Liquid Cross-modal Network (TLCN) to address the limitations of current visual grounding models.
  • To reduce the semantic gap between visual and textual features through guided feature extraction.
  • To decrease model parameters for improved deployment on resource-constrained devices.

Main Methods:

  • The TLCN utilizes a Feature Extraction Module (FEM) where text guides visual feature extraction, minimizing the semantic gap.
  • A Liquid Fusion Module (LFM) employing Liquid Neural Networks (LNNs) captures temporal dependencies and reduces model parameters.
  • A Task-aware Cross-modal Refinement Module (TCRM) with second-level attention and Conv-Trans Blocks (CTBs) deepens feature representation and captures cross-modal interactions, optimized with KL divergence loss.

Main Results:

  • The TLCN demonstrated superior performance on the RefCOCO, RefCOCO+, and RefCOCOg benchmarks.
  • The model also achieved excellent results on a specialized text localization task.
  • Experimental validation confirmed the effectiveness of the proposed modules in improving visual grounding accuracy.

Conclusions:

  • The TLCN effectively bridges the semantic gap via text-guided visual feature extraction.
  • The integration of LNNs significantly reduces model parameters, enabling lightweight deployment.
  • The proposed architecture successfully captures deep cross-modal interactions, providing a robust solution for visual grounding.