Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

1.4K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
1.4K
Retrieval01:12

Retrieval

271
Retrieval is the process of getting information out of memory storage and back into conscious awareness. This ability is essential for daily tasks like brushing hair and teeth, driving to work, and performing job duties. Retrieval occurs in three ways: recall, recognition, and relearning.
Recall involves accessing information without cues, such as during an essay test, where individuals must retrieve facts and concepts from memory unaided. Another example is remembering the name of a colleague...
271
Relative Motion Analysis using Rotating Axes01:25

Relative Motion Analysis using Rotating Axes

653
Consider a component AB undergoing a linear motion. Along with a linear motion, point B also rotates around point A. To comprehend this complex movement, position vectors for both points A and B are established using a stationary reference frame.
However, to express the relative position of point B relative to point A, an additional frame of reference, denoted as x'y', is necessary. This additional frame not only translates but also rotates relative to the fixed frame, making it...
653

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

[Development and in vivo biomechanics of goat mobile artificial lumbar spine complex].

Zhongguo gu shang = China journal of orthopaedics and traumatology·2024
Same author

<i>Dictyophora indusiata</i> and <i>Bacillus aryabhattai</i> improve sugarcane yield by endogenously associating with the root and regulating flavonoid metabolism.

Frontiers in plant science·2024
Same author

Ubiquitin ligase MDM2 mediates endothelial inflammation in Kawasaki disease vasculitis development.

Translational pediatrics·2024
Same author

Determining a relative total lumbar range of motion to alleviate adjacent segment degeneration after transforaminal lumbar interbody fusion: a finite element analysis.

BMC musculoskeletal disorders·2024
Same author

Quantitative evaluation of disc degeneration using dual-energy CT: advantages of R-VH, D-VH values and the IVNCa + CT model.

European spine journal : official publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society·2024
Same author

Effect of deep learning image reconstruction with high-definition standard scan mode on image quality of coronary stents and arteries.

Quantitative imaging in medicine and surgery·2024
Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Nov 18, 2025

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

20.2K

Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video Retrieval.

Mengshi Qi, Jie Qin, Yi Yang

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |February 9, 2021
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces Semantics-aware Spatial-temporal Binaries (STBin), a new framework for video retrieval using natural language. STBin enhances cross-modal retrieval by considering both spatial-temporal context and semantic relationships for improved accuracy.

    More Related Videos

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.3K

    Related Experiment Videos

    Last Updated: Nov 18, 2025

    Cross-Modal Multivariate Pattern Analysis
    13:51

    Cross-Modal Multivariate Pattern Analysis

    Published on: November 9, 2011

    20.2K
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.3K

    Area of Science:

    • Computer Science
    • Artificial Intelligence
    • Information Retrieval

    Background:

    • Video-based social networks are growing exponentially, increasing demand for effective natural language video retrieval.
    • Existing methods often overlook temporal dynamics and semantic links between text queries and video actions.
    • The semantic correspondence between natural language and person-centric actions in videos remains underexplored.

    Purpose of the Study:

    • To propose a novel binary representation learning framework, Semantics-aware Spatial-temporal Binaries (STBin), for cross-modal video retrieval.
    • To address limitations in current approaches by incorporating spatial-temporal context and semantic relationships.
    • To improve the efficiency and effectiveness of generating binary codes for both videos and texts.

    Main Methods:

    • Developed a binary representation learning framework (STBin) that integrates spatial-temporal context and semantic relationships.
    • Employed an iterative optimization scheme with attribute-guided stochastic training to learn deep encoding functions.
    • Generated binary codes for videos and texts by exploiting semantic relationships between the two modalities.

    Main Results:

    • The proposed STBin framework demonstrated superior performance in cross-modal video retrieval tasks.
    • Experimental results on three video datasets confirmed the effectiveness of STBin over state-of-the-art methods.
    • STBin successfully captured visual pattern consistencies and temporal relationships across video frames.

    Conclusions:

    • STBin offers a significant advancement in cross-modal video retrieval by effectively combining spatial-temporal and semantic information.
    • The framework provides an efficient and effective solution for generating binary codes for video and text.
    • Future research can build upon STBin to further enhance video understanding and retrieval capabilities.