Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

643
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
643
Chunking and Rehearsal in Sensory Memory01:22

Chunking and Rehearsal in Sensory Memory

203
Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...
203

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

XOV-Action: Towards Generalizable Open-Vocabulary Action Recognition.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Human-Structure-Aware Token Position Embedding for Tokenized Pose Estimation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Crafting Your Evolving Dreams: Concept-Incremental Versatile Customization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Event-Aware Instructed Assistant for Referring Video Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Open-Set Anomaly Segmentation in Complex Scenarios.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jun 28, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

9.0K

Learning Local and Global Temporal Contexts for Video Semantic Segmentation.

Guolei Sun, Yun Liu, Henghui Ding

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |April 10, 2024
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces novel methods for video semantic segmentation (VSS) by integrating local and global temporal contexts. The proposed techniques enhance feature mining for more accurate video understanding.

    More Related Videos

    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
    07:36

    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

    Published on: November 30, 2018

    15.7K
    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    2.7K

    Related Experiment Videos

    Last Updated: Jun 28, 2025

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.0K
    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
    07:36

    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

    Published on: November 30, 2018

    15.7K
    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    2.7K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Contextual information is crucial for accurate video semantic segmentation (VSS).
    • Existing methods often focus on either local temporal contexts (LTC) or global temporal contexts (GTC), but rarely both.
    • Simultaneously learning static and motional contexts within LTC offers complementary benefits.

    Purpose of the Study:

    • To propose a unified approach for learning local temporal contexts (LTC) in VSS.
    • To introduce a method for incorporating global temporal contexts (GTC) to further improve VSS performance.
    • To enhance the accuracy and effectiveness of video semantic segmentation.

    Main Methods:

    • Coarse-to-Fine Feature Mining (CFFM) technique to learn a unified representation of LTC.
    • CFFM comprises Coarse-to-Fine Feature Assembling (CFFA) for abstracting static and motional contexts, and Cross-frame Feature Mining (CFM) for enhancing features from neighboring frames.
    • CFFM++ extends CFFM by incorporating GTC through k-means clustering of sampled frames and CFM for prototype refinement.

    Main Results:

    • The proposed CFFM method effectively learns a unified presentation of local temporal contexts.
    • CFFM++ demonstrates improved performance by additionally leveraging global temporal contexts.
    • Both CFFM and CFFM++ achieve state-of-the-art results on popular video semantic segmentation benchmarks.

    Conclusions:

    • Simultaneously learning static and motional contexts via CFFM significantly enhances VSS.
    • Integrating global temporal contexts with CFFM++ further boosts segmentation accuracy.
    • The proposed methods offer a more comprehensive approach to exploiting temporal information for VSS.