Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Scanpath Prediction in Panoramic Videos Via Expected Code Length Minimization.

Mu Li, Kanglong Fan, Kede Ma

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |May 25, 2026
    PubMed
    Summary
    This summary is machine-generated.

    Related Concept Videos

    Relative Motion Analysis using Rotating Axes-Problem Solving01:29

    Relative Motion Analysis using Rotating Axes-Problem Solving

    Consider a crane whose telescopic boom rotates with an angular velocity of 0.04 rad/s and angular acceleration of 0.02 rad/s2. Along with the rotation, the boom also extends linearly with a uniform speed of 5 m/s. The extension of the boom is measured at point D, which is measured with respect to the fixed point C on the other end of the boom. For the given instant, the distance between points C and D is 60 meters.
    Here, in order to determine the magnitude of velocity and acceleration for point...

    You might also read

    Related Articles

    Articles linked to this work by shared authors, journal, and citation graph.

    Sort by
    Same author

    Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same author

    FusionX: A liquid-metal-jet x-ray source-powered multifunctional SAXS/WAXS/HR-XRD platform.

    The Review of scientific instruments·2025
    Same author

    Quantification of anterior scleral thickness in Posner-Schlossman syndrome.

    SAGE open medicine·2025
    Same author

    Self-Supervised Voice Denoising Network for Multi-Scenario Human-Robot Interaction.

    Biomimetics (Basel, Switzerland)·2025
    Same author

    VTA network dominance in depression confers distinct psychopathological states through blunted neural tracking of reward prediction errors.

    Research square·2025
    Same author

    Author Correction: Endocannabinoids disinhibit the ventral tegmental nucleus of Gudden to dorsal premammillary nucleus pathway to enhance escape behavior following learned threat experience.

    Nature communications·2025
    Same journal

    Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    Learning Shape Anchors for Holistic Indoor Scene Understanding.

    IEEE transactions on pattern analysis and machine intelligence·2026
    See all related articles

    This study introduces a novel scanpath prediction method for panoramic videos, leveraging data compression principles. The approach accurately predicts human visual attention by modeling diverse scanpaths without needing ground-truth data.

    Area of Science:

    • Computer Vision
    • Human-Computer Interaction
    • Machine Learning

    Background:

    • Scanpath prediction in panoramic videos is complex due to spherical geometry, multimodality, and diverse human visual behavior.
    • Existing methods often struggle to capture the inherent uncertainty and variability in human gaze patterns.
    • The need for accurate and perceptually realistic scanpath prediction is crucial for applications like adaptive interfaces and content summarization.

    Purpose of the Study:

    • To develop a robust scanpath prediction model for panoramic videos that accounts for spherical geometry and output diversity.
    • To propose a novel criterion for scanpath prediction based on minimizing expected code length, inspired by lossy data compression.
    • To generate realistic, human-like scanpaths without relying on imitation learning from ground-truth data.

    Related Experiment Videos

    Main Methods:

    • A novel criterion for scanpath prediction is proposed, minimizing the expected code length of quantized scanpaths via maximum likelihood estimation.
    • A conditional probability model is developed, integrating viewport sequences (visual input) and projected past scanpaths (path input).
    • Discretized Gaussian mixture models parameterize the probability model to capture scanpath uncertainty and diversity; a PID controller-based sampler generates scanpaths.

    Main Results:

    • The proposed method achieves superior quantitative scanpath prediction accuracy compared to baseline methods across various prediction horizons.
    • Experimental results demonstrate enhanced perceptual realism of generated scanpaths, validated through machine discrimination and psychophysical experiments.
    • The model shows strong generalization capabilities on unseen panoramic video datasets.

    Conclusions:

    • The data compression-inspired approach effectively addresses the challenges of scanpath prediction in panoramic videos.
    • The method successfully models scanpath uncertainty and diversity, enabling realistic gaze prediction without ground-truth imitation.
    • The findings offer significant advancements in predicting human visual attention for immersive media.