Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

897
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
897
Spherical Coordinates01:23

Spherical Coordinates

10.8K
Spherical coordinate systems are preferred over Cartesian, polar, or cylindrical coordinates for systems with spherical symmetry. For example, to describe the surface of a sphere, Cartesian coordinates require all three coordinates. On the other hand, the spherical coordinate system requires only one parameter: the sphere's radius. As a result, the complicated mathematical calculations become simple. Spherical coordinates are used in science and engineering applications like electric and...
10.8K
Relative Motion Analysis using Rotating Axes01:25

Relative Motion Analysis using Rotating Axes

530
Consider a component AB undergoing a linear motion. Along with a linear motion, point B also rotates around point A. To comprehend this complex movement, position vectors for both points A and B are established using a stationary reference frame.
However, to express the relative position of point B relative to point A, an additional frame of reference, denoted as x'y', is necessary. This additional frame not only translates but also rotates relative to the fixed frame, making it...
530
Relative Motion Analysis using Rotating Axes-Problem Solving01:29

Relative Motion Analysis using Rotating Axes-Problem Solving

448
Consider a crane whose telescopic boom rotates with an angular velocity of 0.04 rad/s and angular acceleration of 0.02 rad/s2. Along with the rotation, the boom also extends linearly with a uniform speed of 5 m/s. The extension of the boom is measured at point D, which is measured with respect to the fixed point C on the other end of the boom. For the given instant, the distance between points C and D is 60 meters.
Here, in order to determine the magnitude of velocity and acceleration for point...
448

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Dynamic nested hierarchies: self-evolving machine learning architectures for lifelong learning.

Frontiers in artificial intelligence·2026
Same author

The rubber tool illusion reveals how body image modifies body schema.

Journal of experimental psychology. Human perception and performance·2025
Same author

Deep Learning for 3D Reconstruction, Augmentation, and Registration: A Review Paper.

Entropy (Basel, Switzerland)·2024
Same author

HyperE2VID: Improving Event-Based Video Reconstruction via Hypernetworks.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2024
Same author

Multi3Generation: Multitask, Multilingual, and Multimodal Language Generation.

Open research Europe·2023
Same author

REC-NN: A reconstruction error compensation neural network for Magnetic Resonance Electrical Property Tomography (MREPT).

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2023
Same journal

Raising the Bar in Graph OOD Generalization: Invariant Learning beyond Explicit Environment Modeling.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

LoRASculpt: Harmonious Low-Rank Adaptation for Multimodal Large Language Models.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Linearly Solving Robust Rotation Estimation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adapting Dense Vision-Language Relationships for Multi-label Classification with Partial Label.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

MoE-Enhanced Explainable Deep Manifold Transformation for Complex Data Embedding and Visualization.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Sep 9, 2025

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects
06:36

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

1.1K

Spherical Vision Transformers for Audio-Visual Saliency Prediction in 360$^{\circ }$∘ Videos.

Mert Cokelek, Halit Ozsoy, Nevrez Imamoglu

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |August 29, 2025
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces new models for predicting visual attention in 360-degree videos, incorporating spatial audio. Integrating audio cues significantly improves the accuracy of saliency prediction in omnidirectional videos.

    More Related Videos

    Robotized Testing of Camera Positions to Determine Ideal Configuration for Stereo 3D Visualization of Open-Heart Surgery
    05:12

    Robotized Testing of Camera Positions to Determine Ideal Configuration for Stereo 3D Visualization of Open-Heart Surgery

    Published on: August 12, 2021

    2.1K
    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    635

    Related Experiment Videos

    Last Updated: Sep 9, 2025

    Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects
    06:36

    Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

    Published on: October 18, 2024

    1.1K
    Robotized Testing of Camera Positions to Determine Ideal Configuration for Stereo 3D Visualization of Open-Heart Surgery
    05:12

    Robotized Testing of Camera Positions to Determine Ideal Configuration for Stereo 3D Visualization of Open-Heart Surgery

    Published on: August 12, 2021

    2.1K
    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    635

    Area of Science:

    • Computer Vision
    • Virtual Reality
    • Human-Computer Interaction

    Background:

    • Omnidirectional videos (ODVs) offer immersive virtual reality (VR) experiences with a full field-of-view (FOV).
    • Predicting visual saliency in 360° environments presents unique challenges due to spherical distortion and the integration of spatial audio.
    • Existing datasets lack comprehensive audio-visual data for 360° saliency prediction.

    Purpose of the Study:

    • To extend saliency prediction to 360° video environments by addressing spherical distortion and spatial audio integration.
    • To develop and evaluate novel models for audio-visual saliency prediction in ODVs.
    • To introduce a new dataset, YT360-EyeTracking, for training and evaluating 360° saliency prediction models.

    Main Methods:

    • Curated the YT360-EyeTracking dataset comprising 81 ODVs with varying audio-visual conditions.
    • Proposed SalViT360, a vision-transformer model with spherical geometry-aware attention for ODVs.
    • Developed SalViT360-AV, an extension incorporating transformer adapters conditioned on audio input.

    Main Results:

    • SalViT360 and SalViT360-AV significantly outperform existing methods on benchmark datasets, including YT360-EyeTracking.
    • Demonstrated the effectiveness of incorporating spatial audio cues for enhanced saliency prediction accuracy.
    • Validated the models' ability to predict viewer attention in complex 360° scenes.

    Conclusions:

    • Integrating spatial audio is crucial for accurate saliency prediction in omnidirectional videos.
    • The proposed SalViT360 and SalViT360-AV models represent significant advancements in 360° visual attention prediction.
    • The YT360-EyeTracking dataset facilitates further research in audio-visual saliency for immersive media.