Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Apr 17, 2026

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Applying artificial vision models to human scene understanding.

Elissa M Aminoff¹, Mariya Toneva², Abhinav Shrivastava³

¹Center for the Neural Basis of Cognition, Carnegie Mellon University Pittsburgh, PA, USA ; Department of Psychology, Carnegie Mellon University Pittsburgh, PA, USA.

Frontiers in Computational Neuroscience

|February 21, 2015

Summary

Related Concept Videos

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

FEDERATED LEARNING OF ROBUST INDIVIDUALIZED DECISION RULES WITH APPLICATION TO HETEROGENEOUS MULTIHOSPITAL SEPSIS POPULATION.

The annals of applied statistics·2026

Same author

SNAC-DB: An ML-ready database for antibody and NANOBODY® VHH-antigen complexes with expanded structural diversity and real-world benchmarking.

Protein science : a publication of the Protein Society·2026

Same author

Realization of High-Reliable Coherent-State Quantum Secure Communication.

Research (Washington, D.C.)·2026

Same author

In vivo CAR-M therapy: advancing precision delivery and programmable immune remodeling.

Cell communication and signaling : CCS·2026

Same author

[Retracted]Oblongifolin C reverses GEM resistance via suppressing autophagy flux in bladder cancer cells.

Experimental and therapeutic medicine·2026

Same author

Evaluation of hydrological models at gauged and ungauged basins using machine learning-based limits-of-acceptability and hydrological signatures.

Journal of hydrology·2026

Same journal

Learning under constraints: a theoretical framework for comparing resource-constrained learning in biological and artificial systems.

Frontiers in computational neuroscience·2026

Same journal

MsGCN: a multi-stream graph convolutional network for multiband PLV graph fusion in EEG-based biometric identification.

Frontiers in computational neuroscience·2026

Same journal

AI-driven neuroanalytic modeling for mental health: multichannel CNN-based autism spectrum disorder detection via facial pattern analysis.

Frontiers in computational neuroscience·2026

Same journal

Modeling multiscale neural dynamics for EEG-based emotion recognition using an attentive wavelet-transformer framework.

Frontiers in computational neuroscience·2026

Same journal

New directions for complex systems in contemporary neuroscience: a morphodynamic and emergent function approach.

Frontiers in computational neuroscience·2026

Same journal

NMDA receptor kinetics drive distinct routes to chaotic firing in pyramidal neurons.

Frontiers in computational neuroscience·2026

See all related articles

This summary is machine-generated.

Artificial vision models better explain neural scene understanding than human judgments. Computer vision models incorporating mid- and high-level attributes correlate strongly with activity in scene-selective brain regions like the parahippocampal/lingual region (PPA).

Area of Science:

Neuroscience
Computer Vision
Cognitive Science

Background:

Scene understanding relies on a network of scene-selective brain regions, including the parahippocampal/lingual region (PPA), retrosplenial complex (RSC), and occipital place area (TOS).
Previous research often focused on single visual dimensions, neglecting the high-dimensional feature space crucial for neural representation of scenes.

Purpose of the Study:

To investigate how scenes are encoded in the scene-selective brain network using advanced artificial vision systems.
To compare the explanatory power of different computer vision models and behavioral judgments in accounting for neural activity patterns.

Main Methods:

Correlated similarity matrices derived from BOLD activity in scene-selective regions with those from behavioral judgments and various computer vision models.

Keywords:

computer vision parahippocampal place area retrosplenial cortex scene processing transverse occipital sulcus

Related Experiment Videos

Last Updated: Apr 17, 2026

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Evaluated models based on their ability to capture neural patterns, particularly focusing on mid- and high-level scene attributes.

Main Results:

Computer vision models utilizing mid- and high-level scene attributes demonstrated the highest correlations with neural activity in the scene-selective network.
The NEIL and SUN models best explained activity in the PPA and TOS, while the GIST model was optimal for the RSC.
The top-performing models surpassed behavioral judgments in explaining neural data variance.
The NEIL model showed significant correlations across all three regions and was a top performer for PPA and TOS.

Conclusions:

Artificial vision systems, especially those incorporating learned statistical regularities from large datasets (like NEIL), offer a powerful tool for understanding neural scene encoding.
These findings represent a significant advancement in developing detailed models of neural scene understanding and clarifying the functional roles within the scene-selective brain network.