Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Clustering and Interpretability of Residential Electricity Demand Profiles.

Sensors (Basel, Switzerland)·2025

Same author

Affinity-Driven Transfer Learning for Load Forecasting.

Sensors (Basel, Switzerland)·2024

Same author

FishSegSSL: A Semi-Supervised Semantic Segmentation Framework for Fish-Eye Images.

Journal of imaging·2024

Same author

Perceptions of self-monitoring dietary intake according to a plate-based approach: A qualitative study.

PloS one·2023

Same author

Unsupervised Mixture Models on the Edge for Smart Energy Consumption Segmentation with Feature Saliency.

Sensors (Basel, Switzerland)·2023

Same author

Data-Weighted Multivariate Generalized Gaussian Mixture Model: Application to Point Cloud Robust Registration.

Journal of imaging·2023

Same journal

Human-AI Interaction in Interventional Radiology: A Narrative Review of Current Applications, Challenges, and Future Directions.

Journal of imaging·2026

Same journal

Coronary Artery Anomalies and Anatomical Variants: Cross-Sectional Diagnostic Imaging and Clinical Background.

Journal of imaging·2026

Same journal

YoLeTooth: A Unified Framework for Joint Tooth Segmentation and Periapical Lesion Detection in Panoramic Radiographs.

Journal of imaging·2026

Same journal

Radiomics-Guided Multi-Sequence Learning for Pathological Complete Response Prediction from Breast MRI with Missing Auxiliary Sequences.

Journal of imaging·2026

Same journal

Cutaneous Thermography in Arthropathies: Quantitative Imaging, Machine Learning, and Clinical Translation.

Journal of imaging·2026

Same journal

Two-Stage Dynamic Synergistic Segmentation Method for Myocardial Pathology.

Journal of imaging·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 26, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting.

Ahmed Zgaren^1,2, Wassim Bouachir², Nizar Bouguila¹

¹Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montréal, QC H3G 1M8, Canada.

Journal of Imaging

|February 25, 2025

Summary

This summary is machine-generated.

This study introduces an automated zero-shot counting method that surpasses existing zero-shot and few-shot techniques. The novel approach enhances visual object counting accuracy for diverse applications.

Keywords:

class-agnostic object counting transformers visual attention zero-shot

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

Related Experiment Videos

Last Updated: May 26, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

Area of Science:

Computer Vision
Machine Learning
Artificial Intelligence

Background:

Generic Visual Object Counting aims to identify and quantify objects within images.
Zero-shot counting enables object counting for arbitrary classes without prior examples, contrasting with few-shot methods that require exemplars.
Existing methods often require exemplars or lack the automation needed for rapid processing.

Purpose of the Study:

To propose a fully automated zero-shot counting method that outperforms current zero-shot and few-shot approaches.
To enhance the accuracy and efficiency of visual object counting across various domains.

Main Methods:

Exploiting feature maps from a pre-trained detection-based backbone.
Introducing a Visual Embedding Module to generate semantic embeddings with object contextual information.
Utilizing a Self-Attention Matching Module to create an encoded representation for the head counter.

Main Results:

Achieved state-of-the-art performance in zero-shot counting on the FSC147 dataset.
Obtained the best Mean Absolute Error (MAE) of 8.89 and Root Mean Square Error (RMSE) of 35.83.
Demonstrated competitive results compared to few-shot methods.

Conclusions:

The proposed method offers a significant advancement in automated zero-shot visual object counting.
The approach shows promise for applications in tree counting, wildlife monitoring, and medical image analysis (e.g., blood cell counting).
This work pushes the boundaries of visual object counting, enabling more efficient and accurate automated solutions.