Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Beyond benchmarks of IUGC: Rethinking requirements of deep learning method for intrapartum ultrasound biometry from fetal ultrasound videos.

Medical image analysis·2026

Same author

Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations.

Proceedings. International Conference on 3D Vision·2024

Same author

Operating room scheduling: knowing and accepting your limits.

British journal of anaesthesia·2024

Same author

Intracranial hypertension after rosacea treatment with isotretinoin.

Neurological sciences : official journal of the Italian Neurological Society and of the Italian Society of Clinical Neurophysiology·2023

Same author

Nondestructive thermographic detection of internal defects using pixel-pattern based laser excitation and photothermal super resolution reconstruction.

Scientific reports·2023

Same author

Neurosteroids and translocator protein 18 kDa (TSPO) in depression: implications for synaptic plasticity, cognition, and treatment options.

European archives of psychiatry and clinical neuroscience·2022

Same journal

A Guide to Structureless Visual Localization.

International journal of computer vision·2026

Same journal

Distillation-free Scaling of Large State-Space Models for Images and Videos.

International journal of computer vision·2026

Same journal

Are Minimal Radial Distortion Solvers Really Necessary for Relative Pose Estimation?

International journal of computer vision·2026

Same journal

Structure-from-motion in micro-image domain for uncalibrated plenoptic 2.0 cameras.

International journal of computer vision·2026

Same journal

FourierMIL: Fourier Filtering-based Multiple Instance Learning for Whole Slide Image Analysis.

International journal of computer vision·2025

Same journal

A Likelihood Ratio-Based Approach to Segmenting Unknown Objects.

International journal of computer vision·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 4, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

The Curious Layperson: Fine-Grained Image Recognition Without Expert Labels.

Subhabrata Choudhury¹, Iro Laina¹, Christian Rupprecht¹

¹Visual Geometry Group, University of Oxford, Oxford, OX1 3PJ UK.

International Journal of Computer Vision

|February 2, 2024

Summary

This summary is machine-generated.

This study introduces a novel method for fine-grained image recognition without expert annotations by using web encyclopedias. The approach leverages visual descriptions and textual similarity to match images with knowledge bases, improving machine learning capabilities.

Keywords:

Clever Fine-grained classification Multimodal retrieval Non-expert annotations

More Related Videos

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Related Experiment Videos

Last Updated: Jul 4, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Area of Science:

Computer Science
Artificial Intelligence
Machine Learning

Background:

Humans possess innate abilities to interpret images and language, enabling knowledge expansion without expert supervision.
Current machine learning models struggle with fine-grained recognition without extensive, specialized training data.
Accessing and utilizing expert-curated knowledge bases remains a significant challenge for AI.

Purpose of the Study:

To address the challenge of fine-grained image recognition using readily available web knowledge.
To develop a method for image recognition that does not rely on expert annotations.
To enable machines to learn from vast, unstructured online information.

Main Methods:

Learning a visual description model from non-expert image descriptions.
Training a fine-grained textual similarity model for sentence-level image-text matching.
Leveraging web encyclopedias as a source of knowledge.

Main Results:

The proposed method demonstrates effective fine-grained image recognition.
Performance is evaluated on CUB-200 and Oxford-102 Flowers datasets.
The approach shows competitive results compared to strong baselines and state-of-the-art cross-modal retrieval methods.

Conclusions:

Fine-grained image recognition is achievable without expert annotations by utilizing web-scale knowledge.
The developed method offers a promising direction for AI systems to learn and recognize objects with limited supervision.
This work facilitates broader application of AI in domains requiring detailed visual understanding.