Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Beyond benchmarks of IUGC: Rethinking requirements of deep learning method for intrapartum ultrasound biometry from fetal ultrasound videos.

Medical image analysis·2026
Same author

Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations.

Proceedings. International Conference on 3D Vision·2024
Same author

Operating room scheduling: knowing and accepting your limits.

British journal of anaesthesia·2024
Same author

Intracranial hypertension after rosacea treatment with isotretinoin.

Neurological sciences : official journal of the Italian Neurological Society and of the Italian Society of Clinical Neurophysiology·2023
Same author

Nondestructive thermographic detection of internal defects using pixel-pattern based laser excitation and photothermal super resolution reconstruction.

Scientific reports·2023
Same author

Neurosteroids and translocator protein 18 kDa (TSPO) in depression: implications for synaptic plasticity, cognition, and treatment options.

European archives of psychiatry and clinical neuroscience·2022
Same journal

A Guide to Structureless Visual Localization.

International journal of computer vision·2026
Same journal

Distillation-free Scaling of Large State-Space Models for Images and Videos.

International journal of computer vision·2026
Same journal

Are Minimal Radial Distortion Solvers Really Necessary for Relative Pose Estimation?

International journal of computer vision·2026
Same journal

Structure-from-motion in micro-image domain for uncalibrated plenoptic 2.0 cameras.

International journal of computer vision·2026
Same journal

FourierMIL: Fourier Filtering-based Multiple Instance Learning for Whole Slide Image Analysis.

International journal of computer vision·2025
Same journal

A Likelihood Ratio-Based Approach to Segmenting Unknown Objects.

International journal of computer vision·2025
See all related articles

Related Experiment Video

Updated: Jul 4, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

9.0K

The Curious Layperson: Fine-Grained Image Recognition Without Expert Labels.

Subhabrata Choudhury1, Iro Laina1, Christian Rupprecht1

  • 1Visual Geometry Group, University of Oxford, Oxford, OX1 3PJ UK.

International Journal of Computer Vision
|February 2, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces a novel method for fine-grained image recognition without expert annotations by using web encyclopedias. The approach leverages visual descriptions and textual similarity to match images with knowledge bases, improving machine learning capabilities.

Keywords:
CleverFine-grained classificationMultimodal retrievalNon-expert annotations

More Related Videos

Deep Neural Networks for Image-Based Dietary Assessment
13:19

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

9.2K
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

542

Related Experiment Videos

Last Updated: Jul 4, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

9.0K
Deep Neural Networks for Image-Based Dietary Assessment
13:19

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

9.2K
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

542

Area of Science:

  • Computer Science
  • Artificial Intelligence
  • Machine Learning

Background:

  • Humans possess innate abilities to interpret images and language, enabling knowledge expansion without expert supervision.
  • Current machine learning models struggle with fine-grained recognition without extensive, specialized training data.
  • Accessing and utilizing expert-curated knowledge bases remains a significant challenge for AI.

Purpose of the Study:

  • To address the challenge of fine-grained image recognition using readily available web knowledge.
  • To develop a method for image recognition that does not rely on expert annotations.
  • To enable machines to learn from vast, unstructured online information.

Main Methods:

  • Learning a visual description model from non-expert image descriptions.
  • Training a fine-grained textual similarity model for sentence-level image-text matching.
  • Leveraging web encyclopedias as a source of knowledge.

Main Results:

  • The proposed method demonstrates effective fine-grained image recognition.
  • Performance is evaluated on CUB-200 and Oxford-102 Flowers datasets.
  • The approach shows competitive results compared to strong baselines and state-of-the-art cross-modal retrieval methods.

Conclusions:

  • Fine-grained image recognition is achievable without expert annotations by utilizing web-scale knowledge.
  • The developed method offers a promising direction for AI systems to learn and recognize objects with limited supervision.
  • This work facilitates broader application of AI in domains requiring detailed visual understanding.