Vision
Stereotype Content Model
You might also read
Articles linked to this work by shared authors, journal, and citation graph.
Updated: Oct 5, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
Published on: April 21, 2023
Pengfei Li1, Min Zhang1, Peijie Lin1
1Hangzhou Dianzi University, Baiyang Road #2, Hangzhou, China.
The novel Visual-Text Reference Pretraining Model (VTR-PTM) enhances image captioning by integrating visual and textual information. This new approach significantly improves performance on benchmark datasets like MS COCO and Visual Genome.
Area of Science:
Background:
Purpose of the Study:
Main Methods:
Main Results:
Conclusions: