Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

2.5K
2.5K
Base Excision Repair01:54

Base Excision Repair

3.6K
3.6K
Cis-regulatory Sequences02:02

Cis-regulatory Sequences

2.9K
2.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same authorSame journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

A Validated LC-MS/MS Method for the Determination of Perospirone in Human Plasma and Its Pharmacokinetic Application in Healthy Volunteers.

ACS omega·2026
Same author

Event-Aware Instructed Assistant for Referring Video Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Network Pharmacology and Experimental Verification to Explore Cinnamomi Cortex against Steroid-induced Osteonecrosis of the Femoral Head.

Journal of visualized experiments : JoVE·2026
Same author

Development and validation of a stability-indicating high-performance liquid chromatography method for iomeprol.

Scientific reports·2026
Same author

Targeting the STK39/ARID2 Axis to Inhibit NF-κB Signaling: A Novel Pathway for Mesenchymal Stem Cell Osteogenic Differentiation in Osteoporosis Management.

Journal of musculoskeletal & neuronal interactions·2026
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: May 24, 2025

Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language
09:27

Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

Published on: October 13, 2018

9.9K

Instruction-Guided Scene Text Recognition.

Yongkun Du, Zhineng Chen, Yuchen Su

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |March 3, 2025
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces an instruction-guided scene text recognition (IGTR) method. IGTR enhances text image understanding by predicting character attributes, outperforming existing models.

    More Related Videos

    Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading
    05:54

    Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading

    Published on: October 18, 2018

    6.1K
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    8.9K

    Related Experiment Videos

    Last Updated: May 24, 2025

    Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language
    09:27

    Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

    Published on: October 13, 2018

    9.9K
    Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading
    05:54

    Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading

    Published on: October 18, 2018

    6.1K
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    8.9K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Natural Language Processing

    Background:

    • Multi-modal models excel in visual recognition but struggle with scene text recognition (STR) due to compositional differences.
    • Existing STR models face challenges in understanding the nuances of text within natural scenes.

    Purpose of the Study:

    • To propose a novel instruction-guided scene text recognition (IGTR) paradigm.
    • To adapt instruction learning for STR by focusing on character attribute prediction.
    • To develop a flexible and efficient STR system.

    Main Methods:

    • Formulating STR as an instruction learning problem.
    • Utilizing instruction triplets to describe character attributes (e.g., frequency, position).
    • Developing a lightweight instruction encoder, cross-modal feature fusion, and multi-task answer head for attribute learning.

    Main Results:

    • IGTR significantly outperforms existing STR models on English and Chinese benchmarks.
    • The model maintains a small size and fast inference speed.
    • IGTR effectively addresses challenges in recognizing rare and morphologically similar characters.

    Conclusions:

    • The proposed IGTR paradigm offers a new character-understanding-based approach to STR.
    • IGTR demonstrates superior performance and efficiency compared to current methods.
    • The instruction-guided approach provides flexibility and improved handling of difficult character recognition cases.