Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Transformers

Transformers

A device that transforms voltages from one value to another using induction is called a transformer. A transformer consists of two separate coils, or windings, wrapped around the same soft iron core. However, they are electrically insulated from each other.
The iron core has a substantial relative permeability. Therefore, the magnetic field lines generated due to the current in one winding are almost entirely confined within the core, such that the same magnetic flux permeates each turn of both...

Types Of Transformers

Types Of Transformers

Transformers can provide desired voltages to a circuit by modifying the number of turns in the secondary windings.
If the ratio of the number of turns in the secondary winding to that of the primary winding is greater than one, then the transformer is said to be a step-up transformer. In a step-up transformer, the voltage at the secondary winding is greater than the voltage applied at the primary winding.
However, if this ratio is less than one, the transformer is said to be a step-down...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The Multifaceted Potential of Adhatoda vasica Nees: Traditional Uses, Pharmacological Activities and Biotechnological Applications.

Mini reviews in medicinal chemistry·2026

Same author

Design and optimization of highly sensitive and tunable nanostructure biosensor for heavy metal detection using machine learning.

Discover nano·2026

Same author

Smart graphene-enhanced ceramic material refractive index sensor simulation design developed for highly sensitive breast Cancer detection optimized with machine learning.

Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy·2026

Same author

Plasmonic SPR metamaterial sensor optimized via machine learning for sensitive organic chemical detection in wastewater.

Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy·2026

Same author

Next-generation advanced surface plasmon resonance biosensor for dopamine detection with ZnO-ag multilayer design: Machine learning optimization for high sensitivity.

Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy·2026

Same author

Scalable architecture for autonomous malware detection and defense in software-defined networks using federated learning approaches.

Scientific reports·2025

Same journal

Mental health of healthcare workers in England during the first three years of the COVID-19 pandemic: The NHS CHECK study cohort.

PloS one·2026

Same journal

Research on trajectory tracking control of tracked vehicles based on hydraulic motor system identification and Laguerre-MPC.

PloS one·2026

Same journal

A collaborative cervical precancer screening strategy with concurrent HPV genotyping and visual inspection using alumni of a training centre across Ghana: The Rotary 'Protect Your Pearl' initiative.

PloS one·2026

Same journal

Removal efficiency of pesticide residues on pesticide-spiked Perilla Leaf and Broccoli surfaces using microplasma-treated water.

PloS one·2026

Same journal

Cross-domain zero-shot semantic segmentation for unstructured environments via EVA-CLIP model, ensemble prompt engineering, and optimized text-image matching.

PloS one·2026

Same journal

Adaptive robust sparse representation for face recognition based on weighted and fusion dictionary.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 18, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Deep learning-driven image captioning: Progress through transformers and large language models.

Priyanka Panchal¹, Vishal Polara², Siddaraj U³

¹Department of Information Technology, Madhuben and Bhanubhai Patel Institute of Technology, The Charutar Vidya Mandal (CVM) University, New Vallabh Vidya Nagar, Gujarat, India.

|March 16, 2026

Summary

This summary is machine-generated.

This study introduces a new deep learning model for image captioning, outperforming existing methods with advanced vision transformers and LLMs. The novel cross-attention mechanism enhances visual-linguistic alignment for more human-like image descriptions.

More Related Videos

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Related Experiment Videos

Last Updated: Mar 18, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Area of Science:

Artificial Intelligence
Computer Vision
Natural Language Processing

Background:

Traditional Convolutional Neural Network-Recurrent Neural Network (CNN-RNN) hybrids and existing transformer models have limitations in image captioning.
Achieving robust multimodal alignment and enhancing caption diversity remain key challenges in the field.

Purpose of the Study:

To propose a novel deep learning model for image captioning using an advanced vision transformer architecture and a powerful Large Language Model (LLM).
To improve the alignment between linguistic context and visual features through a unique cross-attention mechanism.

Main Methods:

Development of a novel deep learning architecture integrating a vision transformer with an LLM.
Implementation of a unique cross-attention mechanism for deep alignment between visual and linguistic features.
Extensive evaluation on benchmark datasets including MSCOCO, Flickr30K, and NoCaps.

Main Results:

The proposed model demonstrates significant improvements over traditional and existing transformer-based approaches.
Achieved state-of-the-art performance comparable to leading methods like GIT, BLIP-2, and CoCa on MSCOCO, Flickr30K, and NoCaps.
Specific metrics on MS COCO include BLEU-4 (0.495), METEOR (0.390), and CIDEr (1.32).

Conclusions:

The novel architecture sets a new performance benchmark for image captioning systems.
The fusion strategy proves efficient, enabling more precise, contextually rich, and human-like image descriptions.
This work advances multimodal AI systems, supporting Sustainable Development Goals 9 and 4.