Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Transformers01:26

Transformers

2.2K
A device that transforms voltages from one value to another using induction is called a transformer. A transformer consists of two separate coils, or windings, wrapped around the same soft iron core. However, they are electrically insulated from each other.
The iron core has a substantial relative permeability. Therefore, the magnetic field lines generated due to the current in one winding are almost entirely confined within the core, such that the same magnetic flux permeates each turn of both...
2.2K
Types Of Transformers01:16

Types Of Transformers

1.7K
Transformers can provide desired voltages to a circuit by modifying the number of turns in the secondary windings.
If the ratio of the number of turns in the secondary winding to that of the primary winding is greater than one, then the transformer is said to be a step-up transformer. In a step-up transformer, the voltage at the secondary winding is greater than the voltage applied at the primary winding.
However, if this ratio is less than one, the transformer is said to be a step-down...
1.7K
Improving Translational Accuracy02:07

Improving Translational Accuracy

15.4K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
15.4K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.8K
3.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Multifaceted Potential of Adhatoda vasica Nees: Traditional Uses, Pharmacological Activities and Biotechnological Applications.

Mini reviews in medicinal chemistry·2026
Same author

Design and optimization of highly sensitive and tunable nanostructure biosensor for heavy metal detection using machine learning.

Discover nano·2026
Same author

Smart graphene-enhanced ceramic material refractive index sensor simulation design developed for highly sensitive breast Cancer detection optimized with machine learning.

Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy·2026
Same author

Plasmonic SPR metamaterial sensor optimized via machine learning for sensitive organic chemical detection in wastewater.

Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy·2026
Same author

Next-generation advanced surface plasmon resonance biosensor for dopamine detection with ZnO-ag multilayer design: Machine learning optimization for high sensitivity.

Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy·2026
Same author

Scalable architecture for autonomous malware detection and defense in software-defined networks using federated learning approaches.

Scientific reports·2025
Same journal

Invaders taking over-Mollusc faunal change in volcanic barrier lakes of the Albertine Rift biodiversity hotspot.

PloS one·2026
Same journal

AI-driven molecular diversification and ligand-based optimization of macitentan derivatives targeting VEGFR1 and endothelin signaling pathways.

PloS one·2026
Same journal

Performance patterns and records in the world aquatics masters championships: Where do the most frequently represented nations among the top-ten masters swimmers come from?

PloS one·2026
Same journal

Modeling diurnal Temperature-Rainfall relationships under multicollinearity using PLS-SEM: A case study of Ghana.

PloS one·2026
Same journal

Organizational culture, social capital, and emergency capacity in primary healthcare institutions: A cross-sectional structural equation modeling study comparing ordinary and older communities.

PloS one·2026
Same journal

Impact of kidney function on the metabolome in the general population.

PloS one·2026
See all related articles

Related Experiment Video

Updated: Mar 18, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K

Deep learning-driven image captioning: Progress through transformers and large language models.

Priyanka Panchal1, Vishal Polara2, Siddaraj U3

  • 1Department of Information Technology, Madhuben and Bhanubhai Patel Institute of Technology, The Charutar Vidya Mandal (CVM) University, New Vallabh Vidya Nagar, Gujarat, India.

Plos One
|March 16, 2026
PubMed
Summary
This summary is machine-generated.

This study introduces a new deep learning model for image captioning, outperforming existing methods with advanced vision transformers and LLMs. The novel cross-attention mechanism enhances visual-linguistic alignment for more human-like image descriptions.

More Related Videos

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

2.4K

Related Experiment Videos

Last Updated: Mar 18, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K
A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

2.4K

Area of Science:

  • Artificial Intelligence
  • Computer Vision
  • Natural Language Processing

Background:

  • Traditional Convolutional Neural Network-Recurrent Neural Network (CNN-RNN) hybrids and existing transformer models have limitations in image captioning.
  • Achieving robust multimodal alignment and enhancing caption diversity remain key challenges in the field.

Purpose of the Study:

  • To propose a novel deep learning model for image captioning using an advanced vision transformer architecture and a powerful Large Language Model (LLM).
  • To improve the alignment between linguistic context and visual features through a unique cross-attention mechanism.

Main Methods:

  • Development of a novel deep learning architecture integrating a vision transformer with an LLM.
  • Implementation of a unique cross-attention mechanism for deep alignment between visual and linguistic features.
  • Extensive evaluation on benchmark datasets including MSCOCO, Flickr30K, and NoCaps.

Main Results:

  • The proposed model demonstrates significant improvements over traditional and existing transformer-based approaches.
  • Achieved state-of-the-art performance comparable to leading methods like GIT, BLIP-2, and CoCa on MSCOCO, Flickr30K, and NoCaps.
  • Specific metrics on MS COCO include BLEU-4 (0.495), METEOR (0.390), and CIDEr (1.32).

Conclusions:

  • The novel architecture sets a new performance benchmark for image captioning systems.
  • The fusion strategy proves efficient, enabling more precise, contextually rich, and human-like image descriptions.
  • This work advances multimodal AI systems, supporting Sustainable Development Goals 9 and 4.