Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Encoding01:19

Encoding

133
Information enters the brain through encoding, which is the input of information into the memory system. Once sensory information is received from the environment, the brain labels or codes it. The information is then organized with similar information and connected to existing concepts. Encoding occurs through automatic processing and effortful processing.
Automatic processing involves the encoding of details like time, space, frequency, and the meaning of words, usually done without conscious...
133
Stereotype Content Model02:16

Stereotype Content Model

14.0K
The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...
14.0K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

11.1K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
11.1K
The Photochemical Reaction Center01:29

The Photochemical Reaction Center

4.1K
Reaction centers are pigment-protein complexes that initiate energy conversion from photons to chemical entities. Therefore, photochemical reaction center is a more appropriate term that describes these complexes. The Nobel laureates Robert Emerson and William Arnold provided the first experimental evidence of photochemical reaction centers by demonstrating the participation of nearly 2,500 chlorophyll molecules for the release of just one molecule of oxygen. Despite thousands of photosynthetic...
4.1K
State Space Representation01:27

State Space Representation

166
The frequency-domain technique, commonly used in analyzing and designing feedback control systems, is effective for linear, time-invariant systems. However, it falls short when dealing with nonlinear, time-varying, and multiple-input multiple-output systems. The time-domain or state-space approach addresses these limitations by utilizing state variables to construct simultaneous, first-order differential equations, known as state equations, for an nth-order system.
Consider an RLC circuit, a...
166
Gestalt Principles of Perception01:21

Gestalt Principles of Perception

280
Gestalt principles provide a framework for understanding how humans perceive objects as unified wholes within their context. These principles are essential in explaining the cognitive processes that make sense of complex visual stimuli by organizing them into coherent groups. One fundamental principle is proximity, which posits that objects located close to each other are perceived as a collective group. For instance, when dots are positioned near one another, the visual system interprets them...
280

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Meshed Context-Aware Beam Search for Image Captioning.

Entropy (Basel, Switzerland)·2024
Same author

[Prevalence and prognostic factors for postoperative complications of uvulopalatopharyngoplasty in patients with obstructive sleep apnea hypopnea syndrome].

Lin chuang er bi yan hou tou jing wai ke za zhi = Journal of clinical otorhinolaryngology head and neck surgery·2008
Same author

[Transurethral electrotomy for cystis vesicular seminalis induced by obstruction of the distal end of the ejaculatory duct].

Zhonghua nan ke xue = National journal of andrology·2008
Same author

[Effects of testosterone on the proliferation of rat corpus cavernosum cells in vitro].

Zhonghua nan ke xue = National journal of andrology·2008
Same author

Identification of 4-aminopyrazolylpyrimidines as potent inhibitors of Trk kinases.

Journal of medicinal chemistry·2008
Same author

Increased dialysate levels of phospholipids containing unsaturated fatty acid are associated with increased peritoneal transport rate.

American journal of nephrology·2008
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jun 9, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

8.9K

Image Captioning Based on Semantic Scenes.

Fengzhi Zhao1,2, Zhezhou Yu1,2,3, Tao Wang1,2

  • 1College of Computer Science and Technology, Jilin University, Changchun 130012, China.

Entropy (Basel, Switzerland)
|October 25, 2024
PubMed
Summary
This summary is machine-generated.

The Semantic Scenes Encoder (SSE) improves image captioning by integrating scene and semantic graphs, generating more accurate and comprehensive descriptions for complex visual data.

Keywords:
attention mechanismgraphimage captioningsemantic scenes encoder

More Related Videos

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
07:36

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

15.7K
Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.7K

Related Experiment Videos

Last Updated: Jun 9, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

8.9K
Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
07:36

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

15.7K
Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.7K

Area of Science:

  • Computer Vision
  • Natural Language Processing
  • Artificial Intelligence

Background:

  • Image captioning generates textual descriptions for images, crucial for applications like image retrieval and autonomous driving.
  • Existing region-based methods often focus on local features, neglecting overall scene understanding and leading to inaccurate captions for complex scenes.
  • Current methods struggle to extract complete semantic information, resulting in biased or deficient captions.

Purpose of the Study:

  • To address the limitations of existing image captioning methods.
  • To propose a novel Semantic Scenes Encoder (SSE) for generating comprehensive and accurate image captions.
  • To enhance the understanding of both image content and semantic relationships for improved caption generation.

Main Methods:

  • The Semantic Scenes Encoder (SSE) extracts a scene graph from images and integrates it into image information encoding.
  • A semantic graph is extracted from captions, preserving information via a learnable attention mechanism termed the 'dictionary'.
  • The model combines encoded image information and learned semantic information for caption generation.

Main Results:

  • The SSE model was evaluated on the MSCOCO dataset.
  • Experimental results demonstrated a significant improvement in the overall quality of generated captions.
  • The SSE achieved higher scores across multiple evaluation metrics, indicating superior performance in image captioning.

Conclusions:

  • The proposed Semantic Scenes Encoder (SSE) effectively enhances image captioning by incorporating scene and semantic graph information.
  • The SSE overcomes limitations of previous methods by considering global scene context and complete semantic information.
  • The model shows significant advantages in generating accurate and coherent captions, particularly for complex visual scenes.