Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Tagging and Fusion Proteins

Tagging and Fusion Proteins

Proteins are involved in several cellular processes and biochemical reactions. Analyzing a specific protein of interest requires it to be isolated from the other proteins in the cell. This is achieved by overexpressing the specific gene in a suitable host to produce large quantities of the target protein. A tag or label is recombined with the gene to produce a fusion protein containing the target protein and the tag. The tags on these fusion proteins can then be used for easy detection and...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Force Classification

Force Classification

Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...

SNAREs and Membrane Fusion

SNAREs and Membrane Fusion

Once a transport vesicle has recognized its target organelle, the vesicular membrane needs to fuse with the target membrane to unload the cargo. Transmembrane proteins called SNAREs present on organelle membranes and their vesicles, mediate vesicle fusion.
SNAREs exist in pairs that symmetrically interact and catalyze the fusion of the lipid bilayers in vesicle and target organelle. v-SNARE in the vesicle membrane are single polypeptide chains that bind to a complementary t-SNARE, composed of 2...

Deconvolution

Deconvolution

Deconvolution, also known as inverse filtering, is the process of extracting the impulse response from known input and output signals. This technique is vital in scenarios where the system's characteristics are unknown, and they must be inferred from the observable signals.
Deconvolution involves several mathematical techniques to derive the impulse response. One common approach is polynomial division. In this method, the input and output sequences are treated as coefficients of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Effects of Moderately Reduced Dietary Net Energy on Growth Performance, Meat Quality and Intestinal Barrier Function in Growing Pigs.

Veterinary sciences·2026

Same author

Identification and validation of BLK and OSBPL10 as diagnostic and prognostic biomarkers for nasopharyngeal carcinoma through machine learning algorithms.

Scientific reports·2026

Same author

Two decades of methane budgets at the sub-national scale in China.

Science bulletin·2026

Same author

A chitosan-lasso peptides nanoparticle for enhanced antibacterial activity and fresh-keeping efficacy in eggs and chilled chicken.

Current research in food science·2026

Same author

An anoikis-related gene signature predicts prognosis and immunotherapy response, and identifies CCAR2 as a therapeutic target in triple-negative breast cancer.

Frontiers in immunology·2026

Same author

The size of tropical vegetation gross primary production.

Nature·2026

Same journal

Q-learning based asynchronous Boolean control networks stabilization with data loss.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

New results on prescribed-time synchronization of complex networks via intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Variance-constrained multi-view ensemble broad network for imbalanced data.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 11, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Event-centric multi-modal fusion method for dense video captioning.

Zhi Chang¹, Dexin Zhao², Huilin Chen¹

¹Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin, 300384, China.

Neural Networks : the Official Journal of the International Neural Network Society

|December 1, 2021

Summary

This summary is machine-generated.

This study introduces an event-centric multi-modal fusion approach for dense video captioning. The model effectively fuses visual and audio cues to improve event description consistency and accuracy in videos.

Keywords:

Dense video captioning Event-centric Multi-modal fusion

More Related Videos

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

Related Experiment Videos

Last Updated: Oct 11, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

Area of Science:

Computer Science
Artificial Intelligence
Multimedia Processing

Background:

Dense video captioning models often overlook inter-event relationships, impacting description consistency.
Current methods primarily use visual features, neglecting multimodal information for event localization and description.

Purpose of the Study:

To develop an event-centric approach for dense video captioning that captures temporal and semantic relationships between events.
To enhance multimodal information fusion for more consistent and accurate video descriptions.

Main Methods:

Exploited visual-audio cues for generating event proposals and improving event representations.
Developed an attention-gating mechanism for dynamic fusion and regulation of multimodal information.
Proposed an event-centric multimodal fusion approach for dense video captioning (EMVC).

Main Results:

The EMVC model demonstrated impressive performance on benchmark datasets (ActivityNet Caption, YouCook2).
The approach effectively captured relationships between events and fused multimodal information.
Achieved superior results compared to existing state-of-the-art methods in dense video captioning.

Conclusions:

The proposed EMVC approach significantly improves dense video captioning by integrating inter-event relationships and multimodal fusion.
Event-centric processing and attention-gating mechanisms are crucial for enhancing video description quality.
The model offers a promising direction for more comprehensive and consistent automatic video description.