Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Convolution Properties II

Convolution Properties II

The important convolution properties include width, area, differentiation, and integration properties.
The width property indicates that if the durations of input signals are T1 and T2, then the width of the output response equals the sum of both durations, irrespective of the shapes of the two functions. For instance, convolving two rectangular pulses with durations of 2 seconds and 1 second results in a function with a width of 3 seconds.
The area property asserts that the area under the...

Convolution Properties I

Convolution Properties I

Convolution computations can be simplified by utilizing their inherent properties.
The commutative property reveals that the input and the impulse response of an LTI (Linear Time-Invariant) system can be interchanged without affecting the output:

Convolution: Math, Graphics, and Discrete Signals

Convolution: Math, Graphics, and Discrete Signals

In any LTI (Linear Time-Invariant) system, the convolution of two signals is denoted using a convolution operator, assuming all initial conditions are zero. The convolution integral can be divided into two parts: the zero-input or natural response and the zero-state or forced response, with t0 indicating the initial time.
To simplify the convolution integral, it is assumed that both the input signal and impulse response are zero for negative time values. The graphical convolution process...

Deconvolution

Deconvolution

Deconvolution, also known as inverse filtering, is the process of extracting the impulse response from known input and output signals. This technique is vital in scenarios where the system's characteristics are unknown, and they must be inferred from the observable signals.
Deconvolution involves several mathematical techniques to derive the impulse response. One common approach is polynomial division. In this method, the input and output sequences are treated as coefficients of...

Upsampling

Upsampling

Managing signal sampling rates is essential in digital signal processing to maintain signal integrity. A decimated signal, characterized by a reduced frequency range due to its lower sampling rate, can be upsampled by inserting zeros between each sample. This upsampling process expands the original spectrum and introduces repeated spectral replicas at intervals dictated by the new Nyquist frequency. To refine this zero-inserted sequence, it is passed through a lowpass filter with a cutoff...

Encoding

Encoding

Information enters the brain through encoding, which is the input of information into the memory system. Once sensory information is received from the environment, the brain labels or codes it. The information is then organized with similar information and connected to existing concepts. Encoding occurs through automatic processing and effortful processing.
Automatic processing involves the encoding of details like time, space, frequency, and the meaning of words, usually done without conscious...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Leveraging Text-to-Image Diffusion Models for Unsupervised Visual Object Tracking.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Tracking-seq: a universal off-target detection approach for CRISPR-Cas genome editing.

Nature protocols·2026

Same author

Minimizing far-extending chromatin perturbation in genome editing preserves stem cell identity.

Cell stem cell·2026

Same author

PoseMoE: Mixture-of-Experts Network for Monocular 3D Human Pose Estimation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2025

Same author

Informative Sample Selection Model for Skeleton-Based Action Recognition With Limited Training Samples.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2025

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 26, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

SibNet: Sibling Convolutional Encoder for Video Captioning.

Sheng Liu, Zhou Ren, Junsong Yuan

IEEE Transactions on Pattern Analysis and Machine Intelligence

|March 10, 2020

Summary

This summary is machine-generated.

This study introduces SibNet, a novel dual-branch neural network for visual captioning. SibNet enhances video understanding by encoding both content and semantic information, outperforming existing methods in generating descriptive sentences.

More Related Videos

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Related Experiment Videos

Last Updated: Dec 26, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Area of Science:

Artificial Intelligence
Computer Vision
Natural Language Processing

Background:

Visual captioning, the automatic description of visual content, is complex due to the richness of visual data and the nuances of natural language.
Existing methods often use single-stream approaches for encoding visual information, potentially missing complementary data representations.

Purpose of the Study:

To develop a novel neural network architecture for improved visual captioning.
To address the limitations of single-stream encoding by proposing a dual-branch approach that captures both visual content and semantic information.

Main Methods:

Introduced the Sibling Convolutional Encoder (SibNet), a dual-branch neural network architecture for visual captioning.
The first branch (content) uses an autoencoder for visual appearance, while the second branch (semantic) employs visual-semantic joint embedding.
Combined features using a soft-attention mechanism and a recurrent neural network (RNN) decoder for caption generation.

Main Results:

SibNet demonstrated superior performance in video captioning tasks compared to existing methods.
Experiments conducted on the YouTube2Text and MSR-VTT benchmarks validated the model's effectiveness.
The dual-branch approach effectively captured both content and semantic information, leading to better video representation.

Conclusions:

The proposed SibNet architecture significantly enhances visual captioning by integrating content and semantic information.
This dual-branch approach offers a more comprehensive understanding of video data, leading to improved caption generation accuracy.
SibNet represents a promising advancement in the field of automated visual description.