Graph Neural Network-Based GrUNet and Attention Transformer Adjacency Matrix for Video Denoising
View abstract on PubMed
Summary
This summary is machine-generated.This study introduces a novel UNet-based video denoising method combining convolutional neural networks (CNNs) and graph neural networks (GNNs). The approach effectively preserves long-term spatiotemporal relationships, enhancing video quality and outperforming existing methods.
Area Of Science
- Computer Vision
- Artificial Intelligence
- Signal Processing
Background
- Video quality is degraded by noise from compression, low light, and sensor imperfections.
- Traditional Convolutional Neural Network (CNN) methods struggle with long-term spatiotemporal dependencies crucial for effective video denoising.
- Preserving fine details, textures, and structures during noise removal remains a challenge.
Purpose Of The Study
- To develop a novel video denoising approach that effectively captures both local and global spatiotemporal dependencies.
- To improve the accuracy of noise modeling and detail preservation in videos.
- To enhance the overall visual quality of videos affected by various noise types.
Main Methods
- A UNet architecture integrating CNNs for local feature extraction and Graph Neural Networks (GNNs) for global dependency modeling.
- Transformer attention is employed for sparse graph formation, where spatiotemporal patches serve as nodes and their similarity as edges.
- The method was validated through ablation studies on different modules and patch sizes across four noise types.
Main Results
- The proposed CNN-GNN hybrid model demonstrated superior performance in preserving video details and structures compared to traditional CNNs.
- Achieved state-of-the-art (SOTA) results in video denoising, outperforming most existing methods in Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM).
- The method proved effective across various noise types and demonstrated its efficacy through rigorous ablation studies.
Conclusions
- The novel integration of CNNs, transformer attention, and GNNs effectively models long-term spatiotemporal relationships for accurate video denoising.
- The proposed method offers a significant advancement in video denoising, balancing performance with moderate computational cost.
- This approach provides a robust solution for enhancing video quality in the presence of diverse noise sources.

