MFPI-Net: A Multi-Scale Feature Perception and Interaction Network for Semantic Segmentation of Urban Remote Sensing Images

  • 0School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644005, China.

|

|

Summary

This summary is machine-generated.

This study introduces MFPI-Net, a novel semantic segmentation network designed for complex urban remote sensing images. MFPI-Net significantly enhances the identification of multi-scale objects and improves accuracy for challenging segmentation tasks.

Area Of Science

  • Computer Vision
  • Remote Sensing
  • Artificial Intelligence

Background

  • Complex urban remote sensing images present challenges such as multi-scale object distribution, class similarity, and small object omission.
  • Existing semantic segmentation networks struggle to effectively address these challenges, leading to suboptimal performance.

Purpose Of The Study

  • To propose MFPI-Net, an advanced encoder-decoder semantic segmentation network tailored for complex urban remote sensing imagery.
  • To enhance the performance of semantic segmentation by effectively handling multi-scale objects, class similarities, and small object detection.

Main Methods

  • MFPI-Net integrates a Swin Transformer backbone encoder for global semantic feature extraction.
  • It incorporates a diverse dilation rates attention shuffle decoder (DDRASD) for multi-scale contextual awareness and resolution enhancement.
  • The network also features a multi-scale convolutional feature enhancement module (MCFEM) for local feature modeling and a cross-path residual fusion module (CPRFM) for improved feature interaction.

Main Results

  • MFPI-Net achieved superior performance on the ISPRS Vaihingen and Potsdam datasets compared to mainstream methods.
  • The proposed network attained mean Intersection over Union (mIoU) scores of 82.57% and 88.49% on the respective datasets.
  • Experimental results validate the effectiveness of MFPI-Net in improving semantic segmentation for urban remote sensing.

Conclusions

  • MFPI-Net demonstrates significant improvements in semantic segmentation accuracy for complex urban remote sensing images.
  • The network's architecture effectively addresses challenges related to multi-scale objects, class similarity, and small object recognition.
  • MFPI-Net represents a substantial advancement in the field of remote sensing image analysis.