A transfer learning-enhanced deep learning framework for efficient and interpretable soil heavy metal pollution prediction under data scarcity and spatial heterogeneity

Bin Yang ¹, Anqi He ², Zhong Ren ³, Kai Yu ⁴, Gang Zhao ⁵, Yanchun Fan ⁶, Qi Wang ⁷, Shenglian Luo ⁸

¹College of Electrical and Information Engineering and Key Laboratory of Visual Perception and Artificial Intelligence of Hunan Province, Hunan University, Changsha 410082, PR China; Key Laboratory of Jiangxi Province for Persistent Pollutants Prevention Control and Resource Reuse, Nanchang Hangkong University, Nanchang 330063, PR China. Electronic address: binyang@hnu.edu.cn.
²College of Electrical and Information Engineering and Key Laboratory of Visual Perception and Artificial Intelligence of Hunan Province, Hunan University, Changsha 410082, PR China. Electronic address: anqihe@hnu.edu.cn.
³Key Laboratory of Jiangxi Province for Persistent Pollutants Prevention Control and Resource Reuse, Nanchang Hangkong University, Nanchang 330063, PR China. Electronic address: renzhong424@163.com.
⁴Key Laboratory of Jiangxi Province for Persistent Pollutants Prevention Control and Resource Reuse, Nanchang Hangkong University, Nanchang 330063, PR China. Electronic address: recarudo@yeah.net.
⁵Jiangxi Academy of Eco-environmental Sciences and Planning, Nanchang 330000, PR China. Electronic address: zhaogang6766@126.com.
⁶Jiangxi Academy of Eco-environmental Sciences and Planning, Nanchang 330000, PR China. Electronic address: fanych@sthjt.jiangxi.gov.cn.
⁷National-Regional Joint Engineering Research Center for Soil Pollution Control and Remediation in South China, Guangdong Key Laboratory of Integrated Agro-environmental Pollution Control and Management, Institute of Eco-environmental and Soil Sciences, Guangdong Academy of Science, Guangzhou 510650, PR China. Electronic address: Wangqi@soil.gd.cn.
⁸Key Laboratory of Jiangxi Province for Persistent Pollutants Prevention Control and Resource Reuse, Nanchang Hangkong University, Nanchang 330063, PR China. Electronic address: sllou@hnu.edu.cn.

Abstract

Large-scale soil heavy metal pollution risk estimation remains challenging due to data scarcity and spatial heterogeneity. Although traditional machine learning (ML) methods offer notable predictive capabilities, they often struggle with high-dimensional, heterogeneous data, limited labeled samples, and insufficient interpretability. In this study, we propose a transfer learning (TL)-based deep learning (DL) framework that integrates convolutional neural networks (CNN), termed TL-CNN, with remote sensing-based (RSs), web-based (WBs), and field-sampled datasets (including spatial regionalization features, SRs) to efficiently predict soil heavy metal pollution. By coupling hierarchical feature extraction with a GradSHAP interpretability module, the approach provides both predictive accuracy and explanatory insights. Results from Shaoguan City (2018-2022) demonstrate that the TL-CNN model substantially outperforms conventional ML methods, with overall accuracy exceeding 84 %, particularly under multi-metal pollution scenarios. Leveraging TL, the model adaptively addresses data scarcity, reducing the need for costly field sampling and mitigating interpolation errors. The incorporation of RSs- and WBs-derived features captures critical environmental variability and anthropogenic emissions, while SRs refine local pollution patterns. GradSHAP analyses highlight the pivotal role of RSs features and spatial metrics in large-scale predictions. Overall, the proposed TL-CNN model underscores the potential of multi-source heterogeneous datasets and TL-based DL strategies to promote sustainable soil management.

A transfer learning-enhanced deep learning framework for efficient and interpretable soil heavy metal pollution prediction under data scarcity and spatial heterogeneity

Watershed Planning within a Quantitative Scenario Analysis Framework

Use of Principal Components for Scaling Up Topographic Models to Map Soil Redistribution and Soil Organic Carbon

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Abstract

Watershed Planning within a Quantitative Scenario Analysis Framework

Use of Principal Components for Scaling Up Topographic Models to Map Soil Redistribution and Soil Organic Carbon

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Survival Tree

ABOUT JoVE

AUTHORS

LIBRARIANS

RESEARCH

EDUCATION

A transfer learning-enhanced deep learning framework for efficient and interpretable soil heavy metal pollution prediction under data scarcity and spatial heterogeneity

Related Experiment Videos These videos have been matched automatically. Contact us if they are not relevant.

Watershed Planning within a Quantitative Scenario Analysis Framework

Use of Principal Components for Scaling Up Topographic Models to Map Soil Redistribution and Soil Organic Carbon

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Abstract

Related Experiment Videos These videos have been matched automatically. Contact us if they are not relevant.

Watershed Planning within a Quantitative Scenario Analysis Framework

Use of Principal Components for Scaling Up Topographic Models to Map Soil Redistribution and Soil Organic Carbon

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Related Concept Videos

Survival Tree

Share

Related Experiment Videos

These videos have been matched automatically. Contact us if they are not relevant.

Related Experiment Videos

These videos have been matched automatically. Contact us if they are not relevant.