Learning vs. understanding: When does artificial intelligence outperform process-based modeling in soil organic carbon prediction?

  • 0FFoQSI, Technopark 1/Haus D, 3430 Tulln an der Donau, Austria.

|

|

Summary

This summary is machine-generated.

Machine learning (ML) models excel at predicting soil organic carbon (SOC) with large datasets. However, process-based models remain superior for small datasets common in long-term ecological research.

Area Of Science

  • Ecological modeling
  • Soil science
  • Machine learning applications

Background

  • Machine learning (ML) algorithms are increasingly used for ecological modeling.
  • Limited evaluation exists for ML in predicting soil organic carbon (SOC) using small datasets typical of long-term studies.
  • ML performance for SOC prediction has not been compared against traditional process-based models.

Purpose Of The Study

  • To compare the performance of ML algorithms, process-based models, and ensembles for SOC prediction.
  • To evaluate model performance using data from five long-term experimental sites in Austria.
  • To assess the impact of dataset size and cross-validation strategies on ML performance.

Main Methods

  • Comparison of ML algorithms (Random Forest, Support Vector Machines with polynomial kernel), calibrated and uncalibrated process-based models, and ensembles.
  • Utilized data from 256 independent data points across five long-term experimental sites in Austria.
  • Applied leave-one-site-out cross-validation and reduced training sample sizes to test model robustness.

Main Results

  • ML approaches (Random Forest, SVM) outperformed process-based models when using all available data.
  • ML algorithm performance decreased significantly with reduced training data or leave-one-site-out cross-validation.
  • Process-based model accuracy was highly dependent on proper calibration and model combinations.

Conclusions

  • ML models are superior for SOC prediction when large datasets are available.
  • Process-based models are more suitable for exploring underlying biophysical and biochemical mechanisms of SOC dynamics.
  • Ensembles combining ML algorithms and process-based models are recommended to leverage the strengths of both approaches.