Learning vs. understanding: When does artificial intelligence outperform process-based modeling in soil organic carbon prediction?
- 1FFoQSI, Technopark 1/Haus D, 3430 Tulln an der Donau, Austria.
- 2Institute of Agronomy, University of Natural Resources and Life Sciences (BOKU) Vienna, Konrad Lorenz-Straße 24, 3430 Tulln an der Donau, Austria; Institute of Soil Research, University of Natural Resources and Life Sciences (BOKU) Vienna, Peter Jordan-Straße 82, 1190 Vienna, Austria.
- 3Institute of Agronomy, University of Natural Resources and Life Sciences (BOKU) Vienna, Konrad Lorenz-Straße 24, 3430 Tulln an der Donau, Austria.
- 4Institute of Soil Research, University of Natural Resources and Life Sciences (BOKU) Vienna, Peter Jordan-Straße 82, 1190 Vienna, Austria.
- 5Institute of Geomatics, University of Natural Resources and Life Sciences (BOKU) Vienna, Peter Jordan-Straße 82, 1190 Vienna, Austria.
- 6Austrian Agency for Health and Food Safety (AGES), Institute for Soil Health and Plant Nutrition, Spargelfeldstraße 191, 1226 Vienna, Austria.
- 7Human-Centered AI Lab, Institute of Forest Engineering, University of Natural Resources and Life Sciences (BOKU) Vienna, Peter Jordan-Straße 82, 1190 Vienna, Austria.
- 0FFoQSI, Technopark 1/Haus D, 3430 Tulln an der Donau, Austria.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.
View abstract on PubMed
Summary
This summary is machine-generated.Machine learning (ML) models excel at predicting soil organic carbon (SOC) with large datasets. However, process-based models remain superior for small datasets common in long-term ecological research.
Area Of Science
- Ecological modeling
- Soil science
- Machine learning applications
Background
- Machine learning (ML) algorithms are increasingly used for ecological modeling.
- Limited evaluation exists for ML in predicting soil organic carbon (SOC) using small datasets typical of long-term studies.
- ML performance for SOC prediction has not been compared against traditional process-based models.
Purpose Of The Study
- To compare the performance of ML algorithms, process-based models, and ensembles for SOC prediction.
- To evaluate model performance using data from five long-term experimental sites in Austria.
- To assess the impact of dataset size and cross-validation strategies on ML performance.
Main Methods
- Comparison of ML algorithms (Random Forest, Support Vector Machines with polynomial kernel), calibrated and uncalibrated process-based models, and ensembles.
- Utilized data from 256 independent data points across five long-term experimental sites in Austria.
- Applied leave-one-site-out cross-validation and reduced training sample sizes to test model robustness.
Main Results
- ML approaches (Random Forest, SVM) outperformed process-based models when using all available data.
- ML algorithm performance decreased significantly with reduced training data or leave-one-site-out cross-validation.
- Process-based model accuracy was highly dependent on proper calibration and model combinations.
Conclusions
- ML models are superior for SOC prediction when large datasets are available.
- Process-based models are more suitable for exploring underlying biophysical and biochemical mechanisms of SOC dynamics.
- Ensembles combining ML algorithms and process-based models are recommended to leverage the strengths of both approaches.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.

