Manuscript Abstract

INETERPRETABLE MACHINE LEARNING FOR SOYBEAN YIELD PREDICTION WITH SHAP-BASED INSIGHTS
Ibrahim Ahmad Cheema, Muhammad Kashif Hanif, Muhammad Umer Sarwar, Muhammad Irfan Khan

I. A. Cheema, M. K. Hanif*, M. U. Sarwar and M. I. Khan

Department of Computer Science, Government College University, Faisalabad, Pakistan

Corresponding Author: mkashifhanif@gcuf.edu.pk
Page Number(s): 754-765
Published Online First: February 14, 2026
Publication Date: May 05, 2026
ABSTRACT

Accurate crop yield prediction is important to minimize uncertainty for informed decision- making and resource allocation. A variety of machine learning models are used in yield prediction; however, the available benchmarking literature offers limited insight to achieve a balance between predictive accuracy and model interpretability of different models. Therefore, this study was conducted to evaluates popular machine learning models for U.S. soybean yield prediction using a multi-source spatiotemporal dataset comprising weather, soil, and management features. The model performance was evaluated using root mean squared error (RMSE) metric, and feature impact was explained using Shapley Additive Explanations (SHAP) for interpretability. The findings indicate that Random Forest is the best model that achieved least RMSE of 5.07 and highest correlation coefficient of 90.36% on test set. SHAP results revealed that precipitation and solar radiation are leading yield determinants, while soil properties, such as soil pH and bulk density, exerted moderate effects. The contribution of this work is fourfold: (i) a rigorous benchmarking of ML models using accuracy metrics for yield prediction, (ii) evidence based affirming the model superiority for complex agronomic dataset, (iii) systematic assessment of global feature importance connecting yield affecting climatic and edaphic factors, and (iv) application of SHAP as a means for interpretation and explainability. The results bring together predictive performance and explanation, providing insights into the advancement of smart agriculture through informed decision-making for irrigation planning, efficient input application, and climate-resilient strategy formulation.

Keywords: Crop Yield Prediction, Informed decision-making, Machine Learning, Smart Agriculture, SHAP Interpretability, Explainable AI
Open Access: This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( https://creativecommons.org/licenses/by/4.0/).


Download Statistics
This Manuscript
Full Text
56
downloads
Indicators
Metrics

Cite Score: 1.3

JCR Year: 2025

Indexing
Status

Web of Science (SCIE)

SCOPUS (Q3)

Journal Metrics
Current

Journal Impact Factor: 0.5

HEC Category: W

ISSN Details
Verified

Print ISSN: 1018-7081

Electronic ISSN: 2309-8694

Search the Journal

Use the fields below to search for articles by Title, Author, or Keywords.

All Downloads
Full Text
21,953
downloads
Supplementary
41
downloads