Mid-Infrared spectroscopy calibration models for base cation concentration prediction in soils of the North-West Province, South Africa
Abstract
Fertilizers are essential for plant nutrition to sustain global food demands. Fertilizer recommendations requires soil analysis. Conventional laboratory analysis for soil chemistry is often slow and expensive. Mid-Infrared (MIR) spectroscopy may be a promising solution to overcome the limitations of conventional soil analysis, but these require soil specific calibration algorithms. Insufficient MIR analysis calibration algorithms exist for South African soils. The aim of this study was to create calibration algorithms for the prediction of exchangeable base cations (calcium (Ca2+), magnesium (Mg2+), potassium (K+), and sodium (Na+)) concentrations for soils from North-West Province, South Africa. Soil analysis data was received from Noordwes Kooperasie (NWK) and Griekwaland Wes Korporatief (GWK) which included 4393 and 175 soil samples, respectively. Conditioned Latin Hypercube Sampling (cLHS) was used to select a total of 1000 samples (900 from NWK, 100 from GWK), of which 979 were deemed fit and represented the soil spectral database (SSD). The samples were crushed and sieved (53 micron) before being scanned at 4000 – 600 cm-1 spectral range at 2 cm-1 resolution. The data was captured by OPUS Base software, exported with Spectrograph 1.2 software to R Studio. A spectral library was created by combining the SSD and the spectra of the samples from the SSD in R Studio using the R programming language. The spectral library was divided into a training and validation datasets at a 75:25 split. Calibration algorithms were created from the training dataset using Cubist, Partial Least Squared Regression (PLSR) and Random Forest (RF) calibration models. The calibration algorithms were used to predict values of the validation dataset from the spectral library. The accuracy of the models was tested with the independent validation dataset with statistical analysis including coefficient of determination (R2), root mean square error (RMSE) and ratio of performance to deviation (RPD). Cubist showed the best overall performance with order of declining performance accuracy of the base cations as follows: Ca (R2 = 0.77; RMSE = 129; RPD = 2.09), Mg (R2 = 0.75; RMSE = 40; RPD = 1.89), K (R2 = 0.41; RMSE = 59; RPD = 1.28), and Na (R2 = 0.29; RMSE = 6.45; RPD = 1.14), followed by PLSR, then RF. Base cations are not active in the MIR band. Prediction algorithms use soil properties which are active in the MIR band that correlate with exchangeable base cations, to predict the concentrations of the exchangeable base cations. To improve the accuracy of the models, it is recommended to increase sample numbers; using additional calibration models; using different scanning methods; and to include spectral processing before calibration.