In a paper published in the journal Sensors, researchers investigated soil classification based on four parent materials using 59 samples analyzed through x-radiation fluorescence (XRF), inductively coupled plasma–optical emission spectrometry (ICP-OES), and visible near-infrared (Vis-NIR) spectroscopy.
 Study area (Sanliurfa), profiles, and soil-sampling locations. Image Credit: https://www.mdpi.com/1424-8220/24/16/5126
Study area (Sanliurfa), profiles, and soil-sampling locations. Image Credit: https://www.mdpi.com/1424-8220/24/16/5126
Machine learning (ML) algorithms, including support vector machine (SVM), ensemble subspace k-nearest neighbor (ESKNN), and ensemble bagged trees (EBTs), were employed, with ESKNN achieving the highest accuracy. Key variables were identified using Relief algorithms, and reclassification with these variables showed reduced accuracy for Vis-NIR data, though ESKNN remained the most effective.
Past work has shown that soil classification is essential for various applications, from agriculture to forensic science, with methods like XRF, ICP-OES, and Vis-NIR providing valuable insights into soil origins and properties. Combined with machine learning algorithms such as SVM, EBT, and ESKNN, these techniques have been separately assessed, but only some studies have analyzed their effectiveness on the same samples.
Soil Analysis Overview
The study was conducted in Şanlıurfa, a province in Turkey's Southeastern Anatolia region, which plays a crucial role in the Southeastern Anatolia Project (GAP), one of the largest global initiatives for irrigation and electricity production. Soil samples were collected from 12 profiles across four different parent materials—mudflows, Eocene–miocene limestones, Neogene marls, and basalt materials—resulting in 59 samples. The soils were classified as Vertisols and Inceptisols, noted for their high clay content, low organic matter, and elevated calcium levels.
The team used two primary analytical techniques to determine the concentrations of major elements, including silicon, aluminum, calcium, and manganese: ICP-OES and XRF. For ICP-OES analysis, soil samples were finely ground, dried, and subjected to microwave digestion to dissolve the compounds before analysis. This method allowed for precise quantification of the elemental components within the soil.
In the XRF analysis, the soil samples were ground, dried, mixed with a binder, and pressed into pellets. This preparation enabled the accurate measurement of elemental concentrations using a polarized energy dispersive XRF Spectrometer. To comprehend how various parent materials affected the qualities of the soil, both XRF and ICP-OES investigations offered thorough information on the elemental composition of the soils.
Analysts performed Vis-NIR spectral reflectance analysis to measure the soil samples' reflections within the 350–2500 nm wavelength range. These state-of-the-art analytical techniques thoroughly examined the research area's soil qualities, providing important information regarding the region's potential for agriculture.
Elemental Analysis Summary
The distribution of total element concentrations, analyzed through ICP-OES and XRF, showed that the highest percentage of elements was observed for silicon dioxide (SiO₂), followed by calcium oxide (CaO), aluminum oxide (Al₂O₃), or iron oxide (Fe₂O₃), with rankings varying based on the parent material. CaO ratios were higher than SiO₂ ratios in Marl parent material. Statistically significant differences among the parent materials underscore their discriminative power, with variations attributed to the acids' effectiveness during the sample pretreatment process. Further details on elemental variations are available in prior research.
Spectral reflectance analysis was conducted using the Vis-NIR technique. The reflectance levels varied according to soil content, showing that soils from different parent materials exhibited unique spectral characteristics. Basaltic soils had the lowest reflectance intensity due to their high iron oxide content. Peaks in the 400–1000 nm range were associated with iron oxides, whereas absorption bands between 1400 and 2500 nm were associated with hydroxyl groups, soil water, and clay minerals. The band around 2350 nm correlated with high calcium carbonate (CaCO₃) content in the soil.
The classification of soil samples based on their parent materials employed various machine-learning methods, including ESKNN, SVM, and EBT models. Performance metrics such as F score, accuracy, and geometric mean were used to assess model effectiveness. Classification accuracies varied from 0.7 to nearly 1, with Vis-NIR and XRF generally yield better results than ICP. Researchers addressed overfitting by reducing input parameters and using the Relief method to identify significant features. Results showed that models using XRF and Vis-NIR data often achieved high classification accuracy, though Vis-NIR sometimes performed slightly better.
ESKNN generally performed better across different datasets, while SVM excelled with ICP-OES data. Techniques like ICP and XRF provided higher accuracy in soil characterization than Vis-NIR, which, despite being cost-effective and requiring less sample preparation, may be affected by soil mineral content. The study highlights the need for region-specific spectral models to optimize VNIRS performance and suggests that each classification algorithm's effectiveness depends on dataset characteristics and study objectives.
Conclusion
To sum up, this study effectively identified soil sources from various parent materials using advanced analytical and classification techniques. ESKNN consistently outperformed other algorithms and incorporating Relief-determined variables did not significantly impact performance compared to all variables. The models achieved up to 100% accuracy, demonstrating their potential for precise soil source determination. Although training on dug soil profiles and testing with surrounding samples resulted in lower accuracy, the approach showed promising results.
Journal reference:
	- Yüsra İnci, et al. (2024). Machine Learning-Based Classification of Soil Parent Materials Using Elemental Concentration and Vis-NIR Data. Sensors, 24:16, 5126–5126. DOI: 10.3390/s24165126, https://www.mdpi.com/1424-8220/24/16/5126