Classifying Soil Origins with High Accuracy Using ML

In a paper published in the journal Sensors, researchers investigated soil classification based on four parent materials using 59 samples analyzed through x-radiation fluorescence (XRF), inductively coupled plasma–optical emission spectrometry (ICP-OES), and visible near-infrared (Vis-NIR) spectroscopy.

Study area (Sanliurfa), profiles, and soil-sampling locations. Image Credit: https://www.mdpi.com/1424-8220/24/16/5126
Study area (Sanliurfa), profiles, and soil-sampling locations. Image Credit: https://www.mdpi.com/1424-8220/24/16/5126

Machine learning (ML) algorithms, including support vector machine (SVM), ensemble subspace k-nearest neighbor (ESKNN), and ensemble bagged trees (EBTs), were employed, with ESKNN achieving the highest accuracy. Key variables were identified using Relief algorithms, and reclassification with these variables showed reduced accuracy for Vis-NIR data, though ESKNN remained the most effective.

Past work has shown that soil classification is essential for various applications, from agriculture to forensic science, with methods like XRF, ICP-OES, and Vis-NIR providing valuable insights into soil origins and properties. Combined with machine learning algorithms such as SVM, EBT, and ESKNN, these techniques have been separately assessed, but only some studies have analyzed their effectiveness on the same samples.

Soil Analysis Overview

The study was conducted in Şanlıurfa, a province in Turkey's Southeastern Anatolia region, which plays a crucial role in the Southeastern Anatolia Project (GAP), one of the largest global initiatives for irrigation and electricity production. Soil samples were collected from 12 profiles across four different parent materials—mudflows, Eocene–miocene limestones, Neogene marls, and basalt materials—resulting in 59 samples. The soils were classified as Vertisols and Inceptisols, noted for their high clay content, low organic matter, and elevated calcium levels.

The team used two primary analytical techniques to determine the concentrations of major elements, including silicon, aluminum, calcium, and manganese: ICP-OES and XRF. For ICP-OES analysis, soil samples were finely ground, dried, and subjected to microwave digestion to dissolve the compounds before analysis. This method allowed for precise quantification of the elemental components within the soil.

In the XRF analysis, the soil samples were ground, dried, mixed with a binder, and pressed into pellets. This preparation enabled the accurate measurement of elemental concentrations using a polarized energy dispersive XRF Spectrometer. To comprehend how various parent materials affected the qualities of the soil, both XRF and ICP-OES investigations offered thorough information on the elemental composition of the soils.

Analysts performed Vis-NIR spectral reflectance analysis to measure the soil samples' reflections within the 350–2500 nm wavelength range. These state-of-the-art analytical techniques thoroughly examined the research area's soil qualities, providing important information regarding the region's potential for agriculture.

Elemental Analysis Summary

The distribution of total element concentrations, analyzed through ICP-OES and XRF, showed that the highest percentage of elements was observed for silicon dioxide (SiO₂), followed by calcium oxide (CaO), aluminum oxide (Al₂O₃), or iron oxide (Fe₂O₃), with rankings varying based on the parent material. CaO ratios were higher than SiO₂ ratios in Marl parent material. Statistically significant differences among the parent materials underscore their discriminative power, with variations attributed to the acids' effectiveness during the sample pretreatment process. Further details on elemental variations are available in prior research.

Spectral reflectance analysis was conducted using the Vis-NIR technique. The reflectance levels varied according to soil content, showing that soils from different parent materials exhibited unique spectral characteristics. Basaltic soils had the lowest reflectance intensity due to their high iron oxide content. Peaks in the 400–1000 nm range were associated with iron oxides, whereas absorption bands between 1400 and 2500 nm were associated with hydroxyl groups, soil water, and clay minerals. The band around 2350 nm correlated with high calcium carbonate (CaCO₃) content in the soil.

The classification of soil samples based on their parent materials employed various machine-learning methods, including ESKNN, SVM, and EBT models. Performance metrics such as F score, accuracy, and geometric mean were used to assess model effectiveness. Classification accuracies varied from 0.7 to nearly 1, with Vis-NIR and XRF generally yield better results than ICP. Researchers addressed overfitting by reducing input parameters and using the Relief method to identify significant features. Results showed that models using XRF and Vis-NIR data often achieved high classification accuracy, though Vis-NIR sometimes performed slightly better.

ESKNN generally performed better across different datasets, while SVM excelled with ICP-OES data. Techniques like ICP and XRF provided higher accuracy in soil characterization than Vis-NIR, which, despite being cost-effective and requiring less sample preparation, may be affected by soil mineral content. The study highlights the need for region-specific spectral models to optimize VNIRS performance and suggests that each classification algorithm's effectiveness depends on dataset characteristics and study objectives.

Conclusion

To sum up, this study effectively identified soil sources from various parent materials using advanced analytical and classification techniques. ESKNN consistently outperformed other algorithms and incorporating Relief-determined variables did not significantly impact performance compared to all variables. The models achieved up to 100% accuracy, demonstrating their potential for precise soil source determination. Although training on dug soil profiles and testing with surrounding samples resulted in lower accuracy, the approach showed promising results.

Journal reference:
  • Yüsra İnci, et al. (2024). Machine Learning-Based Classification of Soil Parent Materials Using Elemental Concentration and Vis-NIR Data. Sensors, 24:16, 5126–5126. DOI: 10.3390/s24165126, https://www.mdpi.com/1424-8220/24/16/5126
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2024, August 19). Classifying Soil Origins with High Accuracy Using ML. AZoAi. Retrieved on October 20, 2025 from https://www.azoai.com/news/20240819/Classifying-Soil-Origins-with-High-Accuracy-Using-ML.aspx.

  • MLA

    Chandrasekar, Silpaja. "Classifying Soil Origins with High Accuracy Using ML". AZoAi. 20 October 2025. <https://www.azoai.com/news/20240819/Classifying-Soil-Origins-with-High-Accuracy-Using-ML.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "Classifying Soil Origins with High Accuracy Using ML". AZoAi. https://www.azoai.com/news/20240819/Classifying-Soil-Origins-with-High-Accuracy-Using-ML.aspx. (accessed October 20, 2025).

  • Harvard

    Chandrasekar, Silpaja. 2024. Classifying Soil Origins with High Accuracy Using ML. AZoAi, viewed 20 October 2025, https://www.azoai.com/news/20240819/Classifying-Soil-Origins-with-High-Accuracy-Using-ML.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AI-Powered Index Maps U.S. Hot Spots For Growing Power Outage Risks