Clear Sky Science · en
Human and environmental controls on soil contamination in a dust-prone region revealed by random forest and Shapley additive explanations analysis
Why dust and soil in dry regions matter to you
In many dry parts of the world, winds can pick up tiny soil particles and carry them for hundreds of kilometers. If those particles contain toxic metals such as arsenic or lead, every dust storm becomes a potential health hazard. This study focuses on a dusty region of central Iran, asking a practical question with global relevance: where exactly are these toxic elements building up in the soil, and which human and natural factors are most responsible?

A dusty landscape under pressure
The researchers studied a 1,057-square-kilometer area in Yazd Province, a hot desert zone shaped by strong winds and frequent dust storms. They collected 107 surface soil samples from the top five centimeters of ground, the layer most easily lifted into the air and most likely to contact people, crops, and animals. In these samples they measured five potentially toxic elements—arsenic, cadmium, cobalt, chromium, and lead—along with a suite of soil properties such as grain size, salt content, and mineral indicators. They also assembled detailed maps describing the land’s topography, climate, vegetation, distance to industries and mines, and satellite-based indicators of surface conditions.
Bringing data and machine learning together
Rather than looking for simple one-to-one causes, the team used a machine learning method called random forest to tease out patterns from dozens of overlapping influences. They built eleven different “what-if” scenarios by combining groups of predictors: soil chemistry and texture, land surface features, signals of human activity like roads and factories, meteorological data, and information from satellite images. For each toxic element, they tested how well the model could reproduce measured concentrations at the sampling sites, then chose the scenario that gave the most accurate predictions across the landscape.
What the models revealed about hidden pollution
The analysis showed that cadmium, cobalt, arsenic, and chromium could be predicted reasonably well, while lead proved much harder to map accurately—likely because its concentrations were very uneven, with a few sharp hotspots among mostly low values. For arsenic, cobalt, and chromium, the best-performing models relied mainly on a combination of human activity information and soil properties. Cadmium required a broader mix, including land surface and satellite data. The resulting maps highlighted clear hotspots: arsenic and cadmium were highest near central and western industrial zones and a major highway, while cobalt and chromium peaked near an urban area in the north and an economic zone in the southwest. Even where average concentrations were moderate, these focused accumulations in a wind-eroded landscape raise concerns for both local residents and areas downwind.
Who or what is driving the contamination?
To move beyond “black box” predictions, the study used an interpretability tool known as SHAP, which assigns each environmental factor a share of responsibility for the model’s output. Human-related factors emerged as the dominant drivers for arsenic, cadmium, and cobalt, and a major contributor for chromium. In particular, the distance to industrial centers stood out: soils closer to factories tended to have higher metal levels. Among soil properties, calcium and magnesium in the soil solution, along with magnetic susceptibility (a magnetic signal linked to certain minerals and dust inputs), were especially important. Together, these findings point to diffuse, widespread contamination from industrial emissions and traffic rather than isolated point spills. Land surface features and satellite-derived indicators played a secondary but still meaningful role, especially for cadmium, capturing how terrain roughness and surface reflectance influence where metals settle and build up.

What this means for people and land
In plain terms, the study concludes that in this dusty desert region, human activities—especially industrial operations—are the main reason toxic elements are accumulating in the topsoil, with certain soil characteristics helping to trap or release them. The authors show that by carefully combining ground measurements, maps, satellite data, and advanced machine learning, it is possible to pinpoint contamination hotspots even with a limited number of samples. This type of mapping can guide where to monitor air quality, protect agricultural fields, and prioritize cleanup, not only in central Iran but in arid regions worldwide where dust and pollution increasingly intersect.
Citation: Ebrahimi-Khusfi, Z., Ayoubi, S., Samadi-Todar, S.A. et al. Human and environmental controls on soil contamination in a dust-prone region revealed by random forest and Shapley additive explanations analysis. Sci Rep 16, 10073 (2026). https://doi.org/10.1038/s41598-026-40377-x
Keywords: soil pollution, heavy metals, dust storms, industrial contamination, machine learning