Clear Sky Science · en
A Q-learning approach to waste rock reduction in open-pit mine design based on cleaner production principles
Why Smarter Mines Matter
Modern society runs on metals, from the copper in our phones to the wiring in power grids. Getting those metals, however, often means carving enormous open pits into the earth and moving staggering amounts of rock. Most of that rock is waste that must be hauled, dumped, and monitored for decades. This study explores a new way to design open-pit mines that uses artificial intelligence, specifically a method called Q-learning, to reduce waste rock and its environmental damage while still keeping mines profitable.
The Hidden Cost of Moving Mountains
In a typical open-pit copper mine, engineers first define the ultimate pit limit—the outer shell of rock that is worth removing over the life of the mine. Inside that shell lies ore that contains valuable metal; outside it lies rock that is too costly to mine. Traditional design methods focus almost entirely on money made from selling metal minus the direct costs of drilling, blasting, hauling, and processing. They largely ignore the long-term environmental bills for dealing with waste rock, such as land degradation, pollution, and the risk of acid mine drainage. As a result, a pit can look attractive on paper while quietly locking in huge future liabilities for cleanup and water treatment.
Teaching a Digital Agent to Dig
The researchers recast pit design as a learning problem rather than a one-time calculation. They divide the orebody into thousands of three-dimensional blocks, each with its own revenue, mining cost, processing cost, and a carefully estimated environmental cost per ton of ore and waste. A computer “agent” then practices mining these blocks step by step inside a simulated mine. When it chooses blocks that increase overall value while honoring safe wall angles, it earns a positive reward; when it violates slope rules or chases blocks that become unprofitable once environmental impacts are counted, it is penalized. Over many training cycles, the agent uses Q-learning to discover a mining pattern—a policy—that balances profit with lower waste and fewer environmental burdens.

From Toy Models to a Giant Copper Pit
To test the idea, the team first applied the Q-learning framework to small two- and three-dimensional test deposits. In these experiments, the digital agent gradually improved its strategy: early pit shapes were jagged and inefficient, but after thousands of learning steps the pits became smooth, realistic, and economically sound. The key change was that once environmental costs were built into each block’s value, many marginal blocks that once looked attractive turned into net losses, so the agent learned to leave them in the ground. Importantly, the resulting pits mined nearly the same amount of ore but required less waste rock removal.
Real-World Mining, Real-World Trade-Offs
The real proof came from applying the method to the Sarcheshmeh copper mine in Iran, one of the largest copper operations in the country. The new Q-learning-based design was compared with the industry-standard Lerchs–Grossmann algorithm, which optimizes purely for financial return. The traditional design produced slightly higher profit on paper but did so by overlooking environmental costs. The Q-learning design, in contrast, reduced waste rock by millions of tons while recovering almost exactly the same amount of ore. It also ran faster on the same computer, cutting optimization time by about 20 percent. The end result was a somewhat smaller, more compact pit that would disturb less land and expose less material capable of generating acidic runoff, without sacrificing meaningful revenue.

Rethinking What “Profit” Really Means
For non-specialists, the main message is that how we design mines can dramatically change their long-term footprint, even if short-term profits look similar. By teaching an algorithm to treat environmental damage as a real cost from the very first design step, the study shows that it is possible to mine almost as much metal while moving less rock, leaving a smaller scar, and likely paying less for cleanup later. In other words, the smartest mine is not the one that squeezes out every last dollar today, but the one that recognizes that nature’s bill will eventually come due—and plans accordingly.
Citation: Badakhshan, N., Bakhtavar, E., Shahriar, K. et al. A Q-learning approach to waste rock reduction in open-pit mine design based on cleaner production principles. Sci Rep 16, 6447 (2026). https://doi.org/10.1038/s41598-026-35892-w
Keywords: open-pit mining, waste rock, reinforcement learning, sustainable mining, mine design