Clear Sky Science · en
Hybrid deep learning model for air quality prediction and its impact on healthcare
Why cleaner air and smarter forecasts matter
Air pollution is more than a hazy skyline—it quietly worsens breathing problems, strains the heart, and shortens lives. City authorities now rely on the Air Quality Index (AQI) to warn people when it is unsafe to be outside, but these warnings are often based on yesterday’s data or simple forecasts that miss sudden spikes. This paper explores a new way to predict short-term air quality using a combination of advanced computer models and carefully crafted inputs, with the goal of giving people and health systems earlier and more reliable warnings.
From dirty air to a single health warning number
The study focuses on Gurugram, a fast-growing city in India where traffic, industry, and construction all contribute to poor air. Six key pollutants—tiny particles (PM2.5 and PM10), ground-level ozone, nitrogen dioxide, sulfur dioxide, and carbon monoxide—were gathered hourly over four months using the OpenWeather air pollution service. These measurements were turned into a single AQI value by comparing each pollutant to national safety limits and then taking the worst one as the city’s overall score. This AQI value is what people see in weather apps as categories like “Good,” “Moderate,” “Poor,” or “Severe,” each tied to different levels of health concern.

Teaching computers to read the rhythms of pollution
Instead of simply feeding raw pollutant readings into a model, the authors first engineered extra features that mirror how air actually behaves. They added lagged values to show what pollution looked like a few hours earlier, moving averages to smooth out brief spikes, and ratios such as PM2.5/PM10 to distinguish fine from coarse dust. They also encoded calendar patterns—like time of day, day of week, and month—using cyclic signals to capture routine human activity, such as weekday traffic or weekend slowdowns. These human-designed signals were meant to help the models see subtle trends and interactions that raw numbers alone can hide.
Blending two types of deep learning
The researchers compared three deep learning approaches. A one-dimensional convolutional neural network (CNN) excels at spotting local patterns—short bursts or shapes in the data. A long short-term memory (LSTM) network shines at remembering how values evolve over time. The hybrid CNN–LSTM model chains these strengths together: first, CNN layers compress and highlight important features from the pollutant sequences; then LSTM layers track how those features change hour by hour. All three models were trained on most of the data and tested on the remainder, using standard scores such as precision, recall, and F1-score to judge how well they assigned each hour to the correct AQI category.

Sharper forecasts and what they mean for health
Across repeated experiments, the hybrid model consistently delivered the best balance of accuracy and reliability. With engineered features included, it achieved an F1-score of about 91 percent, slightly ahead of the standalone LSTM and clearly better than the CNN. It also made especially strong distinctions at the dirtiest end of the scale, rarely confusing “Severe” air with safer categories. A simple add-on translated each predicted AQI level into a rough health risk score, indicating, for example, that “Very Poor” and “Severe” conditions correspond to sharply higher chances of breathing and heart problems. The authors stress that these risk scores are guides rather than medical diagnoses, but they show how air-quality forecasts can be turned into more intuitive health signals.
What this means for cities and citizens
The study concludes that combining thoughtfully engineered inputs with a hybrid CNN–LSTM architecture can make short-term AQI forecasts both more accurate and more stable than using a single model alone. Although the work is limited to one city and a few months of data, it points toward practical tools that could inform school closures, outdoor work schedules, hospital preparedness, and personal choices such as when to exercise outside or wear a mask. With longer datasets and wider testing, similar systems could become a backbone of data-driven air quality monitoring, giving people earlier warnings about unhealthy air and helping decision-makers respond before pollution peaks.
Citation: Madan, T., Sagar, S., Singh, Y. et al. Hybrid deep learning model for air quality prediction and its impact on healthcare. Sci Rep 16, 6036 (2026). https://doi.org/10.1038/s41598-026-36564-5
Keywords: air quality index, deep learning, CNN-LSTM, health risk, pollution forecasting