Clear Sky Science · en
Synthesis of covalent organic frameworks for photocatalytic hydrogen peroxide production guided by large language models
Turning Sunlight, Water, and Air into a Useful Cleaner
Hydrogen peroxide is the fizzing liquid many people know from medicine cabinets and cleaning sprays. Industry makes it on a huge scale, but current methods are energy‑hungry and generate chemical waste. This study explores a greener route: using sunlight to turn just water and oxygen from the air into hydrogen peroxide, and it shows how an artificial‑intelligence system can help chemists design better light‑powered materials to do the job.

Why Cleaning Up Hydrogen Peroxide Matters
Hydrogen peroxide is prized because it breaks down into plain water and oxygen after use, yet it disinfects food, purifies water, and helps drive chemical manufacturing. Today it is made mainly by an older process based on anthraquinone, which demands high temperatures, high pressures, and careful handling of organic solvents. Attempts to copy nature and make hydrogen peroxide directly from water and oxygen under sunlight have been promising, but most lab‑made materials produce solutions that are far too dilute to be useful outside the lab. Reaching practical concentrations without wasting energy or adding extra chemicals has been a stubborn bottleneck.
Teaching Computers to Read the Chemistry Literature
The authors turned to large language models—the same kind of AI that powers advanced chatbots—to sift through recent research on a class of porous materials called covalent organic frameworks, or COFs. These frameworks are like crystalline sponges built from organic building blocks joined by specific linkages. Instead of manually reading hundreds of papers, the team fed 355 publications on COF‑based photocatalysts into an AI pipeline. The model automatically identified key fragments of text and converted more than 11,000 statements about building blocks, linkages, stability, and hydrogen peroxide output into a structured "knowledge graph." This map of chemical relationships could then be queried in plain language to find combinations that look both durable in water and active under light.
Finding and Building a Better Light Sponge
Guided by this AI‑built knowledge base, the system highlighted two particular organic components—one based on a triazine ring and one on a sulfur‑rich benzotrithiophene ring—as especially promising when connected by a thiazole linkage. Chemists synthesized two COFs using the same building blocks but different linkers: one with the more common imine bond (Imi‑COF) and one with the thiazole bond (Thz‑COF). Detailed tests showed that both had well‑ordered, sponge‑like structures and similar pore sizes, but the thiazole‑linked version was markedly tougher. It withstood strong acid, base, and concentrated hydrogen peroxide, and remained stable to high temperatures, while the imine‑linked framework degraded under harsher conditions.
How the New Material Harvests Light and Moves Charges
Optical measurements and ultrafast spectroscopy revealed why Thz‑COF outperformed its cousin. The thiazole linkage extended the material’s light absorption deeper into the visible range and slightly narrowed its energy gap, allowing it to capture more of the solar spectrum. In Thz‑COF, electrons and holes created by light were better separated in space and lived longer before recombining, giving them more time to participate in chemical reactions at the material’s surface. Calculations showed that thiazole sites bind oxygen molecules just strongly enough to encourage a two‑electron reduction pathway that forms hydrogen peroxide, while avoiding binding the product too tightly. In contrast, the imine linkage held on to hydrogen peroxide more strongly, which encouraged its breakdown rather than its release.

From Laboratory Light to Real‑World Uses
When tested under visible light in pure water saturated with oxygen, Thz‑COF produced hydrogen peroxide at roughly twice the rate of the imine‑linked version and, crucially, kept accumulating product rather than plateauing. After 72 hours it reached about 0.28 percent by weight—more than five times higher than the comparison material and above the threshold needed for tasks like detoxifying certain food contaminants. In a two‑liquid setup designed to concentrate the product even further, the system achieved nearly 1.9 percent hydrogen peroxide, suitable for uses such as food sanitizing and tooth whitening. The generated solutions rapidly bleached dye pollutants and killed common bacteria almost completely, and the material retained its activity over multiple cycles with only modest structural changes.
What This Means for Greener Chemistry
To a non‑specialist, the key message is that AI can now comb through vast amounts of chemical knowledge and point experimentalists toward smarter choices, rather than relying solely on trial and error or intuition. In this case, that guidance led to a robust, light‑harvesting framework that turns ordinary water and air into a versatile disinfectant at concentrations edging into practical territory, without added fuel molecules. The work suggests that pairing language models with clever data structures can accelerate the search for other sunlight‑driven materials, bringing cleaner routes to everyday chemicals closer to reality.
Citation: Shu, C., Wang, L., Yang, X. et al. Synthesis of covalent organic frameworks for photocatalytic hydrogen peroxide production guided by large language models. Nat Commun 17, 3046 (2026). https://doi.org/10.1038/s41467-026-69549-z
Keywords: hydrogen peroxide, photocatalysis, covalent organic frameworks, materials discovery, large language models