Clear Sky Science · en
An artificial intelligence-driven synthesis planning platform (PhotoCat) for photocatalysis
Shining Light on Smarter Chemistry
Chemists increasingly use light to drive chemical reactions, turning simple starting materials into medicines, materials, and fragrances with less waste and energy. Yet designing these light-powered, or photocatalytic, reactions is often slow trial and error. This paper introduces PhotoCat, an artificial-intelligence platform that learns from tens of thousands of past light-driven reactions to help scientists predict what will happen, plan new syntheses, and choose practical lab conditions. For readers, it is a glimpse of how AI and green chemistry are coming together to speed up discovery while cutting environmental impact.

Building a Map of Light-Driven Reactions
The authors’ first step was to assemble a detailed map of known photocatalytic chemistry. They combed through the scientific literature and experimental records to create PhotoCatDB, a curated database of 26,700 light-driven reactions. Each entry captures not just which molecules went in and which came out, but also crucial experimental details: which photocatalyst was used, whether acids, bases or additives were present, the solvent, and the color (wavelength) of light. Many of these are multicomponent reactions, where several building blocks come together at once, reflecting the complexity chemists face in the lab. By checking similarity between products, the team ensured the database emphasizes diverse and novel reactions rather than many near-duplicates.
Teaching an AI to Understand Photochemistry
On top of this database, the researchers built PhotoCat, a family of deep-learning models based on the Transformer architecture originally developed for language translation. One module, PhotoCat-RXN, learns to predict the products of a reaction from the starting materials and, when available, the reaction conditions. Another, PhotoCat-Retro, works in reverse: given a desired target molecule, it proposes plausible photocatalytic starting materials and steps. A third module, PhotoCat-Cond, recommends the actual lab setup—photocatalyst, solvent, additives, and light wavelength—likely to make a proposed reaction work. To give the models broad “chemical common sense,” the team first trained them on millions of general reactions from public patent data before fine-tuning on the specialized photocatalytic set.

Why Conditions Matter as Much as Ingredients
A key insight from this work is that explicitly telling the AI about reaction conditions dramatically improves its performance. When the model received only the starting molecules, its accuracy at predicting the main product was already respectable. But adding structured information about the photocatalyst, acid or base, additives, solvent, and light color pushed the top prediction accuracy above 82 percent and sped up training. The authors show a vivid example in which the presence or absence of a strong acid flips a reaction from making a ketone to forming an alkene. Attention maps from the model reveal that it “looks” most closely at the acid label precisely when predicting the part of the product structure controlled by that choice—mirroring how human chemists reason about conditions.
From Screen to Bench: Discovering New Reactions
To test whether PhotoCat is more than a numerical exercise, the team used it to propose entirely new photocatalytic transformations and then carried them out in the lab. The workflow begins with PhotoCat-Retro suggesting a light-driven route to a target structure, followed by PhotoCat-Cond choosing conditions and PhotoCat-RXN checking that the predicted products are consistent. From 22 AI-suggested candidates, the chemists selected five that appeared novel and practical; four worked in the lab with good yields. These new reactions include a light-driven acylation resembling a cleaner version of the classic Friedel–Crafts process, a catalyst-free route to benzoxazoles, a metal-free method to install trifluoromethyl groups on unsaturated acids using air as the oxidant, and an efficient light-triggered oxo-amination of simple alkenes.
What This Means for Future Green Chemistry
For non-specialists, the takeaway is that PhotoCat acts like an intelligent assistant that has read tens of thousands of photocatalysis papers and can suggest “what to try next” in the lab. By combining a purpose-built database with modern AI models, the platform reaches accuracies on par with the best general reaction-prediction tools, but tailored specifically to light-driven chemistry. More importantly, it turns abstract predictions into actionable recipes that chemists can test, shortening the path from idea to experiment. As the database grows and the models are linked with broader planning tools, systems like PhotoCat could help make photocatalysis a routine, greener choice in chemical manufacturing, quietly improving the sustainability of products we rely on every day.
Citation: Xu, J., Zhai, S., Huang, P. et al. An artificial intelligence-driven synthesis planning platform (PhotoCat) for photocatalysis. Commun Chem 9, 92 (2026). https://doi.org/10.1038/s42004-026-01894-y
Keywords: photocatalysis, artificial intelligence, reaction prediction, retrosynthesis, green chemistry