AI Framework Speeds Up Relay Catalysis Design With Transparent, Literature-Linked Pathways

A new AI-driven catalysis framework combines language models with a knowledge graph to streamline discovery, offering chemists transparent and traceable multi-step pathways that could accelerate sustainable chemical innovation.

Research: Synergizing a knowledge graph and large language model for relay catalysis pathway recommendation. Image Credit: ArtemisDiana  / Shutterstock

Research: Synergizing a knowledge graph and large language model for relay catalysis pathway recommendation. Image Credit: ArtemisDiana  / Shutterstock

Relay catalysis connects multiple catalytic reactions so that the product of one step becomes the reactant of the next. This approach can improve efficiency, increase selectivity, and reduce energy use. However, designing such multi-step pathways is often slow and challenging. Chemists need to search through a lot of scattered papers, compare different conditions, and check whether each step works well together in practice.

A New AI-Assisted Framework

In a study published in the journal National Science Review, Jun Cheng and Ye Wang from Xiamen University, together with Jeff Z. Pan from the University of Edinburgh, developed a new AI-driven method to make this process faster and more reliable. Their method combines large language models (LLMs) with a custom-built catalysis knowledge graph called Cat-KG. The goal is to recommend useful relay catalysis pathways that are easily accessible, easy to understand, and clearly linked to the original research sources.

How It Works

The team built Cat-KG by using LLMs to extract key reaction data from over 15,000 published catalysis papers. This includes information on reactants, products, catalysts, reaction conditions, and performance. All data are cleaned, organized, and stored in a graph database, where each reaction is linked to its source article. This makes it easy for chemists to trace back every recommended step to the original literature.

To find suitable pathways, the system uses graph-based searches combined with chemistry-informed filtering rules. These rules help identify reaction sequences that are not only possible but also practical—ensuring, for example, that the temperature or gas used in one step doesn't conflict with the next. The filtered results are then summarized by the LLM in plain language and chemical equations, making them easy to understand and evaluate.

Workflow diagram of the construction of the Cat-KG and application for relay catalysis. The initial phase involves the selection of a catalytic schema and the acquisition of data from catalysis-centric literature to establish the Cat-KG (blue), followed by the recommendation of a reaction pathway (purple). Shared steps are marked with the corresponding color.

Workflow diagram of the construction of the Cat-KG and application for relay catalysis. The initial phase involves the selection of a catalytic schema and the acquisition of data from catalysis-centric literature to establish the Cat-KG (blue), followed by the recommendation of a reaction pathway (purple). Shared steps are marked with the corresponding color.

Results and Impact

For several important target molecules—such as ethylene, ethanol, and 2,5-furandicarboxylic acid—the system successfully found relay pathways that match those already proven in the lab. It also suggested 20 new and unreported pathways for further experimental study. Most of these results can be generated within minutes.

Why It Matters

Unlike black-box AI models, this method is transparent, explainable, and traceable. Every recommended pathway comes with supporting data and literature links, helping chemists evaluate the suggestions before starting experiments. The system is flexible and can be updated with better LLMs or applied to new areas like photocatalysis and electrocatalysis. The team also plans to improve the model by learning from expert feedback in future versions.

The Cat-KG constructed in this study is publicly accessible and supports catalytic reaction queries at: https://ai4ec.ac.cn/apps/chembrain 

Looking Forward

Currently, this work mainly focuses on selecting each reaction step individually. Future research will place greater emphasis on handling more complex interactions between steps, such as considering coupling effects between catalysts, assessing catalyst stability under practical reaction conditions, and taking into account economic and operational feasibility, to make the entire catalytic process run more smoothly under real-world conditions.

About the Research Team

Jun Cheng – State Key Laboratory of Physical Chemistry of Solid Surface, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China; Tan Kah Kee Innovation Laboratory (IKKEM), Laboratory of AI for Electrochemistry (AI4EC lab), Xiamen 361005, China; Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, China.

For more information, please visit the team's websites: https://www.cheng-group.net

Source:
Journal reference:
  • Fu, F., Li, Q., Wang, F., Hu, J., Wang, T., Liu, Y., Xu, W., Lin, Z., Gong, F., Fan, Q., Pan, J. Z., Wang, Y., & Cheng, J. (2025). Synergizing a knowledge graph and large language model for relay catalysis pathway recommendation. National Science Review, 12(8). DOI: 10.1093/nsr/nwaf271, https://academic.oup.com/nsr/article/12/8/nwaf271/8199925 

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
From Molecules to Mutations: Aviary Elevates AI in Research