Home > News > Techscience

University of Science and Technology of China Uses Explainable Machine Learning to Solve Catalytic Structure Sensitivity Challenge

Sun, Mar 17 2024 11:13 AM EST

Professor Weixue Li from the University of Science and Technology of China recently tackled a longstanding challenge in heterogeneous catalysis research using physics-inspired explainable machine learning algorithms combined with first-principles calculations. The research findings, titled "Structure Sensitivity of Metal Catalysts Revealed by Interpretable Machine Learning and First-principles Calculations," were published in the Journal of the American Chemical Society. 65f6330be4b03b5da6d0b518.jpeg Catalytic reaction active sites and their structure sensitivity are among the fundamental concepts in heterogeneous catalysis research. Despite significant progress in recent years, determining the active sites and their structure sensitivity at the atomic scale remains a major challenge in the rational design of catalytic materials due to numerous influencing factors spanning multiple spatial and temporal scales. For example, the Brønsted–Evans–Polanyi (BEP) relationship between activation energy and reaction heat, as well as the linear scaling relationships between different molecular adsorption energies, have long been regarded as the most important basic research frameworks for catalytic reaction mechanisms and optimization design. However, the lack of explicit information on catalyst geometric structures and chemical compositions in BEP relationships and scaling laws renders them theoretically unable to describe catalytic structure sensitivity, thus greatly limiting the optimization design research of catalysts.

Machine learning methods are playing an increasingly important role in heterogeneous catalysis research and have been applied to the study of structure sensitivity of catalysts. However, most studies to date belong to end-to-end "black box" research, with research results lacking good physical interpretability. There is still an urgent need to establish physically interpretable relationships that include clear geometric structures and chemical compositions of catalysts and can accurately predict the analytical relationship of catalytic reaction barriers. Additionally, since the calculation of catalytic reaction barriers is mainly achieved through high-precision and high-cost density functional theory, theoretical data for the system are also relatively scarce. Therefore, it is often necessary to refer to different data sources, and the challenges brought by the diversity of data sources also require appropriate machine learning algorithms.

Faced with this problem, in this research work, the authors based on physics-inspired interpretable multitask learning symbolic regression and diverse first-principles computational datasets, established a concise, physically clear descriptor. The descriptor consists of two parts: structural terms of catalysts and energy terms of catalytic reactions, which can accurately predict the activation barriers of various molecules on different component and structure metal catalysts. Among them, the newly established structural terms consist of three variables: topological coordination unsaturation, valence electrons, and lattice constants of catalysts, successfully solving the problem of structure sensitivity of metal catalysts and highlighting the importance of transparency of data-driven theoretical models ("white box" research) in constructing catalytic physical models. 65f6330ce4b03b5da6d0b51a.jpeg Figure 1 presents a comprehensive first-principles computational dataset encompassing various metals, crystal facets, phases, and reactions.

In the specific machine learning modeling process, several key strategies and approaches were proposed in this work. Firstly, to ensure the diversity of the dataset, a large and diverse dataset covering 21 different chemical bond dissociation energies, 10 transition metal catalysts, 2 different crystal phases, and 17 different crystal facets was constructed (see Figure 1). Secondly, based on domain knowledge and chemical intuition, it was inferred that there should be a strong correlation between surface energy and activation energy, while surface energy is also correlated with surface dangling bonds or coordination unsaturation. By considering the contributions of various coordinated atoms exposed on the surface, the authors defined a new physical quantity, ΔCN (topological coordination unsaturation), which purely reflects the structural characteristics of the catalyst. 65f6330ce4b03b5da6d0b51c.jpeg In Figure 2a, it's evident that the correlation between the activation energy of catalytic reactions and the degree of topological coordination unsaturation is superior to both surface energy and reaction heat in traditional BEP. Incorporate this topological coordination unsaturation along with reaction heat and other fundamental atomic parameters of catalysts as physical features into corresponding machine learning studies. Thirdly, to ensure the interpretability, accuracy, and generality of data-driven models, considering the inconsistency that may exist in multi-source data and the difficulty in explicitly describing differences between different molecules, this work adopts a multi-task learning symbolic regression strategy for machine learning modeling.

The machine learning results eventually yield the optimal model for molecular activation energy as follows: 65f6330de4b03b5da6d0b51e.jpeg In this equation, Ne represents the valence electron count of the metal catalyst, a is the corresponding lattice constant, ΔE stands for the reaction heat, and c1, c2, and c0 are the respective coefficients. In this two-dimensional model, the first term accounts for the structural contribution, proportional to the square of the valence electron count and lattice constant, inversely proportional to the topological coordination unsaturation. The second term corresponds to the reaction energy term in the classical Brønsted-Evans-Polanyi (BEP) relationship. This model accurately predicts the bond dissociation energy barriers and demonstrates good universality across datasets with different symmetries, bond orders, and steric hindrances of chemical bonds (see Figure 2b and 2c). 65f6330de4b03b5da6d0b520.jpeg Figure 2: (a) Statistical correlation between CO activation energy barrier and various physical parameters (reaction heat, surface energy, topological coordination unsaturation); (b) Training and testing accuracy of machine learning models; (c) Transferability prediction capability of the model on new systems.

The aforementioned equations explicitly incorporate information about catalyst composition, structure, and reaction heat. Therefore, the dependency of activation energy on structural terms (Figure 3a) and energy terms (Figure 3b) modulated by catalyst composition can be utilized to dissect the geometric and electronic effects of the catalyst. Additionally, the magnitude of activation energy projections onto these two terms can be used to classify the sensitivity of catalyst structure. As depicted in Figure 3c, a larger projection and coefficient on the former imply that the activation process of molecules (such as CO, NO, N2) is structure-sensitive, whereas if it's on the latter (e.g., OH, NH), it indicates that the reaction is structure-insensitive. This conclusion primarily applies to small molecules. For larger molecules, significant steric effects hinder the interpretability of corresponding projections for judging reaction structure sensitivity. Nevertheless, the predictive capabilities of the corresponding formulas still demonstrate excellent performance. 65f6330ee4b03b5da6d0b522.jpeg Title: Geometric, Electronic, Structural, and Energetic Effects on Activation Energy Barriers: A Statistical Analysis

In this research article, authored by doctoral candidate Wu Shu from the University of Science and Technology of China, with Professor Li Weixue as the corresponding author and Dr. Ouyang Runhai from Shanghai University as the co-corresponding author, the intricate interplay between geometric, electronic, structural, and energetic factors influencing activation energy barriers is meticulously analyzed.

Supported by funding from the National Natural Science Foundation of China, including Innovative Research Groups and General Programs, as well as projects from the Chinese Academy of Sciences and the Ministry of Science and Technology, this study also benefitted from computational resources provided by the Supercomputing Center of the University of Science and Technology of China.

For further details, please refer to the article: Link