Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential
Richard Messerly, Shuhao Zhang, Małgorzata Makoś, Ryan Jadrich, Elfi Kraka, Kipton Barros, Benjamin Nebgen, Sergei Tretiak, Olexandr Isayev, Nicholas Lubbers, +1 more
Highlight
Abstract Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery.
Abstract
Abstract Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive quantum chemistry simulations. In practice, developing reactive MLIPs requires prior knowledge of reaction networks to generate fitting data and refitting to extensive datasets for each new application. For this reason, many fields of chemistry would greatly benefit from a general reactive MLIP, i.e., an MLIP that is applicable to a broad range of reactive chemistry such that it can be applied to new systems without the need for retraining. In this work, we develop a general reactive MLIP through unbiased active learning with an atomic configuration sampler inspired by nanoreactor molecular dynamics. The resulting potential (ANI-1xnr) is then applied to study five distinct condensed-phase reactive chemistry systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early-earth small molecules. In all studies, ANI-1xnr closely matches experiment and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N, and O elements that does not need to be refit for each application, enabling high-throughput in silico reactive chemistry experimentation.
Keywords
Cite This Paper
@article{Messerly2023,
author = {Messerly, Richard and Zhang, Shuhao and Makoś, Małgorzata and Jadrich, Ryan and Kraka, Elfi and Barros, Kipton and Nebgen, Benjamin and Tretiak, Sergei and Isayev, Olexandr and Lubbers, Nicholas and Smith, Justin},
title = {Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential},
year = {2023},
doi = {10.21203/rs.3.rs-2383420/v1},
url = {http://dx.doi.org/10.21203/rs.3.rs-2383420/v1},
publisher = {Springer Science and Business Media LLC},
keywords = {machine learning, molecular dynamics, active learning, quantum chemistry},
researchAreas = {ml-potentials, experiment-automation, reactions-reactivity},
researchArea = {ml-potentials},
highlight = {Abstract Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery.}
} Copied to clipboard!
Related Research Areas
Related Publications
Accurate Ring Strain Energy Predictions with Machine Learning and Application in Strain-Promoted Reactions
(2024)
Ring strain energy (RSE) is crucial for understanding molecular reactivity.
Transferable Machine Learning Interatomic Potential for Pd-Catalyzed Cross-Coupling Reactions
(2025)
Finding efficient substrate-catalyst combinations for palladium-catalyzed cross-coupling reactions remains a critical challenge in synthetic chemistry, with broad implications for pharmaceutical and materials manufacturing.
The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules
Scientific Data, 7 (2020)
Abstract Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models.
Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential
Nature Chemistry, 16, 727–734 (2024)
Abstract Atomistic simulation has a broad range of applications from drug design to materials discovery.
The ANI-1ccx and ANI-1x Data Sets, Coupled-Cluster and Density Functional Theory Properties for Molecules
(2020)
Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models.