Isayev Lab Publications

Isayev Lab PublicationsResearch publications in machine learning, computational chemistry, and drug discovery from the Isayev Lab at Carnegie Mellon University.https://olexandrisayev.com/en-usolexandr@cmu.edu (Olexandr Isayev)olexandr@cmu.edu (Olexandr Isayev)Copyright 2025 Olexandr IsayevScienceChemistryMachine LearningPublicationsAIMNet2-rxn: A Machine Learned Potential for Generalized Reaction Modeling on a Millions-of-Pathways Scalehttps://doi.org/10.26434/chemrxiv-2025-hpdmghttps://doi.org/10.26434/chemrxiv-2025-hpdmgMechanistic modeling of chemical transformations offers a compelling basis for understanding reactivity and allows for prediction of reaction outcomes before attempting experiments. Despite progress in machine learned interatomic potentials (MLIPs), we demonstrate that available models lack the accuracy for diverse reaction modeling. With this motivation, we developed a general MLIP for mechanistic modeling of organics, AIMNet2-rxn, using a dataset of ~4.7 x 106 range-separated DFT calculations. AIMNet2-rxn enables reaction modeling ~106 faster than the reference quantum mechanical (QM) methods while significantly outperforming graph-based ML, reaffirming the value using 3D chemical information for training. On a test suite of well-known reaction mechanisms—such as amide formation, proton transfers, and pericyclics—AIMNet2-rxn yields 1-2 kcal mol-1 accuracy across reaction coordinates without retraining or system-specific fine-tuning. To exploit GPU parallelism and AIMNet2-rxn efficiency, we introduce a batched nudged elastic band (BNEB) method that readily achieves minimum energy pathway search on a millions-of-reactions scale. To demonstrate complex reaction characterization, the thermodynamics of an 11-step pathway producing hydroxymethylfurfural, the experimentally observed major product of glucose pyrolysis, is evaluated. Overall, the accuracy and efficiency afforded by AIMNet2-rxn creates opportunities in high-throughput reaction discovery and deep reaction network analysis that would be infeasible with QM methods.Wed, 01 Jan 2025 00:00:00 GMTreaction mechanismhigh-throughputAnstine, Dylan M. et al.Transferable Machine Learning Interatomic Potential for Pd-Catalyzed Cross-Coupling Reactionshttps://doi.org/10.26434/chemrxiv-2025-n36r6https://doi.org/10.26434/chemrxiv-2025-n36r6Finding efficient substrate-catalyst combinations for palladium-catalyzed cross-coupling reactions remains a critical challenge in synthetic chemistry, with broad implications for pharmaceutical and materials manufacturing. We report AIMNet2-Pd, a machine learned interatomic potential that enables rapid, accurate computational studies of palladium-catalyzed cross-coupling reactions. AIMNet2-Pd replaces computationally expensive electronic structure calculations with a neural network-based model that performs geometry optimization, transition state searches, and energy calculations in seconds while maintaining accuracy within 1-2 kcal mol⁻¹ and ~0.1 Å compared to the reference QM calculations. AIMNet2-Pd makes computational high-throughput catalyst screening and mechanistic studies of realistic systems feasible by providing on-demand thermodynamic and kinetic predictions for each step of a catalytic cycle. Importantly, the applicability of the systems extends beyond the monophosphine ligands in Pd(0)/Pd(II) cycles for which it has been trained on to chemically diverse Pd complexes. This demonstrates AIMNet2-Pd's utility to serve as a general-purpose and high-throughput tool for studying catalytic reactions.Wed, 01 Jan 2025 00:00:00 GMTneural networkmachine learningtransition statehigh-throughputAnstine, Dylan et al.AIMNet2: a neural network potential to meet your neutral, charged, organic, and elemental-organic needshttps://doi.org/10.1039/d4sc08572hhttps://doi.org/10.1039/d4sc08572hMachine learned interatomic potentials (MLIPs) are reshaping computational chemistry practices because of their ability to drastically exceed the accuracy-length/time scale tradeoff.Wed, 01 Jan 2025 00:00:00 GMTneural networkAnstine, Dylan M. et al.All That Glitters Is Not Gold: Importance of Rigorous Evaluation of Proteochemometric Modelshttps://doi.org/10.1021/acs.jcim.5c00395https://doi.org/10.1021/acs.jcim.5c00395Avdiunina, Polina and Jamal, Shamieraah and Gusev, Filipp and Isayev, Olexandr. Journal of Chemical Information and Modeling (2025)Wed, 01 Jan 2025 00:00:00 GMTAvdiunina, Polina et al.Democratizing Reaction Kinetics through Machine Vision and Learninghttps://doi.org/10.26434/chemrxiv-2025-4tk40https://doi.org/10.26434/chemrxiv-2025-4tk40We present an innovative methodology for measuring amide coupling reaction rates by monitoring pH changes via indicator dyes, achieving precision comparable to traditional NMR techniques, called PRISM (Parallelized Reaction-rates via Indicator Spectrometry using Machine-vision) The experimental design, enabled by a serial dilution, allowed for measuring twelve rate constants concurrently, spanning more than four orders of magnitude using 96-well plates, with 1,162 total rate constants collected. Moreover, the instrumentation is 3D-printed, with the remaining components comprising readily available and cost-effective hardware, promoting the democratized use of this technique to generate uniform data sets. Validation with 19F-NMR confirmed PRISM’s reliability. Computational investigations reveal a concerted asynchronous SN2 mechanism, with base-catalyzed pathways exhibiting the lowest energy barriers. To complement the PRISM rate dataset, we developed a classification model that achieves high accuracy for out-of-distribution reactants in determining rate measurability, and a chemically rich graph neural network regression model for predicting quantitative reaction rates. This approach provides a framework that offers a resource-efficient strategy for studying reaction kinetics, which can be applied to other reaction classes.Wed, 01 Jan 2025 00:00:00 GMTneural networkgraph neuralBaumer, Mitchell et al.Anticipating the Selectivity of Intramolecular Cyclization Reaction Pathways with Neural Network Potentialshttps://doi.org/10.1021/acs.jctc.5c01161https://doi.org/10.1021/acs.jctc.5c01161Casetti, Nicholas and Anstine, Dylan and Isayev, Olexandr and Coley, Connor W.. Journal of Chemical Theory and Computation (2025)Wed, 01 Jan 2025 00:00:00 GMTneural networkCasetti, Nicholas et al.AIQM3: Targeting Coupled-Cluster Accuracy with Semi-Empirical Speed Across Seven Main Group Elementshttps://doi.org/10.26434/chemrxiv-2025-g2dbghttps://doi.org/10.26434/chemrxiv-2025-g2dbgThe AIQM series methods are successful neural network-based models that target coupled-cluster accuracy while maintaining high robustness and transferability across various tasks by leveraging Δ-learning. However, the previous AIQM1 and AIQM2 models are limited to molecular systems with four elements: H, C, N, and O, which falls short of meeting the common needs for atomistic simulations. Here, we introduce the extension—AIQM3—that covers three additional chemical elements: S, F, Cl, and approaches coupled cluster level at the speed of a semi-empirical method. AIQM3 maintains the accuracy of its predecessor AIQM2, surpasses the commonly used density functional theory (DFT) method in different types of molecular interactions, and its efficiency is competitive with that of machine learning interatomic potentials on commodity CPU hardware. AIQM3 superiority is showcased for reaction simulations and tasks related to drug design, where it delivers accurate torsion profiles for various real-world drug-like molecules. In addition, AIQM3 can be used for infrared (IR) spectra calculations at a low cost. We provide a web service for the AIQM3 calculations on the Aitomistic Hub at aitomistic.xyz, to democratize and facilitate its use with the assistance of AI agents.Wed, 01 Jan 2025 00:00:00 GMTneural networkmachine learningdensity functionalChen, Yuxinxin et al.Proto-Yield: An Uncertainty-Aware Prototype Network for Yield Prediction in Real-world Chemical Reactionshttps://doi.org/10.1145/3746252.3761323https://doi.org/10.1145/3746252.3761323Guo, Kehan and Liu, Zhen and Guo, Zhichun and Nan, Bozhao and Isayev, Olexandr and Chawla, Nitesh and Wiest, Olaf and Zhang, Xiangliang. Proceedings of the 34th ACM International Conference on Information and Knowledge Management (2025)Wed, 01 Jan 2025 00:00:00 GMTGuo, Kehan et al.Machine learning anomaly detection of automated HPLC experiments in the cloud laboratoryhttps://doi.org/10.1039/d5dd00253bhttps://doi.org/10.1039/d5dd00253bAutonomous experiments are vulnerable to unforeseen adverse events. We developed a transferable ML framework that flags affected HPLC runs in real time and provides expert-level quality control without human oversight.Wed, 01 Jan 2025 00:00:00 GMTmachine learningGusev, Filipp et al.Machine learning interatomic potentials at the centennial crossroads of quantum mechanicshttps://doi.org/10.1038/s43588-025-00930-6https://doi.org/10.1038/s43588-025-00930-6Kalita, Bhupalee and Gokcan, Hatice and Isayev, Olexandr. Nature Computational Science (2025)Wed, 01 Jan 2025 00:00:00 GMTmachine learningKalita, Bhupalee et al.AIMNet2‐NSE: A Transferable Reactive Neural Network Potential for Open‐Shell Chemistryhttps://doi.org/10.1002/anie.202516763https://doi.org/10.1002/anie.202516763Abstract Open‐shell systems such as radical intermediates are central to radical polymerization (RP), combustion, catalysis, and many other chemical and industrial processes, yet their accurate modeling presents significant computational challenges. Most of the current machine learning interatomic potentials do not distinguish between different spin states, making them unsuitable for open‐shell reactive chemistry. Here we present AIMNet2‐NSE (neural spin‐charge equilibration), a neural network potential that incorporates spin‐charge equilibration for accurate treatment of molecules and reactions with arbitrary charge and spin multiplicities. Built upon the AIMNet2 framework, AIMNet2‐NSE is trained on an extensive dataset comprising 20 million closed‐shell neutral and charged molecules, 13 million open‐shell radical configurations, and 200K radical reaction profiles. With explicit handling of spin charges, AIMNet2‐NSE enables prediction of spin‐resolved properties with near‐DFT accuracy while maintaining a favorable linear scaling compared to the polynomial scaling of electronic structure methods. The predictive capabilities and generalizability of our model are confirmed by evaluations on large‐scale radical test sets, the industrially relevant BASChem19 benchmark, and RP reactions. Overall, AIMNet2‐NSE represents a significant advancement in machine learning interatomic potentials, allowing efficient exploration of complex open‐shell systems, and significantly advancing our ability to model radical reaction pathways and reactive intermediates in chemical processes where traditional quantum mechanical methods are computationally prohibitive.Wed, 01 Jan 2025 00:00:00 GMTneural networkmachine learningKalita, Bhupalee et al.AIMNet2‐NSE: A Transferable Reactive Neural Network Potential for Open‐Shell Chemistryhttps://doi.org/10.1002/ange.202516763https://doi.org/10.1002/ange.202516763Abstract Open‐shell systems such as radical intermediates are central to radical polymerization (RP), combustion, catalysis, and many other chemical and industrial processes, yet their accurate modeling presents significant computational challenges. Most of the current machine learning interatomic potentials do not distinguish between different spin states, making them unsuitable for open‐shell reactive chemistry. Here we present AIMNet2‐NSE (neural spin‐charge equilibration), a neural network potential that incorporates spin‐charge equilibration for accurate treatment of molecules and reactions with arbitrary charge and spin multiplicities. Built upon the AIMNet2 framework, AIMNet2‐NSE is trained on an extensive dataset comprising 20 million closed‐shell neutral and charged molecules, 13 million open‐shell radical configurations, and 200K radical reaction profiles. With explicit handling of spin charges, AIMNet2‐NSE enables prediction of spin‐resolved properties with near‐DFT accuracy while maintaining a favorable linear scaling compared to the polynomial scaling of electronic structure methods. The predictive capabilities and generalizability of our model are confirmed by evaluations on large‐scale radical test sets, the industrially relevant BASChem19 benchmark, and RP reactions. Overall, AIMNet2‐NSE represents a significant advancement in machine learning interatomic potentials, allowing efficient exploration of complex open‐shell systems, and significantly advancing our ability to model radical reaction pathways and reactive intermediates in chemical processes where traditional quantum mechanical methods are computationally prohibitive.Wed, 01 Jan 2025 00:00:00 GMTneural networkmachine learningKalita, Bhupalee et al.Fast and Accurate Ring Strain Energy Predictions with Machine Learning and Application in Strain-Promoted Reactionshttps://doi.org/10.1021/jacsau.5c00667https://doi.org/10.1021/jacsau.5c00667Liu, Zhen and Vinskus, Jessica and Fu, Yue and Liu, Peng and Noonan, Kevin J. T. and Isayev, Olexandr. JACS Au (2025)Wed, 01 Jan 2025 00:00:00 GMTmachine learningLiu, Zhen et al.Efficient Molecular Crystal Structure Prediction and Stability Assessment with AIMNet2 Neural Network Potentialshttps://doi.org/10.1021/acs.cgd.5c01001https://doi.org/10.1021/acs.cgd.5c01001Nayal, Kamal Singh and O’Connor, Dana and Zubatyuk, Roman and Anstine, Dylan M. and Yang, Yi and Tom, Rithwik and Deng, Wenda and Tang, Kehan and Marom, Noa and Isayev, Olexandr. Crystal Growth & Design (2025)Wed, 01 Jan 2025 00:00:00 GMTneural networkNayal, Kamal Singh et al.Scalable Low-Energy Molecular Conformer Generation with Quantum Mechanical Accuracyhttps://doi.org/10.26434/chemrxiv-2025-k4h7vhttps://doi.org/10.26434/chemrxiv-2025-k4h7vMolecular geometry is crucial for biological activity and chemical reactivity; however, computational methods for generating 3D structures are limited by the vast scale of conformational space and the complexities of stereochemistry. Here we present an approach that combines an expansive dataset of molecular conformers with generative diffusion models to address this problem. We introduce ChEMBL3D, which contains over 250 million molecular geometries for 1.8 million drug-like compounds, optimized using AIMNet2 neural network potentials to a near-quantum mechanical accuracy with implicit solvent effects included. This dataset captures complex organic molecules in various protonation states and stereochemical configurations. We then developed LoQI, a stereochemistry-aware diffusion model that learns molecular geometry distributions directly from this data. Through graph augmentation, LoQI accurately generates molecular structures with targeted stereochemistry, representing a significant advance in modeling capabilities over previous generative methods. The model outperforms traditional approaches, achieving up to tenfold improvements in energy accuracy and effective recovery of optimal conformations. Benchmark tests on complex systems, including macrocycles and flexible molecules, as well as validation with crystal structures, show LoQI can perform low energy conformer search efficiently. The model code and dataset are available at https: //github.com/isayevlab/LoQI.Wed, 01 Jan 2025 00:00:00 GMTneural networkNikitin, Filipp et al.GEOM-drugs revisited: toward more chemically accurate benchmarks for 3D molecule generationhttps://doi.org/10.1039/d5dd00206khttps://doi.org/10.1039/d5dd00206kRevisiting GEOM drugs: corrected metrics and novel energy-based structural benchmark enable rigorous evaluation of 3D molecule generative models.Wed, 01 Jan 2025 00:00:00 GMTgenerative modelNikitin, Filipp et al.Design of Tough 3D Printable Elastomers with Human‐in‐the‐Loop Reinforcement Learninghttps://doi.org/10.1002/ange.202513147https://doi.org/10.1002/ange.202513147Abstract The development of high‐performance elastomers for additive manufacturing requires overcoming complex property trade‐offs that challenge conventional material discovery pipelines. Here, a human‐in‐the‐loop reinforcement learning (RL) approach is used to discover polyurethane elastomers that overcome pervasive stress–strain property tradeoffs. Starting with a diverse training set of 92 formulations, a coupled multi‐component reward system was identified that guides RL agents toward materials with both high strength and extensibility. Through three rounds of iterative optimization combining RL predictions with human chemical intuition, we identified elastomers with more than double the average toughness compared to the initial training set. The final exploitation round, aided by solubility prescreening, predicted twelve materials exhibiting both high strength (>10 MPa) and high strain at break (>200%). Analysis of the high‐performing materials revealed structure‐property insights, including the benefits of high molar mass urethane oligomers, a high density of urethane functional groups, and incorporation of rigid low molecular weight diols and unsymmetric diisocyanates. These findings demonstrate that machine‐guided, human‐augmented design is a powerful strategy for accelerating polymer discovery in applications where data is scarce and expensive to acquire, with broad applicability to multi‐objective materials optimization.Wed, 01 Jan 2025 00:00:00 GMTRapp, Johann L. et al.Design of Tough 3D Printable Elastomers with Human‐in‐the‐Loop Reinforcement Learninghttps://doi.org/10.1002/anie.202513147https://doi.org/10.1002/anie.202513147Abstract The development of high‐performance elastomers for additive manufacturing requires overcoming complex property trade‐offs that challenge conventional material discovery pipelines. Here, a human‐in‐the‐loop reinforcement learning (RL) approach is used to discover polyurethane elastomers that overcome pervasive stress–strain property tradeoffs. Starting with a diverse training set of 92 formulations, a coupled multi‐component reward system was identified that guides RL agents toward materials with both high strength and extensibility. Through three rounds of iterative optimization combining RL predictions with human chemical intuition, we identified elastomers with more than double the average toughness compared to the initial training set. The final exploitation round, aided by solubility prescreening, predicted twelve materials exhibiting both high strength (>10 MPa) and high strain at break (>200%). Analysis of the high‐performing materials revealed structure‐property insights, including the benefits of high molar mass urethane oligomers, a high density of urethane functional groups, and incorporation of rigid low molecular weight diols and unsymmetric diisocyanates. These findings demonstrate that machine‐guided, human‐augmented design is a powerful strategy for accelerating polymer discovery in applications where data is scarce and expensive to acquire, with broad applicability to multi‐objective materials optimization.Wed, 01 Jan 2025 00:00:00 GMTRapp, Johann L. et al.Machine Learning-Accelerated Screening of Hydroquinone Analogs for Proton-Coupled Electron Transferhttps://doi.org/10.26434/chemrxiv-2025-9k7p7https://doi.org/10.26434/chemrxiv-2025-9k7p7Proton-coupled electron transfer (PCET) mediated by hydroquinone and related molecules is key to natural and artificial energy conversion. The reactivity of these molecules depends on their bond dissociation free energy (BDFE), but studying the relationship between structure and thermochemistry across chemical space has been limited by computational expense. Here, we present the first use of the AIMNet2 neural network potential to calculate average BDFE (BDFEavg) values for the 2H+/2e− dehydrogenation of about 200,000 hydroquinone-like compounds, including vicinal diamines, diols, and dithiols. Benchmarking against DFT calculations for 168 substituted ortho-phenylenediamines (opda) shows good agreement (R² > 0.9). Our analysis finds that BDFEavg ranges from 50 to 80 kcal/mol and can be systematically tuned by modifying the backbone and N-substitution: electron-withdrawing groups raise BDFEavg by up to 15 kcal/mol, while lower aromaticity in furan and thiophene backbones decreases BDFEavg by approximately 10 kcal/mol compared to phenyl systems. We developed an additive "offset model" that allows separate investigation of backbone and sidechain effects. Validation through cyclic voltammetry and reactivity studies with quinone oxidants for selected compounds supports the computational results. This extensive thermochemical database and web-based prediction tool offer valuable resources for designing PCET reagents for catalysis, energy storage, and biomedical uses.Wed, 01 Jan 2025 00:00:00 GMTneural networkmachine learningSarma, Rajdeep et al.ANI-1xBB: An ANI-Based Reactive Potential for Small Organic Moleculeshttps://doi.org/10.1021/acs.jctc.5c00347https://doi.org/10.1021/acs.jctc.5c00347Zhang, Shuhao and Zubatyuk, Roman and Yang, Yinuo and Roitberg, Adrian and Isayev, Olexandr. Journal of Chemical Theory and Computation (2025)Wed, 01 Jan 2025 00:00:00 GMTZhang, Shuhao et al.Including Physics-Informed Atomization Constraints in Neural Networks for Reactive Chemistryhttps://doi.org/10.1021/acs.jcim.5c00341https://doi.org/10.1021/acs.jcim.5c00341Zhang, Shuhao and Chigaev, Michael and Isayev, Olexandr and Messerly, Richard A. and Lubbers, Nicholas. Journal of Chemical Information and Modeling (2025)Wed, 01 Jan 2025 00:00:00 GMTneural networkZhang, Shuhao et al.Discovery of Novel Celecoxib Polymorphs Using AIMNet2 Machine Learning Interatomic Potentialhttps://doi.org/10.26434/chemrxiv-2025-nhmr1https://doi.org/10.26434/chemrxiv-2025-nhmr1Polymorphism plays a pivotal role in defining the solid-state properties of pharmaceutical compounds, yet the discovery and accurate energy ranking of polymorphs remain a challenge. Here, we leverage a fine-tuned machine-learned interatomic potential AIMNet2 to explore the polymorphic landscape of celecoxib, a clinically important COX-2 inhibitor. Our approach combines GPU-accelerated crystal structure generation, active learning-guided model refinement, and quasi-harmonic free-energy corrections. The workflow successfully reproduces the experimental energy hierarchy of known polymorphs and identifies several novel low-energy structures with distinct packing motifs. In addition, we evaluate the elastic properties and thermal expansion effects across polymorphs, revealing structural features that underpin mechanical flexibility and thermodynamic preferences. This study demonstrates the power of AIMNet2-based crystal structure prediction for resolving complex pharmaceutical polymorphism and offers a powerful tool for future polymorph discovery and solid-state optimization.Wed, 01 Jan 2025 00:00:00 GMTmachine learningactive learningZheng, Peikun et al.High-throughput electronic property prediction of cyclic molecules with 3D-enhanced machine learninghttps://doi.org/10.1039/d5sc04079ehttps://doi.org/10.1039/d5sc04079eRing Vault contains 201 546 cyclic molecules across 11 elements. AIMNet2 with 3D information outperformed 2D models in predicting the electronic properties of cyclic molecules.Wed, 01 Jan 2025 00:00:00 GMTmachine learninghigh-throughputZheng, Peikun et al.Uncertainty-Aware Yield Prediction with Multimodal Molecular Featureshttps://doi.org/10.1609/aaai.v38i8.28668https://doi.org/10.1609/aaai.v38i8.28668Chen, Jiayuan and Guo, Kehan and Liu, Zhen and Isayev, Olexandr and Zhang, Xiangliang. AAAI Conference on Artificial Intelligence (2024)Mon, 01 Jan 2024 00:00:00 GMTuncertainty quantificationyield predictionsuch as molecular fingerprintsSMILES sequencesor molecular graphsmolecular graphsChen, Jiayuan et al.MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflowshttps://doi.org/10.1021/acs.jctc.3c01203https://doi.org/10.1021/acs.jctc.3c01203Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Barbatti, Mario and Isayev, Olexandr and Wang, Cheng and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Lina and Zhang, Shuang and Ullah, Arif and Zhang, Quanhao and Ou, Yanchi. J. Chem. Theory Comput. (2024)Mon, 01 Jan 2024 00:00:00 GMTmachine learningcomputational chemistryDral, Pavlo O. et al.In silico screening of LRRK2 WDR domain inhibitors using deep docking and free energy simulationshttps://doi.org/10.1039/d3sc06880chttps://doi.org/10.1039/d3sc06880cIn this work, we combined Deep Docking and free energy MD simulations for the in silico screening and experimental validation for potential inhibitors of leucine rich repeat kinase 2 (LRRK2) targeting the WD40 repeat (WDR) domain.Mon, 01 Jan 2024 00:00:00 GMTGutkin, Evgeny et al.ANI/EFP: Modeling Long-Range Interactions in ANI Neural Network with Effective Fragment Potentialshttps://doi.org/10.1021/acs.jctc.4c01052https://doi.org/10.1021/acs.jctc.4c01052Haghiri, Shahed and Viquez Rojas, Claudia and Bhat, Sriram and Isayev, Olexandr and Slipchenko, Lyudmila. Journal of Chemical Theory and Computation (2024)Mon, 01 Jan 2024 00:00:00 GMTneural networkHaghiri, Shahed et al.Discovery of Crystallizable Organic Semiconductors with Machine Learninghttps://doi.org/10.1021/jacs.4c05245https://doi.org/10.1021/jacs.4c05245Johnson, Holly M. and Gusev, Filipp and Dull, Jordan T. and Seo, Yejoon and Priestley, Rodney D. and Isayev, Olexandr and Rand, Barry P.. J. Am. Chem. Soc. (2024)Mon, 01 Jan 2024 00:00:00 GMTmachine learningorganic semiconductorsmaterials discoveryJohnson, Holly M. et al.De novo molecule design towards biased properties via a deep generative framework and iterative transfer learninghttps://doi.org/10.1039/d3dd00210ahttps://doi.org/10.1039/d3dd00210aThe RRCGAN, validated through DFT, demonstrates success in generating chemically valid molecules targeting energy gap values with 75% of the generated molecules have RE of <20% of the targeted values.Mon, 01 Jan 2024 00:00:00 GMTgenerative modelsmolecular designtransfer learningSattari, Kianoosh et al.Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSARhttps://doi.org/10.1038/s41573-023-00832-0https://doi.org/10.1038/s41573-023-00832-0Tropsha, Alexander and Isayev, Olexandr and Varnek, Alexandre and Schneider, Gisbert and Cherkasov, Artem. Nat. Rev. Drug Discov. (2024)Mon, 01 Jan 2024 00:00:00 GMTQSARdeep learningdrug discoveryTropsha, Alexander et al.Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potentialhttps://doi.org/10.1038/s41557-023-01427-3https://doi.org/10.1038/s41557-023-01427-3Abstract Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP that is applicable to a broad range of reactive chemistry without the need for refitting. Here we develop a general reactive MLIP (ANI-1xnr) through automated sampling of condensed-phase reactions. ANI-1xnr is then applied to study five distinct systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early earth small molecules. In all studies, ANI-1xnr closely matches experiment (when available) and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N and O elements in the condensed phase, enabling high-throughput in silico reactive chemistry experimentation.Mon, 01 Jan 2024 00:00:00 GMTmachine learningab initiohigh-throughputZhang, Shuhao et al.Generative Models as an Emerging Paradigm in the Chemical Scienceshttps://doi.org/10.1021/jacs.2c13467https://doi.org/10.1021/jacs.2c13467Anstine, Dylan M. and Isayev, Olexandr. J. Am. Chem. Soc. (2023)Sun, 01 Jan 2023 00:00:00 GMTgenerative modelschemical sciencesdeep learningAnstine, Dylan M. et al.Machine Learning Interatomic Potentials and Long-Range Physicshttps://doi.org/10.1021/acs.jpca.2c06778https://doi.org/10.1021/acs.jpca.2c06778Anstine, Dylan M. and Isayev, Olexandr. J. Phys. Chem. A (2023)Sun, 01 Jan 2023 00:00:00 GMTmachine learning potentialslong-range interactionsphysicsAnstine, Dylan M. et al.Themed collection on Insightful Machine Learning for Physical Chemistryhttps://doi.org/10.1039/d3cp90129ghttps://doi.org/10.1039/d3cp90129gThis themed collection includes a collection of articles on Insightful Machine Learning for Physical Chemistry.Sun, 01 Jan 2023 00:00:00 GMTmachine learningClark, Aurora E. et al.Synergy of semiempirical models and machine learning in computational chemistryhttps://doi.org/10.1063/5.0151833https://doi.org/10.1063/5.0151833Catalyzed by enormous success in the industrial sector, many research programs have been exploring data-driven, machine learning approaches. Performance can be poor when the model is extrapolated to new regions of chemical space, e.g., new bonding types, new many-body interactions. Another important limitation is the spatial locality assumption in model architecture, and this limitation cannot be overcome with larger or more diverse datasets. The outlined challenges are primarily associated with the lack of electronic structure information in surrogate models such as interatomic potentials. Given the fast development of machine learning and computational chemistry methods, we expect some limitations of surrogate models to be addressed in the near future; nevertheless spatial locality assumption will likely remain a limiting factor for their transferability. Here, we suggest focusing on an equally important effort—design of physics-informed models that leverage the domain knowledge and employ machine learning only as a corrective tool. In the context of material science, we will focus on semi-empirical quantum mechanics, using machine learning to predict corrections to the reduced-order Hamiltonian model parameters. The resulting models are broadly applicable, retain the speed of semiempirical chemistry, and frequently achieve accuracy on par with much more expensive ab initio calculations. These early results indicate that future work, in which machine learning and quantum chemistry methods are developed jointly, may provide the best of all worlds for chemistry applications that demand both high accuracy and high numerical efficiency.Sun, 01 Jan 2023 00:00:00 GMTsemiempirical methodsmachine learningFedik, Nikita et al.Active Learning Guided Drug Design Lead Optimization Based on Relative Binding Free Energy Modelinghttps://doi.org/10.1021/acs.jcim.2c01052https://doi.org/10.1021/acs.jcim.2c01052Gusev, Filipp and Gutkin, Evgeny and Kurnikova, Maria G. and Isayev, Olexandr. J. Chem. Inf. Model. (2023)Sun, 01 Jan 2023 00:00:00 GMTactive learningdrug designbinding free energyGusev, Filipp et al.Scalable hybrid deep neural networks/polarizable potentials biomolecular simulations including long-range effectshttps://doi.org/10.1039/d2sc04815ahttps://doi.org/10.1039/d2sc04815aDeep-HP is a scalable extension of the Tinker-HP multi-GPU molecular dynamics (MD) package enabling the use of Pytorch/TensorFlow Deep Neural Network (DNN) models.Sun, 01 Jan 2023 00:00:00 GMTneural networkspolarizable potentialsJaffrelot Inizan, Th{\'e}o et al.The challenge of balancing model sensitivity and robustness in predicting yields: a benchmarking study of amide coupling reactionshttps://doi.org/10.1039/d3sc03902ahttps://doi.org/10.1039/d3sc03902aA sensitive model captures the reactivity cliffs but overfit to yield outliers. On the other hand, a robust model disregards the yield outliers but underfits the reactivity cliffs.Sun, 01 Jan 2023 00:00:00 GMTyield predictionbenchmarkingLiu, Zhen et al.Structure Prediction of Epitaxial Organic Interfaces with Ogre, Demonstrated for Tetracyanoquinodimethane (TCNQ) on Tetrathiafulvalene (TTF)https://doi.org/10.1021/acs.jpcc.3c02384https://doi.org/10.1021/acs.jpcc.3c02384Moayedpour, Saeed and Bier, Imanuel and Wen, Wen and Dardzinski, Derek and Isayev, Olexandr and Marom, Noa. J. Phys. Chem. C (2023)Sun, 01 Jan 2023 00:00:00 GMTstructure predictionorganic interfacesMoayedpour, Saeed et al.Comprehensive exploration of graphically defined reaction spaceshttps://doi.org/10.1038/s41597-023-02043-zhttps://doi.org/10.1038/s41597-023-02043-zZhao, Qiyuan and Vaddadi, Sai Mahit and Woulfe, Michael and Ogunfowora, Lawal A. and Garimella, Sanjay S. and Isayev, Olexandr and Savoie, Brett M.. Sci. Data (2023)Sun, 01 Jan 2023 00:00:00 GMTreaction spacesdata scienceactivation energyheat of reactionreactant and product geometriesfrequencies032 reactionsZhao, Qiyuan et al.$Δ^2$ machine learning for reaction property predictionhttps://doi.org/10.1039/d3sc02408chttps://doi.org/10.1039/d3sc02408cNewly developed Δ 2 -learning models enable state-of-the-art accuracy in predicting the properties of chemical reactions.Sun, 01 Jan 2023 00:00:00 GMTdelta learningreaction predictionZhao, Qiyuan et al.Extending machine learning beyond interatomic potentials for predicting molecular propertieshttps://doi.org/10.1038/s41570-022-00416-3https://doi.org/10.1038/s41570-022-00416-3Fedik, Nikita and Zubatyuk, Roman and Kulichenko, Maksim and Lubbers, Nicholas and Smith, Justin S. and Nebgen, Benjamin and Messerly, Richard and Li, Ying Wai and Boldyrev, Alexander I. and Barros, Kipton and Isayev, Olexandr and Tretiak, Sergei. Nat. Rev. Chem. (2022)Sat, 01 Jan 2022 00:00:00 GMTmachine learningmolecular propertiesFedik, Nikita et al.Learning molecular potentials with neural networkshttps://doi.org/10.1002/wcms.1564https://doi.org/10.1002/wcms.1564AbstractThe potential energy of molecular species and their conformers can be computed with a wide range of computational chemistry methods, from molecular mechanics to ab initio quantum chemistry. However, the proper choice of the computational approach based on computational cost and reliability of calculated energies is a dilemma, especially for large molecules. This dilemma is proved to be even more problematic for studies that require hundreds and thousands of calculations, such as drug discovery. On the other hand, driven by their pattern recognition capabilities, neural networks started to gain popularity in the computational chemistry community. During the last decade, many neural network potentials have been developed to predict a variety of chemical information of different systems. Neural network potentials are proved to predict chemical properties with accuracy comparable to quantum mechanical approaches but with the cost approaching molecular mechanics calculations. As a result, the development of more reliable, transferable, and extensible neural network potentials became an attractive field of study for researchers. In this review, we outlined an overview of the status of current neural network potentials and strategies to improve their accuracy. We provide recent examples of studies that prove the applicability of these potentials. We also discuss the capabilities and shortcomings of the current models and the challenges and future aspects of their development and applications. It is expected that this review would provide guidance for the development of neural network potentials and the exploitation of their applicability.This article is categorized under:Data Science > Artificial Intelligence/Machine LearningMolecular and Statistical Mechanics > Molecular InteractionsSoftware > Molecular ModelingSat, 01 Jan 2022 00:00:00 GMTneural networksmolecular potentialsreviewGokcan, Hatice et al.Simulations of Pathogenic E1α Variants: Allostery and Impact on Pyruvate Dehydrogenase Complex-E1 Structure and Functionhttps://doi.org/10.1021/acs.jcim.2c00630https://doi.org/10.1021/acs.jcim.2c00630Gokcan, Hatice and Bedoyan, Jirair K. and Isayev, Olexandr. Journal of Chemical Information and Modeling (2022)Sat, 01 Jan 2022 00:00:00 GMTGokcan, Hatice et al.Prediction of Protein pKa with Representation Learninghttps://doi.org/10.26434/chemrxiv-2021-tcn0f-v2https://doi.org/10.26434/chemrxiv-2021-tcn0f-v2The behavior of proteins is closely related to the protonation states of the residues. Therefore, prediction and measurement of pKa are essential to understand the basic functions of proteins. In this work, we develop a new empirical scheme for protein pKa prediction that is based on deep representation learning. It combines machine learning with atomic environment vector (AEV) and learned quantum mechanical representation from ANI-2x neural network potential (J. Chem. Theory Comput. 2020, 16, 4192). The scheme requires only the coordinate information of a protein as the input and separately estimates the pKa for all five titratable amino acid types. The accuracy of the approach was analyzed with both cross-validation and an external test set of proteins. Obtained results were compared with the widely used empirical approach PROPKA. The new empirical model provides accuracy with MAEs below 0.5 for all amino acid types. It surpasses the accuracy of PROPKA and performs significantly better than the null model. Our model is also sensitive to the local conformational changes and molecular interactions.Sat, 01 Jan 2022 00:00:00 GMTneural networkmachine learningGokcan, Hatice et al.Prediction of Protein pKa with Representation Learninghttps://doi.org/10.26434/chemrxiv-2021-tcn0f-v2https://doi.org/10.26434/chemrxiv-2021-tcn0f-v2The behavior of proteins is closely related to the protonation states of the residues. Therefore, prediction and measurement of pKa are essential to understand the basic functions of proteins. In this work, we develop a new empirical scheme for protein pKa prediction that is based on deep representation learning. It combines machine learning with atomic environment vector (AEV) and learned quantum mechanical representation from ANI-2x neural network potential (J. Chem. Theory Comput. 2020, 16, 4192). The scheme requires only the coordinate information of a protein as the input and separately estimates the pKa for all five titratable amino acid types. The accuracy of the approach was analyzed with both cross-validation and an external test set of proteins. Obtained results were compared with the widely used empirical approach PROPKA. The new empirical model provides accuracy with MAEs below 0.5 for all amino acid types. It surpasses the accuracy of PROPKA and performs significantly better than the null model. Our model is also sensitive to the local conformational changes and molecular interactions.Sat, 01 Jan 2022 00:00:00 GMTneural networkmachine learningGokcan, Hatice et al.Prediction of protein pKawith representation learninghttps://doi.org/10.1039/d1sc05610ghttps://doi.org/10.1039/d1sc05610gWe developed new empirical ML model for protein pKaprediction with MAEs below 0.5 for all amino acid types.Sat, 01 Jan 2022 00:00:00 GMTGokcan, Hatice et al.Generative and reinforcement learning approaches for the automated de novo design of bioactive compoundshttps://doi.org/10.1038/s42004-022-00733-0https://doi.org/10.1038/s42004-022-00733-0AbstractDeep generative neural networks have been used increasingly in computational chemistry for de novo design of molecules with desired properties. Many deep learning approaches employ reinforcement learning for optimizing the target properties of the generated molecules. However, the success of this approach is often hampered by the problem of sparse rewards as the majority of the generated molecules are expectedly predicted as inactives. We propose several technical innovations to address this problem and improve the balance between exploration and exploitation modes in reinforcement learning. In a proof-of-concept study, we demonstrate the application of the deep generative recurrent neural network architecture enhanced by several proposed technical tricks to design inhibitors of the epidermal growth factor (EGFR) and further experimentally validate their potency. The proposed technical solutions are expected to substantially improve the success rate of finding novel bioactive compounds for specific biological targets using generative and reinforcement learning approaches.Sat, 01 Jan 2022 00:00:00 GMTneural networkdeep learningKorshunova, Maria et al.Roadmap on Machine learning in electronic structurehttps://doi.org/10.1088/2516-1075/ac572fhttps://doi.org/10.1088/2516-1075/ac572fKulik, H. J. and Hammerschmidt, T. and Schmidt, J. and Botti, S. and Marques, M. A. L. and Boley, M. and Scheffler, M. and Todorovi{\'c}, M. and Rinke, P. and Oses, C. and Smolyanyuk, A. and Curtarolo, S. and Tkatchenko, A. and Bart{\'o}k, A. P. and Manzhos, S. and Ihara, M. and Carrington, T. and Behler, J. and Isayev, O. and Veit, M. and Grisafi, A. and Nigam, J. and Ceriotti, M. and Sch{\"u}tt, K. T. and Westermayr, J. and Gastegger, M. and Maurer, R. J. and Kalita, B. and Burke, K. and Nagai, R. and Akashi, R. and Sugino, O. and Hermann, J. and No{\'e}, F. and Pilati, S. and Draxl, C. and Kuban, M. and Rigamonti, S. and Scheidgen, M. and Esters, M. and Hicks, D. and Toher, C. and Balachandran, P. V. and Tamblyn, I. and Whitelam, S. and Bellinger, C. and Ghiringhelli, L. M.. Electron. Struct. (2022)Sat, 01 Jan 2022 00:00:00 GMTmachine learningelectronic structuretraditional methodsextendedsimplerthis Roadmap articleKulik, H. J. et al.Auto3D: Automatic Generation of the Low-Energy 3D Structures with ANI Neural Network Potentialshttps://doi.org/10.1021/acs.jcim.2c00817https://doi.org/10.1021/acs.jcim.2c00817Liu, Zhen and Zubatiuk, Tetiana and Roitberg, Adrian and Isayev, Olexandr. J. Chem. Inf. Model. (2022)Sat, 01 Jan 2022 00:00:00 GMT3D structure generationneural networksconformersLiu, Zhen et al.