article Machine Learning Potentials Experiment Automation Drug Discovery Materials Informatics

Discovery of Novel Celecoxib Polymorphs Using AIMNet2 Machine Learning Interatomic Potential

Peikun Zheng, Yuriy Abramov, Changquan Calvin Sun, Olexandr Isayev

2025

Highlight

Polymorphism plays a pivotal role in defining the solid-state properties of pharmaceutical compounds, yet the discovery and accurate energy ranking of polymorphs remain a challenge.

Abstract

Polymorphism plays a pivotal role in defining the solid-state properties of pharmaceutical compounds, yet the discovery and accurate energy ranking of polymorphs remain a challenge. Here, we leverage a fine-tuned machine-learned interatomic potential AIMNet2 to explore the polymorphic landscape of celecoxib, a clinically important COX-2 inhibitor. Our approach combines GPU-accelerated crystal structure generation, active learning-guided model refinement, and quasi-harmonic free-energy corrections. The workflow successfully reproduces the experimental energy hierarchy of known polymorphs and identifies several novel low-energy structures with distinct packing motifs. In addition, we evaluate the elastic properties and thermal expansion effects across polymorphs, revealing structural features that underpin mechanical flexibility and thermodynamic preferences. This study demonstrates the power of AIMNet2-based crystal structure prediction for resolving complex pharmaceutical polymorphism and offers a powerful tool for future polymorph discovery and solid-state optimization.

Keywords

Cite This Paper

@article{Zheng2025,
  author = {Zheng, Peikun and Abramov, Yuriy and Sun, Changquan Calvin and Isayev, Olexandr},
  title = {Discovery of Novel Celecoxib Polymorphs Using AIMNet2 Machine Learning Interatomic Potential},
  year = {2025},
  doi = {10.26434/chemrxiv-2025-nhmr1},
  url = {http://dx.doi.org/10.26434/chemrxiv-2025-nhmr1},
  publisher = {American Chemical Society (ACS)},
  keywords = {machine learning, active learning},
  researchAreas = {ml-potentials, experiment-automation, drug-discovery, materials-informatics},
  highlight = {Polymorphism plays a pivotal role in defining the solid-state properties of pharmaceutical compounds, yet the discovery and accurate energy ranking of polymorphs remain a challenge.}
}

Related Research Areas

Related Publications

2025
cited3

Transferable Machine Learning Interatomic Potential for Pd-Catalyzed Cross-Coupling Reactions

Anstine D., Zubatyuk R., Gallegos L., Paton R., Wiest O., Nebgen B., Jones T., Gomes G., Tretiak S., Isayev O.

(2025)

Ml Potentials
Reactions Reactivity
Experiment Automation
Materials Informatics

Finding efficient substrate-catalyst combinations for palladium-catalyzed cross-coupling reactions remains a critical challenge in synthetic chemistry, with broad implications for pharmaceutical and materials manufacturing.

DOI
2020
cited212

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

Smith J. S., Zubatyuk R., Nebgen B., Lubbers N., Barros K., Roitberg A. E., Isayev O., Tretiak S.

Scientific Data, 7 (2020)

Ml Potentials
Quantum Chemistry
Experiment Automation

Abstract Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models.

DOI
2024
cited85

Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential

Zhang S., Makoś M. Z., Jadrich R. B., Kraka E., Barros K., Nebgen B. T., Tretiak S., Isayev O., Lubbers N., Messerly R. A., Smith J. S.

Nature Chemistry, 16, 727–734 (2024)

Ml Potentials
Experiment Automation

Abstract Atomistic simulation has a broad range of applications from drug design to materials discovery.

DOI
2022
cited4

Active learning guided drug design lead optimization based on relative binding free energy modeling

Gusev F., Gutkin E., Kurnikova M. G., Isayev O.

(2022)

Drug Discovery
Experiment Automation

In silico identification of potent protein inhibitors commonly requires prediction of a ligand binding free energy (BFE).

DOI
2020
cited2

The ANI-1ccx and ANI-1x Data Sets, Coupled-Cluster and Density Functional Theory Properties for Molecules

Smith J. S., Zubatyuk R., Nebgen B. T., Lubbers N., Barros K., Roitberg A., Isayev O., Tretiak S.

(2020)

Quantum Chemistry
Ml Potentials
Experiment Automation

Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models.

DOI