Machine Learning anomaly detection of automated HPLC experiments in the Cloud Laboratory
Filipp Gusev, Benjamin C Kline, Ryan Quinn, Anqin Xu, Ben Smith, Brian Frezza, Olexandr Isayev
Highlight
Automation of experiments in cloud laboratories promises to revolutionize scientific research by enabling remote experimentation and improving reproducibility.
Abstract
Automation of experiments in cloud laboratories promises to revolutionize scientific research by enabling remote experimentation and improving reproducibility. However, maintaining quality control without constant human oversight remains a critical challenge. Here, we present a novel machine learning framework for automated anomaly detection in High-Performance Liquid Chromatography (HPLC) experiments conducted in a cloud lab. Our system specifically targets air bubble contamination—a common yet challenging issue that typically requires expert analytical chemists to detect and resolve. By leveraging active learning combined with human-in-the-loop annotation, we trained a binary classifier on approximately 25,000 HPLC traces. Prospective validation demonstrated robust performance, with an accuracy of 0.96 and an F1 score of 0.92, suitable for real-world applications. Beyond anomaly detection, we show that the system can serve as a sensitive indicator of instrument health, outperforming traditional periodic qualification tests in identifying systematic issues. The framework is protocol-agnostic, instrument-agnostic, and vendor-neutral, making it adaptable to various laboratory settings. This work represents a significant step toward fully autonomous laboratories by enabling continuous quality control, reducing the expertise barrier for complex analytical techniques, and facilitating proactive maintenance of scientific instrumentation. The approach can be extended to detect other types of experimental anomalies, potentially transforming how quality control is implemented in self-driving laboratories (SDLs) across diverse scientific disciplines.
Keywords
Cite This Paper
@article{Gusev2025,
author = {Gusev, Filipp and Kline, Benjamin C and Quinn, Ryan and Xu, Anqin and Smith, Ben and Frezza, Brian and Isayev, Olexandr},
title = {Machine Learning anomaly detection of automated HPLC experiments in the Cloud Laboratory},
year = {2025},
doi = {10.26434/chemrxiv-2025-7ggzl},
url = {http://dx.doi.org/10.26434/chemrxiv-2025-7ggzl},
publisher = {American Chemical Society (ACS)},
keywords = {machine learning, active learning, automation},
researchAreas = {experiment-automation},
highlight = {Automation of experiments in cloud laboratories promises to revolutionize scientific research by enabling remote experimentation and improving reproducibility.},
citations = {2}
} Copied to clipboard!
Related Research Areas
Related Publications
The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules
Scientific Data, 7 (2020)
Abstract Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models.
Less is more: Sampling chemical space with active learning
The Journal of Chemical Physics, 148 (2018)
The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task.
Discovery of Novel Celecoxib Polymorphs Using AIMNet2 Machine Learning Interatomic Potential
(2025)
Polymorphism plays a pivotal role in defining the solid-state properties of pharmaceutical compounds, yet the discovery and accurate energy ranking of polymorphs remain a challenge.
Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential
Nature Chemistry, 16, 727–734 (2024)
Abstract Atomistic simulation has a broad range of applications from drug design to materials discovery.
Active Learning in Bayesian Neural Networks for Bandgap Predictions of Novel Van der Waals Heterostructures
Advanced Intelligent Systems, 3 (2021)
The bandgap is one of the most fundamental properties of condensed matter.