Pharmacokinetic-Pharmacodynamic informed labelling for Machine Learning classification tasks: predicting severe neutropenia using Real World Data

Introduction: Real-world data (RWD) from electronic health records provide a rich source of information about patient trajectories and outcomes. Machine learning (ML) models can make use of these large datasets to identify significant features and patterns associated with drug effect or safety. However using clinical RWD in ML modelling can be challenging due to sparse, imbalanced and missing data. Pharmacokinetic-pharmacodynamic (PKPD) modelling can provide biological and mechanistic basis in a prediction task. One opportunity to combine the approaches is to use PKPD-informed labels for ML classification. PKPD models use individual and population level information to make a prediction at important time points, even when there may not be a direct observation. This can help to label more patients, and amplify the signal from relevant features across the population. The objective of this study was to develop ML models using training datasets based on PKPD predictions and compare performance with models based only on available observations. Severe neutropenia following docetaxel administration was selected as a case example.

Methods: Data were available from oncology electronic health records at the Helsinki University Hospital (HUS), structured in the OMOP common data model. Dose and prior information about docetaxel pharmacokinetics were used to predict drug exposure, driving neutropenia drug effect [1]. The Friberg model was used to describe the neutrophils dynamics following docetaxel [2]. The PKPD analysis was performed using NONMEM (ICON Development Solutions, Maryland, USA) version 7.5.1 and Perl Speaks NONMEM version 5.5.0. Model evaluation involved Visual Predictive Checks and non-parametric bootstrapping to assess parameter uncertainty.

The ML model used the logistic regression and XGBoost algorithms for the severe neutropenia prediction task. Only baseline information prior to the first dose was used (no on-treatment data). Two labelling approaches were assessed. The first was the ‘naive’ approach based on observations. Patients with a neutrophil observation below < 0.1 cells x 109/L were labelled as positive. The negative label requires assumptions about when the severe neutropenia will occur (e.g. a neutrophil observation >0.5 cells x 109/L, between 6-8 days after the docetaxel dose). The second approach was informed by the PKPD prediction at the individual patient’s nadir and did not require an observation in a specific window. The ML models were trained and evaluated using naive or PKPD informed training sets and a common validation set. Between 200 – 4000 potential training samples were evaluated for each algorithm and then repeated with randomly sampled hyperparameters (n=200). Both AUC-ROC and AUC-PR were used to assess predictive performance.

Results: After application of exclusion criteria and data cleaning, data from 4477 patients from the HUS database were available for analysis. The population PD parameter estimates (relative standard error %) were, baseline neutrophils 4.14 cells x 109/L (7.3 %), mean transit time 2.4 days (5.1 %), Hill exponent 0.12 (17 %), drug effect slope 13.8 (8.3 %). A combined (additive and proportional) error model was used to describe the residual error. For the logistic regression models, the median AUC-ROC was increased by 0.16 and AUC-PR was increased by 0.24 (both p < 0.001, Wilcoxon Signed-Rank test) across all numbers of training samples tested, when using the PKPD approach compared to the naive approach. For XGBoost, the median AUC-ROC increased by between 0.11 over the range of training samples tested and AUC-PR was similarly increased by between 0.20 (both p < 0.001, Wilcoxon Signed-Rank test). These systematic increases reflect the increased number of patients able to be labelled using the PKPD approach, as well as the label quality. 

Conclusion: Machine learning model performance (AUC-ROC and AUC-PR) increased when using the PKPD labelling approach compared to the naive approach, across both logistic regression and XGBoost algorithms. Using a PKPD labelling approach for the training dataset enabled more patients to be labelled. This framework leveraged RWD, mechanism based PKPD model predictions and ML algorithms and could be generalised to other interventions and patient outcomes.