**Context:** Although population pharmacokinetic and/or pharmacodynamic model evaluation is highly recommended by regulatory authorities, there is no consensus today on the appropriate approach to assess a population model. We have also shown in a recent literature survey [1] that model evaluation was not appropriately performed in most published population pharmacokinetic-pharmacodynamic analyses. Among the different approaches proposed to evaluate a population model on an external validation dataset, standardised prediction errors (computed in NONMEM through WRES) are frequently used, but they are computed using a first-order approximation. We developed a metric called Normalized Prediction Distribution Errors (NPDE) based on the whole predictive distribution, and we propose different tests and graphs. We assess by simulation the statistical properties of NPDE for the evaluation of a population pharmacokinetic model in comparison with WRES. We also evaluate the ability of NPDE to detect misspecification in the covariate model.

**Methods:** The null hypothesis (H_{0}) is that data in the validation dataset can be described by a given model. For each observation, we define the prediction discrepancy as the percentile of this observation in the whole marginal predictive distribution under H_{0}^{ }[2]. The predictive distribution is obtained through Monte Carlo simulations as for Visual Predictive Checks. As prediction discrepancies are correlated within an individual, we use the predicted mean and variance of observations estimated empirically from simulations to obtain uncorrelated metrics [3]. NPDE are then obtained using the inverse function of the normal cumulative density function. By construction, if H_{0}is true, NPDE follow a N (0, 1) distribution without any approximation and are uncorrelated within an individual. This can be tested using the Kolmogorov-Smirnov test.

The model used for simulations is a one compartment model with first order absorption built from two phase II and one phase III studies. We simulate with this model (without covariates) 1000 external validation datasets according to the design of another phase III study and calculate NPDE and WRES for these simulated datasets. We evaluate the type I error of the Kolmogorov-Smirov test for these two metrics.

In a second step, we consider covariate models and investigate the application of NPDE to these models. We use covariate models of a real phase III study and we generate several validation datasets under H_{0} and under alternative assumptions without covariate, with one continuous covariate (weight) or with one categorical covariate (sex). We propose two approaches to evaluate models with covariate by using NPDE. The first approach consists in using the Spearman correlation test or the Wilcoxon test, to test the relationship between NPDE and weight or sex, respectively. The second approach consists in testing whether the NPDE follow a N(0, 1) distribution after splitting them in different groups of values of the covariates.

**Results**: The simulations under H_{0} show a high type I error for the Kolmogorov-Smirnov test applied to WRES, but this test presents a type I error close to 5% for NPDE. Regarding the application of NPDE to covariate models, Spearman and Wilcoxon tests were not significant when model and validation datasets were consistent. When the model and validation dataset were not consistent, these different tests showed a significant correlation between NPDE and the covariate. We also find the same results by using the Kolmogorov-Smirnov test after splitting NPDE by covariates.

**Conclusion:** We recommend the use of NPDE over WRES for model evaluation. NPDE do not depend on an approximation of the model and have good statistical properties. They can be viewed as a metric that quantifies the visual predective check. NPDE thus appear as a good tool to evaluate models, with or without covariates.

**References**

[1] Brendel^{ }K., Dartois^{ }C., Comets^{ }E., Lemenuel-Diot^{ }A., Christian Laveille, Tranchand^{ }B., Girard^{ }P., Laffont^{ }C.M., Mentré^{ }F. *Clin Pharmacokinet* 2007; 46 (3): 1(In press).

[2] Mentré F., S. Escolano S. Prediction discrepancies for the evaluation of nonlinear mixed-effects models. *J Pharmacokinet Pharmacodyn*. **33**(3):345-367 (2006).

[3] Brendel K., Comets E., Laffont C., Laveille C., Mentré F. Metrics for external model evaluation with an application to the population pharmacokinetics of gliclazide. *Pharm Res*. **23**(9):2036-2049 (2006).