babelmixr2, nlmixr2 and NONMEM

By Matt Fidler and the nlmixr2 Development Team in babelmixr2

November 11, 2022

I remember attending a virtual ACoP where Tim Waterhouse said “This person is so convincing that the could sell NONMEM to a nlmixr developer”. I was in the wrong meeting so I laughed and connected to the correct meeting.

While he is correct, I don’t really want to purchase a NONMEM license, and I would think that individual pharmacometricians are the same: they don’t want to buy a personal license for the software they use at work (although CROs might be different here).

That being said, I have used NONMEM long before I helped develop nlmixr2, and I’ve always appreciated all that NONMEM brings to the pharmacometrics community. I remember when I ran my first NONMEM model I was amazed and wondered how it could calculate both individual and population effects of a complicated system.

I still think NONMEM has an important role in our pharmacometrics ecosystem today.

Still, our vision stands:

To develop an R-based open-source nonlinear mixed-effects modeling software that can compete with commercial tools and is suitable for regulatory submissions.

which means it would be really convenient to have an interface to convert nlmixr2 models to NONMEM, and other tools, to make everyone’s lives easier.

With this in mind, I am proud to announce the first nlmixr2 to NONMEM translator in babelmixr2.

While this has been done before, the method whereby we are converting between the two is novel and has some surprising advantages.

How to use `NONMEM` with `nlmixr2`

To use NONMEM in nlmixr, you do not need to change your data or your nlmixr2 dataset. babelmixr2 will do the heavy lifting here.

Lets take the classic warfarin example to start the comparison with…

The model we use in the nlmixr2 vignettes is:

library(babelmixr2)
# The nonmem translation requires the package pmxTools as well.
# You do not need to load it, simply have it available for use.
pk.turnover.emax3 <- function() {
  ini({
    tktr <- log(1)
    tka <- log(1)
    tcl <- log(0.1)
    tv <- log(10)
    ##
    eta.ktr ~ 1
    eta.ka ~ 1
    eta.cl ~ 2
    eta.v ~ 1
    prop.err <- 0.1
    pkadd.err <- 0.1
    ##
    temax <- logit(0.8)
    tec50 <- log(0.5)
    tkout <- log(0.05)
    te0 <- log(100)
    ##
    eta.emax ~ .5
    eta.ec50  ~ .5
    eta.kout ~ .5
    eta.e0 ~ .5
    ##
    pdadd.err <- 10
  })
  model({
    ktr <- exp(tktr + eta.ktr)
    ka <- exp(tka + eta.ka)
    cl <- exp(tcl + eta.cl)
    v <- exp(tv + eta.v)
    emax = expit(temax+eta.emax)
    ec50 =  exp(tec50 + eta.ec50)
    kout = exp(tkout + eta.kout)
    e0 = exp(te0 + eta.e0)
    ##
    DCP = center/v
    PD=1-emax*DCP/(ec50+DCP)
    ##
    effect(0) = e0
    kin = e0*kout
    ##
    d/dt(depot) = -ktr * depot
    d/dt(gut) =  ktr * depot -ka * gut
    d/dt(center) =  ka * gut - cl / v * center
    d/dt(effect) = kin*PD -kout*effect
    ##
    cp = center / v
    cp ~ prop(prop.err) + add(pkadd.err)
    effect ~ add(pdadd.err) | pca
  })
}

Next you have to figure out the command to run NONMEM (it is often useful to use the full command path). You can set it in options("babelmixr2.nonmem"="nmfe743") or use nonmemControl(runCommand="nmfe743"). I prefer the options() method since you only need to set it once. This could also be a function if you prefer (but I will not cover using the function here).

Lets assume you have NONMEM setup appropriately. To run the nlmixr2 model using NONMEM you simply can run it directly:

testthat::expect_error(nlmixr(pk.turnover.emax3, nlmixr2data::warfarin, "nonmem",
                              nonmemControl(readRounding=FALSE, modelName="pk.turnover.emax3")))

## 
## 
##  WARNINGS AND ERRORS (IF ANY) FOR PROBLEM    1
## 
##  (WARNING  2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
## 
## 
## 0MINIMIZATION TERMINATED
##  DUE TO ROUNDING ERRORS (ERROR=134)
##  NO. OF FUNCTION EVALUATIONS USED:     1088
##  NO. OF SIG. DIGITS UNREPORTABLE
## 0PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
## 
## nonmem model: 'pk.turnover.emax3-nonmem/pk.turnover.emax3.nmctl'

## → terminated with rounding errors, can force nlmixr2/rxode2 to read with nonmemControl(readRounding=TRUE)

## Error : nonmem minimization not successful

Note that a few options you may note in the nonmemControl() here is modelName which helps control the output directory of NONMEM (if not specified babelmixr2 tries to guess based on the model name based on the input).

Now if you wanted, you could do the standard approach of changing sigdig, sigl, tol etc, to get a successful NONMEM model convergence, of course that is supported.

One of the other approaches is to ignore the rounding errors that have occurred and read into nlmixr2 anyway:

# Can still load the model to get information (possibly pipe) and create a new model
f <- nlmixr(pk.turnover.emax3, nlmixr2data::warfarin, "nonmem",
            nonmemControl(readRounding=TRUE, modelName="pk.turnover.emax3"))

## → loading into symengine environment...

## → pruning branches (`if`/`else`) of full model...

## ✔ done

## → finding duplicate expressions in EBE model...

## [====|====|====|====|====|====|====|====|====|====] 0:00:00

## → optimizing duplicate expressions in EBE model...

## [====|====|====|====|====|====|====|====|====|====] 0:00:00

## → compiling EBE model...

## ✔ done

## → Calculating residuals/tables

## ✔ done

## → compress origData in nlmixr2 object, save 27560

## → compress parHist in nlmixr2 object, save 4760

You may see more work happening than you expected to need for an already completed model. When reading in a NONMEM model, babelmixr2 grabs:

NONMEM’s objective function value
NONMEM’s covariance (if available)
NONMEM’s optimization history
NONMEM’s final parameter estimates (including the ETAs)
NONMEM’s PRED and IPRED values (for validation purposes)

These are used to solve the ODEs as if they came from an nlmixr2 optimization procedure.

This means that you can compare the IPRED and PRED values of nlmixr2/rxode2 and know immediately if your model validates. This is similar to the procedure Kyle Baron advocates for validating a NONMEM model against a mrgsolve model (see https://mrgsolve.org/blog/posts/2022-05-validate-translation/).

The advantage of this method is that you need to simply write one model to get a validated roxde2/nlmixr2 model.

In this case you can see the validation when you print the fit object:

print(f)

## ── nlmixr² nonmem ver 7.4.3 ──
## 
##                 OBJF      AIC      BIC Log-likelihood Condition Number
## nonmem focei 1326.91 2252.605 2332.025      -1107.302               NA
## 
## ── Time (sec $time): ──
## 
##         setup optimize covariance table other
## elapsed 0.004    0.001      0.001  0.06 6.914
## 
## ── Population Parameters ($parFixed or $parFixedDf): ──
## 
##                Est. Back-transformed BSV(CV% or SD) Shrink(SD)%
## tktr       6.24e-07                1           86.5      59.8% 
## tka       -3.01e-06                1           86.5      59.8% 
## tcl              -2            0.135           28.6      1.34% 
## tv             2.05             7.78           22.8      6.44% 
## prop.err     0.0986           0.0986                           
## pkadd.err     0.512            0.512                           
## temax          6.42            0.998           3.00      100.% 
## tec50         0.141             1.15           45.0      6.06% 
## tkout         -2.95           0.0522           9.16      32.4% 
## te0            4.57             96.6           5.24      18.1% 
## pdadd.err      3.72             3.72                           
##  
##   No correlations in between subject variability (BSV) matrix
##   Full BSV covariance ($omega) or correlation ($omegaR; diagonals=SDs) 
##   Distribution stats (mean/skewness/kurtosis/p-value) available in $shrink 
##   Information about run found ($runInfo):
##    • NONMEM terminated due to rounding errors, but reading into nlmixr2/rxode2 anyway 
##   Censoring ($censInformation): No censoring
##   Minimization message ($message):  
##     
## 
##  WARNINGS AND ERRORS (IF ANY) FOR PROBLEM    1
## 
##  (WARNING  2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
## 
##     
## 0MINIMIZATION TERMINATED
##  DUE TO ROUNDING ERRORS (ERROR=134)
##  NO. OF FUNCTION EVALUATIONS USED:     1088
##  NO. OF SIG. DIGITS UNREPORTABLE
## 0PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
## 
##     IPRED relative difference compared to Nonmem IPRED: 0%; 95% percentile: (0%,0%); rtol=7.3e-06
##     PRED relative difference compared to Nonmem PRED: 0%; 95% percentile: (0%,0%); rtol=6.57e-06
##     IPRED absolute difference compared to Nonmem IPRED: atol=7.97e-05; 95% percentile: (2.18e-06, 0.00064)
##     PRED absolute difference compared to Nonmem PRED: atol=6.57e-06; 95% percentile: (2.75e-07,0.00337)
##     there are solving errors during optimization (see '$prderr')
##     nonmem model: 'pk.turnover.emax3-nonmem/pk.turnover.emax3.nmctl' 
## 
## ── Fit Data (object is a modified tibble): ──
## # A tibble: 483 × 35
##   ID     TIME CMT      DV  PRED   RES IPRED   IRES  IWRES eta.ktr eta.ka eta.cl
##   <fct> <dbl> <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>   <dbl>  <dbl>  <dbl>
## 1 1       0.5 cp      0    1.16 -1.16 0.444 -0.444 -0.864  -0.506 -0.506  0.699
## 2 1       1   cp      1.9  3.37 -1.47 1.45   0.446  0.840  -0.506 -0.506  0.699
## 3 1       2   cp      3.3  7.51 -4.21 3.96  -0.660 -1.03   -0.506 -0.506  0.699
## # … with 480 more rows, and 23 more variables: eta.v <dbl>, eta.emax <dbl>,
## #   eta.ec50 <dbl>, eta.kout <dbl>, eta.e0 <dbl>, cp <dbl>, depot <dbl>,
## #   gut <dbl>, center <dbl>, effect <dbl>, ktr <dbl>, ka <dbl>, cl <dbl>,
## #   v <dbl>, emax <dbl>, ec50 <dbl>, kout <dbl>, e0 <dbl>, DCP <dbl>, PD <dbl>,
## #   kin <dbl>, tad <dbl>, dosenum <dbl>

That is in this case:

    IPRED relative difference compared to Nonmem IPRED: 0%; 95% percentile: (0%,0%); rtol=7.3e-06
    PRED relative difference compared to Nonmem PRED: 0%; 95% percentile: (0%,0%); rtol=6.57e-06
    IPRED absolute difference compared to Nonmem IPRED: atol=7.97e-05; 95% percentile: (2.18e-06, 0.00064)
    PRED absolute difference compared to Nonmem PRED: atol=6.57e-06; 95% percentile: (2.75e-07,0.00337)

Which means there are very few differences between the predictions of rxode2 and NONMEM, or this model is “validated”.

Since it is a nlmixr2 fit, you can do interesting things with this fit that you couldn’t do in NONMEM or even in another translator. For example, if you wanted to add a covariance step you can with getVarCov():

getVarCov(f)

## → loading into symengine environment...

## → pruning branches (`if`/`else`) of full model...

## ✔ done

## → calculate jacobian

## → calculate sensitivities

## → calculate ∂(f)/∂(η)

## → finding duplicate expressions in inner model...

## → optimizing duplicate expressions in inner model...

## → finding duplicate expressions in EBE model...

## → optimizing duplicate expressions in EBE model...

## → compiling inner model...

## ✔ done

## → finding duplicate expressions in FD model...

## → optimizing duplicate expressions in FD model...

## → compiling EBE model...

## ✔ done

## → compiling events FD model...

## ✔ done

## calculating covariance matrix

## Warning in foceiFitCpp_(.ret): using R matrix to calculate covariance, can check
## sandwich or S matrix with $covRS and $covS

## Warning in foceiFitCpp_(.ret): gradient problems with covariance; see $scaleInfo

## → compress origData in nlmixr2 object, save 27560

## Updated original fit object f

##                tktr           tka           tcl           tv         temax
## tktr   1.892598e-02 -1.582985e-02 -1.981657e-05 3.266277e-04  0.0015335469
## tka   -1.582985e-02  1.888286e-02 -2.652577e-05 3.175901e-04  0.0011916368
## tcl   -1.981657e-05 -2.652577e-05  2.505341e-04 1.152329e-05 -0.0008937098
## tv     3.266277e-04  3.175901e-04  1.152329e-05 3.202883e-04  0.0011777851
## temax  1.533547e-03  1.191637e-03 -8.937098e-04 1.177785e-03  7.6242618702
## tec50  1.333488e-04  1.435212e-04 -3.647821e-04 1.262144e-04  0.0490792404
## tkout  1.033562e-04  1.030440e-04 -9.918052e-05 1.201488e-04 -0.0189849996
## te0    1.506058e-05  1.176585e-05 -9.650248e-06 1.229662e-05 -0.0004769028
##               tec50         tkout           te0
## tktr   0.0001333488  1.033562e-04  1.506058e-05
## tka    0.0001435212  1.030440e-04  1.176585e-05
## tcl   -0.0003647821 -9.918052e-05 -9.650248e-06
## tv     0.0001262144  1.201488e-04  1.229662e-05
## temax  0.0490792404 -1.898500e-02 -4.769028e-04
## tec50  0.0018652677  1.582355e-04 -1.380255e-04
## tkout  0.0001582355  6.353965e-04  5.249358e-05
## te0   -0.0001380255  5.249358e-05  8.894088e-05

nlmixr2 is more generous in what constitutes a covariance step. The r,s covariance matrix is the “most” successful covariance step for focei, but the system will fall back to other methods if necessary.

While this covariance matrix is not r,s, and should be regarded with caution, it can still give us some clues on why this things are not working in NONMEM.

When examining the fit, you can see the shrinkage is high for temax, tktr and tka, so they could be dropped, makiing things more likely to converge in NONMEM.

If we use model piping to remove the parameters, the new run will start at the last model’s best estimates (saving a bunch of model development time).

In this case, I specify the output directory pk.turnover.emax4 with the control and get the following:

f2 <- f %>% model(ktr <- exp(tktr)) %>%
  model(ka <- exp(tka)) %>%
  model(emax = expit(temax)) %>%
  nlmixr(data=nlmixr2data::warfarin, est="nonmem",
         control=nonmemControl(readRounding=FALSE,
                               modelName="pk.turnover.emax4"))

## ! remove between subject variability `eta.ktr`

## ! remove between subject variability `eta.ka`

## ! remove between subject variability `eta.emax`

## → loading into symengine environment...

## → pruning branches (`if`/`else`) of full model...

## ✔ done

## → finding duplicate expressions in EBE model...

## → optimizing duplicate expressions in EBE model...

## → compiling EBE model...

## ✔ done

## → Calculating residuals/tables

## ✔ done

## → compress origData in nlmixr2 object, save 27560

## → compress parHist in nlmixr2 object, save 7448

You can see the NONMEM run is now successful and validates against the rxode2 model below:

f2

## ── nlmixr² nonmem ver 7.4.3 ──
## 
##                  OBJF      AIC      BIC Log-likelihood Condition Number
## nonmem focei 1418.923 2338.618 2405.498      -1153.309     1.852796e+16
## 
## ── Time (sec f2$time): ──
## 
##         setup table compress other
## elapsed 0.003  0.05     0.02 6.347
## 
## ── Population Parameters (f2$parFixed or f2$parFixedDf): ──
## 
##                Est.       SE     %RSE     Back-transformed(95%CI) BSV(CV%)
## tktr       6.24e-07 9.05e-05 1.45e+04                    1 (1, 1)         
## tka       -3.57e-06 0.000153 4.29e+03                    1 (1, 1)         
## tcl           -1.99   0.0639      3.2         0.136 (0.12, 0.154)     27.6
## tv             2.05     2.66      130      7.76 (0.042, 1.44e+03)     23.6
## prop.err      0.161                                         0.161         
## pkadd.err     0.571                                         0.571         
## temax          9.98     4.96     49.7                1 (0.565, 1)         
## tec50         0.131     1.61 1.23e+03         1.14 (0.0489, 26.6)     43.6
## tkout         -2.96     28.3      954 0.0517 (4.63e-26, 5.77e+22)     8.63
## te0            4.57    0.411        9            96.7 (43.2, 217)     5.19
## pdadd.err      3.59                                          3.59         
##           Shrink(SD)%
## tktr                 
## tka                  
## tcl            3.19% 
## tv             10.7% 
## prop.err             
## pkadd.err            
## temax                
## tec50          7.12% 
## tkout          33.8% 
## te0            17.2% 
## pdadd.err            
##  
##   Covariance Type (f2$covMethod): nonmem.r,s
##   No correlations in between subject variability (BSV) matrix
##   Full BSV covariance (f2$omega) or correlation (f2$omegaR; diagonals=SDs) 
##   Distribution stats (mean/skewness/kurtosis/p-value) available in f2$shrink 
##   Censoring (f2$censInformation): No censoring
##   Minimization message (f2$message):  
##     
## 
##  WARNINGS AND ERRORS (IF ANY) FOR PROBLEM    1
## 
##  (WARNING  2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
## 
##     
## 0MINIMIZATION SUCCESSFUL
##  HOWEVER, PROBLEMS OCCURRED WITH THE MINIMIZATION.
##  REGARD THE RESULTS OF THE ESTIMATION STEP CAREFULLY, AND ACCEPT THEM ONLY
##  AFTER CHECKING THAT THE COVARIANCE STEP PRODUCES REASONABLE OUTPUT.
##  NO. OF FUNCTION EVALUATIONS USED:     2391
##  NO. OF SIG. DIGITS IN FINAL EST.:  4.1
## 
##     IPRED relative difference compared to Nonmem IPRED: 0%; 95% percentile: (0%,0%); rtol=7.82e-06
##     PRED relative difference compared to Nonmem PRED: 0%; 95% percentile: (0%,0%); rtol=7.17e-06
##     IPRED absolute difference compared to Nonmem IPRED: atol=7.42e-05; 95% percentile: (2.13e-06, 0.000645)
##     PRED absolute difference compared to Nonmem PRED: atol=7.17e-06; 95% percentile: (3.11e-07,0.00342)
##     nonmem model: 'pk.turnover.emax4-nonmem/pk.turnover.emax4.nmctl' 
## 
## ── Fit Data (object f2 is a modified tibble): ──
## # A tibble: 483 × 32
##   ID     TIME CMT      DV  PRED   RES IPRED   IRES IWRES eta.cl eta.v eta.ec50
##   <fct> <dbl> <fct> <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>    <dbl>
## 1 1       0.5 cp      0    1.16 -1.16 0.920 -0.920 -1.56  0.689 0.228    0.160
## 2 1       1   cp      1.9  3.38 -1.48 2.68  -0.780 -1.09  0.689 0.228    0.160
## 3 1       2   cp      3.3  7.53 -4.23 5.94  -2.64  -2.36  0.689 0.228    0.160
## # … with 480 more rows, and 20 more variables: eta.kout <dbl>, eta.e0 <dbl>,
## #   cp <dbl>, depot <dbl>, gut <dbl>, center <dbl>, effect <dbl>, ktr <dbl>,
## #   ka <dbl>, cl <dbl>, v <dbl>, emax <dbl>, ec50 <dbl>, kout <dbl>, e0 <dbl>,
## #   DCP <dbl>, PD <dbl>, kin <dbl>, tad <dbl>, dosenum <dbl>

One thing to emphasize: unlike other translators, you will know immediately if the translation is off because the model will not validate. Hence you can start this process with confidence - you will know immediately if something is wrong.

Conclusion

The first release of babelmixr2 includes a NONMEM translation function. The advantages of this are:

For nlmixr2 development, we can easily compare to NONMEM to see how we’re doing with respect to the current gold standard.
For people who are using rxode2 and NONMEM, writing a model with nlmixr2 syntax and using it to run NONMEM will let you only write one model, and save you time debugging and coding it yourself.
For pharmacometricians using NONMEM, you can take an unsuccessful NONMEM fit, get information (covariance shrinkage, etc) about the model and you will be able to make informed decisions on how to proceed.

Many of these advantages come from the fact that babelmixr2 leans into supporting nlmixr2 development for those fluent in NONMEM and having nlmixr2 available can help pharmacometricans in daily tasks, even when they need to use another tool.

The astute reader will also notice that the full model runs in nlmixr2’s focei without adjustment. I would like to caution that this doesn’t mean that nlmixr2’s focei is better: rather, it is different (as mentioned in a previous blog post). I have seen cases in the past where something runs better in NONMEM than nlmixr2 so comparisons based on a single model should be regarded with caution (I no longer have these examples available, though, soyou’ll have to take my word for it).

Thanks for reading!

Posted on:: November 11, 2022

Length:: 14 minute read, 2783 words

Categories:: babelmixr2

Tags:: new-version NONMEM

See Also:: nlmixr2 2.1.2/ rxode2 2.1.3; nlmixr2 2.1.0/ rxode2 2.1.1; nonmem2rx and babelmixr2

How to use NONMEM with nlmixr2

Conclusion

How to use `NONMEM` with `nlmixr2`