nlmixr2 2.0.8 log-likelihood

By Matt Fidler and the nlmixr2 Development Team in nlmixr2

October 24, 2022

I am pretty excited abut the new nlmixr2 release (2.0.8). When I joined the the nlmixr2 team, I wanted to do a fancy heavy tailed, skewed model in an open source tool so I could figure out how to do even more with it.

With this release, it is possible to do a heavy tailed (t-distribution dt()) skewed (coxBox(lambda)) distribution: my old wish is now possible with focei!

A few other things that people may be interested in are:

  • Likelihood based each observation (and how to get it)

  • Standard Errors for etas (and how to get it)

  • More robust dose-based between subject variability (lag time, bioavailability, modeled rate/duration)

These will all be discussed in other posts (since this one is long already).

Testing Generalized Likelihood

With any new method I want to the results makes sense.

With that in mind, I thought I could use the models Rik used to compare the SAEM and FOCEi algorithms to Monolix and NONMEM, respectively (1) to check the generalized likelihood.

In these tests, we can modify the error model to use generalized likelihood by adding a + dnorm() to the end of the residual specification. This allows us to run a generalized likelihood and compare it to the compare the current focei .

As discussed in Rik’s paper, every model has estimated a central volume. So simply looking at Vc and how it behaves is a good surrogate of how well this method is doing.

First, we will compare the population estimated Vc values between the methods and against the true value of 70:

In my estimation, the Vc values are similar between the two methods and both vary (similarly) around the true value of 70. Incidentally both Monolix and NONMEM give similar findings here (1).

The next parameter to check is the between subject variability on Vc where the true value is 30%CV:

The estimates here also look fairly consistent (and reasonable).

Now lets examine the standard errors of the estimates:

## Warning: Removed 6 rows containing missing values (geom_point).

Here you can see:

  • The results are similar, but often higher for the log-likelihood estimation

  • The log-likelihood estimation is less likely to have a successful covariance (as measured by a r,s matrix)

If you allow all the covariance types returned, you can see a similar pattern:

The next question is how much (longer) does it take to run nlmixr2 with log-likelihood than with a standard normal model

This shows it takes anywhere from 0.8 to 7.2 fold longer to use the log-likelihood method than the standard focei method with these datasets.

This is because each individual has to estimate the eta Hessian numerically for each subject to calculate the overall objective function. The time difference is likely a function of:

  1. number of between subject varaibilities estimated

  2. number of subjects in the data-set

NOTE: Because the log-likelihood focei calculation is different than using the NONMEM approximation of the individual Hessian (as described by Almquist 2015 (2)), you should not do statistical comparisons between a log-likelihood model and a standard focei model; The calculations are different, and it is not clear that the differences are due to likelihood alone.

That being said, the generalized log-likelihood method does approximate the same likelihood as the NOMMEM-style focei. Hence, plotting the objective functions between the two methods give very similar values for each problem:

Overall Conclusions about nlmixr2 log-likelihood focei

  • The log-likelihood estimation procedure performs well for estimating population effects and between subject varaibilites

  • The log-likelihood estimation procedure does provide standard error estimates that are in the right general area, but tend to be a bit larger than the true values.

  • The procedure takes more time (0.8 to 7 fold more)

  • You should not compare objective functions between models that were estimated by general focei and log-likelihood focei.

Generalized likelihood methods in nlmixr2

More distributions

These (mostly) match all match the R distributions in the function. The left handed side of the equation (err) represent the compartment name for the DV in the dataset (if needed). It still needs the compartment defined even if it is a single endpoint model.

The only function that does not exactly the R documentation is the dnbinomMu probability function. It takes the size and mu as documented below and in the R documentation.

Distribution How to Add
Poision err ~ dpois(lamba)
Binomial err ~ dbinom(n, p)
Beta err ~ dbeta(alpha, beta)
Chisq err ~ chisq(nu)
Exponential err ~ dexp(r)
Uniform err ~ dunif(a, b)
Weibull err ~ dweibull(a, b)
Gamma err ~ dgamma(a, b)
Geom err ~ dgeom(a)
Negative Binomial err ~ dnbinom(n, p)
Negative Binomial (mu version) err ~ dnbinomMu(size, mu)


Finally, ordinal likelihoods/simulations can be specified in 2 ways, the first is:

err ~ c(p0, p1, p2)

Here err represents the compartment and p0 is the probability of being in category:

Category Probability
1 p0
2 p1
3 p2
4 1-p0-p1-p2

It is up to the model to ensure that the sum of the p values are less than 1. Additionally you can write an arbitrary number of categories in the ordinal model described above.

It seems a little off that p0 is the probability for category 1 and sometimes scores are in non-whole numbers. This can be modeled as follows:

err ~ c(p0=0, p1=1, p2=2, 3)

Here the numeric categories are specified explicitly, and the probabilities remain the same:

Category Probability
0 p0
1 p1
2 p2
3 1-p0-p1-p2

Generalized likelihood

The generalized log-likelihood can be defined as follows:

ll(err) ~ log likelihood expression

Here err represents the compartment or dvid that specifies the log-likelihood. You still need to specify it even if it is a single endpoint model (like the distributions above).


Schoemaker R, Fidler M, Laveille C, Wilkins JJ, Hooijmaijers R, Post TM, et al. Performance of the SAEM and FOCEI Algorithms in the Open-Source, Nonlinear Mixed Effect Modeling Tool nlmixr. CPT: Pharmacometrics and Systems Pharmacology. 2019;8(12):923–30.
Almquist J, Leander J, Jirstrand M. Using sensitivity equations for computing gradients of the FOCE and FOCEI approximations to the population likelihood. Journal of Pharmacokinetics and Pharmacodynamics. 2015 Jun;42(3):191–209.
Posted on:
October 24, 2022
6 minute read, 1102 words
See Also:
nlmixr2 2.1.2/ rxode2 2.1.3
nlmixr2 2.1.0/ rxode2 2.1.1
nonmem2rx and babelmixr2