8 2 Getting Started
numerical optimisation methods are needed. These methods are iterative pro-
cedures that will ideally approach the optimal parameter values in a stepwise
manner. At each step, the algorithms determine the new parameter values
based on the data, the model, and the current parameter values. By far the
most common algorithm for estimation in nonlinear regression is the Gauss-
Newton method, which relies on linear approximations to the nonlinear mean
function at each step. For more details and explanations, see, for example,
Bates and Watts (1988, pp. 32–66), Seber and Wild (1989, pp. 21–89), or
Weisberg (2005, pp. 234–237). The numerical optimisation methods are not
perfect. Two common complications when using them are:
• how to start the procedure and how to choose the initial/starting parame-
ter value
• how to ensure that the procedure reached the global minimum
rather than a local minimum
These two issues are interrelated. If the initial parameter values are sufficiently
close to the optimal parameter values, then the procedure will usually get
closer and closer to the optimal parameter value (the algorithm is said to
converge) within a few steps. Therefore, it is very important to provide sensible
starting parameter values. Poorly chosen starting values on the other hand will
often lead the procedures astray so no useful model fit is obtained. If lack of
convergence persists regardless of the choice of starting values, then it typically
indicates that the model in its present form is not appropriate for the data at
hand (if possible, you could try fitting a related but simpler model).
As the solutions to nonlinear regression problems are numeric, they may
differ as a consequence of different algorithms, different implementations of
the same algorithm (for example, different criteria for declaring convergence
or whether or not first derivatives are computed numerically or explicit ex-
pressions are provided), different parameterisations, or different starting val-
ues. However, the resulting parameter estimates often will not differ much.
If there are large discrepancies, then it may possibly indicate that a simpler
model should be preferred.
Once the parameter estimates
ˆ
β are found, the estimate of the residual
variance σ
2
is obtained as the minimum value of RSS (attained when para-
meter estimates are inserted) divided by the degrees of freedom (n−p), giving
the estimate s
2
=
RSS(
ˆ
β)
n−p
(Fox, 2002). The residual standard error is then s.
2.2 Getting started with nls()
In this section, we will introduce the key player in this book, the model fitting
function nls() (Bates and Chambers, 1992), which comes with the standard
installation of R (in the package stats). We go through an example show-
ing how to fit a model, how to obtain parameter estimates, predictions, and