Preface xv
in the time series analysis literature; some pointers to this literature are given
in Appendix B.
The book is primarily intended for graduate students and researchers in intended audience
machine learning at departments of Computer Science, Statistics and Applied
Mathematics. As prerequisites we require a good basic grounding in calculus,
linear algebra and probability theory as would be obtained by graduates in nu-
merate disciplines such as electrical engineering, physics and computer science.
For preparation in calculus and linear algebra any good university-level text-
book on mathematics for physics or engineering such as Arfken [1985] would
be fine. For probability theory some familiarity with multivariate distributions
(especially the Gaussian) and conditional probability is required. Some back-
ground mathematical material is also provided in Appendix A.
The main focus of the book is to present clearly and concisely an overview focus
of the main ideas of Gaussian processes in a machine learning context. We have
also covered a wide range of connections to existing models in the literature,
and c over approximate inference for faster practical algorithms. We have pre-
sented detailed algorithms for many methods to aid the practitioner. Software
implementations are available from the website for the book, see Appendix C.
We have also included a small set of exercises in each chapter; we hope these
will help in gaining a deeper understanding of the material.
In order limit the size of the volume, we have had to omit some topics, such scope
as, for example, Markov chain Monte Carlo methods for inference. One of the
most difficult things to decide when writing a book is what sections not to write.
Within sections, we have often chosen to describe one algorithm in particular
in depth, and m ention related work only in passing. Although this causes the
omission of some material, we feel it is the best approach for a monograph, and
hope that the reader will gain a general understanding so as to be able to push
further into the growing literature of GP models.
The book has a natural split into two parts, with the chapters up to and book organization
including chapter 5 covering core material, and the remaining s ec tions covering
the connections to other methods, fast approximations, and more specialized
prop e rties. Some sections are marked by an aste risk. These sections may be ∗
omitted on a first reading, and are not pre-requisites for later (un-starred)
material.
We wish to express our considerable gratitude to the many people with acknowledgements
who we have interacted during the writing of this book. In particular Moray
Allan, David Barber, Peter Bartlett, Miguel Carreira-Perpi˜n´an, Marcus Gal-
lagher, Manfred Opper, Anton Schwaighofer, Matthias Seeger, Hanna Wallach,
Joe Whittaker, and Andrew Zisserman all read parts of the book and provided
valuable feedback. Dilan G¨or¨ur, Malte Kuss, Iain Murray, Joaquin Qui˜nonero-
Candela, Leif Rasmussen and Sam Roweis were especially heroic and provided
comments on the whole manuscript. We thank Chris Bishop, Miguel Carreira-
Perpi˜n´an, Nando de Freitas, Zoubin Ghahramani, Peter Gr¨unwald, Mike Jor-
dan, John Kent, Radford Neal, Joaquin Qui˜nonero-Candela, Ryan Rifkin, Ste-
fan Schaal, Anton Schwaighofer, Matthias Seeger, Peter Sollich, Ingo Steinwart,