xviii PREFACE TO THE FIRST EDITION
interpretations of them, showing how the answers fit together in various
ways. Along the way we speculate on the meaning of the second law of
thermodynamics. Does entropy always increase? The answer is yes and
no. This is the sort of result that should please experts in the area but
might be overlooked as standard by the novice.
In fact, that brings up a point that often occurs in teaching. It is fun
to find new proofs or slightly new results that no one else knows. When
one presents these ideas along with the established material in class, the
response is “sure, sure, sure.” But the excitement of teaching the material
is greatly enhanced. Thus we have derived great pleasure from investigat-
ing a number of new ideas in this textbook.
Examples of some of the new material in this text include the chapter
on the relationship of information theory to gambling, the work on the uni-
versality of the second law of thermodynamics in the context of Markov
chains, the joint typicality proofs of the channel capacity theorem, the
competitive optimality of Huffman codes, and the proof of Burg’s theorem
on maximum entropy spectral density estimation. Also, the chapter on
Kolmogorov complexity has no counterpart in other information theory
texts. We have also taken delight in relating Fisher information, mutual
information, the central limit theorem, and the Brunn–Minkowski and
entropy power inequalities. To our surprise, many of the classical results
on determinant inequalities are most easily proved using information the-
oretic inequalities.
Even though the field of information theory has grown considerably
since Shannon’s original paper, we have strived to emphasize its coher-
ence. While it is clear that Shannon was motivated by problems in commu-
nication theory when he developed information theory, we treat informa-
tion theory as a field of its own with applications to communication theory
and statistics. We were drawn to the field of information theory from
backgrounds in communication theory, probability theory, and statistics,
because of the apparent impossibility of capturing the intangible concept
of information.
Since most of the results in the book are given as theorems and proofs,
we expect the elegance of the results to speak for themselves. In many
cases we actually describe the properties of the solutions before the prob-
lems. Again, the properties are interesting in themselves and provide a
natural rhythm for the proofs that follow.
One innovation in the presentation is our use of long chains of inequal-
ities with no intervening text followed immediately by the explanations.
By the time the reader comes to many of these proofs, we expect that he
or she will be able to follow most of these steps without any explanation
and will be able to pick out the needed explanations. These chains of