Pub. Date:
Elements of Information Theory / Edition 2

Elements of Information Theory / Edition 2

by Thomas M. Cover, Joy A. Thomas
Current price is , Original price is $124.0. You

Temporarily Out of Stock Online

Please check back later for updated availability.

Product Details

ISBN-13: 9780471241959
Publisher: Wiley
Publication date: 07/18/2006
Series: Wiley Series in Telecommunications and Signal Processing , #40
Edition description: Revised Edition
Pages: 776
Sales rank: 1,048,524
Product dimensions: 6.40(w) x 9.50(h) x 1.62(d)

About the Author

THOMAS M. COVER, PHD, is Professor in the departments of electrical engineering and statistics, Stanford University. A recipient of the 1991 IEEE Claude E. Shannon Award, Dr. Cover is a past president of the IEEE Information Theory Society, a Fellow of the IEEE and the Institute of Mathematical Statistics, and a member of the National Academy of Engineering and the American Academy of Arts and Science. He has authored more than 100 technical papers and is coeditor of Open Problems in Communication and Computation.

JOY A. THOMAS, PHD, is the Chief Scientist at Stratify, Inc., a Silicon Valley start-up specializing in organizing unstructured information. After receiving his PhD at Stanford, Dr. Thomas spent more than nine years at the IBM T. J. Watson Research Center in Yorktown Heights, New York. Dr. Thomas is a recipient of the IEEE Charles LeGeyt Fortescue Fellowship.

Read an Excerpt

Chapter 1: Introduction and Preview

This "first and last lecture" chapter goes backwards and forwards through information theory and its naturally related ideas. The full definitions and study of the subject begin in Chapter 2.

Information theory answers two fundamental questions in communication theory: what is the ultimate data compression (answer: the entropy H), and what is the ultimate transmission rate of communication (answer: the channel capacity C). For this reason some consider information theory to be a subset of communication theory. We will argue that it is much more. Indeed, it has fundamental contributions to make in statistical physics (thermodynamics), computer science (Kolmogorov complexity or algorithmic complexity), statistical inference (Occam's Razor: "The simplest explanation is best") and to probability and statistics (error rates for optimal hypothesis testing and estimation).

Figure 1.1 illustrates the relationship of information theory to other fields. As the figure suggests, information theory intersects physics (statistical mechanics), mathematics (probability theory), electrical engineering (communication theory) and computer science (algorithmic complexity). We now describe the areas of intersection in greater detail:

Electrical Engineering (Communication Theory). In the early 1940s, it was thought that increasing the transmission rate of information over a communication channel increased the probability of error. Shannon surprised the communication theory community by proving that this was not true as long as the communication rate was below channel capacity. The capacity can be simply computed from the noise characteristics of the channel. Shannon further argued that random processes such as music and speech have an irreducible complexity below which the signal cannot be compressed. This he named the entropy, in deference to the parallel use of this word in thermodynamics, and argued that if the entropy of the source is less than the capacity of the channel, then asymptotically error free communication can be achieved.

Information theory today represents the extreme points of the set of all possible communication schemes, as shown in the fanciful Figure 1.2. The data compression minimum I(X; X) lies at one extreme of the set of communication ideas. All data compression schemes require description rates at least equal to this minimum. At the other extreme is the data transmission maximum I(X; Y), known as the channel capacity. Thus all modulation schemes and data compression schemes lie between these limits.

Information theory also suggests means of achieving these ultimate limits of communication. However, these theoretically optimal communication schemes, beautiful as they are, may turn out to be computationally impractical. It is only because of the computational feasibility of simple modulation and demodulation schemes that we use them rather than the random coding and nearest neighbor decoding rule suggested by Shannon's proof of the channel capacity theorem. Progress in integrated circuits and code design has enabled us to reap some of the gains suggested by Shannon's theory. A good example of an application of the ideas of information theory is the use of error correcting codes on compact discs.

Modern work on the communication aspects of information theory has concentrated on network information theory: the theory of the simultaneous rates of communication from many senders to many receivers in a communication network. Some of the trade-offs of rates between senders and receivers are unexpected, and all have a certain mathematical simplicity. A unifying theory, however, remains to be found.

Computer Science (Kolmogorov Complexity). Kolmogorov, Chaitin and Solomonoff put forth the idea that the complexity of a string of data can be defined by the length of the shortest binary program for computing the string. Thus the complexity is the minimal description length. This definition of complexity turns out to be universal, that is, computer independent, and is of fundamental importance. Thus Kolmogorov complexity lays the foundation for the theory of descriptive complexity. Gratifyingly, the Kolmogorov complexity K is approximately equal to the Shannon entropy H if the sequence is drawn at random from a distribution that has entropy H. So the tie-in between information theory and Kolmogorov complexity is perfect. Indeed, we consider Kolmogorov complexity to be more fundamental than Shannon entropy. It is the ultimate data compression and leads to a logically consistent procedure for inference.

There is a pleasing complementary relationship between algorithmic complexity and computational complexity. One can think about computational complexity (time complexity) and Kolmogorov complexity (program length or descriptive complexity) as two axes corresponding to program running time and program length. Kolmogorov complexity focuses on minimizing along the second axis, and computational complexity focuses on minimizing along the first axis. Little work has been done on the simultaneous minimization of the two...

Table of Contents

Preface to the Second Edition.

Preface to the First Edition.

Acknowledgments for the Second Edition.

Acknowledgments for the First Edition.

1. Introduction and Preview.

2. Entropy, Relative Entropy, and Mutual Information.

3. Asymptotic Equipartition Property.

4. Entropy Rates of a Stochastic Process.

5. Data Compression.

6. Gambling and Data Compression.

7. Channel Capacity.

8. Differential Entropy.

9. Gaussian Channel.

10. Rate Distortion Theory.

11. Information Theory and Statistics.

12. Maximum Entropy.

13. Universal Source Coding.


14. Kolmogorov Complexity.

15. Network Information Theory.

16. Information Theory and Portfolio Theory.

17. Inequalities in Information Theory.


List of Symbols.


What People are Saying About This

From the Publisher

"As expected, the quality of exposition continues to be a high point of the book. Clear explanations, nice graphical illustrations, and illuminating mathematical derivations make the book particularly useful as a textbook on information theory." (Journal of the American Statistical Association, March 2008)

"This book is recommended reading, both as a textbook and as a reference." (Computing, December 28, 2006)

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews

Elements of Information Theory 3.7 out of 5 based on 0 ratings. 3 reviews.
billmcn on LibraryThing More than 1 year ago
I do research in statistical natural language processing and use this as my primary reference about information theory. Well-written, thorough, and accessible to anyone with a foundation in probability. The first three chapters alone are worth the price of the book.
Anonymous More than 1 year ago
This is a great book. It is a standard textbook for teaching information theory in many top-tier universities. The development of theories is clear together with some examples. This is a theoretical book definitely not a book for applied science.
Guest More than 1 year ago
This book is horrible. Theories are given, but no examples are provided. Please do not make students' life worst than it is now.