Computers are everywhere. Some of them are highly visible, in laptops, tablets, cell phones, and smart watches. But most are invisible, like those in appliances, cars, medical equipment, transportation systems, power grids, and weapons. We never see the myriad computers that quietly collect, share, and sometimes leak vast amounts of personal data about us. Through computers, governments and companies increasingly monitor what we do. Social networks and advertisers know far more about us than we should be comfortable with, using information we freely give them. Criminals have all-too-easy access to our data. Do we truly understand the power of computers in our world?
Understanding the Digital World explains how computer hardware, software, networks, and systems work. Topics include how computers are built and how they compute; what programming is and why it is difficult; how the Internet and the web operate; and how all of these affect our security, privacy, property, and other important social, political, and economic issues. This book also touches on fundamental ideas from computer science and some of the inherent limitations of computers. It includes numerous color illustrations, notes on sources for further exploration, and a glossary to explain technical terms and buzzwords.
Understanding the Digital World is a must-read for all who want to know more about computers and communications. It explains, precisely and carefully, not only how they operate but also how they influence our daily lives, in terms anyone can understand, no matter what their experience and knowledge of technology.
|Publisher:||Princeton University Press|
|Product dimensions:||7.10(w) x 10.10(h) x 0.90(d)|
About the Author
Read an Excerpt
Understanding the Digital World
What You Need to Know about Computers, the Internet, Privacy, and Security
By Brian W. Kernighan
PRINCETON UNIVERSITY PRESSCopyright © 2017 Princeton University Press
All rights reserved.
What's in a Computer?
"Inasmuch as the completed device will be a general-purpose computing machine it should contain certain main organs relating to arithmetic, memorystorage, control and connection with the human operator."
Arthur W. Burks, Herman H. Goldstine, John von Neumann, "Preliminary discussion of the logical design of an electronic computing instrument," 1946.
Let's begin our discussion of hardware with an overview of what's inside a computer. We can look at a computer from at least two viewpoints: the logical or functional organization — what the pieces are, what they do and how they are connected — and the physical structure — what the pieces look like and how they are built. The goal of this chapter is to see what's inside, learn roughly what each part does, and get some sense of what the myriad acronyms and numbers mean.
Think about your own computing devices. Many readers will have some kind of "PC," that is, a laptop or desktop computer descended from the Personal Computer that IBM first sold in 1981, running some version of the Windows operating system from Microsoft. Others will have an Apple Macintosh that runs a version of the Mac OS X operating system. Still others might have a Chromebook or similar laptop that relies on the Internet for storage and computation. More specialized devices like smartphones, tablets and ebook readers are also powerful computers. These all look different and when you use them they feel different as well, but underneath the skin, they are fundamentally the same. We'll talk about why.
There's a loose analogy to cars. Functionally, cars have been the same for over a hundred years. A car has an engine that uses some kind of fuel to make the engine run and the car move. It has a steering wheel that the driver uses to control the car. There are places to store the fuel and places to store the passengers and their goods. Physically, however, cars have changed greatly over a century: they are made of different materials, and they are faster, safer, and much more reliable and comfortable. There's a world of difference between my first car, a well-used 1959 Volkswagen Beetle, and a Ferrari, but either one will carry me and my groceries home from the store or across the country, and in that sense they are functionally the same. (For the record, I have never even sat in a Ferrari, let alone owned one, so I'm speculating about whether there's room for the groceries.)
The same is true of computers. Logically, today's computers are very similar to those of the 1950s, but the physical differences go far beyond the kinds of changes that have occurred with the automobile. Today's computers are much smaller, cheaper, faster and more reliable than those of 50 years ago, literally a million times better in some properties. Such improvements are the fundamental reason why computers are so pervasive.
The distinction between the functional behavior of something and its physical properties — the difference between what it does and how it's built or works inside — is an important idea. For computers, the "how it's built" part changes at an amazing rate, as does how fast it runs, but the "how it does what it does" part is quite stable. This distinction between an abstract description and a concrete implementation will come up repeatedly in what follows.
I sometimes do a survey in my class in the first lecture. How many have a PC? How many have a Mac? The ratio was fairly constant at 10 to 1 in favor of PCs in the first half of the 2000s, but changed rapidly over a few years, to the point where Macs now account for well over three quarters of the computers. This is not typical of the world at large, however, where PCs dominate by a wide margin.
Is the ratio unbalanced because one is superior to the other? If so, what changed so dramatically in such a short time? I ask my students which kind is better, and for objective criteria on which to base that opinion. What led you to your choice when you bought your computer?
Naturally, price is one answer. PCs tend to be cheaper, the result of fierce competition in a marketplace with many suppliers. A wider range of hardware add-ons, more software, and more expertise are all readily available. This is an example of what economists call a network effect: the more other people use something, the more useful it will be for you, roughly in proportion to how many others there are.
On the Mac side are perceived reliability, quality, esthetics, and a sense that "things just work," for which many consumers are willing to pay a premium.
The debate goes on, with neither side convincing the other, but it raises some good questions and helps to get people thinking about what is different between different kinds of computing devices and what is really the same.
There's an analogous debate about phones. Almost everyone has a "smart phone" that can run programs ("apps") downloaded from Apple's App Store or Google's Play Store. The phone serves as a browser, a mail system, a watch, a camera, a music and video player, a comparison shopping tool, and even occasionally a device for conversation. Typically about three quarters of the students have an iPhone from Apple; almost all the rest have an Android phone from one of many suppliers. A tiny fraction might have a Windows phone, and (rarely) someone admits to having only a "feature phone," which is defined as a phone that has no features beyond the ability to make phone calls. My sample is for the US and a comparatively affluent environment; in other parts of the world, Android phones would be much more common.
Again, people have good reasons — functional, economic, esthetic — for choosing one kind of phone over others but underneath, just as for PCs versus Macs, the hardware that does the computing is very similar. Let's look at why.
1.1 Logical Construction
If we were to draw an abstract picture of what's in a simple generic computer — its logical or functional architecture — it would look like the diagram in Figure 1.1 for both Mac and PC: a processor (the CPU), some primary memory (RAM), some secondary storage (a disk) and a variety of other components, all connected by a set of wires called a bus that transmits information between them.
If instead we drew this picture for a phone or tablet, it would be similar, though mouse, keyboard and display are combined into one component, the screen. There's certainly no CD or DVD, but there are hidden components like a compass, an accelerometer, and a GPS receiver for determining your physical location.
The basic organization — with a processor, storage for instructions and data, and input and output devices — has been standard since the 1940s. It's often called the von Neumann architecture, after John von Neumann, who described it in the 1946 paper quoted above. Though there is still debate over whether von Neumann gets too much credit for work done by others, the paper is so clear and insightful that it is well worth reading even today. For example, the quotation at the beginning of this chapter is the first sentence of the paper. Translated into today's terminology, the CPU provides arithmetic and control, the RAM and disk are memory storage, and the keyboard, mouse and display interact with the human operator.
The processor or central processing unit (CPU) is the brain, if a computer could be said to have such a thing. The CPU does arithmetic, moves data around, and controls the operation of the other components. The CPU has a limited repertoire of basic operations that it can perform but it performs them blazingly fast, billions per second. It can decide what operations to perform next based on the results of previous computations, so it is to a considerable degree independent of its human users. We will spend more time on this component in Chapter 3 because it's so important.
If you go to a store or shop online to buy a computer, you'll find most of these components mentioned, usually accompanied by mysterious acronyms and equally mysterious numbers. For example, you might see a CPU described as a "2.2 GHz dual-core Intel Core i7 processor," as it is for one of my computers. What's that?Intel makes the CPU and "Core i7" is just a marketing term. This particular processor actually has two processing units in a single package; in this context, lower-case "core" has become a synonym for "processor." For most purposes, it's sufficient to think of the combination as "the CPU," no matter how many cores it has.
"2.2 GHz" is the more interesting part. CPU speed is measured, at least approximately, in terms of the number of operations or instructions or parts thereof that it can do in a second. The CPU uses an internal clock, rather like a heartbeat or the ticking of a clock, to step through its basic operations. One measure of speed is the number of such ticks per second. One beat or tick per second is called one hertz (abbreviated Hz), after the German engineer Heinrich Hertz, whose discovery of how to produce electromagnetic radiation in 1888 led directly to radio and other wireless systems. Radio stations give their broadcast frequencies in megahertz (millions of hertz), like 102.3 MHz. Computers today typically run in the billions of hertz, or gigahertz, or GHz; my quite ordinary 2.2 GHz processor is zipping along at 2,200,000,000 ticks per second. The human heartbeat is about 1 Hz or almost 100,000 beats per day, which is around 30 million per year, so my CPU does in 1 second the number of beats my heart would do in 70 years.
This is our first encounter with some of the numerical prefixes like mega and giga that are so common in computing. "Mega" is one million, or 10; "giga" is one billion, or 10, and usually pronounced with a hard "g" as in "gig." We'll see more units soon enough, and there is a complete table in the glossary.
The primary memory or random access memory (RAM) stores information that is in active use by the processor and other parts of the computer; its contents can be changed by the CPU. The RAM stores not only the data that the CPU is currently working on, but also the instructions that tell the CPU what to do with the data. This is a crucially important point: by loading different instructions into memory, we can make the CPU do a different computation. This makes the stored-program computer a general-purpose device; the same computer can run a word processor and a spreadsheet, surf the web, send and receive email, keep up with friends on Facebook, do my taxes, and play music, all by placing suitable instructions in the RAM. The importance of the stored-program idea cannot be overstated.
RAM provides a place to store information while the computer is running. It stores the instructions of programs that are currently active, like Word, Photoshop or a browser. It stores their data — the pictures on the screen, the documents being edited, the music that's currently playing. It also stores the instructions of the operating system — Windows, Mac OS X or something else — that operates behind the scenes to let you run multiple applications at the same time. We'll talk about operating systems in Chapter 6.
RAM is called random access because the CPU can access the information stored at any place within it as quickly as in any other; to oversimplify a little, there's no speed penalty for accessing memory locations in a random order. Compare this to an old VCR tape, where to look at the final scenes of a movie, you have to fast forward (slowly!) over everything from the beginning; that's called sequential access.
Most RAM is volatile, that is, its contents disappear if the power is turned off, and all this currently active information is lost. That's why it's prudent to save your work often, especially on a desktop machine, where tripping over the power cord could be a real disaster.
Your computer has a fixed amount of RAM. Capacity is measured in bytes, where a byte is an amount of memory that's big enough to hold a single character like W or @, or a small number like 42, or a part of a larger value. Chapter 2 will show how information is represented in memory and other parts of a computer, since it's one of the fundamental issues in computing. But for now, you can think of the RAM as a large collection of identical little boxes, numbered from 1 up to a few billion, each of which can hold a small amount of information.
What is the capacity? The laptop I'm using right now has 4 billion bytes or 4 gigabytes or 4 GB of RAM, which many people would deem too small. The reason is that more RAM usually translates into faster computing, since there's never enough for all the programs that want to use it at the same time, and it takes time to move parts of an unused program out to make room for something new. If you want your computer to run faster, buying extra RAM is likely to be the best strategy.
1.1.3 Disks and other secondary storage
The RAM has a large but limited capacity to store information and its contents disappear when the power is turned off. Secondary storage holds information even when the power is turned off. There are two main kinds of secondary storage, the magnetic disk, usually called the hard disk or hard drive, and flash memory, often called solid state disk. Both kinds of disk store much more information than RAM and it's not volatile: information on disk stays there indefinitely, power or not. Data, instructions, and everything else is stored on the disk for the long term and brought into RAM only transiently.
Magnetic disks store information by setting the direction of magnetization of tiny regions of magnetic material on rotating metallic surfaces. Data is stored in concentric tracks that are read and written by a sensor that moves from track to track. The whirring and clicking that you hear when a computer is doing something is the disk in action, moving the sensor to the right places on the surface. You can see the surface and sensor in the picture of a standard laptop disk in Figure 1.2; the platter is 2½ inches (6¼ cm) in diameter.
Disk space is about 100 times cheaper per byte than RAM, but accessing information is slower. It takes about ten milliseconds for the disk drive to access any particular track on the surface; data is then transferred at roughly 100 MB per second.
Increasingly, laptops have solid state disks, which use flash memory instead of rotating machinery. Flash memory is non-volatile; information is stored as electric charges in circuitry that maintains the charge in individual circuit elements without using any power. Stored charges can be read to see what their values are, and they can be erased and overwritten with new values. Flash memory is faster, lighter, more reliable, won't break if dropped, and requires less power than conventional disk storage, so it's used in cell phones, cameras, and the like. Right now it's more expensive per byte but prices are coming down fast and it seems likely to take over from mechanical disks in laptops.
A typical laptop disk today holds perhaps 500 gigabytes, and external drives that can be plugged in to a USB socket have capacities in the multi-terabyte (TB) range. "Tera" is one trillion, or 10, another unit that you'll see more and more often.
How big is a terabyte, or even a gigabyte for that matter? One byte holds one alphabetic character in the most common representation of English text. Pride and Prejudice, about 250 pages on paper, has about 550,000 characters, so 1 GB could hold nearly 2,000 copies of it. More likely, I would store one copy and then include some music. Music in MP3 or AAC format is about 1 MB per minute, so an MP3 version of one of my favorite audio CDs, The Jane Austen Songbook, is about 60 MB, and there would still be room for another 15 hours of music in 1 GB. The twodisk DVD of the 1995 BBC production of Pride and Prejudice with Jennifer Ehle and Colin Firth is less than 10 GB, so I could store it and a hundred similar movies on a 1 TB disk.
A disk is a good example of the difference between logical structure and physical implementation. When we run a program like Explorer in Windows or Finder in Mac OS X, we see disk contents organized as a hierarchy of folders and files. But the data could be stored on rotating machinery, integrated circuits with no moving parts, or something else entirely. The particular kind of "disk" in a computer doesn't matter. Hardware in the disk itself and software in the operating system, called the file system, create the organizational structure. We will return to this in Chapter 6.
The logical organization is so well matched to people (or, more likely, by now we're so completely used to it) that other devices provide the same organization even though they use completely different physical means to achieve it. For example, the software that gives you access to information from a CD-ROM or DVD makes it look like this information is stored in a file hierarchy, regardless of how it is actually stored. So do USB devices, cameras and other gadgets that use removable memory cards. Even the venerable floppy disk, now totally obsolete, looked the same at the logical level. This is a good example of abstraction, a pervasive idea in computing: physical implementation details are hidden. In the file system case, no matter how the different technologies work, they are presented to users as a hierarchy of organized information.
Excerpted from Understanding the Digital World by Brian W. Kernighan. Copyright © 2017 Princeton University Press. Excerpted by permission of PRINCETON UNIVERSITY PRESS.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
Table of Contents
Part I: Hardware 7
1. What’s in a Computer? 11
1.1 Logical Construction 13
1.1.1 CPU 13
1.1.2 RAM 14
1.1.3 Disks and other secondary storage 15
1.1.4 Et cetera 17
1.2 Physical Construction 17
1.3 Moore’s Law 21
1.4 Summary 22
2. Bits, Bytes, and Representation of Information 23
2.1 Analog versus Digital 23
2.2 Analog-Digital Conversion 25
2.3 Bits, Bytes, and Binary 30
2.3.1 Bits 30
2.3.2 Powers of two and powers of ten 31
2.3.3 Binary numbers 32
2.3.4 Bytes 34
2.4 Summary 36
3. Inside the CPU 37
3.1 The Toy Computer 38
3.1.1 The first Toy program 38
3.1.2 The second Toy program 40
3.1.3 Branch instructions 41
3.1.4 Representation in RAM 43
3.2 Real CPUs 43
3.3 Caching 46
3.4 Other Kinds of Computers 47
3.5 Summary 49
Wrapup on Hardware 51
Part II: Software 53
4. Algorithms 55
4.1 Linear Algorithms 56
4.2 Binary Search 58
4.3 Sorting 59
4.4 Hard Problems and Complexity 63
4.5 Summary 65
5. Programming and Programming Languages 67
5.1 Assembly Language 68
5.2 High-Level Languages 69
5.3 Software Development 75
5.3.1 Libraries, interfaces, and development kits 76
5.3.2 Bugs 77
5.4 Intellectual Property 79
5.4.1 Trade secret 80
5.4.2 Copyright 80
5.4.3 Patents 81
5.4.4 Licenses 82
5.5 Standards 84
5.6 Open Source 84
5.7 Summary 86
6. Software Systems 87
6.1 Operating Systems 88
6.2 How an Operating System Works 92
6.2.1 System calls 93
6.2.2 Device drivers 93
6.3 Other Operating Systems 94
6.4 File Systems 95
6.4.1 Disk file systems 96
6.4.2 Removing files 98
6.4.3 Other file systems 99
6.5 Applications 100
6.6 Layers of Software 102
6.7 Summary 104
7. Learning to Program 105
7.1 Programming Language Concepts 106
7.4 Loops 110
7.5 Conditionals 111
7.6 Libraries and Interfaces 112
7.8 Summary 114
Wrapup on Software 117
Part III: Communications 119
8. Networks 125
8.1 Telephones and Modems 126
8.2 Cable and DSL 126
8.3 Local Area Networks and Ethernet 128
8.4 Wireless 130
8.5 Cell Phones 131
8.6 Bandwidth 135
8.7 Compression 135
8.8 Error Detection and Correction 137
8.9 Summary 139
9. The Internet 141
9.1 An Internet Overview 142
9.2 Domain Names and Addresses 145
9.2.1 Domain Name System 145
9.2.2 IP addresses 146
9.2.3 Root servers 147
9.2.4 Registering your own domain 148
9.3 Routing 148
9.4 TCP/IP Protocols 150
9.4.1 IP, the Internet Protocol 151
9.4.2 TCP, the Transmission Control Protocol 152
9.5 Higher-Level Protocols 153
9.5.1 Telnet and SSH: remote login 154
9.5.2 SMTP: Simple Mail Transfer Protocol 154
9.5.3 File sharing and peer-to-peer protocols 156
9.6 Copyright on the Internet 157
9.7 The Internet of Things 159
9.8 Summary 159
10. The World Wide Web 163
10.1 How the Web Works 164
10.2 HTML 165
10.3 Cookies 167
10.4 Active Content in Web Pages 168
10.5 Active Content Elsewhere 170
10.6 Viruses, Worms and Trojan Horses 171
10.7 Web Security 173
10.7.1 Attacks on clients 174
10.7.2 Attacks on servers 177
10.7.3 Attacks on information in transit 179
10.8 Defending Yourself 179
10.9 Summary 181
11. Data and Information 183
11.1 Search 184
11.2 Tracking 188
11.3 Social Networks 193
11.4 Data Mining and Aggregation 195
11.5 Cloud Computing 197
11.6 Summary 202
12. Privacy and Security 203
12.1 Cryptography 204
12.1.1 Secret-key cryptography 205
12.1.2 Public-key cryptography 206
12.2 Anonymity 210
12.2.1 Tor and the Tor Browser 211
12.2.2 Bitcoin 213
12.3 Summary 215
13. Wrapping Up 217
What People are Saying About This
"Kernighan tells us exactly what we need to know about computers and computer science, focusing on ideas that are useful and interesting for everyday computer users. He covers a fascinating range of topics, including fundamentals such as computer hardware, programming, algorithms, and networks, as well as politically charged issues related to government surveillance, privacy, and Internet neutrality."John MacCormick, Dickinson College