For physical machines, we can compare the power of different machines by measuring the amount of mechanical work they can perform within a given amount of time. This power can be captured with units like horsepower and watt. Physical power is not a very useful measure of computing power, though, since the amount of computing achieved for the same amount of energy varies greatly. Energy is consumed when a computer operates, but consuming energy is not the purpose of using a computer.
Two properties that measure the power of a computing machine are:
1. How much information it can process?
2. How fast can it process?
1. How much information it can process?
2. How fast can it process?
We defer considering the second property until Part II, but consider the first question here.
InformationInformally, we use information to mean knowledge. But to understand informa- information
tion quantitatively, as something we can measure, we need a more precise way
to think about information.
tion quantitatively, as something we can measure, we need a more precise way
to think about information.
Measuring Computing PowerThe way computer scientists measure information is based on how what is known changes as a result of obtaining the information. The primary unit of information is a bit bit . One bit of information halves the amount of uncertainty. It is equivalent to answering a “yes” or “no” question, where either answer is equally likely beforehand. Before learning the answer, therewere two possibilities; after learning the answer, there is one. binary question We call a question with two possible answers a binary question. Since a bit can have two possible values, we often represent the values as 0 and 1. For example, suppose we perform a fair coin toss but do not reveal the result. Half of the time, the coin will land “heads”, and the other half of the time the coin will land “tails”.
Without knowing any more information, our chances of guessing the correct answer are 1
2 . One bit of information would be enough to convey either “heads” or “tails”; we can use 0 to represent “heads” and 1 to represent “tails”. So, the amount of information in a coin toss is one bit.
Similarly, one bit can distinguish between the values 0 and 1:
2 . One bit of information would be enough to convey either “heads” or “tails”; we can use 0 to represent “heads” and 1 to represent “tails”. So, the amount of information in a coin toss is one bit.
Similarly, one bit can distinguish between the values 0 and 1:
How many bits of information are there in the outcome of tossing a six-sided
die?
There are six equally likely possible outcomes, so without any more information
we have a one in six chance of guessing the correct value. One bit is not enough
to identify the actual number, since one bit can only distinguish between two
values. We could use five binary questions like this:
die?
There are six equally likely possible outcomes, so without any more information
we have a one in six chance of guessing the correct value. One bit is not enough
to identify the actual number, since one bit can only distinguish between two
values. We could use five binary questions like this:
This is quite inefficient, though, since we need up to five questions to identify
the value (and on average, expect to need 313
questions). Can we identify the
value with fewer than 5 questions?
the value (and on average, expect to need 313
questions). Can we identify the
value with fewer than 5 questions?
Computing 5Our goal is to identify questions where the “yes” and “no” answers are equally likely—that way, each answer provides the most information possible. This is not the case if we start with, “Is the value 6?”, since that answer is expected to be “yes” only one time in six. Instead, we should start with a question like, “Is the value at least 4?”. Here, we expect the answer to be “yes” one half of the time, and the “yes” and “no” answers are equally likely. If the answer is “yes”, we know the result is 4, 5, or 6. With two more bits, we can distinguish between these three values (note that two bits is actually enough to distinguish among four different values, so some information is wasted here). Similarly, if the answer to the first question is no, we know the result is 1, 2, or 3. We need two more bits to distinguish which of the three values it is. Thus, with three bits, we can distinguish all six possible outcomes.
Three bits can convey more information that just six possible outcomes, however. In the binary question tree, there are some questions where the answer is not equally likely to be “yes” and “no” (for example, we expect the answer to “Is the value 3?” to be “yes” only one out of three times). Hence, we are not obtaining a full bit of information with each question. Each bit doubles the number of possibilities we can distinguish, so with three bits we can distinguish between 2 2 2 = 8 possibilities. In general, with n bits, we can distinguish between 2n possibilities. Conversely, distinguishing among k possible values requires log2 k bits. The logarithm is defined such that if a = bc logarithm then logb a = c. Since each bit has two possibilities, we use the logarithm base
2 to determine the number of bits needed to distinguish among a set of distinct
possibilities. For our six-sided die, log2 6 2.58, so we need approximately 2.58
binary questions. But, questions are discrete: we can’t ask 0.58 of a question, so
we need to use three binary questions. Trees. Figure 1.1 depicts a structure of binary questions for distinguishing among eight values. We call this structure a binary tree. We will see many useful binary tree applications of tree-like structures in this book. Computer scientists draw trees upside down. The root is the top of the tree, and the leaves are the numbers at the bottom (0, 1, 2, . . ., 7). There is a unique path from the root of the tree to each leaf. Thus, we can describe each of the eight
6 1.2. Measuring Computing Power
possible values using the answers to the questions down the tree. For example,
if the answers are “No”, “No”, and “No”, we reach the leaf 0; if the answers are
“Yes”, “No”, “Yes”, we reach the leaf 5. Since there are no more than two possible
answers for each node, we call this a binary tree.
We can describe any non-negative integer using bits in this way, by just adding
additional levels to the tree. For example, if we wanted to distinguish between
16 possible numbers, we would add a new question, “Is is >= 8?” to the top
of the tree. If the answer is “No”, we use the tree in Figure 1.1 to distinguish
numbers between 0 and 7. If the answer is “Yes”, we use a tree similar to the one
in Figure 1.1, but add 8 to each of the numbers in the questions and the leaves.
The depth depth of a tree is the length of the longest path fromthe root to any leaf. The
example tree has depth three. A binary tree of depth d can distinguish up to 2d
different values.
Figure 1.1. Using three bits to distinguish eight possible values.
Units of Information. One byte is defined as eight bits. Hence, one byte of
information corresponds to eight binary questions, and can distinguish among
28 (256) different values. For larger amounts of information, we use metric prefixes,
but instead of scaling by factors of 1000 they scale by factors of 210 (1024).
Hence, one kilobyte is 1024 bytes; one megabyte is 220 (approximately one million)
bytes; one gigabyte is 230 (approximately one billion) bytes; and one terabyte
is 240 (approximately one trillion) bytes.
Three bits can convey more information that just six possible outcomes, however. In the binary question tree, there are some questions where the answer is not equally likely to be “yes” and “no” (for example, we expect the answer to “Is the value 3?” to be “yes” only one out of three times). Hence, we are not obtaining a full bit of information with each question. Each bit doubles the number of possibilities we can distinguish, so with three bits we can distinguish between 2 2 2 = 8 possibilities. In general, with n bits, we can distinguish between 2n possibilities. Conversely, distinguishing among k possible values requires log2 k bits. The logarithm is defined such that if a = bc logarithm then logb a = c. Since each bit has two possibilities, we use the logarithm base
2 to determine the number of bits needed to distinguish among a set of distinct
possibilities. For our six-sided die, log2 6 2.58, so we need approximately 2.58
binary questions. But, questions are discrete: we can’t ask 0.58 of a question, so
we need to use three binary questions. Trees. Figure 1.1 depicts a structure of binary questions for distinguishing among eight values. We call this structure a binary tree. We will see many useful binary tree applications of tree-like structures in this book. Computer scientists draw trees upside down. The root is the top of the tree, and the leaves are the numbers at the bottom (0, 1, 2, . . ., 7). There is a unique path from the root of the tree to each leaf. Thus, we can describe each of the eight
6 1.2. Measuring Computing Power
possible values using the answers to the questions down the tree. For example,
if the answers are “No”, “No”, and “No”, we reach the leaf 0; if the answers are
“Yes”, “No”, “Yes”, we reach the leaf 5. Since there are no more than two possible
answers for each node, we call this a binary tree.
We can describe any non-negative integer using bits in this way, by just adding
additional levels to the tree. For example, if we wanted to distinguish between
16 possible numbers, we would add a new question, “Is is >= 8?” to the top
of the tree. If the answer is “No”, we use the tree in Figure 1.1 to distinguish
numbers between 0 and 7. If the answer is “Yes”, we use a tree similar to the one
in Figure 1.1, but add 8 to each of the numbers in the questions and the leaves.
The depth depth of a tree is the length of the longest path fromthe root to any leaf. The
example tree has depth three. A binary tree of depth d can distinguish up to 2d
different values.
Figure 1.1. Using three bits to distinguish eight possible values.
Units of Information. One byte is defined as eight bits. Hence, one byte of
information corresponds to eight binary questions, and can distinguish among
28 (256) different values. For larger amounts of information, we use metric prefixes,
but instead of scaling by factors of 1000 they scale by factors of 210 (1024).
Hence, one kilobyte is 1024 bytes; one megabyte is 220 (approximately one million)
bytes; one gigabyte is 230 (approximately one billion) bytes; and one terabyte
is 240 (approximately one trillion) bytes.
No comments:
Post a Comment