numbers as bits (intro)
A bit (short for “binary digit”) is the smallest possible unit of information. Computers represent all information in bits because they are the simplest information to store and manipulate electronically.
Each bit is in one of two states, usually represented as 0 and 1. What these two states represent is up for grabs: maybe they mean on or off, yes or no, true or false, up or down, left or right, etc. Like with all messages, the meaning depends upon pre-established agreement between the sender and recipient. For example, “one if by land, two if by sea” is arbitrary. The only way Paul Revere knew what one or two lights meant depended upon pre-established agreement with the message sender. We can use multiple bits to represent more interesting kinds of information, such as numbers, text, images, and audio.
How do we represent numbers as bits? The counting system we use in English is called “decimal”. It uses ten symbols to represent numbers: 0 1 2 3 4 5 6 7 8 9. This choice of ten symbols, though, is arbitrary. We could just as well use seven, fifteen, or thirty-four symbols. The fewest number of symbols we could use, though, is two. The system of counting that uses just two symbols, 0 and 1, is called binary. Because it uses just two symbols, we can naturally represent binary numbers with bits. So to represent numbers in bits, we use binary rather than decimal. For example, the quantity we call “twenty-two” and write “22” is represented in binary as “10110” (read this as “one zero one one zero”).
When given n bits, the number of different possible values (combinations of 1’s and 0’s) is always 2n, e.g. given four bits, we can represent sixteen different unique values:
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
This suggest then that given n bits, we can represent the range of positive integers from 0 to 2n – 1, e.g. given five bits we can represent any integer from 0 up to 31 (which is 32 – 1).
If we wish to represent negative integers, we can’t just insert a negative sign: all we have to represent information are bits and so everything must somehow be expressed in the bits themselves. The most common scheme for representing negatives is called two’s complement, wherein we make a positive number negative by flipping all of its bits and adding one, e.g. four in binary is 0100, so negative four is 1100 (which is 1011 + 1). (Notice that the first bit effectively acts as a flag indicating the sign: when 0, the number is positive; when 1, the number is negative.) Two’s complement is not the only way to represent negatives, but it is the most commonly used for efficiency reasons.
To represent numbers with a fractional component (3/5, 6.32, etc.), we most commonly use a scheme called floating-point, which is basically the computing equivalent of engineering notation. For example, in engineering notation:
3.04e6 = 3.04 * 106
Above, the significand 3.04 is multiplied by the quantity 10 raised to exponent 6. Floating-point is the same idea, except the exponent becomes a power of 2 (because we’re in binary):
significand * 2exponent
When dealing with floating-point, the programmer must be careful about issues of precision and accuracy. Firstly, floating-point numbers usually use a maximum number of bits to represent the significand and exponent, so some values may not fit in the limited range. Secondly, not all values we can express in decimal can be precisely represented in binary. We’ll discuss these issues in detail.