Bits, bytes, and bigger
“I signed up for a 2mb Internet connection. Why can’t I ever download faster than 250 kb/s?”
Did you know 2 Mb is not the same amount of data as 2 MB? Did you know that a gigabyte can be a different size depending on who you’re talking to? Computer data units can be confusing, and there are lots of misconceptions out there. In this article, I explain the basics of computer data units and hopefully clear up the confusion.
Most people have heard that computer data, in its most basic form, is a series of 1s and 0s. In other words, it’s a bunch of values that can each be in only one of two states: on or off, full or null, true or false, 1 or 0, etc. This is called a binary system. Each of these binary values is called a bit and is represented by the lowercase “b”. If you have three 1s and two 0s, that’s a total of five bits of data, or 5 b.
But you usually don’t find five bits of data by itself. Bits are almost always arranged and processed in groups of eight. A group of eight bits is called a byte, and is represented by an uppercase “B”. A collection of 24 bits amounts to three bytes, or 3 B.
A byte doesn’t intrinsically “mean” anything. It’s just a sequence of eight blips and bloops. Its meaning completely depends on the context. For example, if we’re looking at a black-and-white image, maybe each bit tells whether the corresponding pixel is white (1) or black (0). More often, the eight bits will be collectively interpreted as a single value, be it a number, an item from a list, or even part of a multi-byte value.
The most common way a byte is interpreted is as a single whole number, usually from 0 to 255. To understand how this works, think of our own counting system. The lowest possible value in a digit is 0. After we count up to 9, we reset the digit to 0 and increment the next digit by 1, so after 9 we get 10. After 29 we get 30. After 199 we get 200, and so on.
In binary, the same concept applies, but you simply have fewer possible values per digit. So you start at 0, then 1, and then you reset the digit and increment the next one, which gives you 10. Then follow 11, 100, 101, 110, 111, 1000, and so on. The following table compares the binary values to the equivalent decimal (standard counting) numbers:
| Binary | Decimal |
|---|---|
| 00000000 | 0 |
| 00000001 | 1 |
| 00000010 | 2 |
| 00000011 | 3 |
| 00000100 | 4 |
| 00000101 | 5 |
| … | … |
| 00001000 | 8 |
| … | … |
| 00010000 | 16 |
| … | … |
| 00100000 | 32 |
| … | … |
| 01000000 | 64 |
| … | … |
| 10000000 | 128 |
| … | … |
| 11111100 | 252 |
| 11111101 | 253 |
| 11111110 | 254 |
| 11111111 | 255 |
When dealing with very large numbers of bytes, it’s helpful to use larger units like kilobytes, megabytes, and gigabytes. Instead of saying that you have 24252415 bytes, you can say you have about 24 megabytes, or 24 MB. A kilobyte (kB) is roughly 1000 bytes, a megabyte (MB) is roughly 1000 kilobytes, and a gigabyte (GB) is roughly 1000 megabytes. Larger units include terabyte, petabyte, and exabyte, each other which is also roughly 1000 times the size of the previous.
I say “roughly” for a reason. For historical reasons, there are actually two different definitions of each of those units. The original definition of kilobyte was actually 1024 bytes, not 1000, and both definitions are in widespread use today.
Obviously, the names of these units were inspired by the metric system, which has meters, kilometers, megameters, and so on, each of which is exactly 1000 times the size of the previous. But in computers, it’s much faster and more straight-forward to organize sets of data in powers of 2. This is because each digit in binary amounts to some power of 2 (looking at the table above, notice that 1, 2, 4, 8, 16, 32, 64, and 128 are powers of 2).
From a computer’s point of view, defining a kilobyte as 1024 bytes makes some math easier. Let’s say a file is 47284 bytes large and you want to express that number in kilobytes. First, let’s take the binary representation of that number (the way the computer already sees it): 1011100010110100. Chop off the 10 digits on the far right and you get 101110, which translates to 46. That’s actually the accurate number of kilobytes we were looking for, rounded down to the nearest whole number. Pretty straight-forward.
But of course, average people don’t think it’s straight-forward to have a kilobyte equal to 1024 bytes, so some programmers decided to make programs do the math the long way and divide the number by 1000. Unfortunately, they kept on calling it a kilobyte even though it was a different value. Some programs now try to differentiate the two by calling the base-1000 kilobyte “kB” and calling the base-1024 kilobyte “kiB”, but kB is still the most common form for either definition.
The following table shows unit conversions for various values, in both the classic base-1024 system and base-1000 system.
| Bits | Bytes | Kilobits | Kilobytes | Megabits | Megabytes |
|---|---|---|---|---|---|
| Base-1024 | |||||
| 1 b | |||||
| 8 b | 1 B | ||||
| 1024 b | 128 B | 1 kb | |||
| 8192 b | 1024 B | 8 kb | 1 kB | ||
| 1048576 b | 131072 B | 1024 kb | 128 kB | 1 Mb | |
| 8388608 b | 1048576 B | 8192 kb | 1024 kB | 8 Mb | 1 MB |
| Base-1000 | |||||
| 1 b | |||||
| 8 b | 1 B | ||||
| 1000 b | 125 B | 1 kb | |||
| 8000 b | 1000 B | 8 kb | 1 kB | ||
| 1000000 b | 125000 B | 1000 kb | 125 kB | 1 Mb | |
| 8000000 b | 1000000 B | 8000 kb | 1000 kB | 8 Mb | 1 MB |
So to answer the question at the beginning: Internet connection speeds are typically advertised in megabits per second while web browsers and download applications typically show speeds in megabytes or kilobytes per second, which means the maximum speed values will typically differ from each other by a factor of 8.
Michael
September 14th, 2008 by Michael
I think the x8 factor also helps the the marketing side of things too
At least they never try to explain it.
I waited the whole time while reading the article for you to say kibibyte, you you never did