- Computers & Software»
- Computer Science & Programming»
- Programming Languages
C data types, variables and constants
Computer programs manipulate data in order to be useful. Data that can come external to the program, like data coming from the user or from a file or from a device (e.g. network). This is considered as "input" data, since it's going "in" to the program.
Most programs will also produce data. Data that can be useful to the user or a device. The data coming out from a program is known as "output" data.
C programs manipulate data through the use of variables and constants. If you recall, a variable or constant is a name, either letter or word, used in the program which actually points to a memory location. With this memory location, you can store or retrieve data.
It's required in C that you specify the type of data you want to store in a variable or constant. Luckily, there are only a few types that you should worry about, these are:
Please note that the four terms above are part of the set of "keywords" in C; therefore can't be used as names for variables or constants.
In order to specify the type of a variable or constant in C, you'd put one of the terms on the list above before the variable name, like the following examples:
char letter; int age; float percentage; double speed;
The "char" data type is for storing characters like A, F, x or $. These are also considered as bytes. A byte is an 8-bit number. If you might recall, computers can only manipulate data in binary at the lowest level. Therefore at the lowest level, information can only be represented in binary notation and a piece of information represented using 8 binary digits ("bits") is called a "byte".
So char is for variables or constants that can store a single byte. For this reason, a char variable/constant can only hold one of the 255 possible values.
In order to store data into a char variable/constant, you have two options:
- If the data is a printable character, you can just use the character by typing it on the keyboard and surround it with single quotes.
- Alternatively, you can use a number from -127 to 127.
The reason for starting from negative is that the left-most bit, also known as the MSB or Most Significant Bit, is used normally as the "Sign" bit. If this bit is clear or zero, the number represented by the other 7 bits, is considered to be positive. If this bit is set or one, however, the other 7 bits would be interpreted to represent a negative number.
In other words, only 7 of the available 8 bits can really be used to represent a character. So the 7 bits would effectively give us 127 different values and because of the sign bit, it can represent either the 127 possible positive values or the 127 possible negative values. Such that 127 + 127 + 1 (for zero) = 255 possible values.
Please note that the ASCII character set specifies what number corresponds to which character. Please see the ASCII table here: http://www.asciitable.com/ for more details. Therefore you can use the number 65 to mean the capital letter 'A'.
In the example below, the two assignment statements ("letter = ") are actually the same. Note the use of single quotes to surround the letter A.
char letter; letter = 65; letter = 'A';
The "int" data type is for data categorized as "Integers". These are the whole numbers including negative numbers and zero. This can be 2011, 5, -40, 21 or 0.
The number of bits used to store an integer is dependent on the compiler. In other words, a C compiler is free to choose the number of bits to use as appropriate for the underlying computer hardware. Normally this is 32-bits in modern desktop computers. In some special-purpose computers, also considered as embedded systems like mobile phones and other smaller devices, an integer can also be represented in 16-bits.
Therefore the range of values that the int data type can store may be:
Number of bits used
See the example below for an integer variable declaration and assignment.
int year; year = 2011;
float and double
The float and double data types are for data with fractional components. These will be the numbers with the decimal point in it. For example 3.14159, 40.25, 6.0004 or 1.0.
The way a float and a double is represented is now standardized and is known as IEEE-754 (See this link: http://en.wikipedia.org/wiki/IEEE_754-2008 for more details). As you'll see, a "float" is C's implementation of the so-called "single-precision" numbers. The "double" is C's implementation of the so-called "double-precision" numbers.
A float is represented by 32 bits in all implementations of C. In other words, just like the byte, which is 8 bits, a float will always be 32 bits.
If you recall the scientific notation of representing numbers, which looks like this:
112345 in scientific notation is 1.12345 x 105
From the scientific notation example above, we'll find that the significand (also known as mantissa or coefficient) is the fractional number 1.12345 and the exponent is the positive number 5.
Therefore in C's float, to prepresent the number above, it will allocate 23 bits for the significand, 8 bits for the signed exponent and 1 bit to specify the sign of the number. There is a clever encoding behind this therefore if you really want to know, please check out Single-precision numbers, Exponent bias and Binary offset encoding.
To make the story short, please accept that the range a float can hold is: (Note that numbers with negative exponents are included in the range.)
-3.4 x 1038 to 3.4 x 1038.
Please see an example below on how to declare and assign values to a float variable.
float pi; pi = 3.14159; pi = 314159e-5;
The double data type is similar to float only it has a bigger range. This data is represented by the computer using 64 bits thus giving it more possible values than a float. The range of C's double data type is: (Also, the negative exponent numbers are included in the range.)
-1.7 x 10308 to 1.7 x 10308
Please see below for some examples:
double x; x = -1.7e308; x = 6.82e-9;
Variables and constants
Variables are names for memory locations which data can change while the program runs. Constants are names for memory locations which data can never change as the program runs.
You've seen how variables are declared above. To declare a constant identifier, you'd use the keyword "const", like this:
const int pi = 3.14159;
Please note that a constant identifier can only be assigned a value during declaration. In other words, unlike variables, you can't separate the previous code like this:
const int pi; pi = 3.14159;
Trying to assign a value to a const outside the declaration, will result in a compiler error as shown below:
$ gcc test.c -o test test.c: In function `main`: test.c:7: error: assignment of read-only variable `pi`
What about "signed int" and "unsigned int" or "short int" and "long int"? If you have seen some C programs using the keywords "signed", "unsigned", "short" and "long", you might be wondering why I didn't cover them here. Well, they're called "modifiers". Do you know what they mean? Please stay tuned for more next time.