Size Matters
Matters of Size
Nowhere does size matter more than in proper selection of data types. Accurate quantization of virtual properties or physical properties depend on decisions made at design time. Loosely typed languages such as php or perl along with strongly typed languages such as Java (not JavaScript) each present unique opportunities to embed nefarious logic errors in mission-critical software, unless matters of size are carefully considered.
A simple example illustrating why size matters
Consider a simplified version of a discrete wavelet transform function:
DWT(f(x)) = ΣΨj,k(x)
Assuming a Haar scaling function and also a Haar wavelet, we must consider the upper and lower bounds that will result from the calculation. Data type size becomes crucial. For the purposes of this discussion we will assume f(x) is a simple step function:
We can see that size matters
From the (admittedly arbitrary) definition of our discrete function, we see that an improper selection of container size for the summation expression could result in disaster.
Another example of why size does matter
A gross oversimplification of the size issue is without question the seemingly ubiquitous issue of overflow and underflow when dealing with integer types. Consider the following code snipped in C++:
A simple example of why size matters
#include <iostream> #include <limits.h> void main() { int alpha; long int delta; delta = INT_MAX; alpha = ++delta; }
This code snipped illustrates an all-too-common problem in software development. The size of the integer data item, alpha, is not sufficient to handle the result of the calculation. The value of delta++ generates a value that exceeds the maximum possible number that can be stored in a signed integer data item. The assignment statement (line 8) evaluates properly, but the result of the assignment is usually unexpected and almost always improper; alpha is truncated. This code is deterministic, but often not properly understood. In other words, it works, but not the way that most programmers think it works. It's not broken, just misunderstood.
Let us examine this code in greater detail. In a typical modern development environment, 4 byte integers are the norm. A signed integer can range from -231 through +((231) - 1). In base 10, these values are -2147483648 through +2147483647 (for better or worse, these base 10 values are not at all intuitive. Digital computers store integers in base 2 format. We humans live in a base 10 world.). A signed long integer can range from -263 to +(263-1)) , or -922337203685477580810 to +922337203685477580710.
In line 7 of our code we store the maximum value of a signed 4 byte integer in a signed long integer. So far, so good. However, in line 8 we foment a logic error. A through understanding of the methodology employed by C++ to evaluate an assignment expression is necessary here. Keep in mind that line 8 contains two operators; the increment operator and the assignment operator. We know from the C++ order of operations rules that the increment is executed first because it is prepended to the delta symbol (commonly referred to as a "pre-increment' operator). After executing the increment operator, the assignment operator is executed. Given that C++ is not exactly a strongly typed operating system, the compiler happily adds an implicit data conversion in order to successfully complete the assignment operation; delta is coerced to a signed int and that result is stored in alpha. For better or worse the result of the coercion is a truncated version of what delta contains.
From this simple illustration we can see that size matters. Had alpha been declared as a long integer (as delta already was) then the result of the code would have been completely different. Allow us to reemphasize the fact that the code as written is not necessarily 'wrong'. It should be considered to be a completely predictable behavior of the C++ compiler. The code can only be considered 'correct' or 'incorrect' when taken in a greater context. We do not claim to provide that context here; should there be sufficient future interest, we will supply additional code and discuss a larger framework in which this matter of size may be quantitatively judged as right or wrong.
Conclusion
We hear it all the time. It's become a mantra in the computer science field: size matters. We have demonstrated two situations in which this is indeed true.