What is Software assurance (SwA)?

SwA

SwA stands for "Software Assurance". This is something that I have been interested in for many years. This is a short overview intended to give some idea why software assurance is important. It's not an exhaustive treatment as that would take a whole book.

As a developer you quickly realise that is is very difficult to keep bugs out of a program. Most often, a programmer or testing department makes the program fail while trying to use it to do it's typical normal functions. For example, if a program asks for a start-date and an end-date to calculate the time difference, then it might get the calculation wrong. This is a mainstream functional bug. This kind of bug will get detected and corrected quickly.

The date-difference algorithm might work well except when the two dates span a leap-year. This is what is called a "corner-case" or "edge-case". It is the type of bug that is more likely to slip into a publicly distributed program when compared to a mainstream functional bug.

Corner-cases are more likely to be caught when a structured and audited development cycle is enforced.

"Boundary conditions" are those input data that are unusual by virtue of being at an extreme end of the expected input type. A simple example is an algorithm that manipulates an array of integers. Boundary conditions in this case would be:

  1. When the array is initialised but has no elements.
  2. When the array is not initialised.
  3. When the first or last element of the array is accessed.
  4. When access to an element before or beyond the first or last element is attempted.

Take the array of numbers: { 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 }

Let's assume that there are two operations that can be applied:

  1. find the number to the left of the present element.
  2. find the number to the right of the present element.

In this context, elements 1,2,3,4,5,6 should behave identically. This means that you can test element 3 and get 2 and 4 before and after it, and that test should almost certainly imply that the test for 1,2,4,5,6 will also pass.

When the number of elements is small, then it is reasonable to test all of them. But if there are potentially 30 trillion numbers, then this may not be practical.

However, whether there are 30 trillion elements in the array or 7, the first and last are special. The first number cannot return a number equivalent to what is to the left of it. Similarly, the last number cannot return a number equivalent to what is to the right of it.

If your testing does not include boundary conditions, then the program is likely to contain bugs.

In this simple example, it is by no means clear what the algorithm should do with the boundary conditions.

One option when accessing the mythical element 'before' the first element or after the last would be to use a different set of coordinates. The natural way to think of this array is as it it were a stiff strip of card, divided into sections where there was a position for every element but this 'topology' or 'coordinate system' for the array contains boundary conditions -- one at each end.

We can make those boundary conditions disappear by joining one end to the other. Now there is a loop. Topologically, there is no beginning, and no end. This is a circular data structure. Testing complexity is inherently reduced by eliminating boundary conditions. It has to be pointed out that this particular data structure might not be appropriate for certain applications, but for the purposes of discussion, let's assume that it is good.

Now that you have that picture in mind, consider the circular data structure where there are some elements that are given values, and some that have never been accessed. Your testing procedure needs to consider what the calling code needs to do in the case where it tries to access data that has never been initialised.

Again, there are two ways to look at this. Either we can test this 'corner case' for any of potentially thousands of calling programs, or we could eliminate the possibility. To eliminate these corner cases, we can enforce a rule into the topology that causes all elements to be initialised when the data structure is first created. It will not be the responsibility of the calling procedure to do this, it will simply be a feature of the data structure. In this way, testing is confined to the context of the data structure. We now guarantee that all values returned from the array are sensible.

Previously (before initialisation) we had three cases to consider:

  1. The data returned was put there deliberately.
  2. The data returned indicates that it is default or untrusted or triggers a new action.
  3. The data returned is unpredictable.

If we force initialisation, then 3. above cannot occur. It may be possible for calling programs to avoid the case by keeping proper track of which elements are valid, but that's a big responsibility. Even if it works, then the method is vulnerable to malicious tampering.

After guaranteed initialisation, there are only two cases to consider:

  1. The data returned was put there deliberately.
  2. The data returned indicates that it is default or untrusted or triggers a new action.

Again, is more to consider. Case 2. above might return a valid but otherwise special element. It's common to use the value zero for unused but initialised elements. But this excludes the ability to place a zero into the array and have it interpreted as case 1.

To understand the rather deep consequence of this, you need to consider what the structure is designed to hold. In most computer languages, this is known as 'type'. The array could be of 'type' integer or 'type' fraction or of 'type' "pointer to a particular animal in the local zoo".

Let's take the last example and assume that the local zoo has the following cages, marked with integers (pointers).

[0] = frog

[1]= elephant

[2]= mouse

[3]= empty cage - nothing in it

[4]= monkey

Clearly, we cannot initialise our data structure with the number zero. If so, then it would incorrectly tell us that every cage in the zoo (even the empty one) contains a frog. So we need to reserve an element that cannot logically map to a cage. We could use [5]. But that means that the calling code either needs to know that a return value of [5] means that it is uninitialiesd data, or inefficiently 'test' the zoo by looking for cage 5.

If it looks for cage '5' and can't find it, then it could assume that 5 is the indicator for an unassigned default initialiased element.

This is an appalling solution. When the council grow the zoo by one cage, then the program breaks. It's inefficient and fragile.

A slightly better choice would be to initialise the array so that unused elements are 0, and re-name all the cages such that 'zero' is never used as a cage's name. But this imposes unreasonable restrictions on the zoo-keeper. He might be quite happy to name all unused cages '0'.

Another solution would be to use a very large integer for the initialised elements. It seems reasonable that we could never build a zoo with 65535 cages. Now the calling program needs to know that the value "65535" is special.

But what if you saw the beauty of your code one day and decided it could be re-used to store the names of discovered species? We are in trouble again because there are more than 65535 discovered species. A poor hack might be to simply skip that designation in the target but oh what a mess!

We could make significant improvement by initialising to (-1). It would be impossible to have a negative number of cages or classifications.

What we have done here is home in on a domain/co-domain set-theory solution.

In set theory, a particular set, that we will label D will be designated the 'domain'. This is the equivalent of (say) the infinite set of integers, { 0,1,2,...} or the finite set of integers {1,2} or the null set {}.

The null set is a set (of type integer in this case) which has no elements.

If we then apply a mapping to each of those elements -- let's take the example finite set D={1,2} so that each element maps one-to-one on to the codomain {frog,elephant}, then

1 -> frog

2-> elephant

We read this as; "1 maps to a frog" , "2 maps to an elephant".

All elements of D are positive integers. All elements of C are names of animals.

An unused domain would be the empty set {}.

But how do we initialise the elements of this set? In the strictest possible sense, we can't do this -- at least not in a universal way because we state that all elements must be positive integers, yet an unused element cannot be map-able.

The only clean mathematical method to extend this is to make the domain D a super-set of its original definition. We could allow the overall type of D to be any integer, even negative ones, and then initialise the elements of the array to any negative number.

Alternatively, each element can be a set. We could make what is known as a 'tuple' and define a new domain D'

Elements of D' may now be:

{1:0} or {1:1}

{2:0} or {2:1}

{3:0} or {3:1}

and so on. In each element of D', there is a value (the first in the tuple) and an attribute which takes the binary type with possible values 0 or 1. We can say that 0 means that the first element of the array is uninitialised and not to be trusted, while the 1 means that it is valid data.

This is now a clean solution. The calling code can use this data structure for many different things, be it a mapping to zoo animals or the square-root of the data in D'.

The calling program does not need to check the return value against some arbitrary value, all it needs to do is 'query' the data itself about its validity. To do this, the designer of the data structure will define access methods:

Create_structure

get_data

put_data

is_data_valid

destroy_structure

We could extend this to

get_element_before

get_element_after

set_element_before

set_element_after

We could also have:

is_array_full

By creating these access methods, the designer of the data structure is free to publish a standard way of interfacing with the data, and have the ability to change the design of the data storage at will. As long as the interface methods are valid, the calling program will not need to be changed.


If D is made of positive integers as for use by the calling programs, then D' might be constructed of signed integers. It might take a little thought, but it should be clear that this is exactly the same as storing tuples in D'. This is because a signed integer is simply a positive integer with an attribute. The 'negativeness' of the returned value conveys its validity.

However, if you adopt this idea, then be aware that you will have to accommodate the idea that zero can have a negative attribute. In number theory, zero is neither positive or negative so this could be a source of error. It's best to avoid it.

So let's consider two methods of design. The first is to use tuples internally, and the second is to use signed integers. Which method is best?

Even though the data structure designer could use an access method to determine validity based on the negativeness of the value, a third party programmer using the data structure could easily notice this feature and test the sign of the returned value rather than calling the method. While this would work at first, it might not work in the future, and we still need to worry what -0 means and if we ever try to negate zero.. So we need more rules.

The designer of the data structure could (and should) return only positive integers from its "get" method but could store the data internally in a signed array. Now you should be able to see that good design permits changes and whether tuples or signed integers are used as D' it makes no difference to the calling programs. To push the example to the extreme, the data structure could be held by one-legged storks holding hand-written pieces of paper while data validity could sprayed on to the storks' beak in green or yellow paint. As long as the access methods remain standard so that the stork-keeper returned integers and binary declarations of validity using the published interface, the calling program will still work as expected. It might be a little slow... and need feeding, but it should be logically sound.

No matter how the internal data is organised, the caller of the data would code something like this:

if is_data_valid(d) then
   c = get_data(d)
   do_stuff(c)

...and because of our robust design we can also code like this:

c = get_data(d)
if is_data_valid(d) then
  do_stuff(c)

These differ in efficiency but not function. In the first case, one might assume it is a 'cheap' function to test validity before retrieving the data compared to the second case where actually retrieving the data is performed regardless. Whether one is more efficient over the other depends on how the internal structure works. The data structure designer could optimise by pre-fetching the data internally during the validation check, or validation might be a trivial operation once the data is located.

It may even be reasonable for the data structure to keep track of whether data is validated after sending it. Then the data structure could be asked to supply assurance of code usage.

For example after:

c = get_data(d)
x = get_data(y)
if is_data_valid(d) then
  do_stuff(c)
  do_stuff(x)
 

Assurance code might issue a method provided by the data structure designer:

Print_validation_on_access
 
result: data y was accessed but validity was not checked.
 

The code would be corrected as follows:

c = get_data(d)
x = get_data(y)
if ( is_data_valid(d) and is_data_valid(y) ) then
  do_stuff(c)
  do_stuff(x)

Because of the nature for the binary operator 'and' to inspect each operand, validity for all data access is assured.

asserts

If you design a function that computes the square root of a number, then primary school mathematics will tell you that you cannot take the square root of a negative number. This is true within the domain of integers, real numbers (decimals) and quotients (fractions). It is not true for complex numbers since these live in a domain that defines the square root of -1. Therefore all complex numbers (even negative ones) have a square-root. But while we are working with integers, it is not permissible to attempt to take the square root of a negative number.

This is where input validation is useful. If the function in question works with integers, and takes the square root, then the data passed in must be zero or positive. Otherwise an error should be thrown.

One way to do this is to use a function called 'assert' It works like this:

squareRoot( x )
  Assert ( x >=0 )
  Do stuff

The assert does nothing if x is valid. If x is invalid, then it causes program termination, an error-log entry, an electric current to go through the programmer's chair... whatever is appropriate. In a good environment, this will throw an 'exception' which means that the calling program is redirected to an error-handling routine. In many cases, the calling program can perform an alternate piece of code.

A function might also produce an answer that must be a certain value of a range of values. Here is a silly(ish) example where a number is taken away from itself. The answer MUST be zero. If it's not, then 'something very bad' has happened to the computer.

takeaway( x )
  result = x - x
  assert (result == 0)
return result


This might seem stupidly simple, but in fact x might be a complex data type. The code that does 'minus' might be dealing with a whole array of elements stored in a structure called x. Something external or otherwise unexpected might prevent x - x from returning exactly zero.

This could be the result of hacker interference, or a hardware bug. Perhaps the value of x is unstable in rare cases. There are many things that could go wrong under weird circumstances. Therefore, the 'assert' will at least return control to the overall program. However, we have to have faith in the ability of assert to identify zero properly.

Trustworthiness, Predictability ,Conformance

Trustworthiness

If you use solid design techniques, then the software is more trustworthy. By reducing the number of special conditions by design rather than by error-checking, the code is kept simpler. Simple code is less likely to contain bugs. When a bug is 'exploitable' it means that the particular behaviour of the program when the bug happens causes a behaviour that permits loss of control. When this loss of control results in the possibility that a malicious third party gains control, then this is when real people using the software are at risk of losing real-world assets to criminals.

The trustworthiness of a software product is dependent on how well it has been designed and tested. Software that is designed well and designed for possible change is more likely to remain trustworthy than software that is simply hacked together and throughly tested. This is because changes to poorly written software introduces bugs.

Predictability

Predictability is the assurance that a program does what it is designed to do each time it is used in a particular way. An example where this is not the case would be a system that is 'ill conditioned'. This means that massive and unpredictable different results can happen when the input is changed only slightly from a particular value or through a range of values. Those random slow-downs that you see sometimes on your computer is an example of unpredictable behaviour.

Conformance

Conformance is not only to have predictability, but also to comply with regulations and methodologies. This may include legislation. A good example is that of credit-card processing. It is possible to have a predictable and reliable -- even secure credit card transaction processing system that does not conform to the Payment Card Industry requirements. In this case, the perfectly good design is not legal.

Similarly, it is possible to have a conforming program which is not predictable or trustworthy.

This is why there are three criteria: TPC.

A directory traversal exploit

Exploit example

Watch the video to the right. It has no sound or explanation so I will explain what is happening. The black screen is a command window into which the hacker enters commands to connect to someone's file transfer system. This ftp system is a typical service found on many commercial enterprises. The companies expose part of their file system to the public to allow people to retrieve things like brochures or deposit files for troubleshooting problems and so on. That's fine as long as the outside user may only deposit files into a certain directory and below. As long as the outsider cannot retrieve private information from the server outside what is allowed, then secret data is safe. However, this ftp server program was designed and written without the assurance and testing methodologies outlines by example above. The designer never anticipated that someone would deliberately try hundreds of ways to break out of the designated directory.

Here is an explanation of some of the commands used:

cd ../

This means go up one directory. It is blocked.

cd ..\

Use the DOS version of slash and try the same thing. It is blocked.

cd ..\/

That's not a letter V, it is \ followed by /

In UNIX/Linux, the backslash (\) is called an 'escape' and allows literal translation of the character that follows. So this is logically the same as ../

All those exploit atempts fail. If you watch the whole video you will see that several trials reveal a vulnerability.

The video goes on to single out this command:

cd .../

That's three dots then a slash. In the file system a single dot means 'the current directory'. two dots mean the directory above. Three dots mean -- well, nothing really. It's not a command. However, the designer of the program probably took a shortcut and interpreted only the last two dots or first two dots to implement a command to go up a directory. The testers of the code probably only checked that ../ did nothing when already at the top level (root level). The hacker used an automatic script to try hundreds of combinations to try and break out of the directory structure.

Once this vulnerability was discovered, the hacker could get a secret file and inspect it.

This is a good example of a vulnerability is present because the testing and security was not properly designed into the application.

Software Assurance testing and methods aims to design robust applications. Theoretically, if all applications were perfectly secure to the same standards as perimeter-hardened network equipment, then the perimeter equipment would be more effective as a layer of defense rather than just the outer-wall.

Bad people

Some unintentional software bugs are not exploitable. They might never get executed. If they are executed, then they fail without compromising the program.

Then there are bugs that can be exploited. Their existence is unintentional but some tenacious hacker has found it and found a way to exploit it. We hear a lot about this in the popular news.

And there is the intentionally designed malware. At any time during the software life cycle, someone could put some naughty code in. It might be a back-door or a time-bomb or some other little gem that later allows access or control when the program is released.

SwA employs tools, procedures and methodologies to prevent this.

Especially in our modern world, data can be thought of as currency. The integrity of the data is what gives it worth. Data can be stolen, modified or deleted for profit. Preventing this at the lowest level is very important. Programmers and system designers can do more to install integrity checks on the data that the program collects.

Dynamic and Static analysis

Static analysis is performed on the source code. The assurance tester uses eyes and assurance testing tools to try and determine if there are exploitable vulnerabilities, back doors, Trojan, spyware and even malicious code disguised as a feature. It must be noted: This is a very hard problem to solve.

Dynamic analysis is performed on the running program. This is looking for assurance that data-driven attacks are ineffective. Popular press call this 'Penetration testing' or simply 'Pen testing'. It may extend out of the realm of code into operational procedure. When this happens, we call it 'social engineering', 'phishing' and 'scams'. Social engineering and the like preys on people's vulnerability to being tricked. In some cases, a better designed application can prevent a human from doing something silly. A very well known example is for the launch of a weapon with a two-key initiation. If the design of the system demands that two individuals independently agree that a launch is necessary, then it greatly reduces the risk that a single social-engineering attack on an individual will be effective.

Another example is that of two-factor authentication. In this example, a person uses a secret (typically a password) and a hardware token as two separate 'keys' for unlocking an account. While a social engineering attack might entice someone to reveal the secret password, they would also need to steal the token. 

Complexity

Today, there must be trillions of lines of code in use. There are probably billions of unique lines of code exposed to the internet, and most of those are unworthy of being exposed directly to malicious attackers. Every month hundreds of new exploits are let loose, and also thousands of variants of these exploits. Each year, the involvement of organised crime groups increases. Ordinary people are losing control of their email, their bank accounts, and their unique identity. Criminals are doing this for profit.

Some perimeter security devices like firewalls and intrusion prevention systems test every connection against up to a million signatures. I really have to ask how long this can continue. Will it be practical to test for 10 million signatures? 100 million? At what point will it become unmanagable?

SwA is scalable. Software assurance might seem to add to the cost of software development, but the return on the investment comes after market. There are fewer bugs, the program is more reliable, more predictable, more salable. It is easier to enhance and improve. It is more secure and will gain a greater following which increases market share and profits.

Programmers should be acutely aware of these issues, but it's very tempting to take short cuts and declare that a later 'clean up' will address potential issues. However, these often get left as it and never corrected.

So it's a good idea to find some automated tools that can scan for potential problems, and perform code audits and walk-throughs.

More by this Author


Comments 1 comment

Austinstar profile image

Austinstar 5 years ago from Somewhere in the universe

I'm glad there are guys like you out there that can keep up with this stuff!

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working