Confidentiality, Integrity, and Availability. How it affects your network data security.
If ever we need an example why this is important, please read http://hubpages.com/hub/Stuxnet-A-major-military-strike-using-a-virus-through-cyberspace
Security basics in a digital age.
Original work. Sept 2009. Revision 1.0
Although what I am about to explain has become very important in the digital age, the principles are valid throughout history. Therefore, I will introduce the topic from a historic perspective.
To run a business (or a war), you must protect your information. There are three separate, but related topics that embody most of what you need to consider. This, as the title suggests includes:
I will describe what each of these mean, why and when you need it, and give some examples in lay-terms how they might be imposed.
I will also explain simply, but without compromising accuracy just how your data can be stolen despite the use of strong security measures. Digital security is hard. Don't let anyone tell you otherwise. The entire e-commerce system is complex and complex systems are bound to contain flaws.
Shhh! Don't tell anyone. It's a secret.
In ancient Greece, Two interesting methods of sending secret messages were used. One was to engrave a wooden tablet with a message, then cover it in wax. This looked to the guards, as a blank writing tablet. As long as the guards were not alerted to the trick, a plainly legible, but hidden message could slip by without raising an alarm. Another trick was to shave a messenger's head, write a message on his head and let the hair grow back. Once the message was delivered, the recipient once again shaved the messenger's head. Obviously this is a slow method, and should the guards learn of the trick, then the messages could easily be intercepted.
However, if intercepted, in both cases, the message media would arrive at its intended place altered. The most likely outcome is the messenger was killed. If he was set free, then the messenger could relay what happened. But the messenger could be blackmailed into covering up the interception. With the tablet, the man in the middle would need to replace the wax and do so in such a way that it did not look like it had been scraped off. This could be difficult, In the second case, it would take a while for the messenger's hair to grow back and this would cause a delay in delivery.
In either case, a new messenger might be substituted - an impostor that looks like the real messenger, could arrive without integrity.
If he was killed, then the system loses availability.
Hiding the message is a crude method of confidentiality because the people who would like to discover a hidden message need only look for it. Once found, it is readable.
Hiding in plain sight.
The methods of hiding a readable message in someone's hair, or behind the wax on a blank tablet are just two kinds of steganography. Sometimes, this method is called, "hiding a message in plain sight". It might seem elementary and crude, but steganography can be very useful even today. Risk is only significant if there is a vulnerability, a threat, a cost, and a period of exposure. Clearly, in the case of an ancient Greek messenger, he is vulnerable by virtue of being identifiable as a messenger - perhaps by carrying a scroll. This display of messenger-like behaviour alerts the attention of a guard, and therefore the guard is a threat. His risk level is high because he is likely to be killed or blackmailed if caught, and he is exposed every time he looks like a messenger.
Steganography reduces his risk level because he can carry a message without looking like a messenger. This reduces his exposure to threat.
The biggest downside to this method is that the message is not protected in any way apart from being hidden. To be useful, you need to hide the message extremely well.
By hiding his message, the Greek reduces his risk
One surprisingly effective method of secretly sending a message is via the ordinary postal service. Simply because there are billions of letters and parcels being handled via changing staff, and each parcel is carried via a potentially different route, and it can be made to target a reasonably anonymous and possibly varying postal box or physical address, then it is quite easy to securely send a message from one top secret government agency to another without that transaction being noticed or intercepted. In contrast, when a top secret government agency sends a message over a private network, then the scope of the network is limited, it's route is deterministic, and the time of carriage is easily detected. This is like painting a target on it. I am not saying that private secure channels are not used or are useless - it's just that their underlying mechanisms must be very strong because they almost advertise their importance.
But, a secure channel can be stuffed with false data. If a data link has "something" encrypted going though it at all times, then who knows if, and when a secret message is transmitted. Furthermore, a fully used encrypted channel presents a large cryptanalysis data set, most of which results in wasted analysis time.
So we have a complex public network where a simple, single unobtrusive package is sent, and a simple private network, where a complex multiple-decoy traffic stream is used. In either case, a form of steganography (hiding in plain sight) is in play.
More about encryption.
Encryption is different to steganography, and as we have seen with the stuffed-channel, the two may be used at the same time. Each contribute mostly to confidentiality.
The purpose of encryption is to take a plainly readable message (plaintext) and contort it into an unintelligible gibberish which statistically, looks as close to random as possible. That's fairly easy to do, because all you need to do is mix it in with a stream of random bits of information, and the randomness is preserved. (If you do it right). However, to read the message, you need to use the same random stream and do it again. This is called the one-time-pad and it is one of the only provably uncrackable encryption methods. It has a huge downside though. The random stream must be distributed to the recipients, and doing so is just as difficult as sending the plaintext, with the slight advantage that you can pre-distribute the pad of random bits. However, if any of these streams are used more than once, then the encryption may be broken. This is why it is called the one-time-pad, and not the many-time-pad.
The focus today is not on the one-time-pad even though it is unbreakable if used correctly. Cryptologists are focusing on how to use a small key instead and apply it to the whole stream without introducing a practical vulnerability. Ideally, if no theoretic vulnerability can be found by the best crytanalysts in the world, then it is likely to be a better algorithm.
Encryption techniques positively benefit from being publicly scrutinized. Good modern encryption MUST be publicly scrutinized. In all cases, encryption involves two parts.
- An algorithm. (How to do the encryption)
- Something kept secret. (A unique key which allows decryption)
The algorithm should be publicly known - like a lock mechanism. People who dedicate their careers to the task are then able to examine the algorithm and look for ways that it might be broken. The mathematics behind encryption is NOT based on a known proof. This is why all encryption techniques are vulnerable to cryptanalysis. (Except the one time pad and another that I will discuss later).
Many encryption techniques are based on what the pure mathematicians call a hard problem. This may be one of the most understated descriptions in all of science, since, a hard problem to a pure-mathematician seems downright impossible to mere mortals. Nevertheless, a proof is an unshakable, rigorous logical certainty, while a hard-problem leaves open the possibility of a tiny crack or hope (or fear!) of being made simple.
Much of modern cryptography hinges on the seemingly ridiculously difficult task of factoring large numbers. By 'large' I mean stupidly large. For example $10,000,000 would normally be called a large number of dollars, while 10,000,000,000,000 (and 90 more zeros) is a suitable example of a stupidly large number.
It is a fact that all integers may be decomposed into smaller numbers which, when multiplied together give you that number. For example the following pairs of numbers all multiply to 12:
and the following tuples also multiply to 12:
Some of those factors are prime numbers. It is a fact that all non-prime numbers may be decomposed into prime factors, for example the prime factors of 12 are 2,2,3. The difficulty of identifying the any factors in a composite number depends on a couple of things. The first is how many factors there are, and the second is how big the numbers are in the first place. If there are more factors, then the job is easier than if there are only two.
We can imagine two very large prime numbers, which are multiplied together to create an enormous number which is hard to factorize. There are only two factors to be found in this difficult search because they are the only two factors; and, being prime, cannot be factored further.
There is some difficulty in finding large prime numbers. Although, luckily this is not as hard a problem as factorisation. But it has its challenges.
There are an infinite number of primes. (Euclid). Mathematicians have tried in vane for ever to try to predict the next prime number value from the sequence of those found before, but there seems to be no way to do this.
A composite number is one that is not prime (because it is possible to find the primes that equate to that number once you multiply them together).
For relatively small prime numbers, you can look them up from previous calculations or use what is called a sieve to generate the prime numbers. But a sieve becomes too inefficient as the primes desired become large. All is not lost because other algorithms may be used to locate a prime number. One algorithm is deterministic, which means that it works without error, and the other is probabilistic, which assigns a probability whether a number under test is prime. Actually, given a large composite number to test, a modern probabilistic algorithm has a small chance of declaring it prime, but given a number that is actually prime, it will correctly declare it as prime. This drawback is workable for cryptography as it is sufficient to find large primes that are almost certainly prime as opposed to the much more difficult task of finding one that is certainly prime. For example, the Miller–Rabin primality test, if run 30 times on a given test, it will be in error only .000000000000000086% of the time.
In the old days, finding prime numbers was a tedious job that had to be calculated by hand. All large modern prime numbers are found using computers.
Someone could intercept the ancient Greek messenger, shave his head, read the message, "Attack at dawn", kill him, substitute a look-alike with a new message, "Retreat", wait for his hair to grow back, and send him on his way. The message has lost integrity. The recipient needs to know whether the message is in its original form. In this case, it takes a fair while to grow hair, and so the recipient could get suspicious that the message took so long to arrive. But the interceptor could counter this by preparing many messengers with various bogus messages to suit various situations. The sender could counter this by marking the messenger with some kind of signature that is difficult to reproduce, or fades with time. The interceptor could develop techniques to improve the signature forgery, then the sender could send the message in multiple parts, via alternate routes.
As seen above, the process above is an arms race. It evolves in the same way as predator-prey relationships in the natural world. Today, we have ever growing sophistication for both the legitimate and malicious techniques.
The modern method of ensuring integrity of a digital stream of data is again a mathematical construct. Encryption seeks to hide a message and provide a way to recover it (by using the secret key). Ensuring integrity has no need to obfuscate the message. The technique requires a method of producing a small tag which is unique to a particular message. This is called a hash.
A hash is a fixed length number which is calculated from the contents of the message that is to be protected. It sounds simple enough to do but there are some strict criteria.
There are two classes of hashing functions. One is capable of flagging changes to data if the changes are caused by a non-intelligent event. Typically this would be electromagnetic interference or cross-talk in adjacent wires. The second must perform the same function, but also protect against an intelligent adversary. It is relatively easy to create a good hash function that is strong against a random event, but difficult to create one that resists the power of human invention and logic to make it fail. The latter is known as a cryptographic hash, while the former is often called a checksum. Similar to the checksum, is a hash used in computer programming, and these are designed with different criteria. Unfortunately the computer programmer's hash shares the same name with the cryptographic hash. You need to read the term in context.
For a cryptographic hash:
- The hash must be computationally infeasible to create from a different message. This is called a collision.
- There should be no particular restriction on the size of the message.
- Given only the hash, it must be computationally infeasible to find a message that can be used to generate that hash.
- For practical reasons, the hash must be easy to compute.
Under some circumstances, a hash function might be known to produce a collision. If these are well defined, and limited in scope, then the hash function might still be useful if those particular collisions are actively avoided. However, it is very important that the implemented function be highly resistant to collisions under the intended implementation and use.
When a hash function meets the above criteria, then any minor or major change in the message produces an unpredictably wild jump in hash value. For example, here are two hash values produced from this paragraph, one with a full stop at the end, and one without.
With full stop: 752df9d66886520829865739a0e9fafb
Without a full stop: d265ddad8eace2f3715a4c9166c7f2c6
I used a utility called md5sum to create those hashes.
So if I wished to send that paragraph to you, I would also deliver the hash result. You would re-compute the hash from the message and compare the results. If they match, then you can be very sure that the message is intact. If they don't match, then either the has or the message has been altered.
One more provably secure method of encryption.
Quantum encryption is a relatively new technique. Note there is a little contention about the use of the words quantum encryption since the practicality of the current systems seems mostly fixed upon securely sending a key which is then used in a classical way, and many prefer to say "Quantum Key Distribution".
There are a few hardware appliances purchasable today which can do this.
Two of these were reported in 2005 by Technology review.
At present, normal encryption using a symmetric key is still mathematically very strong but there are signs that the methods used to secretly distribute these keys are in danger of being thoroughly broken. If so, then we will need a better way to distribute keys. This is where quantum physics can help.
All present, common methods of key distribution involve classical physics, and there is one outstanding property of a classical solution that is a problem: The keys may be copied without leaving a trace of the theft.
QKD deserves a full article and here, I will just give a few hints. The details are more involved.
In quantum key distribution, when a key is read, the success of reading it has a certain probability of being in error so it is sent several times until the effect of the read-errors are negligible. This is nominally 50% unusable bits which reflects the quantum nature of the particles, and a further error due to inaccuracy of measurement and decoherence in the channel. The recipient and sender compare the prepared states of the key when it was sent to the measured states when received. They can do this over an unencrypted link as they are only discussing success/failure of individual measurements. Once a quantum key is read the quantum states are disturbed and no longer describing the key. An interception predictably increases the error and it is this which acts as a red-flag to the intended recipient. If that is the case, then the key is discarded since any error rate higher than that expected in the channel is regarded as an interception.
The physics involved with this leads to a theoretically unbreakable cryptographic tool, but in practice, either due to cost, implementation or efficiency considerations, there are potential ways to compromise the method once put in practice. Presently, the appliances are fairly expensive, but hopefully this exciting technology will become more common.
Encryption and hash functions do not help very much with data availability. This is the domain of backup strategies, Storage array networks (SANs), tape drives, high availability clusters and so on. But there is little point in encrypting and protecting your data if it is inaccessible. The 'A' part of the CIA triad is very important. You can have a backup strategy, implement it, and yet still be vulnerable if you never actually test the result. Just because data is sent to a tape drive does not necessarily mean that it can be restored. There might be something wrong with the tape drive, but not wrong enough to stop it from apparently working. If you leave your backup media in the same room as the computers, then even if a restore has been tested, it will do no good if it burns in a fire along with the data center.
Backup media is also vulnerable to theft or compromise especially while in transit. You might go to great lengths to encrypt your data over the internet, but neglect to protect it adequately while it is transported to a vault. Backup media can contain passwords in plain text, especially those communicated in email. They can reveal hashes of passwords that can be used as the basis of a dictionary or brute force attack. Your backup strategy requires physical and logical security measures from the moment the media is written, to the point where it is stored.
A disaster can strike, like fire, flood, wild weather, falling trees, loss of power, widespread utilities strike, and no doubt other bad things like political unrest. This kind of data loss can only be managed by building an alternate site. This site should be geographically displaced far enough that it is unlikely to suffer from the same disaster at the same time. The disaster site can be cold, warm or hot. A cold site houses all the required equipment and data but is not switched on. It takes space, costs money to maintain, but is not using as much power. A cold site takes longer to bring on line, and can sometimes be tested in isolation. A hot site has a continuous updated feed of day to day data. It is permanently active, and ensures a reliable and fast switch since all that is required is staff relocation. A warm site is a compromise between the two. Like the simple tape-backup, a DR plan must be tested for all logistic, procedural and practical efficiency. This topic is actually quite expansive and there are companies that specialize in helping to set up DR sites.
Another high-risk to data availability is the so called DoS attack, This is sometimes motivated by political or highly opinionated individuals. A DoS attack may also be performed for bragging rights or as part of a technique to silence some kind of security device. The classic low-level use of a DoS attack is called a SYN flood. This is used to disable a machine's ability to communicate via TCP/IP connections. Unfortunately, the DoS attack is not easy to defend or recover from because it is not necessarily a small-scale intrusion that might have a recognizable signature. The DoS attack complexity ranges from sending a single packet to a vulnerable server to engaging 100,000 computers to simultaneously request services from an internet-facing machine. In the former case, a single-packet DoS attack is easily recognized and blocked, but only if there is a security device to filter it. This is not often the case inside a corporate network. So an already-compromised machine could be activated to disable a security device, or a reliable information source for the purpose of injecting false data. (This breaks integrity). A distributed denial of service attack (DDoS) can simultaneously activate thousands of already compromised computers to target a single site. In this case, each individual computer is asking for a response that the site is actually designed and bound to give. As the site must try and respond, there is no easy defense. Large and important sites usually have huge capacity, and some reasonably effective defenses. One of those defenses is a resistance to the SYN flood. A SYN flood is easy to understand. This is how it works:
Under normal TCP/IP communications, a session is set up much like a telephone conversation:
caller : "ring ring"
answer : "hello"
caller "can you talk right now?"
answer "Sure - what do you want?"
In TCP/IP, the session is started like this:
caller : SYN
At this point, the conversation proceeds.
A human could call a company many times and not continue a conversation, and not hang up. This would quickly tie-up all the lines until the receptionist decides to give up on some of the half-established calls.
A computer has the same problem. If an attacker sends many SYNs and never sends ACK, then it can tie up all the resources of the recipient's machine. Ordinary machines reach this limit quite quickly. A firewall or other security device is configured to have a much longer half-open connection queue, and to recognize the characteristics of a SYN flood and start to clear resources using an intelligent method. This protects the more vulnerable machines behind the firewall. But high traffic flows could present a similar profile, and so in some cases, the parameters in the firewall's SYN flood defense must be tuned to suit the companies' traffic.
This is just one example of a DoS attack. It is not limited to electronic means, as we saw too many times in the 1970s when striking employees picketed factory gates.
Why do a DoS attack?
Sometimes - as in the case where SCO was taken off line by a DoS attack, it is hacktivism. The word is a contraction of hacking and activism. Many people around the world wanted to make a protest about the law suits that SCO was threatening against the general community. This happened in December 2003. SCO suffered nearly 5000 connections a second and the attack chewed through 20 Mbits/second (both ways).
You can read more about it here.
But there are now other reasons to perform a DoS attack. A simple one is performed by many viruses and other malware. I've seen, for example a Trojan that disabled all known anti-virus update sites. This worked by populating a file called "hosts" with records similar to this:
184.108.40.206 f-secure.com 220.127.116.11 http://www.f-secure.com 18.104.22.168 ftp.f-secure.com 22.214.171.124 ftp.sophos.com 126.96.36.199 liveupdate.symantec.com 188.8.131.52 customer.symantec.com 184.108.40.206 dispatch.mcafee.com 220.127.116.11 download.mcafee.com 18.104.22.168 rads.mcafee.com 22.214.171.124 mast.mcafee.com 126.96.36.199 my-etrust.com
All those entries of an IP address of 188.8.131.52 are bogus and this is easily confirmed in two ways:
First, when we try to find out what company published a reverse record for 184.108.40.206 then there is no response. This is not proof that it is a bogus IP address, but it's a clue (so is the fact that it is used so many times in the host file and it just looks fake).
nslookup 220.127.116.11 Host 18.104.22.168.in-addr.arpa. not found: 3(NXDOMAIN)
Second, if we look up one of the legitimate domain names in the host file:
You can see that the real IP address for f-secure.com is actually 22.214.171.124. When your computer tries to look up f-secure.com, before it uses a slower external DNS lookup, it first checks if there is an easy answer in the host file. If it finds an answer, then it tries to use it. But in this case the host file has been hijacked, and there is nothing at 126.96.36.199.
This is a form of DoS attack because your machine is denied the ability to look up the IP addresses of anti-virus companies.
With the anti-virus updates out of the way, the virus or Trojan down-loader has more chance to do damage.
Today, one of the primary intents of a virus, Trojan, backdoor, or down-loader is to quietly turn your innocent little PC into a willing participant in a DDoS attack.
Even if you are aware and use the principles of the CIA triad, there are still methods that criminals can use to steal from you.
Unfortunately for the goal of security, a very high percentage of the world uses a common operating system. This is something that Dan Geer called the monoculture. For several reasons these common operating systems and certain programs that run on them suffer common vulnerabilities. These are now systematically, and ruthlessly targeted by organized mafia-style crime syndicates. There is a particularly effective source in Russia. The main goal of these groups is ill-gotten financial gain. And ordinary people are the target. Due to the operating system and application exploits, the crime group use various methods to lure people to visit infected websites. If you visit one with a vulnerable web browser, then that very act silently installs some malware. It depends how and what happened at the time, but one method might be to install a program that hides in your computer waiting for the opportunity to download a more effective piece of software. In some cases this could be a backdoor which will allow someone or some internet-based program full control of your computer, or it could install a key logger.
A software based key logger records your actual keypresses and later uploads them to a criminal.
This compromises the Confidentiality of your system. It does not affect Availability, and if it did, you may become suspicious. If you visit a banking website, or make a credit-card payment on line when a keylogger is installed, then any encryption is irrelevant because the key presses can be replayed at will from the data collected. From this, you are unknowingly releasing passwords, credit card details, personal details, love letters, web searches and more.
Some of this information - for example a credit card number might sell on the black market for as little as $15. But the people who buy these, and other information do so with the hope that some of them can be used to perloin many thousands.
To counter this, some banks are insisting that customer use two-factor authentication when logging on. This effectively disables the keylogging replay attack because you need to enter a user ID, and password, and a pass-number that is only used once.
Unfortunately, this is not the end of the arms race. The latest keyloggers look for specific banking activity, and then send the keypresses in real time over the internet as you enter the token's one-time password. The criminal uses it as you are logging in to gain access to your site at the same time. Depending on the bank's set up, this might lock you out and allow the cracker access because the bank may not be able to tell which source is legitimate. However this works, once the criminal has control, any direct credit transfer utility you have is used to move as much money as possible out of your bank account.
If all else fails...
If a criminal can't be bothered to use a sophisticated hack to get your valuable information, or if the current defense of the day is just a little too hard, then they have one more trick. Just ASK for it!
Yes - It's that simple.
More than 90% of all email is unwanted. This spam is mostly low-quality advertising that is an annoying time waster for most people, but some of it is classed as phishing.
Phishing is a form of social engineering. Here is a typical phishing email:
eBay Account Verification
Fri, 22 Jul 2008 09:39:40 -0700
These emails are made to look legitimate, and some of them are quite convincing. They include links to warnings about fraud, and links to policy, and fake the email headers and to on. It's the link that you visit which causes the damage. These sites are cloned from the target institution to make them look real, and as you enter your details, they swallow your login credentials, and sometimes even redirect you to the real site.
As you can probably gather, there is a lot more to say about CIA in the digital age but hopefully, this has explained a few of the concepts. There is a lot that I have necessarily missed out, but what is included should be accurate.