- Computers & Software»
- Computer How-Tos & Tutorials
Passwords don't work. Choose better authentication and identification.
S T O P P R E S S
7 June 2012:
Millions of Linkedin password hashes stolen by Russian hackers.
Passwords dont work for many reasons. Here is a good one:
Article by Jeremy Lee (C) 2011.
Security bungle exposes 450 NZ Labor supportersDarren Pauli | Jun 15, 2011See this article.
The so called "Hack" consisted of someone simply doing a Google Search, and reading the Google Cache. The passwords had been 'spidered' and were in plain text.I could rest my case... but please read on.
... and another
"The Israeli newspaper Haaretz reports that the Syrian President, aides and staffers had their email hacked by Anonymous, who leaked hundreds of emails online. Reportedly, many of the accounts used the password '12345'
RND stands for Random Number Generator and 'random' means evenly distributed choice from a field of possible values. We are asking people to act like random number generators, and they just can't do it. If I ask you to think of a colour - there are millions of named colours to choose from, but you will pick one of red,green,blue,yellow and a few others most of the time. It's the same for passwords. People can't pick good passwords.
But a breach might be your fault too (apparently)
This clause is from the user agreement for Go Mastercard on-line service centre.
3.3 When selecting your password you must not select:
(a) a numeric code which represents your birth date; or
(b) an alphabetical code which is a recognisable part of your name.
Note: You may become liable for unauthorised transactions if you or your additional cardholder contravenes this requirement - see 9.2.
This is just one reason why you need to read those agreements.
In the press
Here is a summary.
- Sony got hacked.
- Passwords got published
- You can do some statistics on them
- Password is in a common password dictionary : 36%
- Password is less than 8 characters long : 50%
- The same password was used on Sony and Gawker : 67%
- Password is all lowercase alphanumeric less than 10 characters : 82%
- Passwords containing one or more symbols that is not a letter or number : 1%
Ever since I started studying password selection, nothing much has changed. They don't work simply because people don't or can't follow the rules. The problem is always blamed on the individual, but I conjecture that it's not humanly possible to make passwords work for a group of more than 10 smart dedicated security conscious professionals. My reasons should become clear as you read on.
...and this Sony breach of security is just one of scores of reports that I've read over the last ten years since I promised to write this article. Nothing has changed my mind. PASSWORDS DO NOT WORK.
In this article, I will help you understand what a password is, how it is poorly implemented, how it could be better implemented, and why they don't work in either case. The reasons are partly technical, mostly psychological, and statistical.
Have you hear the advice: "Don't show your password to anyone."
What, exactly, is a password.
Well here is the first mistake. We stupidly call it a password. "Please choose a password". OK - how about "Sesame" , or if you are feeling like a smartass, how about "iftaH ya simsim". No score. It's the same weak choice, only in a different language.
In the story, Ali Baba and the forty thieves, the treasure is held in a cave, the mouth of which is sealed by magic. We have no magic. Passwords rely on a mathematical assumption, nothing more.
Perhaps if we had originally not used the term 'password', and instead used, 'secret symbols' then people would instinctively choose more secure symbols. But we still have a problem. Traditionally, a password must be something that only one person knows. (There is another issue involved with this that I will soon explain.)
Since a person must keep the password -- or secret symbols -- in human memory, then it needs to be memorable. This is a fatal flaw in the password system. Naturally, an individual is tempted strongly to use a familiar word. Many people use 123456 or 'password' or 'letmein' (Let me in).
Today, we are forced through irritating and persistent hunter-gathering behavior of the various vendors out there to create a username and password for this site, and username and password for that site... Oh - you want to read the news? Sign up. You want to buy an air ticket? Sign up. You want to book a hotel? Sign up. Use an auction site? Sign up... Pay the electric bill? Sign up. Use internet banking? Sign up. Enjoy the hobby forum? Sign up. Make a comment on a blog? Sign up.
The typical internet user has to make the choice, "Sign up" or not participate. This can occur in hundreds of individual circumstances.
All the security advisors say, "Use a unique password that is not in a dictionary, add symbols and numbers and use mixed case." This is great advice, but's it's absurd. It's unworkable, and everyone knows this. I am amazed that 'passwords' are still used under the flawed assumption that people will follow these rules and create a password like "7&hhPsq!@;.3
Oh - and don't forget to change this every week or month. Oh - and never use the same password for two accounts. Oh - and don't recycle them, or use a system, or write them down, or tell your friends, or give them to anyone at all. Oh - and follow these rules for fifty to one hundred accounts.
IT CAN"T BE DONE.
People just cannot follow these rules. It's impossible, and that's why passwords don't work. They never have, and they never will.
Even if you choose a reasonably good password, and do so for all your accounts, then you are still likely to be part of a group of people who use the same resources. This is most likely to be a place of work, or a forum, or gaming site and so on. Your security also relies on the diligence of others in that group. Even if you follow all the unreasonable rules, in any group of perhaps ten or more, someone will slip up, no matter how careful they are. Security has often been likened to a chain where the strength of the chain is as good as the weakest link. In a large group of people, the system's security is as strong as the weakest password on the system.
This is because of a hacker technique called "privilege elevation". It works like this:
Even though 'Sally's" account is restricted, if Sally has a weak password, a hacker can get in and lay a trap for Sally, or Sally's administrator. This trap will either trick the administrator, or the system to temporarily gain raised access to the system. When this happens, the hacker can install a small hidden program that will permit further privileged access. So you can see that the entire system's security, and all those that use it is dependent on Sally's diligence.
Security advisors tell people to change a password often to try and limit the risk involved between a breach and further damage. But this is only a weak mitigation, and as we have seen, people don't follow the rules. People can't follow the rules because they are unworkable.
I started this section asking, "What, exactly is a password?". Here is the answer:
A password is a secret that only one person knows, and one particular application is able to verify that the person knows it. Supposedly, it is easy to remember; it is easy to invent; it is incredibly difficult for anyone else to discover.
We don't have Ali Baba style magic to rely on.
For an application to verify a person's knowledge of a password, to simply store the password on the system potentially permits anyone to just read it. Once that happens, it is no longer useful as a password. So we have a mathematical system to get around this. It's called a 'hash' (also known as a 'digest').
A hash is used to store an unfathomable representation derived from your secret symbols. In mathematical terms, we say that a hash is a one-way-function that takes a message M of arbitrary size and reduces it to a fixed length set of symbols.
Before explaining this more, I'll introduce you to the pigeon hole principle. Imagine you have 100 pigeons, and a wall of 99 roosts. You could put pigeon A into roost 1, and pigeon Y into roost 3 and so on. But in the end, two pigeons must share the same roost because there is one less roost than pigeons.
Oddly, this is a very simple demonstration of a hash. A hash takes one of many (potentially infinite - but certainly unspecified) messages and assigns each to a number between an upper and lower limit (the roosts). But since there are many more possible messages compared to available integers in a fixed range, there will, inevitably be some sharing. In cryptography, we call this a 'collision'.
When we apply a hash function to your password, we don't necessarily tightly restrict what password you choose. After all, if we restricted it, then someone would theoretically have a better chance of guessing it. Then the hash function manipulates your password and assigns a particular hash value to it. We sort of hope that there is no collision. At least, we hope that the chances of a collision is really rare. But from the pigeon hole principle, we know it is impossible to completely avoid. Therefore, cryptographers choose hash functions that spread out nicely into the available hashed space. If the hash function favored a few roosts, then it would not be very useful. We also use a very large hash space. It's not unusual to use a hash space of trillions of available values. With a very large hash-space, we can arrange the hashing algorithm to make it extremely difficult to find two or more messages that collide.
We try to avoid collisions because they greatly aid the cryptanalyst when trying to 'break' (or hack) the hash. Think of it this way, If 100,000 passwords hashed to the same value, and that value was found in a system, then a hacker could try say 1,000,000 passwords and his/her chances of being let into the system is considerably higher than it would be for a hash that was more evenly spread.
I know this is difficult, but let's persist. There is something else about a hash that you need to know. A good hash uses a function that is easy to compute from password to hash, but unreasonably hard (read: impossible for you and me, and most likely impossible even for the NSA), to compute backwards from hash to particular password.
When your computer system asks you for your password, you supply it, in secret; it is not stored in permanent storage; hopefully, it is not available somewhere between your fingers and the system (which is yet another story). The system has a previously computed stored copy of the HASH value. It re-computes a hash value from the password you just supplied, and compares the result to it's previously stored result. If they match, you are granted access.
Clearly, if a hash algorithm had a habit of producing collisions, then your password, and potentially millions of others would hash to the same value, and their passwords would also work. A collision is like finding two different keys that open the same lock. For this reason a good hash algorithm is very unlikely to produce a clash, and another requirement is to make it extremely difficult or ideally impossible to deliberately find a collision.
Now we need to talk about salt.
If you know why this is called salt - please use the comment section below.
Since the passwords are independently chosen, there is nothing stopping two independent people from choosing the same password. Even if they don't realize it, and keep their passwords a secret, this is a major problem. Let's assume that 100,000,000 people are asked to choose passwords, and the hashes are stored on a system. Let's also assume that a hacker anywhere in the world burns up a few months of computer time to generate the hash values of 1,000,000 words. Let's assume that some of those words (like qwerty or password or 123456) are used by real people as their passwords. This list can (and will) get published.
In that table of 1,000,000 hashes of common passwords, will lie some hashes that can be found on the system of 100,000,000 password hashes. It's a fast and simple task to compare each hash in the hacker's table to those on the protected system. Since the hacker has a mapping between the passwords he 'guessed' and the associated hash values, then a match immediately reveals the correct password of a protected account.
This might not work if every person chose passwords so weird (and unmemorable) that none have a reasonable chance of being in the hacker's table. But we already know that is psychologically impossible.
We can mitigate the risk by using 'salt.
Salt is a probably world-readable but unique (ideally) randomly generated value that is mixed in with the user's password before it is hashed. This means that Sally who chooses a password of abc123 produces a different hash to Bob who happens to use the same password, and of course the password in the table that the hacker users makes another different value. Now the simple direct comparison is not possible. It's no longer a simple pattern match. It's important that the salt is randomly generated to force an attacker to try on average half the possible salt values before finding a match.
The hacker must now compute, or store every possible hash result for EACH given password-guess.
Let's say you use a salt of one of two values. If that salt is chosen at random, then the chances of similar hashes for two identical passwords is halved, and the size of the hacker's table is doubled. If we use four possible values, then the chances of identical hashes is reduced to 1 in four, and the hacker's table or processing time has quadrupled. Since we are forcing the computer system to choose and combine the salt with the user's password, then we can use a nice big salt. Let's imagine 64 switches that can be placed in any combination at random for a salt. The pool of salt values is now 18,446,744,073,709,551,616 and that's how many times the hacker's table needs to be increased in order to permit a pattern match. That's a huge number. It's not possible to store that many hash values, and even if you did, it's not really going to be possible to do the pattern match in reasonable time.
When you log into your Windows system, using a password authentication scheme known as NTLM, then are you now shocked and dismayed to learn that this system DOES NOT USE ANY SALT?
Even worse, the predecessor to NTLM -- called LM is terribly flawed. LM is restricted to 7 character values for the password. When you use an 8 character password, it's chopped up and then two independent hashes are computed. In the extreme, if you use a 14 character password, there are still only two hashes to 'solve' and security is only doubled.
That, quite frankly is stupid, and it is a classic example where programmers have invented a security system without consulting the experts. If a 7 character password was increased to 8, then that alone should double the search-space, and a 9 character password should double it again and so on, but by chopping it into two, it's a gift to the hacker.
If you are forced to use LM, then you should use a password longer than 14 characters for sure. Not that you will, or can. It's too hard. Furthermore, only a select few people will actually know whether they are using a particular kind of password system. It's unreasonable to expect ordinary users to modify their password-choosing behavior depending on the underlying technology.
More salty stupidity
Recall above that I said the salt is possibly world readable? Here is an example. In that code example i,e, programming tutorial, it claims:
"The login program often uses the first two characters of the user name as the "salt" string."
From a programmer's point of view, this makes perfect sense. The user name is forced to be unique on any given system, and the set of people with characters common in the first two places is expected to be small which reduces the chance that two of those independently choose the same password. Of those that do choose the same password, they are quite likely to have different salt.
As I mentioned in the first part of this article, the problem of passwords is linked with statistics. In this scheme, as the pool of user names increases, the chance of identical passwords, and identical salt is substantially increased.
Furthermore, to a cryptographer, using the first two characters of the user name as salt is stupid. The salt is readily available to a hacker because the hacker can find the user name, therefore he knows the salt, but this in itself is not a cryptographic flaw. If it was possible to store the salt more securely, then you would also store the digest more securely. Therefore you may as well keep the salt and the digest at the same level of security. The purpose of salt is to force every password entry into it's own cryptographic challenge INDEPENDENTLY of the rest of the available digests. By using the first two characters of a user name, the 'randomness' of the salt takes on the same level of 'randomness' as the user name. User names are rarely randomly assigned. They are normally people's names. Therefore a salt based on the user name is not random.
Two characters consist of only 8+8 bits or switches. But if it is part of a user name, then these are restricted to the letters a-z and their capital equivalents. Not many people's user names start with q or z either. You can see that the choice of salt is NOT drawn at random. The letter-frequency distribution for the first two characters of a user name is likely to be mostly restricted to a relatively few common combinations. The letter-pairs will also often restrict the salt-space since I don't recall anyone with a name of qxiotra. Therefore, instead of a search space of 2^16 = 65536 combinations for salt which is a dumb-theoretical deduction based on theoretical maximum, the real salt search space is immediately limited to 26 lower case and 26 upper case letters. You only need 5 bits to store 26 combinations. Theoretically, only 6 bits are needed for all lower and upper case combinations, but since many two letter combinations are effectively restricted due to language structure, the first of a two character salt only offers somewhere between 5 and 6 bits, and the second considerably fewer because of English language rules. For example, if someone's user name starts with q, then it's a good start to assume that the next letter is u. These deductions mean that in the lucky case, the salt space is sometimes only a little more than 5 bits. This means that a hacker could guess the salt in the absence of knowing the user name by computing a hash 32 times up to (26+26) ^ 2 = 2704 times.
When a two character salt is based on the first two letters of a user name, and assuming that the hacker does not know this, or only has the hashes to work from, she will face a challenge only 32 to about 2000 times more computing work to crack the scheme compared to the previously mentioned NT or NTLM where salt is not used.
Today, computers are very powerful, and they are getting more powerful rapidly. Salt is therefore essential, but there are poor implementations in use today that still permit a brute-force method of dealing with salted passwords. As in the user-name salt example above, if your salt is not randomly generated, then the rest of the pool of digests makes password cracking easier.
SALT must be generated from a random source of data, and it must provide a large search space.
An md5 implementation
Salt is used by the code linked here.
Analysis of the code indicates that for the hash called md5, it creates 11 characters of space for the salt, uses the first three for the literal string "$1$" as an identifier for md5, then gets 8 characters of random data from /dev/random and shifts the result into printable characters of 7 bits each. The result is a 56-bit search space.
2^56 = 72,057,594,037,927,936
The crypt(3) function mixes this with the user's password.
On average if the password was truly randomly chosen, a brute force attack would make 36,028,797,018,963,968 trials to find a password that creates the same hash value.
In reality, by knowing that people are crappy random-generators, the search space is prioritized or limited to common passwords.
Modern computers can brute-force md5, and certainly get results from a dictionary attack.
If your system uses md5, then it is not secure. Actually, if your system uses people then it's not secure. :-)
An md5 password salt example. Put this in a file called cr.c
static char salt="$1$abcdefgh";
puts( (char*)crypt( "mypassword" , salt ));
salt='0'; // alter the salt a little
puts( (char*)crypt( "mypassword" , salt ));
If you have a linux computer, open a command window and create a file called cr.c with the code as shown to the right.
Then issue this compiler and linker command. It will create a test executable called cr.
cc cr.c -lcrypt -o cr
Now run the new cr test program and inspect the results.
Notice how two identical passwords give very different digests even though the passwords are the same. The salt increases the cost of hacking significantly when the password is unknown. But for the legitimate user who can supply the password, there is no significant additional cost.
The entire digest, including the first three characters $1$ is assumed to be available to the hacker. ANY scheme which does not assume this is likely to be flawed. If a scheme relies on hiding either the algorithm or salt then it is probably flawed. In security circles, hiding algorithms is known as 'security by obscurity'. It's much better to make your algorithm public so it can be tested by anyone. An untested algorithm is likely to be easily broken.
Before concluding, there is one more trick that security people can play on a hacker. Obviously large salt values help secrecy because the cost of processing and storage is increased to the hacker. But there is an asymmetry that we can exploit. When someone knows the password, we could apply a few thousand iterations of the algorithm or deliberately choose a secure but slow algorithm. This might slow down legitimate access from say, one millionth of a second to two seconds. For the legitimate user, checking one hand-typed password, a two second delay is not inconvenient. But to a hacker trying to create a million password hashes, the total operation becomes two million times longer.
The security of a password scheme hinges on making a RANDOM choice for both the salt and the password. The salt is at the same security level as the digest (hash) and is assumed public.
Even when salt is used, a group of ten or more people is likely to contain one person who slips up and uses either a password that is to weak, or the same one for too long, or the same one that is used for multiple other accounts. In the latter case, difficulty in cracking system A is reduced to the effectiveness security systems in place of system B or C or D.
In practice, this means that if you use the same password or a variation of one for multiple social networks, reading the news, participating in blogs, posting pictures or scores of other activities, then when a hacker invades one of those systems, via any one of those people's weak passwords, then all your systems are at risk.
Passwords don't work because the security passwords offer is not independent from the other people in your system, and not even independent of the users on the internet at large. The security of your particular system is linked to the level of security controlled by other companies, and to implement passwords properly is difficult. There are flawed schemes in use always.
Is there a solution?
No. Not really. But there are some things you can do.
- Since the entire scheme is dependent on all users' diligence and understanding, then education will help. It's not a panacea but it won't hurt.
- Use an authentication scheme that is not currently known to be flawed.
- Accept the fact that people are terrible random generators and provide systems to help. The most effective non-technical aid is to teach people to use what I call symbol-phrases. Here is an example. The phrase: Today I shat on a 3 coloured $250 dollar elephant" The phrase is silly which makes it easy to recall. Except for the fact that I just published it, it would not normally appear in print anywhere. The first letters and symbols can be used as the password: TIsona3c$250e
- Get people who can read and understand this document to further educate staff on the flaws of passwords.
- Engage in a detailed security audit by a respectable company. Perform regular pen-tests on your users' passwords.
- Assume that authentication breaches are part of normal business and put in place, some technical alarm systems. This involves effective log collection, management and correlation. In this way you will be able to detect and react and mitigate.
- Use two factor authentication. This helps greatly to remove the link between the individual and the group. It also provides a warning system. If the token or biometric backup to a password is lost, stolen or damaged or otherwise compromised, this serves as a built in warning system. You can take rapid measures to repair security. Two-factor limits the scope for hackers, and slows them down. But don't fall into the trap of thinking that it eliminates risk.
- Have a proper corporate security policy. You and your staff need to know exactly what to do when a user reports a lost token, or an Intrusion Detection System sends an alert.