44

I've read a few times that when storing passwords, it's good practice to 'double hash' the strings (eg. with md5 then sha1, both with salts, obviously).

I guess the first question is, "is this actually correct?" If not, then please, dismiss the rest of this question :)

The reason I ask is that on the face of it, I would say that this makes sense. However, when I think about it, every time a hash is rehashed (possibly with something added to it) all I can see is that there is a reduction in the upper bound on the final 'uniqueness'... that bound being related to the initial input.

Let me put it another way: we have x number of strings that, when hashed, are reduced to y possible strings. That is to say, there are collisions in the first set. Now coming from the second set to the third, is it not possible for the same thing to occur (ie. collisions in the set of all possible 'y' strings that result in the same hash in the third set)?

In my head, all I see is a 'funnel' for each hash function call, 'funneling' an infinite set of possibilities into a finite set and so on, but obviously each call is working on the finite set before it, giving us a set no larger than the input.

Maybe an example will explain my ramblings? Take 'hash_function_a' that will give 'a' and 'b' the hash '1', and will give 'c' and 'd' the hash '2'. Using this function to store passwords, even if the password is 'a', I could use the password 'b'.

Take 'hash_function_b' that will give '1' and '2' the hash '3'. If I were to use it as a 'secondary hash' after 'hash_function_a' then even if the password is 'a' I could use 'b', 'c' or 'd'.

On top of all of that, I get that salts should be used, but they don't really change the fact that each time we are mapping 'x' inputs to 'less than x' outputs. I don't think.

Can someone please explain to me what it is that I am missing here?

Thanks!

EDIT: for what it's worth, I don't do this myself, I use bcrypt. And I'm not really concerned about whether or not it's useful for 'using up cycles' for a 'hacker'. I genuinely am just wondering whether or not the process reduces 'security' from a hash collision stand point.

Narcissus
  • 667

6 Answers6

59

Using different hashing algorithms is a bad idea - it will reduce entropy rather than increase it.

However, assuming you have a cryptographically strong hashing algorithm and a good salt, applying the same hash function several times makes the hashing process more computationally expensive. The benefit of this is that when other means of cracking the password hash fail (guessing, dictionary attacks, rainbow tables, etc.), and the attacker is forced into brute-force techniques, it takes them longer to try each password, simply because they have to apply the same hash function more often. So if one round of hashing would require one month of brute-forcing, applying it twelve times would increase the estimated time to a year.

Recent hashing algorithms like bcrypt build on this idea; they contain a parameter to control the computational complexity of the hash, so that you can scale it as hardware speeds progress: when hardware becomes faster by a factor of two, you increase the complexity to compensate, so the time required to brute-force your hashes remains roughly constant.

tdammers
  • 52,936
30

This is more suited on security.stackexchange but...

The problem with

hash1(hash2(hash3(...hashn(pass+salt)+salt)+salt)...)+salt)

is that this is only as strong as the weakest hash function in the chain. For example if hashn (the innermost hash) gives a collision, the entire hash chain will give a collision (irrespective of what other hashes are in the chain).

A stronger chain would be

hash1(hash2(hash3(...hashn(pass + salt) + pass + salt) + pass + salt)...) + pass + salt)

Here we avoid the early collision problem and we essentially generate a salt that depends on the password for the final hash.

And if one step in the chain collides it doesn't matter because in the next step the password is used again and should give a different result for different passwords.

ratchet freak
  • 25,986
3

Do not try to write your own password hashing scheme unless your are willing to take a course in cryptography and/or security engineering.

You should use a well established implementation of password hashing which in turn should use a key derivation function (KDF) such as PBKDF2, bcrypt, scrypt or the newer Argon2.

Good KDFs include a workfactor, usually a number of iterations, in order to increase the cost of offline attacks. One could say that these KDFs hash the password multiple times, using the same algorithm each time. There is no point in using multiple message digest algorithm, as pointed out by others.

2

In general, you don't need to use more than one hashing algorithm.

What you need to do is:

Use salt: salt isn't used just to make your password more secure, it's used to aboid rainbow table attack. That way, someone will have a harder work trying to precompute the hash for passwords you store in your system.

Use multiple interations: instead of doing just SHA(password + salt), do SHA(SHA(SHA(SHA(SHA(...SHA( password + salt )))))) . Or, to represent in other way:

hash = sha(password + salt)
for i=1 , i=5000, i++ {
    hash = sha(hash + salt);
}

And, finally, choose a good hashing function. SHA, MD5, etc, are not good because they are too fast. Since you want to use hash for protection, you'd better use slower hashes. Take a look at Bcrypt, PBKDF2 or Scrypt, for example.

edit: after observations, let's try to see some points (sorry, long explanation to get to the end, because it might help others searching for similar answers):

If your system is secure, like no one will ever ever get access to the stored password, you wouldn't need hash. The password would be secret, no one would get it.

But no one can assure that the database with the passwords will be stolen. Steal the database, got all the passwords. Ok, your system and your company will suffer all the consequences of it. So, we could try to avoid this password leaking.

NOTICE that we are not worried about online attacks in this point. For one online attack, the best solution is to slow down after bad passwords, lock the account after some tries, etc. And for that it doesn't matter which way you encrypt, hash, store, etc, your password. Online attack is a matter of slowing down the password inputs.

So, back to the don't let them take my plain passwords problem. The answer is simple: don't store them as plain text. Ok, got it.

How to avoid that?

Encrypt the password (?). But, as you know, if you encrypt it, you can decrypt it back, if you have the proper key. And you'll end up with the problem of "where to hide" the key. Hum, no good, since they got you database, they can get your key. Ok, let's not use it.

So, another approach: let's transform the password in something else that can't be reversed and store it. And to verify if the supplied password is correct, we do the same process again and check if the two tranformed values match. If they match = the good password was supplied.

Ok, so far so good. Let's use some MD5 hash in the password. But... if someone has our stored hashed value of password, he can have a lot of computer power to calculate the MD5 hash of every possible password (brute force), so he can find the original password. Or, even worst, he can store all the MD5 from all characteres combinations, and easily find the password. So, do a lot of iteractions, the HASH(HASH(HASH())) thing, to make it harder, because it'll take more time.

But even that can be circunvented, the rainbow table was created exactly to speed up against this kind of protection.

So, let's use some salt over it. This way, at each interaction, the salt is used again. One trying to attack your passwords will have to generate the rainbow table considering that the salt is added each time. And when he generates that rainbow table, since it was generated with one salt, he'll have to calculate again with the other salt, so he will have to spend some time for each password (=each salt). Salt won't add "more complexity" to the password, it'll just make the attacker loose time generating the rainbow table, if you use one salt for each password, the table from one salt is useless to another password.

And using more than one hash will have helped here? No. The person generating a specific rainbow attack will be able to generate it using one or more hashes, anyway.

And using more than one hash can lead you to one problem: it's as secure as the weakest hash you use. If someone find collisions in one hash algorithm, it's that hash that will be exploited, at any point of the the iteration process, to break the password. So, you don't gain anything by using more hashes algorithms, it's better to choose just one good algo. and use it. And if yuo ever hear that it has been broken, think how you'll change it in your application.

And why use bcrypt or something like that (you say you use it): because the attacker will have to spend more time generating the tables. That's why using MD5 + wait (3 seconds) doesn't help: the attack will be offline, anyway, so the attacker can generate the tables without the (3 seconds delay).

-1

My understanding is that using multiple hashing algorithms is to defeat rainbow tables. Using a good salt works too, but I guess it's a second level of protection.

jiggy
  • 1,590
-1

This isn't more secure. However, you have a protocol of identification based on hash multiple times with the same function.

This goes that way. The stored value is hash^n(pass) in computer A. A ask B to authentify and gives B the integer n. B does the calculation hash^(n-1)(pass) and send it back to A.

A check that hash(hash^(n-1)(pass)) == hash^n(pass) . If it is true, then the authentification is done. But then, A store hash^(n-1)(pass) and next authentification, will give B n-1 instead of n.

This ensure that the password is never exchanged in clear, that A never knows what the password is, and that authentification is protected by replay. However, this has the drawback to require password with a finite lifetime. When n reach the value 2, a new password must be choosen after authentification.

Another use of multiple hash is the tool HMAC, to ensure authentification and integrity of a request. For more about HMAC see http://en.wikipedia.org/wiki/HMAC .

Most usage of multiple hash in the real world are overkill. In your case, it seems to be. Note that if you use several hash functions, they will not all have the same entropy, thus this reduce the strengh of the hash. For exemple, md5 has less entropy than sha1, so using sha1 on an md5 will not improve the strength of the hash. The strengh will generally be equal to the strength of the weaker hash function.

deadalnix
  • 6,023