“compare by hash considered harmful” considered harmful
by Vinay Gupta • April 14, 2008 • Science • 1 Comment
Once a hash collision has been found and a demonstrably buggy test program created using the colliding inputs, how will you fix the bug? Usually, the response to a test program that demonstrates a bug in the system is to fix the bug. In this case, the underlying algorithm is the bug.
What’s the error?
SHA1(X) != SHA1(Y) proves X and Y are different
SHA1(X) == SHA1(Y) does not prove that X and Y are the same, although it very strongly suggests it
So, yeah, the piece is talking about hash abuse by programmers making a simple logical error in how they use hash functions. Options?
1> Compare everything that hashes, bit by bit, and in the event of a collision
A> Call Guinness
B> Store both copies with a version tag
2> Accept the 1 in 2^128 error risks which… well… etc. etc. you know the story.
The theoretical risk of collisions is higher, like 1/2^80, for SHA-1 (see the Birthday Paradox). In practice, SHA-1 is much weaker still; see:
http://en.wikipedia.org/wiki/SHA_hash_functions#Cryptanalysis_of_SHA-1
However, the theoretical possibility of finding random collisions on a supercomputer does not directly translate into making all use cases of SHA-1 obsolete. The algorithm has many years in it still for a good variety of use cases.