tested some different algorithms, measuring speed and number of collisions.
I used three different key sets:
A list of 216,553 English words (in lowercase)
The numbers "1" to "216553" (think ZIP codes, and how a poor hash took down msn.com)
216,553 "random" (i.e. type 4 uuid) GUIDs
For each corpus, the number of collisions and the average time spent hashing was recorded.
DJB2a (variant using xor rather than +)
GUIDs are designed to be unique, not random
A timely post by Raymond Chen reiterates the fact that "random" GUIDs are not meant to be used for their randomness. They, or a subset of them, are unsuitable as a hash key:
Even the Version 4 GUID algorithm is not guaranteed to be unpredictable, because the algorithm does not specify the quality of the random number generator. The Wikipedia article for GUID contains primary research which suggests that future and previous GUIDs can be predicted based on knowledge of the random number generator state, since the generator is not cryptographically strong.
Hash Lowercase Random UUID Numbers ============= ============= =========== ============== Murmur 145 ns 259 ns 92 ns 6 collis 5 collis 0 collis FNV-1a 152 ns 504 ns 86 ns 4 collis 4 collis 0 collis FNV-1 184 ns 730 ns 92 ns 1 collis 5 collis 0 collis▪ DBJ2a 158 ns 443 ns 91 ns 5 collis 6 collis 0 collis▪▪▪ DJB2 156 ns 437 ns 93 ns 7 collis 6 collis 0 collis▪▪▪ SDBM 148 ns 484 ns 90 ns 4 collis 6 collis 0 collis** SuperFastHash 164 ns 344 ns 118 ns 85 collis 4 collis 18742 collis CRC32 250 ns 946 ns 130 ns 2 collis 0 collis 0 collis LoseLose 338 ns - - 215178 collis