Universal Hash Functions and Application to Cryptography       

Security Accomplishment | 1977 - 1979

IBM researchers: J. Lawrence Carter, Mark Wegman

Where the work was done: IBM T.J. Watson Research Center

What we accomplished: Carter and Wegman developed a set of efficient hash functions that allow linear time retrieval of data -- useful in compilers, databases, cryptography and many other technologies and research areas. Hash functions compute a hash value for something big, and typically produce a value a lot shorter than the item being hashed. So, it's a many to one function. At the same time, though, we want any two things that are different to map to different hash values. Thus, there can be no worse case guarantees about the number of collisions a given hash function has for an arbitrary set of items to be hashed. Universal hash functions solve this conundrum by introducing classes of hash functions with the property that for any given set, most of them will have few collisions.

Related links: Universal Classes of Hash Functions, Journal of Computer and System Sciences 18, p.143-154, 1979:

  • "This paper gives an input independent average linear time algorithm for storage and retrieval on keys.  The algorithm makes a random choice of hash function from a suitable class of hash functions.  Given any sequence of inputs the expected time (averaging over all functions in the class) to store and retrieve elements is linear in the length of the sequence.  The number of references to the data base required by the algorithm for any input is extremely close to the theoretical minimum for any possible hash function with randomly distributed inputs."
  • 1977 extended abstract in ACM STOC [Symposium on the Theory of Computation]