Hash时取模为什么要模质数

2019-04-13 15:45发布


原地址:https://www.zhihu.com/question/20806796/answer/21359160
首先来说假如关键字是随机分布的,那么无所谓一定要模质数。但在实际中往往关键字有某种规律,例如大量的等差数列,那么公差和模数不互质的时候发生碰撞的概率会变大,而用质数就可以很大程度上回避这个问题。

质数并不是唯一的准则,具体可以参考以下网站。
good hash table primes

作者:李国华
链接:https://www.zhihu.com/question/20806796/answer/21359160
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。


good hash table primes

In the course of designing a good hashing configuration, it is helpful to have a list ofprime numbers for the hash table size. The following is such a list. It has the properties that:
  1. 1.  each number in the list is prime
  2. 2.  each number is slightly less than twice the size of the previous
  3. 3.  each number is as far as possible from the nearest two powers of two
Using primes for hash tables is a good idea because it minimizes clustering in the hashed table. Item (2) is nice because it is convenient for growing a hash table in theface of expanding data. Item (3) has, allegedly, been shown to yield especially good results in practice. And here is the list: lwr upr % err prime 25superscript252^{{5}} 26superscript262^{{6}} 10.416667 53 26superscript262^{{6}} 27superscript272^{{7}} 1.041667 97 27superscript272^{{7}} 28superscript282^{{8}} 0.520833 193 28superscript282^{{8}} 29superscript292^{{9}} 1.302083 389 29superscript292^{{9}} 210superscript2102^{{10}} 0.130208 769 210superscript2102^{{10}} 211superscript2112^{{11}} 0.455729 1543 211superscript2112^{{11}} 212superscript2122^{{12}} 0.227865 3079 212superscript2122^{{12}} 213superscript2132^{{13}} 0.113932 6151 213superscript2132^{{13}} 214superscript2142^{{14}} 0.008138 12289 214superscript2142^{{14}} 215superscript2152^{{15}} 0.069173 24593 215superscript2152^{{15}}