Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you have repetitive data in, and repetitive data out, no modulus is going to help you.

Um, no. A prime modulus will often help in this case. That's the whole point.



Do you have any statistical evidence?


In your test code, change your loop to:

    for r in [3, 5, 17, 257]:
        for i in xrange(n):
            num = r * i
            table_p[num%prime] += 1
            table_c[num%composite] += 1

On here and on your blog, you keep falling back to the observation that modulo prime makes no difference when you have a uniform distribution of inputs, like you get after you apply a good hash function.

Nobody who is telling you that you're wrong would dispute that. The fundamental point that you keep ignoring is that doesn't matter, because as a library writer providing a hashtable, you don't control the hash function. Having a simple way to mitigate the ill effects of users choosing bad hash functions is a good thing. Why do you keep ignoring this point?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: