* XDL_FAST_HASH can be very slow @ 2014-12-22 4:19 Jeff King 2014-12-22 9:08 ` Patrick Reynolds 0 siblings, 1 reply; 4+ messages in thread From: Jeff King @ 2014-12-22 4:19 UTC (permalink / raw) To: git; +Cc: Thomas Rast I ran across an interesting case that diffs very slowly with modern git. And it's even public. You can clone: git://github.com/outpunk/evil-icons and try: git show fc4efe426d5b4e6aa8d5a4dc14babeada7c5f899 (which is also the tip of master as of this writing). The interesting file there is a 10MB Illustrator file, "assets/ei.ai". Git treats it as text, as the early part doesn't have any NULs, but it is mostly non-human-readable. It has a large number of lines, and some of the lines themselves are quite large. On my machine, "git show" takes ~77 seconds using v2.2.1. But if I build the same version with "make XDL_FAST_HASH=", it completes in about 0.4s. Both produce the same output. I'm not really sure what's going on. A few points of interest: - You can replicate this with the very first commit that added XDL_FAST_HASH, 6942efc (xdiff: load full words in the inner loop of xdl_hash_record, 2012-04-06). So it was always bad on this case, and it's not part of any more recent changes. - We actually _don't_ spend most of our time in xdl_hash_record, the function modified by 6942efc. Instead, it all goes to xdl_classify_record, which is looping over the set of hash records. It's not clear to me if more or different hash records is part of the design of XDL_FAST_HASH, or if this is actually a bug. I haven't dug much further than that. -Peff ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XDL_FAST_HASH can be very slow 2014-12-22 4:19 XDL_FAST_HASH can be very slow Jeff King @ 2014-12-22 9:08 ` Patrick Reynolds 2014-12-22 10:48 ` Thomas Rast 0 siblings, 1 reply; 4+ messages in thread From: Patrick Reynolds @ 2014-12-22 9:08 UTC (permalink / raw) To: git@vger.kernel.org; +Cc: Thomas Rast, Jeff King I have been working with Peff on this and have more results to share. For background, xdl_hash_record is a hashing function, producing an unsigned long from an input string terminated by either a newline or the end of the mmap'd file. The original xdl_hash_record is essentially DJB hash, which does a multiplication, load, and xor for each byte of the input. Commit 6942efc introduces an "XDL_FAST_HASH" version of the same function that is clearly inspired by the DJB hash, but it does only one multiplication, load, and xor for each 8 bytes of input -- i.e., fewer loads, but also a lot less bit mixing. Less mixing means far more collisions, leading to the performance problems with evil-icons. It's not clear to me if the XDL_FAST_HASH version intended to match the output of the DJB hash function, but it doesn't at all. Peff has been experimenting with using two modern hash functions, FNV and Murmur3. In theory, these should produce fewer collisions than DJB, but in his measurements, they didn't run diff any faster than plain DJB. They do fix the evil-icons problem. I have implemented two simpler possibilities, both of which fix the problems diffing the evil-icons repository: 1. An XDL_FAST_HASH implementation that matches the output of the DJB hash exactly. Its performance is basically the same as DJB, because the only thing is does differently is load 8 bytes at a time instead of 1. It does all the same ALU operations as DJB. 2. Using (hash % prime_number) instead of (hash & ((1<<hbits)-1)) to map hash values to buckets in the hash table. This helps because there's entropy in the high bits of the hash values that's lost completely if we just mask off the low hbits bits. I've chosen prime numbers that are close to the power-of-two sizes of the table -- e.g., 32749 instead of 32768 -- so very little space is wasted. Applying this change to the XDL_FAST_HASH hash function makes it perform as well as DJB and Murmur3. That is, it eliminates the performance problems with the evil-icons repo. I evaluated several of the hash functions according to how deep the chains are in each hash bucket, when diffing the evil-icons repo. DJB, Murmur3, and XDL_FAST_HASH%prime all produce near-optimal scattering, with the longest chain between 29 and 34 elements long. XDL_FAST_HASH as implemented in the current git tree -- with bit-masking instead of modulo-prime -- produces 100 buckets with chain lengths over 4000. Most of the other buckets are empty. Each of these long chains takes quadratic time to build and linear time to traverse, which presumably is where the terrible performance for evil-icons comes from. The bottom line is, I think XDL_FAST_HASH needs to go, because it has poorly understood collision behavior with pretty bad worst cases. I don't have strong feelings about what should replace it -- original DJB, a fixed XDL_FAST_HASH, Murmur3, or something else. All of the replacements have good collision behavior and good behavior on the repos I've tested, but appear to be a few percent slower in the common case. --Patrick ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XDL_FAST_HASH can be very slow 2014-12-22 9:08 ` Patrick Reynolds @ 2014-12-22 10:48 ` Thomas Rast 2014-12-23 2:51 ` demerphq 0 siblings, 1 reply; 4+ messages in thread From: Thomas Rast @ 2014-12-22 10:48 UTC (permalink / raw) To: Patrick Reynolds; +Cc: git@vger.kernel.org, Jeff King Patrick Reynolds <piki@github.com> writes: > The original xdl_hash_record is essentially DJB hash, which does a > multiplication, load, and xor for each byte of the input. Commit > 6942efc introduces an "XDL_FAST_HASH" version of the same function > that is clearly inspired by the DJB hash, but it does only one > multiplication, load, and xor for each 8 bytes of input -- i.e., fewer > loads, but also a lot less bit mixing. Less mixing means far more > collisions, leading to the performance problems with evil-icons. It's > not clear to me if the XDL_FAST_HASH version intended to match the > output of the DJB hash function, but it doesn't at all. Note that XDL_FAST_HASH is just a ripoff of the hashing scheme that Linus socially engineered on G+ around that time. I didn't do any of the hash genealogy that you did here, and it now shows. The orginal patches are linked from 6942efc (xdiff: load full words in the inner loop of xdl_hash_record, 2012-04-06): https://lkml.org/lkml/2012/3/2/452 https://lkml.org/lkml/2012/3/5/6 The code still exists: https://github.com/torvalds/linux/blob/master/fs/namei.c#L1678 > I have implemented two simpler possibilities, both of which fix the > problems diffing the evil-icons repository: I think it would be best to separate three goals here: 1. hash function throughput 2. quality of the hash values 3. avoiding collision attacks XDL_FAST_HASH was strictly an attempt to improve throughput, and fairly successful at that (6942efc (xdiff: load full words in the inner loop of xdl_hash_record, 2012-04-06) quotes an 8% improvement on 'git log -p'). You are now addressing quality. I have no idea how you ran into this, but if you are reworking things already, I think it would be good to also randomize whatever hash you put in so as to give some measure of protection against collision attacks. > 1. An XDL_FAST_HASH implementation that matches the output of the DJB > hash exactly. Its performance is basically the same as DJB, because > the only thing is does differently is load 8 bytes at a time instead > of 1. It does all the same ALU operations as DJB. I don't think there's a point in having such a function, since it would mean a lot of code for no throughput gain. Let's just remove XDL_FAST_HASH and the original hashing scheme in favor of a better hash function. -- Thomas Rast tr@thomasrast.ch ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: XDL_FAST_HASH can be very slow 2014-12-22 10:48 ` Thomas Rast @ 2014-12-23 2:51 ` demerphq 0 siblings, 0 replies; 4+ messages in thread From: demerphq @ 2014-12-23 2:51 UTC (permalink / raw) To: Thomas Rast; +Cc: Patrick Reynolds, git@vger.kernel.org, Jeff King (sorry for the repost, I use gmail and it send html mails by default). On 22 December 2014 at 11:48, Thomas Rast <tr@thomasrast.ch> wrote: > > 1. hash function throughput > 2. quality of the hash values > 3. avoiding collision attacks > > XDL_FAST_HASH was strictly an attempt to improve throughput, and fairly > successful at that (6942efc (xdiff: load full words in the inner loop of > xdl_hash_record, 2012-04-06) quotes an 8% improvement on 'git log -p'). > > You are now addressing quality. > > I have no idea how you ran into this, but if you are reworking things > already, I think it would be good to also randomize whatever hash you > put in so as to give some measure of protection against collision > attacks. I assume you mean DJB2 when you say DJB, and if so I will just note that it is a pretty terrible hash function for arbitrary data. (I understand it does better with raw text.) It does not pass either strict-avalanche-criteria[1], nor does it pass the bit-independence-criteria[2]. I have images which show how badly DJB2 fails these tests if anyone is interested. Murmur3 is better, in that it does pass SAC and BIC, but before you decide to use Murmur3 you should review https://131002.net/siphash/and related resources which demonstrate multi-collision attacks on Murmur3 which are independent of the seed chosen. The paper also introduces a new hash function with good performance properties, and claims that it has cyptographic strength. (I say claims because I am not qualified to judge if it is or not.) Eg: https://131002.net/siphash/murmur3collisions-20120827.tar.gz I think if you want performance and robustness against collision attacks Siphash is a good candidate, as is perhaps the AES derived hash used by the Go folks, but the performance of that algorithm is strongly dependent on the CPU supporting AES primitives. Anyway, the point is that simply adding a random seed to a hash function like DJB2 or Murmur3 is not sufficient to prevent collision attacks. Yves [1] A change in a single bit of the seed or the key should result in 50% of the output bits of the hash changing. [2] output bits j and k should change independently when any single input bit i is inverted, for all i, j and k. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-12-23 2:51 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-22 4:19 XDL_FAST_HASH can be very slow Jeff King 2014-12-22 9:08 ` Patrick Reynolds 2014-12-22 10:48 ` Thomas Rast 2014-12-23 2:51 ` demerphq
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).