XDL_FAST_HASH can be very slow

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* XDL_FAST_HASH can be very slow
@ 2014-12-22  4:19 Jeff King
  2014-12-22  9:08 ` Patrick Reynolds
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2014-12-22  4:19 UTC (permalink / raw)
  To: git; +Cc: Thomas Rast

I ran across an interesting case that diffs very slowly with modern git.
And it's even public. You can clone:

  git://github.com/outpunk/evil-icons

and try:

  git show fc4efe426d5b4e6aa8d5a4dc14babeada7c5f899

(which is also the tip of master as of this writing).

The interesting file there is a 10MB Illustrator file, "assets/ei.ai".
Git treats it as text, as the early part doesn't have any NULs, but it
is mostly non-human-readable. It has a large number of lines, and some
of the lines themselves are quite large.

On my machine, "git show" takes ~77 seconds using v2.2.1. But if I build
the same version with "make XDL_FAST_HASH=", it completes in about 0.4s.
Both produce the same output.

I'm not really sure what's going on.  A few points of interest:

 - You can replicate this with the very first commit that added
   XDL_FAST_HASH, 6942efc (xdiff: load full words in the inner loop of
   xdl_hash_record, 2012-04-06). So it was always bad on this case, and
   it's not part of any more recent changes.

 - We actually _don't_ spend most of our time in xdl_hash_record, the
   function modified by 6942efc. Instead, it all goes to
   xdl_classify_record, which is looping over the set of hash records.
   It's not clear to me if more or different hash records is part of the
   design of XDL_FAST_HASH, or if this is actually a bug.

I haven't dug much further than that.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: XDL_FAST_HASH can be very slow
  2014-12-22  4:19 XDL_FAST_HASH can be very slow Jeff King
@ 2014-12-22  9:08 ` Patrick Reynolds
  2014-12-22 10:48   ` Thomas Rast
  0 siblings, 1 reply; 4+ messages in thread
From: Patrick Reynolds @ 2014-12-22  9:08 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Thomas Rast, Jeff King

I have been working with Peff on this and have more results to share.

For background, xdl_hash_record is a hashing function, producing an
unsigned long from an input string terminated by either a newline or
the end of the mmap'd file.

The original xdl_hash_record is essentially DJB hash, which does a
multiplication, load, and xor for each byte of the input.  Commit
6942efc introduces an "XDL_FAST_HASH" version of the same function
that is clearly inspired by the DJB hash, but it does only one
multiplication, load, and xor for each 8 bytes of input -- i.e., fewer
loads, but also a lot less bit mixing.  Less mixing means far more
collisions, leading to the performance problems with evil-icons.  It's
not clear to me if the XDL_FAST_HASH version intended to match the
output of the DJB hash function, but it doesn't at all.

Peff has been experimenting with using two modern hash functions, FNV
and Murmur3.  In theory, these should produce fewer collisions than
DJB, but in his measurements, they didn't run diff any faster than
plain DJB.  They do fix the evil-icons problem.

I have implemented two simpler possibilities, both of which fix the
problems diffing the evil-icons repository:

1. An XDL_FAST_HASH implementation that matches the output of the DJB
hash exactly.  Its performance is basically the same as DJB, because
the only thing is does differently is load 8 bytes at a time instead
of 1.  It does all the same ALU operations as DJB.

2. Using (hash % prime_number) instead of (hash & ((1<<hbits)-1)) to
map hash values to buckets in the hash table.  This helps because
there's entropy in the high bits of the hash values that's lost
completely if we just mask off the low hbits bits.  I've chosen prime
numbers that are close to the power-of-two sizes of the table -- e.g.,
32749 instead of 32768 -- so very little space is wasted.  Applying
this change to the XDL_FAST_HASH hash function makes it perform as
well as DJB and Murmur3.  That is, it eliminates the performance
problems with the evil-icons repo.

I evaluated several of the hash functions according to how deep the
chains are in each hash bucket, when diffing the evil-icons repo.
DJB, Murmur3, and XDL_FAST_HASH%prime all produce near-optimal
scattering, with the longest chain between 29 and 34 elements long.
XDL_FAST_HASH as implemented in the current git tree -- with
bit-masking instead of modulo-prime -- produces 100 buckets with chain
lengths over 4000.  Most of the other buckets are empty.  Each of
these long chains takes quadratic time to build and linear time to
traverse, which presumably is where the terrible performance for
evil-icons comes from.

The bottom line is, I think XDL_FAST_HASH needs to go, because it has
poorly understood collision behavior with pretty bad worst cases.  I
don't have strong feelings about what should replace it -- original
DJB, a fixed XDL_FAST_HASH, Murmur3, or something else.  All of the
replacements have good collision behavior and good behavior on the
repos I've tested, but appear to be a few percent slower in the common
case.

--Patrick

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: XDL_FAST_HASH can be very slow
  2014-12-22  9:08 ` Patrick Reynolds
@ 2014-12-22 10:48   ` Thomas Rast
  2014-12-23  2:51     ` demerphq
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Rast @ 2014-12-22 10:48 UTC (permalink / raw)
  To: Patrick Reynolds; +Cc: git@vger.kernel.org, Jeff King

Patrick Reynolds <piki@github.com> writes:

> The original xdl_hash_record is essentially DJB hash, which does a
> multiplication, load, and xor for each byte of the input.  Commit
> 6942efc introduces an "XDL_FAST_HASH" version of the same function
> that is clearly inspired by the DJB hash, but it does only one
> multiplication, load, and xor for each 8 bytes of input -- i.e., fewer
> loads, but also a lot less bit mixing.  Less mixing means far more
> collisions, leading to the performance problems with evil-icons.  It's
> not clear to me if the XDL_FAST_HASH version intended to match the
> output of the DJB hash function, but it doesn't at all.

Note that XDL_FAST_HASH is just a ripoff of the hashing scheme that
Linus socially engineered on G+ around that time.  I didn't do any of
the hash genealogy that you did here, and it now shows.  The orginal
patches are linked from 6942efc (xdiff: load full words in the inner loop of
xdl_hash_record, 2012-04-06):

  https://lkml.org/lkml/2012/3/2/452
  https://lkml.org/lkml/2012/3/5/6

The code still exists:

  https://github.com/torvalds/linux/blob/master/fs/namei.c#L1678

> I have implemented two simpler possibilities, both of which fix the
> problems diffing the evil-icons repository:

I think it would be best to separate three goals here:

1. hash function throughput
2. quality of the hash values
3. avoiding collision attacks

XDL_FAST_HASH was strictly an attempt to improve throughput, and fairly
successful at that (6942efc (xdiff: load full words in the inner loop of
xdl_hash_record, 2012-04-06) quotes an 8% improvement on 'git log -p').

You are now addressing quality.

I have no idea how you ran into this, but if you are reworking things
already, I think it would be good to also randomize whatever hash you
put in so as to give some measure of protection against collision
attacks.

> 1. An XDL_FAST_HASH implementation that matches the output of the DJB
> hash exactly.  Its performance is basically the same as DJB, because
> the only thing is does differently is load 8 bytes at a time instead
> of 1.  It does all the same ALU operations as DJB.

I don't think there's a point in having such a function, since it would
mean a lot of code for no throughput gain.  Let's just remove
XDL_FAST_HASH and the original hashing scheme in favor of a better hash
function.

-- 
Thomas Rast
tr@thomasrast.ch

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: XDL_FAST_HASH can be very slow
  2014-12-22 10:48   ` Thomas Rast
@ 2014-12-23  2:51     ` demerphq
  0 siblings, 0 replies; 4+ messages in thread
From: demerphq @ 2014-12-23  2:51 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Patrick Reynolds, git@vger.kernel.org, Jeff King

(sorry for the repost, I use gmail and it send html mails by default).
On 22 December 2014 at 11:48, Thomas Rast <tr@thomasrast.ch> wrote:
>
> 1. hash function throughput
> 2. quality of the hash values
> 3. avoiding collision attacks
>
> XDL_FAST_HASH was strictly an attempt to improve throughput, and fairly
> successful at that (6942efc (xdiff: load full words in the inner loop of
> xdl_hash_record, 2012-04-06) quotes an 8% improvement on 'git log -p').
>
> You are now addressing quality.
>
> I have no idea how you ran into this, but if you are reworking things
> already, I think it would be good to also randomize whatever hash you
> put in so as to give some measure of protection against collision
> attacks.

I assume you mean DJB2 when you say DJB, and if so I will just note
that it is a pretty terrible hash function for arbitrary data. (I
understand it does better with raw text.) It does not pass either
strict-avalanche-criteria[1], nor does it pass the
bit-independence-criteria[2]. I have images which show how badly DJB2
fails these tests if anyone is interested.

Murmur3 is better, in that it does pass SAC and BIC, but before you
decide to use Murmur3 you should review https://131002.net/siphash/and
related resources which demonstrate multi-collision attacks on Murmur3
which are independent of the seed chosen. The paper also introduces a
new hash function with good performance properties, and claims that it
has cyptographic strength. (I say claims because I am not qualified to
judge if it is or not.) Eg:
https://131002.net/siphash/murmur3collisions-20120827.tar.gz

I think if you want performance and robustness against collision
attacks Siphash is a good candidate, as is perhaps the AES derived
hash used by the Go folks, but the performance of that algorithm is
strongly dependent on the CPU supporting AES primitives.

Anyway, the point is that simply adding a random seed to a hash
function like DJB2 or Murmur3 is not sufficient to prevent collision
attacks.

Yves
[1] A change in a single bit of the seed or the key should result in
50% of the output bits of the hash changing.
[2] output bits j and k should change independently when any single
input bit i is inverted, for all i, j and k.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-12-23  2:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-22  4:19 XDL_FAST_HASH can be very slow Jeff King
2014-12-22  9:08 ` Patrick Reynolds
2014-12-22 10:48   ` Thomas Rast
2014-12-23  2:51     ` demerphq

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).