Re: I'm a total push-over.. - Andreas Ericsson

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Andreas Ericsson <ae@op5.se>
To: Marko Kreen <markokr@gmail.com>
Cc: Dmitry Potapov <dpotapov@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: I'm a total push-over..
Date: Thu, 24 Jan 2008 17:00:51 +0100	[thread overview]
Message-ID: <4798B633.8040606@op5.se> (raw)
In-Reply-To: <e51f66da0801240519u4c8e6ddfrb7af8df34552252a@mail.gmail.com>

Marko Kreen wrote:
> On 1/23/08, Andreas Ericsson <ae@op5.se> wrote:
>> Marko Kreen wrote:
>>> (not lookup3) because lookup3 beat easily all the "simple" hashes
>> By how much? FNV beat Linus' hash by 0.01 microseconds / insertion,
>> and 0.1 microsecons / lookup. We're talking about a case here where
>> there will never be more lookups than insertions (unless I'm much
>> mistaken).
> 
> FNV is around 40% slower than lookup3 on my Intel Core CPU, on 4byte aligned
> input. See below for more detailed info.
> 

But the tests surely need to check for unaligned cases, as that's what
we're likely to hash, no?

>>> If you don't mind few percent speed penalty compared to Jenkings
>>> own optimized version, you can use my simplified version:
>>>
>>>   http://repo.or.cz/w/pgbouncer.git?a=blob;f=src/hash.c;h=5c9a73639ad098c296c0be562c34573189f3e083;hb=HEAD
>> I don't, but I don't care that deeply either. On the one hand,
>> it would be nifty to have an excellent hash-function in git.
>> On the other hand, it would look stupid with something that's
>> quite clearly over-kill.
> 
> Jenkins hash is fast because it does not look at individual bytes.
> If you _do_ want to look at them for unrelated reasons, (case-insensitive,
> unicode-juggling), then it obiously loses the point.  That is, if you
> want to process the string in one go.
> 

I believe the ability to add unicode-juggling was a major point
with the patch, so perhaps Jenkins' isn't such a good option.

I'm not familiar with data-mangling the way Linus (or Theo Tso
is), so I hadn't even considered that aspect of unrolled hashes.

>> It would have to be a significant improvement in wall-clock time
>> on a test-case of hashing 30k strings to warrant going from 6 to 80
>> lines of code, imo. I still believe the original dumb hash Linus
>> wrote is "good enough".
> 
> Well, ad-hoc dumb hashes may have horrible worst-cases that you cant
> see with light testing.  Therefore I'd still suggest some better
> researched dumb hash (eg. FNV or OAT).
> 

True. FNV is used in both MySQL and PostgreSQL. I'd say it's safe to
assume it's fairly well tested.

>> On a side-note, it was very interesting reading, and I shall have
>> to add jenkins3_mkreen() to my test-suite (although the "keep
>> copyright note" license thing bugs me a bit).
> 
> Sorry.  I just used template boilerplate.  Considering all the
> hard work was done by other people, it not proper to put under
> my own license.  I tagged the file as 'public domain' and pushed out.
> 

Thanks. I'll see if I can add it, although it'll probably have to
wait until I have reason to dig into it at work again. I've added
the hash to the code-base, but not yet incorporated it into the
test-case.

> Btw, the reason I started cleaning lookup3 was that at first I was
> scared of the complexity of Jenkins code and decided to go with
> Hsieh hash.  Then I found out that Hsieh code is under totally
> werdo license (http://www.azillionmonkeys.com/qed/weblicense.html)
> so I could not use it.
> 

True. I was in contact with him a while back since I wanted to use it
in an opensource project, but the licensing issues made me go with
another one instead. The patch got turned down anyways, so it was a
non-issue in the end, but... Ah well.

> 
> Here is my raw-speed test of different hashes.  Input is 4-byte
> aligned which should be common case for malloc()-ed strings.

Unless arena allocated, like we do in git.

I'm not surprised that this test favours Jenkin's and Hsie's.
That's to be expected as those benefit far more than simpler
hashing algorithms for long strings. The overhead when trying
shorter strings (say, between 3 and 15 chars, and not necessarily
4-byte aligned) sometimes make them quite a lot slower though.

> This also is best case for original lookup3(), on unaligned
> input the memcpy variants beat it easily.  Input string
> length varies randomly in range 0..64.
> 

Well, memcpy() isn't very interesting to compare against
hashes, as they test vastly different parts of the hardware's
parts' performance. memcpy() should also perform exactly the
same no matter what the test-data, which isn't always true for
hashes.

What *would* be interesting would be something along the lines
of "duff_cpy()": ie, an unrolled loop that aligns itself and
copies each byte to the same address each time.

The bytewise equivalence would ofcourse be

magic_cpy(unsigned char *k, int len)
{
	unsigned char magic;
	do {
		magic = *k++;
	} while (--len);
}

> own_memcpy - last 12-byte memcpy() calls out to libc
> memcpy_hack - last memcpy is inlined bytewise copy loop:
> 
>   while (len--) *dst++ = *src++;
> 
> Note that is is raw-speed test, if you benchmark larger code the
> hash difference probably matters less.
> 
> --------------------------------------------------------------------
> 
> Testing: seed=34 align=4 minlen=0 maxlen=64 trycnt=2 duration=10
> 
> lookup3 : try=0: ... 247.4880 MB/s
> lookup3 : try=1: ... 247.6154 MB/s
> own_memcpy: try=0: ... 223.5508 MB/s
> own_memcpy: try=1: ... 223.5830 MB/s
> memcpy_hack: try=0: ... 241.2241 MB/s
> memcpy_hack: try=1: ... 241.2492 MB/s
> lookup2 : try=0: ... 190.2697 MB/s
> lookup2 : try=1: ... 190.3283 MB/s
> fnv     : try=0: ... 153.0318 MB/s
> fnv     : try=1: ... 153.0178 MB/s
> hsieh   : try=0: ... 234.0468 MB/s
> hsieh   : try=1: ... 234.0426 MB/s
> oat     : try=0: ... 154.7804 MB/s
> oat     : try=1: ... 154.8226 MB/s
> elf     : try=0: ... 125.5892 MB/s
> elf     : try=1: ... 125.5734 MB/s
> 
> Results compared to reference:
> 
> lookup3         : 100.000 %
> own_memcpy      :  90.311 %
> memcpy_hack     :  97.449 %
> lookup2         :  76.872 %
> fnv             :  61.815 %
> hsieh           :  94.544 %
> oat             :  62.533 %
> elf             :  50.729 %
> 
> 

Interesting output, but not very surprising. Do you have the code
available somewhere?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

next prev parent reply	other threads:[~2008-01-24 16:01 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-22 23:37 I'm a total push-over Linus Torvalds
2008-01-23  1:35 ` Kevin Ballard
2008-01-23  2:23 ` Junio C Hamano
2008-01-23  2:36   ` Junio C Hamano
2008-01-23 12:24     ` Johannes Schindelin
2008-01-23 12:28       ` David Kastrup
2008-01-23 12:56         ` Theodore Tso
2008-01-23  2:58   ` Linus Torvalds
2008-01-23  3:19     ` Linus Torvalds
2008-01-25  6:50       ` Junio C Hamano
2008-01-25 16:24         ` Linus Torvalds
2008-01-23  7:23     ` Junio C Hamano
2008-01-23 12:25       ` Johannes Schindelin
2008-01-23 16:25         ` Linus Torvalds
2008-01-23 16:34           ` Johannes Schindelin
2008-01-23 17:09             ` Linus Torvalds
2008-01-23 17:29               ` Linus Torvalds
2008-01-25  5:21                 ` Jeremy Maitin-Shepard
2008-01-25 12:51                   ` Johannes Schindelin
2008-01-25 18:19                     ` Jeremy Maitin-Shepard
2008-01-25 18:24                       ` Johannes Schindelin
2008-01-25 19:07                       ` Junio C Hamano
2008-01-23  8:32 ` Andreas Ericsson
2008-01-23  9:15   ` Dmitry Potapov
2008-01-23  9:31     ` Andreas Ericsson
2008-01-23 14:01       ` Marko Kreen
2008-01-23 14:39         ` Andreas Ericsson
2008-01-24  6:51           ` Luke Lu
2008-01-24 10:24             ` Andreas Ericsson
2008-01-24 13:19           ` Marko Kreen
2008-01-24 16:00             ` Andreas Ericsson [this message]
2008-01-24 16:13               ` Marko Kreen
2008-01-24 16:28               ` Dmitry Potapov
2008-01-24 17:15                 ` Linus Torvalds
2008-01-24 18:45                   ` Dmitry Potapov
2008-01-24 19:08                     ` Linus Torvalds
2008-01-25 20:52                   ` Marko Kreen
2008-01-25 22:16                     ` Linus Torvalds
2008-01-25 22:35                       ` Linus Torvalds
2008-01-26 12:16                       ` Marko Kreen
2008-01-27  6:51                         ` Linus Torvalds
2008-01-27  8:21                           ` Dmitry Potapov
2008-01-27 14:07                             ` Johannes Schindelin
2008-01-27 14:48                               ` Dmitry Potapov
2008-01-27  9:45                           ` Marko Kreen
2008-01-27 15:06                             ` Dmitry Potapov
2008-01-26 12:37                       ` Marko Kreen
2008-01-25 20:08               ` Marko Kreen
2008-01-23 17:10       ` Dmitry Potapov
2008-01-24 10:39         ` Andreas Ericsson
2008-01-23 16:06   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4798B633.8040606@op5.se \
    --to=ae@op5.se \
    --cc=dpotapov@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=markokr@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).