All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Dmitry Potapov <dpotapov@gmail.com>
Cc: git@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>
Subject: Re: [PATCH] git gc: Speed it up by 18% via faster hash comparisons
Date: Thu, 28 Apr 2011 11:44:59 +0200	[thread overview]
Message-ID: <20110428094459.GA15952@elte.hu> (raw)
In-Reply-To: <BANLkTikWd8=1RbY78tPFMVhuV05eKVzjkg@mail.gmail.com>


* Dmitry Potapov <dpotapov@gmail.com> wrote:

> 2011/4/28 Ingo Molnar <mingo@elte.hu>:
> >
> > * Dmitry Potapov <dpotapov@gmail.com> wrote:
> >
> >> 2011/4/28 Ingo Molnar <mingo@elte.hu>:
> >> > +static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> >> >  {
> >> > -       return !memcmp(sha1, null_sha1, 20);
> >> > +       int i;
> >> > +
> >> > +       for (i = 0; i < 20; i++, sha1++, sha2++) {
> >> > +               if (*sha1 != *sha2) {
> >>
> >> At the very least, you may want to put 'likely' in this 'if'
> >> condition, otherwise the compiler may optimize this loop in
> >> the same way as with memcmp. So, it may work well now, but
> >> it may not work much slower with future versions or different
> >> level of optimization. (AFAIK, -O3 is far more aggressive in
> >> optimizing of loops).
> >
> > the main difference is between the string assembly instructions and the loop.
> > Modern CPUs will hardly notice this loop being emitted with slight variations
> > by the compiler. So i do not share this concern.
> 
> Here you make an assumption what kind of optimization the compiler
> can do. [...]

I make no assumption there because rule #1 is that the compiler can pretty well 
do what it wants and we have little control over that.

> [...] As Jonathan noticed above, theoretically a smart compiler can turn this 
> loop into memcmp (or code very similar to memcmp).

Yes, and in practice it does not, and in practice we can speed up git gc 
measurably.

> The reason why memcmp does not work well is that it is optimized
> for the worst case scenario (where beginning of two strings is
> the same), while _we_ know that with a hash it very unlikely,
> and we want to conduct this knowledge to the compiler in some
> way. Just re-writing memcmp as explicit loop does not conduct
> this knowledge.
> 
> Therefore, I believe it makes sense to add 'likely'. I have not
> tested this code, but in the past, I had a very similar code
> which was compiled with -O3, and just putting likely turned out
> to 40% speed-up for that comparison function.

You guys can certainly add the 'likely()' if you want to (it likely wont hurt) 
- but note that the compiler can *still* turn it into a memcpy() - see rule #1 
above.

Note that Git does not have a likely() facility at the moment and 
__builtin_expect() is a GNU extension. Should be a separate patch.

Thanks,

	Ingo

  reply	other threads:[~2011-04-28  9:45 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-27 22:51 [PATCH] git gc: Speed it up by 18% via faster hash comparisons Ingo Molnar
2011-04-27 23:10 ` Ingo Molnar
2011-04-27 23:18 ` Jonathan Nieder
2011-04-28  6:36   ` Ingo Molnar
2011-04-28  9:31     ` Jonathan Nieder
2011-04-28 10:36     ` Ingo Molnar
2011-04-28  9:32   ` Dmitry Potapov
2011-04-27 23:32 ` Junio C Hamano
2011-04-28  0:35   ` Ralf Baechle
2011-04-28  8:18     ` Bernhard R. Link
2011-04-28  9:42       ` Andreas Ericsson
2011-04-28  9:55         ` Erik Faye-Lund
2011-04-28 20:19           ` H. Peter Anvin
2011-04-28  6:27   ` Ingo Molnar
2011-04-28  9:17     ` Erik Faye-Lund
2011-04-28  9:33       ` Ingo Molnar
2011-04-28  9:37       ` Ingo Molnar
2011-04-28  9:50         ` Erik Faye-Lund
2011-04-28 10:10           ` Pekka Enberg
2011-04-28 10:19             ` Erik Faye-Lund
2011-04-28 10:30               ` Pekka Enberg
2011-04-28 11:59                 ` Erik Faye-Lund
2011-04-28 12:12                   ` Pekka Enberg
2011-04-28 12:36                   ` Jonathan Nieder
2011-04-28 12:40                     ` Erik Faye-Lund
2011-04-28 13:37                     ` Ingo Molnar
2011-04-28 15:14                       ` Ingo Molnar
2011-04-28 16:00                         ` Erik Faye-Lund
2011-04-28 20:32                           ` Ingo Molnar
2011-04-29  7:05                   ` Alex Riesen
2011-04-29 16:24                     ` H. Peter Anvin
2011-04-28 12:16                 ` Tor Arntsen
2011-04-28 20:23                   ` H. Peter Anvin
2011-04-28 12:17                 ` Andreas Ericsson
2011-04-28 12:28                   ` Erik Faye-Lund
2011-04-28 10:19           ` Ingo Molnar
2011-04-28 12:02             ` Nguyen Thai Ngoc Duy
2011-04-28 12:18             ` Erik Faye-Lund
2011-04-28 20:20             ` Junio C Hamano
2011-04-28 16:36         ` Dmitry Potapov
2011-04-28  8:52 ` Dmitry Potapov
2011-04-28  9:11   ` Ingo Molnar
2011-04-28  9:31     ` Dmitry Potapov
2011-04-28  9:44       ` Ingo Molnar [this message]
2011-04-28  9:38     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110428094459.GA15952@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=dpotapov@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=hpa@zytor.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.