All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: "Erik Faye-Lund" <kusmabite@gmail.com>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	"Junio C Hamano" <gitster@pobox.com>,
	git@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [PATCH] git gc: Speed it up by 18% via faster hash comparisons
Date: Thu, 28 Apr 2011 17:14:09 +0200	[thread overview]
Message-ID: <20110428151409.GA32025@elte.hu> (raw)
In-Reply-To: <20110428133708.GA31383@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> * Jonathan Nieder <jrnieder@gmail.com> wrote:
> 
> > Hi,
> > 
> > A side note for amusement.
> > 
> > Erik Faye-Lund wrote:
> > 
> > > --- a/cache.h
> > > +++ b/cache.h
> > > @@ -681,13 +681,17 @@ extern char *sha1_pack_name(const unsigned char *sha1);
> > >  extern char *sha1_pack_index_name(const unsigned char *sha1);
> > >  extern const char *find_unique_abbrev(const unsigned char *sha1, int);
> > >  extern const unsigned char null_sha1[20];
> > > -static inline int is_null_sha1(const unsigned char *sha1)
> > > +static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> > >  {
> > > -	return !memcmp(sha1, null_sha1, 20);
> > > +	/* early out for fast mis-match */
> > > +	if (*sha1 != *sha2)
> > > +		return *sha1 - *sha2;
> > > +
> > > +	return memcmp(sha1 + 1, sha2 + 1, 19);
> > >  }
> > 
> > On the off-chance that sha1 and sha2 are nicely aligned, a more
> > redundant
> > 
> > 	if (*sha1 != *sha2)
> > 		return *sha1 - *sha2;
> > 
> > 	return memcmp(sha1, sha2, 20);
> > 
> > would take advantage of that (yes, this is just superstition, but it
> > somehow seems comforting anyway).
> 
> Your variant also makes the code slightly more compact as the sha1+1 and sha2+1 
> addresses do not have to be computed. I'll re-test and resend this variant.

Seems to perform measurably worse:

 #
 # Open-coded loop:
 #
 Performance counter stats for './git gc' (10 runs):

       2358.560100 task-clock               #    0.763 CPUs utilized            ( +-  0.06% )
             1,870 context-switches         #    0.001 M/sec                    ( +-  3.09% )
               170 CPU-migrations           #    0.000 M/sec                    ( +-  3.54% )
            38,230 page-faults              #    0.016 M/sec                    ( +-  0.03% )
     7,513,529,543 cycles                   #    3.186 GHz                      ( +-  0.06% )
     1,634,103,128 stalled-cycles           #   21.75% of all cycles are idle   ( +-  0.28% )
    11,068,971,207 instructions             #    1.47  insns per cycle        
                                            #    0.15  stalled cycles per insn  ( +-  0.04% )
     2,487,656,519 branches                 # 1054.735 M/sec                    ( +-  0.03% )
        59,233,604 branch-misses            #    2.38% of all branches          ( +-  0.09% )

        3.092183093  seconds time elapsed  ( +-  3.49% )

 #
 # Front test + memcmp:
 #
 Performance counter stats for './git gc' (10 runs):

       2723.468639 task-clock               #    0.833 CPUs utilized            ( +-  0.22% )
             1,751 context-switches         #    0.001 M/sec                    ( +-  2.02% )
               167 CPU-migrations           #    0.000 M/sec                    ( +-  1.23% )
            38,230 page-faults              #    0.014 M/sec                    ( +-  0.03% )
     8,684,682,538 cycles                   #    3.189 GHz                      ( +-  0.21% )
     2,062,906,208 stalled-cycles           #   23.75% of all cycles are idle   ( +-  0.60% )
     9,019,624,641 instructions             #    1.04  insns per cycle        
                                            #    0.23  stalled cycles per insn  ( +-  0.04% )
     1,771,179,402 branches                 #  650.340 M/sec                    ( +-  0.04% )
        75,026,810 branch-misses            #    4.24% of all branches          ( +-  0.04% )

        3.271415104  seconds time elapsed  ( +-  1.97% )

So i think the open-coded loop variant i posted is faster.

The key observation is that there's two cases that matter to performance:

 - the hashes are different: in this case the front test catches 99% of the cases
 - the hashes are *equal*: in this case the open-coded loop performs better than the memcmp

My patch addresses both cases.

Thanks,

	Ingo

  reply	other threads:[~2011-04-28 15:17 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-27 22:51 [PATCH] git gc: Speed it up by 18% via faster hash comparisons Ingo Molnar
2011-04-27 23:10 ` Ingo Molnar
2011-04-27 23:18 ` Jonathan Nieder
2011-04-28  6:36   ` Ingo Molnar
2011-04-28  9:31     ` Jonathan Nieder
2011-04-28 10:36     ` Ingo Molnar
2011-04-28  9:32   ` Dmitry Potapov
2011-04-27 23:32 ` Junio C Hamano
2011-04-28  0:35   ` Ralf Baechle
2011-04-28  8:18     ` Bernhard R. Link
2011-04-28  9:42       ` Andreas Ericsson
2011-04-28  9:55         ` Erik Faye-Lund
2011-04-28 20:19           ` H. Peter Anvin
2011-04-28  6:27   ` Ingo Molnar
2011-04-28  9:17     ` Erik Faye-Lund
2011-04-28  9:33       ` Ingo Molnar
2011-04-28  9:37       ` Ingo Molnar
2011-04-28  9:50         ` Erik Faye-Lund
2011-04-28 10:10           ` Pekka Enberg
2011-04-28 10:19             ` Erik Faye-Lund
2011-04-28 10:30               ` Pekka Enberg
2011-04-28 11:59                 ` Erik Faye-Lund
2011-04-28 12:12                   ` Pekka Enberg
2011-04-28 12:36                   ` Jonathan Nieder
2011-04-28 12:40                     ` Erik Faye-Lund
2011-04-28 13:37                     ` Ingo Molnar
2011-04-28 15:14                       ` Ingo Molnar [this message]
2011-04-28 16:00                         ` Erik Faye-Lund
2011-04-28 20:32                           ` Ingo Molnar
2011-04-29  7:05                   ` Alex Riesen
2011-04-29 16:24                     ` H. Peter Anvin
2011-04-28 12:16                 ` Tor Arntsen
2011-04-28 20:23                   ` H. Peter Anvin
2011-04-28 12:17                 ` Andreas Ericsson
2011-04-28 12:28                   ` Erik Faye-Lund
2011-04-28 10:19           ` Ingo Molnar
2011-04-28 12:02             ` Nguyen Thai Ngoc Duy
2011-04-28 12:18             ` Erik Faye-Lund
2011-04-28 20:20             ` Junio C Hamano
2011-04-28 16:36         ` Dmitry Potapov
2011-04-28  8:52 ` Dmitry Potapov
2011-04-28  9:11   ` Ingo Molnar
2011-04-28  9:31     ` Dmitry Potapov
2011-04-28  9:44       ` Ingo Molnar
2011-04-28  9:38     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110428151409.GA32025@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hpa@zytor.com \
    --cc=jrnieder@gmail.com \
    --cc=kusmabite@gmail.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.