From: Jonathan Nieder <jrnieder@gmail.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: git@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
"Arnaldo Carvalho de Melo" <acme@redhat.com>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Pekka Enberg" <penberg@cs.helsinki.fi>
Subject: Re: [PATCH] git gc: Speed it up by 18% via faster hash comparisons
Date: Thu, 28 Apr 2011 04:31:21 -0500 [thread overview]
Message-ID: <20110428093120.GA377@elie> (raw)
In-Reply-To: <20110428063625.GB952@elte.hu>
Ingo Molnar wrote:
> * Jonathan Nieder <jrnieder@gmail.com> wrote:
>> E.g., how would something like
>>
>> const unsigned int *start1 = (const unsigned int *) sha1;
>> const unsigned int *start2 = (const unsigned int *) sha2;
>>
>> if (likely(*start1 != *start2)) {
>> if (*start1 < *start2)
>> return -1;
>> return +1;
>> }
>> return memcmp(sha1 + 4, sha2 + 4, 16);
>>
>> perform?
>
> Note that this function wont work on like 99% of the systems out there due to
> endianness assumptions in Git.
Yes, I was greedy and broke the semantics, and my suggestion was
nonsensical for other reasons (e.g., alignment), too. I should have
written something like:
if (likely(*sha1 != *sha2)) {
if (*sha1 < *sha2)
return -1;
return +1;
}
return memcmp(sha1, sha2, 20);
since speeding it up 255/256 times seems good enough already.
> Also, your hypothetical smart compiler would recognize the above as equivalent
> to memcmp(sha1, sha2, 20) and could rewrite it again - so we'd be back to
> square 1.
True. The real point is a "likely" to explain to human readers what
is happening.
> Having said that, it would be nice if someone could test these two patches on a
> modern AMD box, using the perf stat from here:
>
> http://people.redhat.com/mingo/tip.git/README
>
> cd tools/perf/
> make -j install
>
> and do something like this to test git gc's performance:
>
> $ perf stat --sync --repeat 10 ./git gc
>
> ... to see whether these speedups are generic, or somehow Intel CPU specific.
Sounds like fun. Will try to find time to play around with this in
the next few days.
> Well i messed up endianness in an early version of this patch and 'git gc' was
> eminently unhappy about it! I have not figured out which part of Git relies on
> the comparison result though - most places seem to use the result as a boolean.
I think hashcmp is used to run binary searches within a packfile
index. Thanks for explaining.
Regards,
Jonathan
next prev parent reply other threads:[~2011-04-28 9:31 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-27 22:51 [PATCH] git gc: Speed it up by 18% via faster hash comparisons Ingo Molnar
2011-04-27 23:10 ` Ingo Molnar
2011-04-27 23:18 ` Jonathan Nieder
2011-04-28 6:36 ` Ingo Molnar
2011-04-28 9:31 ` Jonathan Nieder [this message]
2011-04-28 10:36 ` Ingo Molnar
2011-04-28 9:32 ` Dmitry Potapov
2011-04-27 23:32 ` Junio C Hamano
2011-04-28 0:35 ` Ralf Baechle
2011-04-28 8:18 ` Bernhard R. Link
2011-04-28 9:42 ` Andreas Ericsson
2011-04-28 9:55 ` Erik Faye-Lund
2011-04-28 20:19 ` H. Peter Anvin
2011-04-28 6:27 ` Ingo Molnar
2011-04-28 9:17 ` Erik Faye-Lund
2011-04-28 9:33 ` Ingo Molnar
2011-04-28 9:37 ` Ingo Molnar
2011-04-28 9:50 ` Erik Faye-Lund
2011-04-28 10:10 ` Pekka Enberg
2011-04-28 10:19 ` Erik Faye-Lund
2011-04-28 10:30 ` Pekka Enberg
2011-04-28 11:59 ` Erik Faye-Lund
2011-04-28 12:12 ` Pekka Enberg
2011-04-28 12:36 ` Jonathan Nieder
2011-04-28 12:40 ` Erik Faye-Lund
2011-04-28 13:37 ` Ingo Molnar
2011-04-28 15:14 ` Ingo Molnar
2011-04-28 16:00 ` Erik Faye-Lund
2011-04-28 20:32 ` Ingo Molnar
2011-04-29 7:05 ` Alex Riesen
2011-04-29 16:24 ` H. Peter Anvin
2011-04-28 12:16 ` Tor Arntsen
2011-04-28 20:23 ` H. Peter Anvin
2011-04-28 12:17 ` Andreas Ericsson
2011-04-28 12:28 ` Erik Faye-Lund
2011-04-28 10:19 ` Ingo Molnar
2011-04-28 12:02 ` Nguyen Thai Ngoc Duy
2011-04-28 12:18 ` Erik Faye-Lund
2011-04-28 20:20 ` Junio C Hamano
2011-04-28 16:36 ` Dmitry Potapov
2011-04-28 8:52 ` Dmitry Potapov
2011-04-28 9:11 ` Ingo Molnar
2011-04-28 9:31 ` Dmitry Potapov
2011-04-28 9:44 ` Ingo Molnar
2011-04-28 9:38 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110428093120.GA377@elie \
--to=jrnieder@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@redhat.com \
--cc=fweisbec@gmail.com \
--cc=git@vger.kernel.org \
--cc=hpa@zytor.com \
--cc=mingo@elte.hu \
--cc=penberg@cs.helsinki.fi \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.