From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Jeff King <peff@peff.net>
Cc: Thomas Rast <trast@student.ethz.ch>,
git@vger.kernel.org, Eric Herman <eric@freesa.org>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 4/2] grep: turn off threading for non-worktree
Date: Wed, 07 Dec 2011 18:11:47 +0100 [thread overview]
Message-ID: <4EDF9E53.7090702@lsrfire.ath.cx> (raw)
In-Reply-To: <20111207044242.GB10765@sigill.intra.peff.net>
Am 07.12.2011 05:42, schrieb Jeff King:
> On Wed, Dec 07, 2011 at 12:01:37AM +0100, René Scharfe wrote:
>
>> Reading of git objects needs to be protected by an exclusive lock
>> and cannot be parallelized. Searching the read buffers can be done
>> in parallel, but for simple expressions threading is a net loss due
>> to its overhead, as measured by Thomas. Turn it off unless we're
>> searching in the worktree.
>
> Based on my earlier numbers, I was going to complain that we should
> also be checking the "simple expressions" assumption here, as time spent
> in the actual regex might be important.
>
> However, after trying to repeat my experiment, I think the numbers I
> posted earlier were misleading. For example, using my "more complex"
> regex of 'a.*b':
>
> $ time git grep --threads=8 'a.*b' HEAD >/dev/null
> real 0m8.655s
> user 0m23.817s
> sys 0m0.480s
>
> Look at that sweet, sweet parallelism. It's a quad-core with
> hyperthreading, so we're not getting the 8x speedup we might hope for
> (presumably due to lock contention on extracting blobs), but hey, 3x
> isn't bad. Except, wait:
>
> $ time git grep --threads=0 'a.*b' HEAD >/dev/null
> real 0m7.651s
> user 0m7.600s
> sys 0m0.048s
>
> We can get 1x on a single core, but the total time is lower! This
> processor is an i7 with "turbo boost", which means it clocks higher in
> single-core mode than when multiple cores are active. So the numbers I
> posted earlier were misleading. Yes, we got parallelism, but at the cost
> of knocking the clock speed down for a net loss.
Ugh, right, Turbo Boost complicates matters.
I don't understand the multiplied user time in the threaded case,
though. Is it caused by busy-waiting? Thomas reported similar numbers
earlier.
> The sweet spot for me seems to be:
>
> $ time git grep --threads=2 'a.*b' HEAD >/dev/null
> real 0m6.303s
> user 0m11.129s
> sys 0m0.220s
>
> I'd be curious to see results from somebody with a quad-core (or more)
> without turbo boost; I suspect that threading may have more benefit
> there, even though we have some lock contention for blobs.
It would be nice if we could come up with simple rules to calculate
defaults for the number of threads on a given run. Users shouldn't have
to specify this option normally. And it would be good if these rules
didn't require a list of all CPUs known to git. :)
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -1048,7 +1048,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>> nr_threads = 0;
>> #else
>> if (nr_threads == -1)
>> - nr_threads = (online_cpus() > 1) ? THREADS : 0;
>> + nr_threads = (online_cpus() > 1 && !list.nr) ? THREADS : 0;
>>
>> if (nr_threads > 0) {
>> opt.use_threads = 1;
>
> This doesn't kick in for "--cached", which has the same performance
> characteristics as grepping a tree. I think you want to add "&& !cached" to
> the conditional.
Oh, yes.
René
next prev parent reply other threads:[~2011-12-07 17:12 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-25 14:46 [PATCH] grep: load funcname patterns for -W Thomas Rast
2011-11-25 16:32 ` René Scharfe
2011-11-26 12:15 ` [PATCH] grep: enable multi-threading for -p and -W René Scharfe
2011-11-29 9:54 ` Thomas Rast
2011-11-29 13:49 ` René Scharfe
2011-11-29 14:07 ` Thomas Rast
2011-12-02 13:07 ` [PATCH v2 0/3] grep multithreading and scaling Thomas Rast
2011-12-02 13:07 ` [PATCH v2 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-02 13:07 ` [PATCH v2 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-02 13:07 ` [PATCH v2 3/3] grep: disable threading in all but worktree case Thomas Rast
2011-12-02 16:15 ` René Scharfe
2011-12-05 9:02 ` Thomas Rast
2011-12-06 22:48 ` René Scharfe
2011-12-06 23:01 ` [PATCH 4/2] grep: turn off threading for non-worktree René Scharfe
2011-12-07 4:42 ` Jeff King
2011-12-07 17:11 ` René Scharfe [this message]
2011-12-07 18:28 ` Jeff King
2011-12-07 20:11 ` J. Bruce Fields
2011-12-07 20:45 ` Jeff King
2011-12-07 8:12 ` Thomas Rast
2011-12-07 17:00 ` René Scharfe
2011-12-10 13:13 ` Pete Wyckoff
2011-12-12 22:37 ` René Scharfe
2011-12-07 4:24 ` [PATCH v2 3/3] grep: disable threading in all but worktree case Jeff King
2011-12-07 16:52 ` René Scharfe
2011-12-07 18:10 ` Jeff King
2011-12-07 8:11 ` Thomas Rast
2011-12-07 16:54 ` René Scharfe
2011-12-12 21:16 ` [PATCH v3 0/3] grep attributes and multithreading Thomas Rast
2011-12-12 21:16 ` [PATCH v3 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-12 21:16 ` [PATCH v3 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-16 8:22 ` Johannes Sixt
2011-12-16 17:34 ` Junio C Hamano
2011-12-12 21:16 ` [PATCH v3 3/3] grep: disable threading in non-worktree case Thomas Rast
2011-12-12 22:37 ` [PATCH v3 0/3] grep attributes and multithreading René Scharfe
2011-12-12 23:44 ` Junio C Hamano
2011-12-13 8:44 ` Thomas Rast
2011-12-23 22:37 ` [PATCH v2 3/3] grep: disable threading in all but worktree case Ævar Arnfjörð Bjarmason
2011-12-23 22:49 ` Thomas Rast
2011-12-24 1:39 ` Ævar Arnfjörð Bjarmason
2011-12-24 7:07 ` Jeff King
2011-12-24 10:49 ` Nguyen Thai Ngoc Duy
2011-12-24 10:55 ` Nguyen Thai Ngoc Duy
2011-12-24 13:38 ` Jeff King
2011-12-25 3:32 ` Nguyen Thai Ngoc Duy
2011-12-02 17:34 ` [PATCH v2 0/3] grep multithreading and scaling Jeff King
2011-12-05 9:38 ` Thomas Rast
2011-12-05 20:16 ` Thomas Rast
2011-12-06 0:40 ` Jeff King
2011-12-02 20:02 ` Eric Herman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EDF9E53.7090702@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=eric@freesa.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).