From: Eric Herman <eric@freesa.org>
To: Pete Wyckoff <pw@padd.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Sverre Rabbelier" <srabbelier@gmail.com>,
"Fernando Vezzosi" <buccia@repnz.net>
Subject: Re: [PATCH] grep: detect number of CPUs for thread spawning
Date: Sun, 06 Nov 2011 19:00:00 +0100 [thread overview]
Message-ID: <4EB6CB20.5060309@freesa.org> (raw)
In-Reply-To: <20111106145050.GA4219@arf.padd.com>
Hello Pete,
Thank you for the feedback.
On 11/06/2011 03:50 PM, Pete Wyckoff wrote:
>> From: Eric Herman<eric@freesa.org>
>>
>> Change the number of threads that we spawn from a hardcoded value of
>> "8" to what online_cpus() returns.
> I agree with the need to exploit>8 CPUs, but I lose a lot of
> performance when limiting the threads to the number of physical
> CPUs.
Ah, yes, Being focused on big machines, I did not actually test with low
CPU machines, certainly not with NFS mounts.
>
> Tests without your patch on master, just changing "#define
> THREADS" from 8 to 2. On a 2-core Intel Core2 Duo.
>
> Producing lots of output:
>
> 8 threads:
>
> $ time ~/u/src/git/bin-wrappers/git grep f> /dev/null
> 0m14.02s user 0m3.64s sys 0m11.93s elapsed 148.07 %CPU
> $ time ~/u/src/git/bin-wrappers/git grep f> /dev/null
> 0m13.86s user 0m3.70s sys 0m11.82s elapsed 148.57 %CPU
>
> 2 threads:
>
> $ time ~/u/src/git/bin-wrappers/git grep f> /dev/null
> 0m15.14s user 0m3.52s sys 0m24.22s elapsed 77.05 %CPU
> $ time ~/u/src/git/bin-wrappers/git grep f> /dev/null
> 0m14.85s user 0m3.79s sys 0m24.20s elapsed 77.05 %CPU
>
> Producing no output:
>
> 8 threads:
>
> $ time ~/u/src/git/bin-wrappers/git grep unfindable-string
> 0m1.14s user 0m3.68s sys 0m5.17s elapsed 93.22 %CPU
> $ time ~/u/src/git/bin-wrappers/git grep unfindable-string
> 0m1.28s user 0m3.56s sys 0m5.15s elapsed 94.22 %CPU
>
> 2 threads:
>
> $ time ~/u/src/git/bin-wrappers/git grep unfindable-string
> 0m1.36s user 0m3.64s sys 0m16.82s elapsed 29.75 %CPU
> $ time ~/u/src/git/bin-wrappers/git grep unfindable-string
> 0m1.38s user 0m3.66s sys 0m16.81s elapsed 30.04 %CPU
>
> My workdir is on NFS, where even though the repository is fully
> cached, the open()s must go to the server. Using more threads
> than CPUs makes it more likely that some thread isn't blocked.
This is good data.
It gives me ideas for how I can do some more testing.
>
> You could add a #threads knob,
Sure, adding a knob is not a bad idea.
> but then we'd have to get
> everybody on NFS to set that properly.
Indeed, I think you agree that it would be better if there was no need
for most people to fiddle with yet another knob.
> Or take a look at
> preload_index() to see how it guesses at how many threads it
> needs.
Good tip.
A quick peek at preload_index suggests that it was a bit of guesswork:
/*
* Mostly randomly chosen maximum thread counts: we
* cap the parallelism to 20 threads, and we want
* to have at least 500 lstat's per thread for it to
* be worth starting a thread.
*/
However, your comments make me wonder if a rule-of-thumb like "3 +
online_cpus()" would yield better results across both large and small
numbers of cores with either blazing fast or very slow storage.
I will create a setup similar to the one you describe and do some
exploration.
Cheers,
-Eric
--
http://www.freesa.org/ -- mobile: +31 620719662
aim: ericigps -- skype: eric_herman -- jabber: eric.herman@gmail.com
prev parent reply other threads:[~2011-11-06 18:07 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-05 14:16 [PATCH] grep: detect number of CPUs for thread spawning Ævar Arnfjörð Bjarmason
2011-11-06 14:50 ` Pete Wyckoff
2011-11-06 18:00 ` Eric Herman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EB6CB20.5060309@freesa.org \
--to=eric@freesa.org \
--cc=avarab@gmail.com \
--cc=buccia@repnz.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pw@padd.com \
--cc=srabbelier@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).