git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff King <peff@peff.net>
Cc: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>,
	"Thomas Rast" <trast@student.ethz.ch>,
	git@vger.kernel.org, "Eric Herman" <eric@freesa.org>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH 4/2] grep: turn off threading for non-worktree
Date: Wed, 7 Dec 2011 15:11:05 -0500	[thread overview]
Message-ID: <20111207201105.GA22995@fieldses.org> (raw)
In-Reply-To: <20111207044242.GB10765@sigill.intra.peff.net>

On Tue, Dec 06, 2011 at 11:42:42PM -0500, Jeff King wrote:
> On Wed, Dec 07, 2011 at 12:01:37AM +0100, René Scharfe wrote:
> 
> > Reading of git objects needs to be protected by an exclusive lock
> > and cannot be parallelized.  Searching the read buffers can be done
> > in parallel, but for simple expressions threading is a net loss due
> > to its overhead, as measured by Thomas.  Turn it off unless we're
> > searching in the worktree.
> 
> Based on my earlier numbers, I was going to complain that we should
> also be checking the "simple expressions" assumption here, as time spent
> in the actual regex might be important.
> 
> However, after trying to repeat my experiment, I think the numbers I
> posted earlier were misleading. For example, using my "more complex"
> regex of 'a.*b':
> 
>   $ time git grep --threads=8 'a.*b' HEAD >/dev/null
>   real    0m8.655s
>   user    0m23.817s
>   sys     0m0.480s

Dumb question (I missed the beginning of the conversation): what kind of
storage are you using, and is the data already cached?

I seem to recall part of the motivation for the multithreading being
NFS, where the goal isn't so much to keep CPU's busy as it is to keep
the network busy.

Probably a bigger problem for something like "git status" which I think
ends up doing a series of stat's (which can each require a round trip to
the server in the NFS case), as it is a problem for something like
git-grep that's also doing reads.

Just a plea for considering the IO cost as well when making these kinds
of decisions....

(Which maybe you already do, apologies again for just naively dropping
into the middle of a thread.)

--b.

> 
> Look at that sweet, sweet parallelism. It's a quad-core with
> hyperthreading, so we're not getting the 8x speedup we might hope for
> (presumably due to lock contention on extracting blobs), but hey, 3x
> isn't bad. Except, wait:
> 
>   $ time git grep --threads=0 'a.*b' HEAD >/dev/null
>   real    0m7.651s
>   user    0m7.600s
>   sys     0m0.048s
> 
> We can get 1x on a single core, but the total time is lower! This
> processor is an i7 with "turbo boost", which means it clocks higher in
> single-core mode than when multiple cores are active. So the numbers I
> posted earlier were misleading. Yes, we got parallelism, but at the cost
> of knocking the clock speed down for a net loss.
> 
> The sweet spot for me seems to be:
> 
>   $ time git grep --threads=2 'a.*b' HEAD >/dev/null
>   real    0m6.303s
>   user    0m11.129s
>   sys     0m0.220s
> 
> I'd be curious to see results from somebody with a quad-core (or more)
> without turbo boost; I suspect that threading may have more benefit
> there, even though we have some lock contention for blobs.
> 
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -1048,7 +1048,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> >  	nr_threads = 0;
> >  #else
> >  	if (nr_threads == -1)
> > -		nr_threads = (online_cpus() > 1) ? THREADS : 0;
> > +		nr_threads = (online_cpus() > 1 && !list.nr) ? THREADS : 0;
> >  
> >  	if (nr_threads > 0) {
> >  		opt.use_threads = 1;
> 
> This doesn't kick in for "--cached", which has the same performance
> characteristics as grepping a tree. I think you want to add "&& !cached" to
> the conditional.
> 
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2011-12-07 20:11 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-25 14:46 [PATCH] grep: load funcname patterns for -W Thomas Rast
2011-11-25 16:32 ` René Scharfe
2011-11-26 12:15   ` [PATCH] grep: enable multi-threading for -p and -W René Scharfe
2011-11-29  9:54     ` Thomas Rast
2011-11-29 13:49       ` René Scharfe
2011-11-29 14:07         ` Thomas Rast
2011-12-02 13:07           ` [PATCH v2 0/3] grep multithreading and scaling Thomas Rast
2011-12-02 13:07             ` [PATCH v2 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-02 13:07             ` [PATCH v2 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-02 13:07             ` [PATCH v2 3/3] grep: disable threading in all but worktree case Thomas Rast
2011-12-02 16:15               ` René Scharfe
2011-12-05  9:02                 ` Thomas Rast
2011-12-06 22:48                 ` René Scharfe
2011-12-06 23:01                   ` [PATCH 4/2] grep: turn off threading for non-worktree René Scharfe
2011-12-07  4:42                     ` Jeff King
2011-12-07 17:11                       ` René Scharfe
2011-12-07 18:28                         ` Jeff King
2011-12-07 20:11                       ` J. Bruce Fields [this message]
2011-12-07 20:45                         ` Jeff King
2011-12-07  8:12                     ` Thomas Rast
2011-12-07 17:00                       ` René Scharfe
2011-12-10 13:13                         ` Pete Wyckoff
2011-12-12 22:37                           ` René Scharfe
2011-12-07  4:24                   ` [PATCH v2 3/3] grep: disable threading in all but worktree case Jeff King
2011-12-07 16:52                     ` René Scharfe
2011-12-07 18:10                       ` Jeff King
2011-12-07  8:11                   ` Thomas Rast
2011-12-07 16:54                     ` René Scharfe
2011-12-12 21:16                 ` [PATCH v3 0/3] grep attributes and multithreading Thomas Rast
2011-12-12 21:16                   ` [PATCH v3 1/3] grep: load funcname patterns for -W Thomas Rast
2011-12-12 21:16                   ` [PATCH v3 2/3] grep: enable threading with -p and -W using lazy attribute lookup Thomas Rast
2011-12-16  8:22                     ` Johannes Sixt
2011-12-16 17:34                       ` Junio C Hamano
2011-12-12 21:16                   ` [PATCH v3 3/3] grep: disable threading in non-worktree case Thomas Rast
2011-12-12 22:37                   ` [PATCH v3 0/3] grep attributes and multithreading René Scharfe
2011-12-12 23:44                   ` Junio C Hamano
2011-12-13  8:44                     ` Thomas Rast
2011-12-23 22:37               ` [PATCH v2 3/3] grep: disable threading in all but worktree case Ævar Arnfjörð Bjarmason
2011-12-23 22:49                 ` Thomas Rast
2011-12-24  1:39                   ` Ævar Arnfjörð Bjarmason
2011-12-24  7:07                     ` Jeff King
2011-12-24 10:49                       ` Nguyen Thai Ngoc Duy
2011-12-24 10:55                       ` Nguyen Thai Ngoc Duy
2011-12-24 13:38                         ` Jeff King
2011-12-25  3:32                       ` Nguyen Thai Ngoc Duy
2011-12-02 17:34             ` [PATCH v2 0/3] grep multithreading and scaling Jeff King
2011-12-05  9:38               ` Thomas Rast
2011-12-05 20:16                 ` Thomas Rast
2011-12-06  0:40                 ` Jeff King
2011-12-02 20:02             ` Eric Herman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111207201105.GA22995@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=eric@freesa.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=rene.scharfe@lsrfire.ath.cx \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).