git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Fredrik Kuivinen <frekui@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3] Threaded grep
Date: Mon, 18 Jan 2010 10:05:19 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.2.00.1001180947140.13231@localhost.localdomain> (raw)
In-Reply-To: <20100118103334.GA17361@fredrik-laptop>



On Mon, 18 Jan 2010, Fredrik Kuivinen wrote:
>
> Make git grep use threads when it is available.

Ok, this is much better.

On my machine (4 cores with HT, so 8 threads total), I get the 
following:

 [ Note: the --no-ext-grep is because I'm comparing with a git that has 
  the original grep optimization, but hasn't removed the external grep 
  functionality yet. I just used the same command line when then comparing 
  to next+your patch, so don't mind it.

  Also, the three examples are chosen to be: "trivial single match", 
  "regex single match", and "trivial lots of matches" ]

Before (all best-of-five), with the non-threaded internal grep:

 - git grep --no-ext-grep qwerty

	real	0m0.311s
	user	0m0.164s
	sys	0m0.144s

 - git grep --no-ext-grep qwerty.*as

	real	0m0.555s
	user	0m0.416s
	sys	0m0.132s

 - git grep --no-ext-grep function

	real	0m0.461s
	user	0m0.276s
	sys	0m0.180s

After, with threading:

 - git grep --no-ext-grep qwerty

	real	0m0.152s
	user	0m0.788s
	sys	0m0.212s

 - git grep --no-ext-grep qwerty.*as

	real	0m0.148s
	user	0m0.724s
	sys	0m0.284s

 - git grep --no-ext-grep function

	real	0m0.241s
	user	0m1.436s
	sys	0m0.216s

so now it's a clear win in all cases. And as you'd expect, the "complex 
case with single output" is the one that wins most, since it's the one 
that should have the most CPU usage, with the least synchronization.

That said, it still does waste quite a bit of time doing this, and while 
it almost doubles performance in the last case too, it does so at the cost 
of quite a bit of overhead.

(Some overhead is natural, especially since I have HT. Running two threads 
on the same core does _not_ give double the performance, so the CPU time 
almost doubling - because the threads themselves run slower - is not 
unexpected. It's when the CPU time more than quadruples that I suspect 
that it could be improved a bit).

		Linus

  parent reply	other threads:[~2010-01-18 18:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-18 10:33 [PATCH v3] Threaded grep Fredrik Kuivinen
2010-01-18 11:11 ` Johannes Sixt
2010-01-18 13:28   ` Fredrik Kuivinen
2010-01-18 13:45     ` Johannes Sixt
2010-01-18 18:05 ` Linus Torvalds [this message]
2010-01-19  7:34   ` Fredrik Kuivinen
2010-01-19 15:41     ` Linus Torvalds
     [not found] ` <7vmy0bhxn1.fsf@alter.siamese.dyndns.org>
2010-01-19  0:12   ` Fredrik Kuivinen
2010-01-19  7:03     ` Johannes Sixt
2010-01-25 22:47     ` Fredrik Kuivinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1001180947140.13231@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).