git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Fredrik Kuivinen <frekui@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Johannes Sixt <j.sixt@viscovery.net>
Subject: Re: [PATCH v4] Threaded grep
Date: Mon, 25 Jan 2010 15:59:56 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.2.00.1001251542100.3574@localhost.localdomain> (raw)
In-Reply-To: <20100125225139.GA3048@fredrik-laptop>



On Mon, 25 Jan 2010, Fredrik Kuivinen wrote:
> 
> The results below are best of five runs in the Linux repository (on a
> box with two cores).
> 
> git grep qwerty

Before:

	real	0m0.531s
	user	0m0.412s
	sys	0m0.112s

After:

	real	0m0.151s
	user	0m0.720s
	sys	0m0.272s


> $ /usr/bin/time git grep void

Before:

	real	0m1.144s
	user	0m0.988s
	sys	0m0.148s

After:
	real	0m0.290s
	user	0m1.732s
	sys	0m0.232s

So it's helping a lot (~3.5x and ~3.9x) on this 4-core HT setup. 

I don't seem to ever get more than a 4x speedup, so my guess is that HT 
simply isn't able to do much of anything with this load. 

The profile for the threaded case says:

    51.73%      git  libc-2.11.1.so                 [.] re_search_internal
    11.47%      git  [kernel]                       [k] copy_user_generic_string
     2.90%      git  libc-2.11.1.so                 [.] __strlen_sse2
     2.66%      git  [kernel]                       [k] link_path_walk
     2.55%      git  [kernel]                       [k] intel_pmu_enable_all
     2.40%      git  [kernel]                       [k] __d_lookup
     1.71%      git  libc-2.11.1.so                 [.] __GI___libc_malloc
     1.55%      git  [kernel]                       [k] _raw_spin_lock
     1.43%      git  [kernel]                       [k] sys_futex
     1.30%      git  libc-2.11.1.so                 [.] __cfree
     1.28%      git  [kernel]                       [k] intel_pmu_disable_all
     1.25%      git  libc-2.11.1.so                 [.] __GI_memchr
     1.14%      git  libc-2.11.1.so                 [.] _int_malloc
     1.02%      git  [kernel]                       [k] effective_load

and the only thing that makes me go "eh?" there is the strlen(). Why is 
that so hot?  But locking doesn't seem to be the biggest issue, and in 
general I think this is all pretty good. The 'effective_load' thing is the 
scheduler, so there's certainly some context switching going on, probably 
still due to excessive synchronization, but it's equally clear that that 
is certainly not a dominant factor.

One potentially interesting data point is that if I make NR_THREADS be 16, 
performance goes down, and I get more locking overhead. So NR_THREADS of 8 
works well on this machine.

So ack from me. The patch looks reasonably clean too, at least for 
something as complex as a multi-threaded grep.

One worry is, of course, whether all regex() implementations are 
thread-safe. Maybe there are broken libraries that have hidden global 
state in them?

			Linus

  reply	other threads:[~2010-01-26  0:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-25 22:51 [PATCH v4] Threaded grep Fredrik Kuivinen
2010-01-25 23:59 ` Linus Torvalds [this message]
2010-01-26 12:10   ` Fredrik Kuivinen
2010-01-26 15:28     ` Linus Torvalds
2010-01-26 16:30       ` Benjamin Kramer
2010-01-26 16:44         ` Linus Torvalds
2010-01-26 16:56           ` Linus Torvalds
2010-01-26 17:19             ` Mike Hommey
2010-01-26 17:48             ` [PATCH] grep: use REG_STARTEND (if available) to speed up regexec Benjamin Kramer
2010-01-26  1:20 ` [PATCH v4] Threaded grep Junio C Hamano
2010-01-26 11:43   ` Fredrik Kuivinen
2010-01-26 17:21     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1001251542100.3574@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j.sixt@viscovery.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).