From: Linus Torvalds <torvalds@linux-foundation.org>
To: Fredrik Kuivinen <frekui@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Johannes Sixt <j.sixt@viscovery.net>
Subject: Re: [PATCH v4] Threaded grep
Date: Mon, 25 Jan 2010 15:59:56 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.2.00.1001251542100.3574@localhost.localdomain> (raw)
In-Reply-To: <20100125225139.GA3048@fredrik-laptop>
On Mon, 25 Jan 2010, Fredrik Kuivinen wrote:
>
> The results below are best of five runs in the Linux repository (on a
> box with two cores).
>
> git grep qwerty
Before:
real 0m0.531s
user 0m0.412s
sys 0m0.112s
After:
real 0m0.151s
user 0m0.720s
sys 0m0.272s
> $ /usr/bin/time git grep void
Before:
real 0m1.144s
user 0m0.988s
sys 0m0.148s
After:
real 0m0.290s
user 0m1.732s
sys 0m0.232s
So it's helping a lot (~3.5x and ~3.9x) on this 4-core HT setup.
I don't seem to ever get more than a 4x speedup, so my guess is that HT
simply isn't able to do much of anything with this load.
The profile for the threaded case says:
51.73% git libc-2.11.1.so [.] re_search_internal
11.47% git [kernel] [k] copy_user_generic_string
2.90% git libc-2.11.1.so [.] __strlen_sse2
2.66% git [kernel] [k] link_path_walk
2.55% git [kernel] [k] intel_pmu_enable_all
2.40% git [kernel] [k] __d_lookup
1.71% git libc-2.11.1.so [.] __GI___libc_malloc
1.55% git [kernel] [k] _raw_spin_lock
1.43% git [kernel] [k] sys_futex
1.30% git libc-2.11.1.so [.] __cfree
1.28% git [kernel] [k] intel_pmu_disable_all
1.25% git libc-2.11.1.so [.] __GI_memchr
1.14% git libc-2.11.1.so [.] _int_malloc
1.02% git [kernel] [k] effective_load
and the only thing that makes me go "eh?" there is the strlen(). Why is
that so hot? But locking doesn't seem to be the biggest issue, and in
general I think this is all pretty good. The 'effective_load' thing is the
scheduler, so there's certainly some context switching going on, probably
still due to excessive synchronization, but it's equally clear that that
is certainly not a dominant factor.
One potentially interesting data point is that if I make NR_THREADS be 16,
performance goes down, and I get more locking overhead. So NR_THREADS of 8
works well on this machine.
So ack from me. The patch looks reasonably clean too, at least for
something as complex as a multi-threaded grep.
One worry is, of course, whether all regex() implementations are
thread-safe. Maybe there are broken libraries that have hidden global
state in them?
Linus
next prev parent reply other threads:[~2010-01-26 0:00 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-25 22:51 [PATCH v4] Threaded grep Fredrik Kuivinen
2010-01-25 23:59 ` Linus Torvalds [this message]
2010-01-26 12:10 ` Fredrik Kuivinen
2010-01-26 15:28 ` Linus Torvalds
2010-01-26 16:30 ` Benjamin Kramer
2010-01-26 16:44 ` Linus Torvalds
2010-01-26 16:56 ` Linus Torvalds
2010-01-26 17:19 ` Mike Hommey
2010-01-26 17:48 ` [PATCH] grep: use REG_STARTEND (if available) to speed up regexec Benjamin Kramer
2010-01-26 1:20 ` [PATCH v4] Threaded grep Junio C Hamano
2010-01-26 11:43 ` Fredrik Kuivinen
2010-01-26 17:21 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1001251542100.3574@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=frekui@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j.sixt@viscovery.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).