From: Fredrik Kuivinen <frekui@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3] Threaded grep
Date: Tue, 19 Jan 2010 08:34:54 +0100 [thread overview]
Message-ID: <20100119073454.GA15575@fredrik-laptop> (raw)
In-Reply-To: <alpine.LFD.2.00.1001180947140.13231@localhost.localdomain>
On Mon, Jan 18, 2010 at 10:05:19AM -0800, Linus Torvalds wrote:
> On my machine (4 cores with HT, so 8 threads total), I get the
> following:
>
> [ Note: the --no-ext-grep is because I'm comparing with a git that has
> the original grep optimization, but hasn't removed the external grep
> functionality yet. I just used the same command line when then comparing
> to next+your patch, so don't mind it.
>
> Also, the three examples are chosen to be: "trivial single match",
> "regex single match", and "trivial lots of matches" ]
You may be testing slightly different things: next has 885d211 (grep:
rip out pessimization to use fixmatch()) and bbc09c2 (grep: rip out
support for external grep) was added before. So next+patch uses
regexec and the version with support for external greps uses
strstr. IIRC strstr was a bit faster than regexec on your
box. However, this only explains a small part of the results you are
seeing.
>
> Before (all best-of-five), with the non-threaded internal grep:
>
> - git grep --no-ext-grep qwerty
>
> real 0m0.311s
> user 0m0.164s
> sys 0m0.144s
>
> - git grep --no-ext-grep qwerty.*as
>
> real 0m0.555s
> user 0m0.416s
> sys 0m0.132s
>
> - git grep --no-ext-grep function
>
> real 0m0.461s
> user 0m0.276s
> sys 0m0.180s
>
> After, with threading:
>
> - git grep --no-ext-grep qwerty
>
> real 0m0.152s
> user 0m0.788s
> sys 0m0.212s
>
> - git grep --no-ext-grep qwerty.*as
>
> real 0m0.148s
> user 0m0.724s
> sys 0m0.284s
>
> - git grep --no-ext-grep function
>
> real 0m0.241s
> user 0m1.436s
> sys 0m0.216s
>
> so now it's a clear win in all cases. And as you'd expect, the "complex
> case with single output" is the one that wins most, since it's the one
> that should have the most CPU usage, with the least synchronization.
>
> That said, it still does waste quite a bit of time doing this, and while
> it almost doubles performance in the last case too, it does so at the cost
> of quite a bit of overhead.
>
> (Some overhead is natural, especially since I have HT. Running two threads
> on the same core does _not_ give double the performance, so the CPU time
> almost doubling - because the threads themselves run slower - is not
> unexpected. It's when the CPU time more than quadruples that I suspect
> that it could be improved a bit).
I haven't seen this overhead during my tests. But I'm _guessing_ that
the problem is that the mutex grep_lock is held while the result is
written to stdout. So when we are writing, no significant amount of
work can be done by any thread. Here is a patch to fix this (applies
on to of v3).
diff --git a/builtin-grep.c b/builtin-grep.c
index 8fca7a6..422b0ec 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -139,21 +139,48 @@ static int grep_file_async(struct grep_opt *opt, char *name,
return 0;
}
+/* Mark the work_item as done. The function guarantees that w->done is
+ * set to 1 and that if w is included in a prefix of the range
+ * [todo_done, todo_start) where each work_item has .done == 1, then
+ * w->out will eventually be written to stdout. The writing is done
+ * either by this thread or by the thread which has set 'writing' to
+ * 1.
+ */
static void work_done(struct work_item *w)
{
- int old_done;
+ /* 1 if there is a thread which is writing data to stdout and
+ 0 otherwise. */
+ static int writing = 0;
+ int old_done, write_idx, write_to;
pthread_mutex_lock(&grep_lock);
w->done = 1;
old_done = todo_done;
- for(; todo[todo_done].done && todo_done != todo_start;
- todo_done = (todo_done+1) % ARRAY_SIZE(todo)) {
- w = &todo[todo_done];
- write_or_die(1, w->out.buf, w->out.len);
- if (w->type == WORK_BUF)
- free(w->buf);
-
- free(w->name);
+
+ /* We need a loop here if todo_start is changed while we write
+ * some data. */
+ while (!writing && todo[todo_done].done && todo_done != todo_start) {
+ write_idx = todo_done;
+ write_to = todo_start;
+ writing = 1;
+
+ /* Unlock the mutex while we write the data. This is
+ * safe as writing == 1 and hence there is at most one
+ * thread which writes data at any time. */
+ pthread_mutex_unlock(&grep_lock);
+ for(; todo[write_idx].done && write_idx != write_to;
+ write_idx = (write_idx+1) % ARRAY_SIZE(todo)) {
+ w = &todo[write_idx];
+ write_or_die(1, w->out.buf, w->out.len);
+ if (w->type == WORK_BUF)
+ free(w->buf);
+
+ free(w->name);
+ }
+
+ pthread_mutex_lock(&grep_lock);
+ writing = 0;
+ todo_done = write_idx;
}
if (old_done != todo_done)
next prev parent reply other threads:[~2010-01-19 7:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-18 10:33 [PATCH v3] Threaded grep Fredrik Kuivinen
2010-01-18 11:11 ` Johannes Sixt
2010-01-18 13:28 ` Fredrik Kuivinen
2010-01-18 13:45 ` Johannes Sixt
2010-01-18 18:05 ` Linus Torvalds
2010-01-19 7:34 ` Fredrik Kuivinen [this message]
2010-01-19 15:41 ` Linus Torvalds
[not found] ` <7vmy0bhxn1.fsf@alter.siamese.dyndns.org>
2010-01-19 0:12 ` Fredrik Kuivinen
2010-01-19 7:03 ` Johannes Sixt
2010-01-25 22:47 ` Fredrik Kuivinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100119073454.GA15575@fredrik-laptop \
--to=frekui@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).