From: Fredrik Kuivinen <frekui@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Junio C Hamano <gitster@pobox.com>, Miles Bader <miles@gnu.org>,
Jeff King <peff@peff.net>,
Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH] grep: do not do external grep on skip-worktree entries
Date: Mon, 11 Jan 2010 22:07:00 +0100 [thread overview]
Message-ID: <4c8ef71001111307q6679039ajbef22f2e1748df56@mail.gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1001111159270.17145@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 1685 bytes --]
On Mon, Jan 11, 2010 at 21:07, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Mon, 11 Jan 2010, Fredrik Kuivinen wrote:
>> >
>> > Try a complex pattern ("qwerty.*as" finds the same line), and see if that
>> > too is slower than before. If that is faster than it used to be (with
>> > --no-ext-grep, of course), then it's strstr() that is badly implemented.
>>
>> Ah, yes, that's it. With the pattern "qwerty.*as" I get 2.5s with the
>> patch and 6s without.
>
> Ok, so on your machine, regcomp() is basically twice as fast as strstr().
Yes.
> Which is not entirely unexpected: I was actually surprised by strstr()
> being apparently so good on my machine. I do not generally expect things
> like that to be at all optimized for bigger working sets. Most common uses
> of strstr() are in short strings - not "strings" that are many kilobytes
> in size (the whole file).
>
> In fact, I suspect it works so well for me because in my version of glibc
> it's not just SSE-optimized: judging by the naming it's SSE4.2 optimized -
> so the case I see on my machine will _only_ happen on Nehalem-based cores
> (ie the new "Core i[357]" cpu's).
>
> It is entirely possible that strstr in general is a disaster.
Another option is to use memmem instead. As we know the length of the
buffer already it should be a slight improvement over strstr for
everyone. memmem may cause some portability problems though as it is a
GNU extension.
I get these results: (git-grep --no-ext-grep qwerty, best of five)
Junio's patch: 0:04.84
memmem (attached patch on top of Junio's): 0:02.91
regcomp/regexec (I changed is_fixed to always return 0, also on top of
Junio's): 0:02.02
- Fredrik
[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 1236 bytes --]
diff --git a/grep.c b/grep.c
index 940e200..d34247f 100644
--- a/grep.c
+++ b/grep.c
@@ -264,13 +264,14 @@ static void show_name(struct grep_opt *opt, const char *name)
}
-static int fixmatch(const char *pattern, char *line, int ignore_case, regmatch_t *match)
+static int fixmatch(const char *pattern, char *line, char *eol,
+ int ignore_case, regmatch_t *match)
{
char *hit;
if (ignore_case)
hit = strcasestr(line, pattern);
else
- hit = strstr(line, pattern);
+ hit = memmem(line, eol - line, pattern, strlen(pattern));
if (!hit) {
match->rm_so = match->rm_eo = -1;
@@ -333,7 +334,7 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
again:
if (p->fixed)
- hit = !fixmatch(p->pattern, bol, p->ignore_case, pmatch);
+ hit = !fixmatch(p->pattern, bol, eol, p->ignore_case, pmatch);
else
hit = !regexec(&p->regexp, bol, 1, pmatch, eflags);
@@ -646,7 +647,7 @@ static int look_ahead(struct grep_opt *opt,
regmatch_t m;
if (p->fixed)
- hit = !fixmatch(p->pattern, bol, p->ignore_case, &m);
+ hit = !fixmatch(p->pattern, bol, bol + *left_p, p->ignore_case, &m);
else
hit = !regexec(&p->regexp, bol, 1, &m, 0);
if (!hit || m.rm_so < 0 || m.rm_eo < 0)
next prev parent reply other threads:[~2010-01-11 21:07 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-30 14:11 [PATCH] grep: do not do external grep on skip-worktree entries Nguyễn Thái Ngọc Duy
2009-12-31 7:01 ` Junio C Hamano
2009-12-31 7:09 ` Junio C Hamano
2010-01-02 11:50 ` Nguyen Thai Ngoc Duy
2010-01-02 18:44 ` Junio C Hamano
2010-01-02 19:15 ` Nguyen Thai Ngoc Duy
2010-01-02 19:45 ` Junio C Hamano
2010-01-03 2:35 ` Miles Bader
2010-01-03 2:47 ` Miles Bader
2010-01-03 3:08 ` Miles Bader
2010-01-03 19:32 ` Linus Torvalds
2010-01-03 20:49 ` Junio C Hamano
2010-01-04 5:31 ` Jeff King
2010-01-04 5:52 ` Junio C Hamano
2010-01-04 6:44 ` Jeff King
2010-01-04 7:08 ` Junio C Hamano
2010-01-04 7:14 ` Junio C Hamano
2010-01-04 7:29 ` Jeff King
2010-01-04 7:26 ` Jeff King
2010-01-04 8:09 ` Jeff King
2010-01-04 16:01 ` Linus Torvalds
2010-01-04 15:54 ` Linus Torvalds
2010-01-04 15:57 ` Miles Bader
2010-01-04 16:03 ` Linus Torvalds
2010-01-11 6:39 ` Junio C Hamano
2010-01-11 15:43 ` Linus Torvalds
2010-01-11 15:59 ` Linus Torvalds
2010-01-11 16:22 ` Junio C Hamano
2010-01-11 16:24 ` Junio C Hamano
2010-01-11 16:33 ` Linus Torvalds
2010-01-12 8:29 ` Junio C Hamano
2010-01-12 8:31 ` [PATCH] grep: lookahead optimization can be used with -L option Junio C Hamano
2010-01-12 8:32 ` [PATCH] grep: -L should show empty files Junio C Hamano
2010-01-12 21:27 ` Sverre Rabbelier
2010-01-13 6:56 ` Junio C Hamano
2010-01-13 16:04 ` Sverre Rabbelier
2010-01-13 19:48 ` Junio C Hamano
2010-01-13 6:48 ` [PATCH 1/2] grep: rip out support for external grep Junio C Hamano
2010-01-13 8:29 ` Jay Soffian
2010-01-13 8:59 ` Junio C Hamano
2010-01-13 15:20 ` Linus Torvalds
2010-01-13 6:51 ` [PATCH 2/2] grep: rip out pessimization to use fixmatch() Junio C Hamano
2010-01-12 16:21 ` [PATCH] grep: do not do external grep on skip-worktree entries Jeff King
2010-01-11 19:26 ` Fredrik Kuivinen
[not found] ` <4c8ef71001111119p253170f8q37bcd3708d894a62@mail.gmail.com>
2010-01-11 19:29 ` Linus Torvalds
2010-01-11 19:40 ` Fredrik Kuivinen
2010-01-11 20:07 ` Linus Torvalds
2010-01-11 21:07 ` Fredrik Kuivinen [this message]
2010-01-11 21:24 ` Linus Torvalds
2010-01-04 16:24 ` Linus Torvalds
2010-01-04 10:14 ` Nguyen Thai Ngoc Duy
2010-01-04 6:06 ` Mike Hommey
2010-01-04 7:04 ` Jeff King
2010-01-04 12:34 ` [PATCH 1/2] t7002: set test prerequisite "external-grep" if supported Nguyễn Thái Ngọc Duy
2010-01-07 2:37 ` Junio C Hamano
2010-01-07 4:29 ` Junio C Hamano
2010-01-07 13:27 ` Nguyen Thai Ngoc Duy
2010-01-07 14:04 ` Johannes Sixt
2010-01-07 14:26 ` Nguyen Thai Ngoc Duy
2010-01-04 12:34 ` [PATCH 2/2] t7002: add tests for skip-worktree fixes in commit a67e281 Nguyễn Thái Ngọc Duy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4c8ef71001111307q6679039ajbef22f2e1748df56@mail.gmail.com \
--to=frekui@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=miles@gnu.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).