From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Miles Bader <miles@gnu.org>, Jeff King <peff@peff.net>,
Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH] grep: do not do external grep on skip-worktree entries
Date: Mon, 11 Jan 2010 07:59:18 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.2.00.1001110748560.13040@localhost.localdomain> (raw)
In-Reply-To: <alpine.LFD.2.00.1001110739280.13040@localhost.localdomain>
On Mon, 11 Jan 2010, Linus Torvalds wrote:
>
> Without testing it, I can already ACK it. It looks like the
> ObviouslyRightThing(tm) to do. But I'll run some numbers too.
Ok, some good news, some meh news, and some bad news.
The good news: the trivial numbers look good. It's noticeably faster than
external grep for me when it does the 'fixmatch()' case, quite probably
because fixmatch() on at least Linux/x86-64 (which is the only case I
really care about) uses SSE to do the string ops.
So on my Nehalem:
[torvalds@nehalem linux]$ time git grep qwerty > /dev/null
real 0m0.418s
user 0m0.204s
sys 0m0.136s
[torvalds@nehalem linux]$ time git grep --no-ext-grep qwerty > /dev/null
real 0m0.309s
user 0m0.168s
sys 0m0.136s
and since that simple fixmatch case is the common one for me, I'm happy.
The meh news: this shows how grep is faster than regexec() due to being a
smarter algorithm. For the non-fixed case (I used "qwerty.*as"), the
numbers are
- built-in:
real 0m0.548s
user 0m0.384s
sys 0m0.152s
- external:
real 0m0.415s
user 0m0.176s
sys 0m0.160s
so it really is just 'strstr()' that is faster. But This is a 'meh',
because I don't really care, and the new code is still way faster than the
old one. And I'd be personally willing to just drop the external grep if
this is the worst problem.
[ I worry a bit that some libc implementations of 'strstr' may suck, but I
wouldn't lose sleep over it. ]
The bad news is that you broke multi-line greps:
git grep --no-ext-grep -2 qwerty.*as
results in:
drivers/char/keyboard.c-unsigned char kbd_sysrq_xlate[KEY_MAX + 1] =
drivers/char/keyboard.c- "\000\0331234567890-=\177\t" /* 0x00 - 0x0f */
drivers/char/keyboard.c: "qwertyuiop[]\r\000as" /* 0x10 - 0x1f */
when the _correct_ result is
drivers/char/keyboard.c-unsigned char kbd_sysrq_xlate[KEY_MAX + 1] =
drivers/char/keyboard.c- "\000\0331234567890-=\177\t" /* 0x00 - 0x0f */
drivers/char/keyboard.c: "qwertyuiop[]\r\000as" /* 0x10 - 0x1f */
drivers/char/keyboard.c- "dfghjkl;'`\000\\zxcv" /* 0x20 - 0x2f */
drivers/char/keyboard.c- "bnm,./\000*\000 \000\201\202\203\204\205" /* 0x30 - 0x3f */
ie it didn't do the "two lines after" thing.
That said, the external grep also gets this wrong (a different way),
because it gets all the extra noise due to unnecessary separation lines,
so for the external grep I actually get
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
drivers/char/keyboard.c-unsigned char kbd_sysrq_xlate[KEY_MAX + 1] =
drivers/char/keyboard.c- "\000\0331234567890-=\177\t" /* 0x00 - 0x0f */
drivers/char/keyboard.c: "qwertyuiop[]\r\000as" /* 0x10 - 0x1f */
drivers/char/keyboard.c- "dfghjkl;'`\000\\zxcv" /* 0x20 - 0x2f */
drivers/char/keyboard.c- "bnm,./\000*\000 \000\201\202\203\204\205" /* 0x30 - 0x3f */
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
but that's a long-standing problem, and is more "ugly" than "wrong grep
results".
Linus
next prev parent reply other threads:[~2010-01-11 15:59 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-30 14:11 [PATCH] grep: do not do external grep on skip-worktree entries Nguyễn Thái Ngọc Duy
2009-12-31 7:01 ` Junio C Hamano
2009-12-31 7:09 ` Junio C Hamano
2010-01-02 11:50 ` Nguyen Thai Ngoc Duy
2010-01-02 18:44 ` Junio C Hamano
2010-01-02 19:15 ` Nguyen Thai Ngoc Duy
2010-01-02 19:45 ` Junio C Hamano
2010-01-03 2:35 ` Miles Bader
2010-01-03 2:47 ` Miles Bader
2010-01-03 3:08 ` Miles Bader
2010-01-03 19:32 ` Linus Torvalds
2010-01-03 20:49 ` Junio C Hamano
2010-01-04 5:31 ` Jeff King
2010-01-04 5:52 ` Junio C Hamano
2010-01-04 6:44 ` Jeff King
2010-01-04 7:08 ` Junio C Hamano
2010-01-04 7:14 ` Junio C Hamano
2010-01-04 7:29 ` Jeff King
2010-01-04 7:26 ` Jeff King
2010-01-04 8:09 ` Jeff King
2010-01-04 16:01 ` Linus Torvalds
2010-01-04 15:54 ` Linus Torvalds
2010-01-04 15:57 ` Miles Bader
2010-01-04 16:03 ` Linus Torvalds
2010-01-11 6:39 ` Junio C Hamano
2010-01-11 15:43 ` Linus Torvalds
2010-01-11 15:59 ` Linus Torvalds [this message]
2010-01-11 16:22 ` Junio C Hamano
2010-01-11 16:24 ` Junio C Hamano
2010-01-11 16:33 ` Linus Torvalds
2010-01-12 8:29 ` Junio C Hamano
2010-01-12 8:31 ` [PATCH] grep: lookahead optimization can be used with -L option Junio C Hamano
2010-01-12 8:32 ` [PATCH] grep: -L should show empty files Junio C Hamano
2010-01-12 21:27 ` Sverre Rabbelier
2010-01-13 6:56 ` Junio C Hamano
2010-01-13 16:04 ` Sverre Rabbelier
2010-01-13 19:48 ` Junio C Hamano
2010-01-13 6:48 ` [PATCH 1/2] grep: rip out support for external grep Junio C Hamano
2010-01-13 8:29 ` Jay Soffian
2010-01-13 8:59 ` Junio C Hamano
2010-01-13 15:20 ` Linus Torvalds
2010-01-13 6:51 ` [PATCH 2/2] grep: rip out pessimization to use fixmatch() Junio C Hamano
2010-01-12 16:21 ` [PATCH] grep: do not do external grep on skip-worktree entries Jeff King
2010-01-11 19:26 ` Fredrik Kuivinen
[not found] ` <4c8ef71001111119p253170f8q37bcd3708d894a62@mail.gmail.com>
2010-01-11 19:29 ` Linus Torvalds
2010-01-11 19:40 ` Fredrik Kuivinen
2010-01-11 20:07 ` Linus Torvalds
2010-01-11 21:07 ` Fredrik Kuivinen
2010-01-11 21:24 ` Linus Torvalds
2010-01-04 16:24 ` Linus Torvalds
2010-01-04 10:14 ` Nguyen Thai Ngoc Duy
2010-01-04 6:06 ` Mike Hommey
2010-01-04 7:04 ` Jeff King
2010-01-04 12:34 ` [PATCH 1/2] t7002: set test prerequisite "external-grep" if supported Nguyễn Thái Ngọc Duy
2010-01-07 2:37 ` Junio C Hamano
2010-01-07 4:29 ` Junio C Hamano
2010-01-07 13:27 ` Nguyen Thai Ngoc Duy
2010-01-07 14:04 ` Johannes Sixt
2010-01-07 14:26 ` Nguyen Thai Ngoc Duy
2010-01-04 12:34 ` [PATCH 2/2] t7002: add tests for skip-worktree fixes in commit a67e281 Nguyễn Thái Ngọc Duy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1001110748560.13040@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=miles@gnu.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).