From: Joe Perches <joe@perches.com>
To: "Stefan Beller" <sbeller@google.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git <git@vger.kernel.org>
Subject: Re: grep vs git grep performance?
Date: Thu, 26 Oct 2017 10:41:36 -0700 [thread overview]
Message-ID: <1509039696.11245.9.camel@perches.com> (raw)
In-Reply-To: <CAGZ79ka41NdzNxGAvtVW802088KydKkp3yHx=Z5q3Mc9GGa_+g@mail.gmail.com>
On Thu, 2017-10-26 at 09:58 -0700, Stefan Beller wrote:
> + Avar who knows a thing about pcre (I assume the regex compilation
> has impact on grep speed)
>
> On Thu, Oct 26, 2017 at 8:02 AM, Joe Perches <joe@perches.com> wrote:
> > Comparing a cache warm git grep vs command line grep
> > shows significant differences in cpu & wall clock.
> >
> > Any ideas how to improve this?
> >
> > $ time git grep "\bseq_.*%p\W" | wc -l
> > 112
> >
> > real 0m4.271s
> > user 0m15.520s
> > sys 0m0.395s
> >
> > $ time grep -r --include=*.[ch] "\bseq_.*%p\W" * | wc -l
> > 112
> >
> > real 0m1.164s
> > user 0m0.847s
> > sys 0m0.314s
> >
>
> I wonder how much is algorithmic advantage vs coding/micro
> optimization that we can do.
As do I. I presume this is libpcre related.
For instance, git grep performance is better than grep for:
$ time git grep -w "seq_printf" -- "*.[ch]" | wc -l
8609
real 0m0.301s
user 0m0.548s
sys 0m0.372s
$ time grep -w -r --include=*.[ch] "seq_printf" * | wc -l
8609
real 0m0.706s
user 0m0.396s
sys 0m0.309s
next prev parent reply other threads:[~2017-10-26 17:41 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-26 15:02 grep vs git grep performance? Joe Perches
2017-10-26 15:11 ` Han-Wen Nienhuys
2017-10-26 15:55 ` Joe Perches
2017-10-26 16:13 ` SZEDER Gábor
2017-10-26 16:20 ` Joe Perches
2017-10-26 16:58 ` Stefan Beller
2017-10-26 17:41 ` Joe Perches [this message]
2017-10-26 17:45 ` Stefan Beller
2017-10-27 17:22 ` Joe Perches
2017-10-27 22:11 ` Ævar Arnfjörð Bjarmason
2017-10-27 23:22 ` Joe Perches
2017-10-28 7:45 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1509039696.11245.9.camel@perches.com \
--to=joe@perches.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.