From: Michal Kiedrowicz <michal.kiedrowicz@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [RFC PATCH 3/3] git-grep: Learn PCRE
Date: Wed, 4 May 2011 08:16:28 +0200 [thread overview]
Message-ID: <20110504081628.6a80f4a0@mkiedrowicz> (raw)
In-Reply-To: <7vliyngz1x.fsf@alter.siamese.dyndns.org>
On 03.05.2011 16:11:06 -0700 Junio C Hamano <gitster@pobox.com> wrote:
> Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
>
> > This patch lacks tests.
> >
> > AFAIK grep.c is a part of libgit.a, so libpcre is linked with the
> > git itself, which bloats it. I don't like it, especially because it
> > makes 'make test' take:
> >
> > real 6m27.558s
> > user 1m34.392s
> > sys 2m11.029s
> >
> > instead of
> >
> > real 3m41.322s
> > user 1m44.005s
> > sys 2m32.403s
> >
> > on my PC.
>
> I am not against pcre, but the performance numbers look rather yucky.
>
> > diff --git a/grep.c b/grep.c
> > index d67baf9..4f5fcbb 100644
> > --- a/grep.c
> > +++ b/grep.c
> > @@ -70,6 +70,30 @@ static void compile_regexp(struct grep_pat *p,
> > struct grep_opt *opt) if (p->fixed)
> > return;
> >
> > + if (opt->pcre) {
> > +#ifdef NO_LIBPCRE
> > + die("cannot use Perl-compatible regexes when
> > libpcre is not compiled in"); +#else
> > + const char *error;
> > + ...
> > + return;
> > +#endif
> > + }
> > +
>
> Please avoid these #ifdefs in the middle of otherwise generic
> function.
Yeah, I don't like them too.
>
> Make the above into something like:
>
> if (opt->pcre) {
> compile_pcre(p, opt);
> return;
> }
>
> and then have an extra section entirely devoted to pcre integration
> that has bunch of ...
>
> #ifdef NO_LIBPCRE
> static void compile_pcre_part(struct grep_pat *p, struct
> grep_opt *opt) {
> die("...");
> }
> #else
> static void compile_pcre_part(struct grep_pat *p, struct
> grep_opt *opt) {
> ...
> }
> #endif
>
> ... that is totally separate from the generic part of the codebase.
> They could even be in a separate file, if you need numerous helpers.
>
> > @@ -320,7 +344,16 @@ void free_grep_patterns(struct grep_opt *opt)
> > case GREP_PATTERN: /* atom */
> > case GREP_PATTERN_HEAD:
> > case GREP_PATTERN_BODY:
> > +#ifndef NO_LIBPCRE
> > + if (p->re) {
> > + pcre_free(p->re);
> > + pcre_free(p->extra);
> > + } else {
> > + regfree(&p->regexp);
> > + }
> > +#else
> > regfree(&p->regexp);
> > +#endif
> > break;
> > default:
> > break;
>
> > diff --git a/grep.h b/grep.h
> > index 06621fe..cd202a9 100644
> > --- a/grep.h
> > +++ b/grep.h
> > @@ -1,6 +1,9 @@
> > #ifndef GREP_H
> > #define GREP_H
> > #include "color.h"
> > +#ifndef NO_LIBPCRE
> > +#include <pcre.h>
> > +#endif /* NO_LIBPCRE */
>
> This part might want to do something like
>
> #ifndef NO_LIBPCRE
> #include <pcre.h>
> #else
> typedef int pcre; /* dummy */
> typedef int pcre_extra; /* dummy */
> #endif
>
> if it makes it easier to keep the generic part of the code and
> structure definition clean.
>
> For example, the "free" part of your patch above then would become
>
> if (p->pcre_regexp)
> free_pcre_part(p);
> else
> regfree(&p->regexp);
>
> with the conditional enclosed entirely within the implementation of
> the free_pcre_part() helper function.
>
> Also by doing so, the struct definition below can lose #ifndef, and
> can become:
>
> struct grep_pat {
> ...
> + pcre *pcre_regexp;
> + pcre_extra *pcre_extra_info;
> ...
> };
That makes sense. Will re-roll (and add Signed-off-by).
>
> > @@ -33,6 +36,10 @@ struct grep_pat {
> > size_t patternlen;
> > enum grep_header_field field;
> > regex_t regexp;
> > +#ifndef NO_LIBPCRE
> > + pcre *re;
> > + pcre_extra *extra;
> > +#endif /* NO_LIBPCRE */
> > unsigned fixed:1;
> > unsigned ignore_case:1;
> > unsigned word_regexp:1;
> > @@ -83,6 +90,7 @@ struct grep_opt {
> > #define GREP_BINARY_TEXT 2
> > int binary;
> > int extended;
> > + int pcre;
> > int relative;
> > int pathname;
> > int null_following_name;
next prev parent reply other threads:[~2011-05-04 6:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-03 21:35 [PATCH 1/3] grep: Put calls to fixmatch() and regmatch() into patmatch() Michał Kiedrowicz
2011-05-03 21:35 ` [PATCH 2/3] Documentation: Add --line-number to git-grep synopsis Michał Kiedrowicz
2011-05-03 21:35 ` [RFC PATCH 3/3] git-grep: Learn PCRE Michał Kiedrowicz
2011-05-03 23:11 ` Junio C Hamano
2011-05-04 6:16 ` Michal Kiedrowicz [this message]
2011-05-03 23:15 ` Martin Langhoff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110504081628.6a80f4a0@mkiedrowicz \
--to=michal.kiedrowicz@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).