git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Kiedrowicz <michal.kiedrowicz@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [RFC PATCH 3/3] git-grep: Learn PCRE
Date: Wed, 4 May 2011 08:16:28 +0200	[thread overview]
Message-ID: <20110504081628.6a80f4a0@mkiedrowicz> (raw)
In-Reply-To: <7vliyngz1x.fsf@alter.siamese.dyndns.org>

On 03.05.2011 16:11:06 -0700 Junio C Hamano <gitster@pobox.com> wrote:

> Michał Kiedrowicz <michal.kiedrowicz@gmail.com> writes:
> 
> > This patch lacks tests.
> >
> > AFAIK grep.c is a part of libgit.a, so libpcre is linked with the
> > git itself, which bloats it. I don't like it, especially because it
> > makes 'make test' take:
> >
> > 	real    6m27.558s
> > 	user    1m34.392s
> > 	sys     2m11.029s
> >
> > instead of
> >
> > 	real    3m41.322s
> > 	user    1m44.005s
> > 	sys     2m32.403s
> >
> > on my PC.
> 
> I am not against pcre, but the performance numbers look rather yucky.
> 
> > diff --git a/grep.c b/grep.c
> > index d67baf9..4f5fcbb 100644
> > --- a/grep.c
> > +++ b/grep.c
> > @@ -70,6 +70,30 @@ static void compile_regexp(struct grep_pat *p,
> > struct grep_opt *opt) if (p->fixed)
> >  		return;
> >  
> > +	if (opt->pcre) {
> > +#ifdef NO_LIBPCRE
> > +		die("cannot use Perl-compatible regexes when
> > libpcre is not compiled in"); +#else
> > +		const char *error;
> > +	...
> > +		return;
> > +#endif
> > +	}
> > +
> 
> Please avoid these #ifdefs in the middle of otherwise generic
> function.

Yeah, I don't like them too.

> 
> Make the above into something like:
> 
> 	if (opt->pcre) {
>         	compile_pcre(p, opt);
>                 return;
> 	}
> 
> and then have an extra section entirely devoted to pcre integration
> that has bunch of ...
> 
>         #ifdef NO_LIBPCRE
>         static void compile_pcre_part(struct grep_pat *p, struct
> grep_opt *opt) {
>                 die("...");
> 	}
>         #else
>         static void compile_pcre_part(struct grep_pat *p, struct
> grep_opt *opt) {
>                 ...
> 	}
>         #endif
> 
> ... that is totally separate from the generic part of the codebase.
> They could even be in a separate file, if you need numerous helpers.
> 
> > @@ -320,7 +344,16 @@ void free_grep_patterns(struct grep_opt *opt)
> >  		case GREP_PATTERN: /* atom */
> >  		case GREP_PATTERN_HEAD:
> >  		case GREP_PATTERN_BODY:
> > +#ifndef NO_LIBPCRE
> > +			if (p->re) {
> > +				pcre_free(p->re);
> > +				pcre_free(p->extra);
> > +			} else {
> > +				regfree(&p->regexp);
> > +			}
> > +#else
> >  			regfree(&p->regexp);
> > +#endif
> >  			break;
> >  		default:
> >  			break;
> 
> > diff --git a/grep.h b/grep.h
> > index 06621fe..cd202a9 100644
> > --- a/grep.h
> > +++ b/grep.h
> > @@ -1,6 +1,9 @@
> >  #ifndef GREP_H
> >  #define GREP_H
> >  #include "color.h"
> > +#ifndef NO_LIBPCRE
> > +#include <pcre.h>
> > +#endif /* NO_LIBPCRE */
> 
> This part might want to do something like
> 
> 	#ifndef NO_LIBPCRE
>         #include <pcre.h>
>         #else
>         typedef int pcre; /* dummy */
>         typedef int pcre_extra; /* dummy */
>         #endif
> 
> if it makes it easier to keep the generic part of the code and
> structure definition clean.
> 
> For example, the "free" part of your patch above then would become
> 
> 	if (p->pcre_regexp)
>         	free_pcre_part(p);
> 	else
>         	regfree(&p->regexp);
> 
> with the conditional enclosed entirely within the implementation of
> the free_pcre_part() helper function.
> 
> Also by doing so, the struct definition below can lose #ifndef, and
> can become:
> 
> 	 struct grep_pat {
>          ...
>         +	pcre *pcre_regexp;
>         +	pcre_extra *pcre_extra_info;
> 	 ...
> 	 };

That makes sense. Will re-roll (and add Signed-off-by).

> 
> > @@ -33,6 +36,10 @@ struct grep_pat {
> >  	size_t patternlen;
> >  	enum grep_header_field field;
> >  	regex_t regexp;
> > +#ifndef NO_LIBPCRE
> > +	pcre *re;
> > +	pcre_extra *extra;
> > +#endif /* NO_LIBPCRE */
> >  	unsigned fixed:1;
> >  	unsigned ignore_case:1;
> >  	unsigned word_regexp:1;
> > @@ -83,6 +90,7 @@ struct grep_opt {
> >  #define GREP_BINARY_TEXT	2
> >  	int binary;
> >  	int extended;
> > +	int pcre;
> >  	int relative;
> >  	int pathname;
> >  	int null_following_name;

  reply	other threads:[~2011-05-04  6:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-03 21:35 [PATCH 1/3] grep: Put calls to fixmatch() and regmatch() into patmatch() Michał Kiedrowicz
2011-05-03 21:35 ` [PATCH 2/3] Documentation: Add --line-number to git-grep synopsis Michał Kiedrowicz
2011-05-03 21:35 ` [RFC PATCH 3/3] git-grep: Learn PCRE Michał Kiedrowicz
2011-05-03 23:11   ` Junio C Hamano
2011-05-04  6:16     ` Michal Kiedrowicz [this message]
2011-05-03 23:15   ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110504081628.6a80f4a0@mkiedrowicz \
    --to=michal.kiedrowicz@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).