From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Stefan Beller <sbeller@google.com>
Cc: "git\@vger.kernel.org" <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
J Smith <dark.panda@gmail.com>, Joe Ratterman <jratt0@gmail.com>,
Fredrik Kuivinen <frekui@gmail.com>
Subject: Re: [PATCH 5/5] grep: remove regflags from the public grep_opt API
Date: Thu, 29 Jun 2017 20:16:19 +0200 [thread overview]
Message-ID: <87zicqirrg.fsf@gmail.com> (raw)
In-Reply-To: <CAGZ79kbOpMpi0Dv6=ViW45gq5E9KHgpz4GE4o8XA-KMaiR78Vw@mail.gmail.com>
On Thu, Jun 29 2017, Stefan Beller jotted:
> On Wed, Jun 28, 2017 at 2:58 PM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> Refactor calls to the grep machinery to always pass opt.ignore_case &
>> opt.extended_regexp_option instead of setting the equivalent regflags
>> bits.
>>
>> The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log:
>> make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was
>> really just plastering over the code smell which this change fixes.
>>
>> See my "Re: [PATCH v3 05/30] log: make --regexp-ignore-case work with
>> --perl-regexp"[1] for the discussion leading up to this.
>>
>> The reason for adding the extensive commentary here is that I
>> discovered some subtle complexity in implementing this that really
>> should be called out explicitly to future readers.
>>
>> Before this change we'd rely on the difference between
>> `extended_regexp_option` and `regflags` to serve as a membrane between
>> our preliminary parsing of grep.extendedRegexp and grep.patternType,
>> and what we decided to do internally.
>>
>> Now that those two are the same thing, it's necessary to unset
>> `extended_regexp_option` just before we commit in cases where both of
>> those config variables are set. See 84befcd0a4 ("grep: add a
>> grep.patternType configuration setting", 2012-08-03) for the code and
>> documentation related to that.
>>
>> The explanation of why the if/else branches in
>> grep_commit_pattern_type() are ordered the way they are exists in that
>> commit message, but I think it's worth calling this subtlety out
>> explicitly with a comment for future readers.
>
> Up to here the commit message is inspiring confidence.
Thanks.
>>
>> Unrelated to that: I could have factored out the default REG_NEWLINE
>> flag into some custom GIT_GREP_H_DEFAULT_REGFLAGS or something, but
>> since it's just used in two places I didn't think it was worth the
>> effort.
>>
>> As an aside we're really lacking test coverage regflags being
>> initiated as 0 instead of as REG_NEWLINE. Tests will fail if it's
>> removed from compile_regexp(), but not if it's removed from
>> compile_fixed_regexp(). I have not dug to see if it's actually needed
>> in the latter case or if the test coverage is lacking.
>
> This sounds as if extra careful review is needed.
Note though (since I didn't say this explicitly) nothing about this
commit changes the semanics of what we pass to regcomp, I'm just noting
this caveat with REG_NEWLINE as an aside since I'm moving it around.
>>
>> 1. <CACBZZX6Hp4Q4TOj_X1fbdCA4twoXF5JemZ5ZbEn7wmkA=1KO2g@mail.gmail.com>
>> (https://public-inbox.org/git/CACBZZX6Hp4Q4TOj_X1fbdCA4twoXF5JemZ5ZbEn7wmkA=1KO2g@mail.gmail.com/)
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>> builtin/grep.c | 2 --
>> grep.c | 43 ++++++++++++++++++++++++++++++++++---------
>> grep.h | 1 -
>> revision.c | 2 --
>> 4 files changed, 34 insertions(+), 14 deletions(-)
>>
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index f61a9d938b..b682966439 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -1169,8 +1169,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>
>> if (!opt.pattern_list)
>> die(_("no pattern given."));
>> - if (!opt.fixed && opt.ignore_case)
>> - opt.regflags |= REG_ICASE;
>>
>> /*
>> * We have to find "--" in a separate pass, because its presence
>> diff --git a/grep.c b/grep.c
>> index 736e1e00d6..51aaad9f03 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -35,7 +35,6 @@ void init_grep_defaults(void)
>> memset(opt, 0, sizeof(*opt));
>> opt->relative = 1;
>> opt->pathname = 1;
>> - opt->regflags = REG_NEWLINE;
>> opt->max_depth = -1;
>> opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
>> color_set(opt->color_context, "");
>> @@ -154,7 +153,6 @@ void grep_init(struct grep_opt *opt, const char *prefix)
>> opt->linenum = def->linenum;
>> opt->max_depth = def->max_depth;
>> opt->pathname = def->pathname;
>> - opt->regflags = def->regflags;
>> opt->relative = def->relative;
>> opt->output = def->output;
>>
>> @@ -170,6 +168,24 @@ void grep_init(struct grep_opt *opt, const char *prefix)
>>
>> static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
>> {
>> + /*
>> + * When committing to the pattern type by setting the relevant
>> + * fields in grep_opt it's generally not necessary to zero out
>> + * the fields we're not choosing, since they won't have been
>> + * set by anything. The extended_regexp_option field is the
>> + * only exception to this.
>> + *
>> + * This is because in the process of parsing grep.patternType
>> + * & grep.extendedRegexp we set opt->pattern_type_option and
>> + * opt->extended_regexp_option, respectively. We then
>> + * internally use opt->extended_regexp_option to see if we're
>> + * compiling an ERE. It must be unset if that's not actually
>> + * the case.
>> + */
>> + if (pattern_type != GREP_PATTERN_TYPE_ERE &&
>> + opt->extended_regexp_option)
>> + opt->extended_regexp_option = 0;
>> +
>> switch (pattern_type) {
>> case GREP_PATTERN_TYPE_UNSPECIFIED:
>> /* fall through */
>> @@ -178,7 +194,7 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
>> break;
>>
>> case GREP_PATTERN_TYPE_ERE:
>> - opt->regflags |= REG_EXTENDED;
>> + opt->extended_regexp_option = 1;
>> break;
>>
>> case GREP_PATTERN_TYPE_FIXED:
>> @@ -208,6 +224,11 @@ void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_o
>> else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
>> grep_set_pattern_type_option(opt->pattern_type_option, opt);
>> else if (opt->extended_regexp_option)
>> + /*
>> + * This branch *must* happen after setting from the
>> + * opt->pattern_type_option above,
>
> I do not quite understand this. Are you saying
>
> opt->pattern_type_option takes precedence over
> opt->extended_regexp_option if the former is not _UNSPECIFIED ?
I mean this "else if" code *must* be in that order, i.e.:
else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
grep_set_pattern_type_option(opt->pattern_type_option, opt);
else if (opt->extended_regexp_option)
grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
Not:
else if (opt->extended_regexp_option)
grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
grep_set_pattern_type_option(opt->pattern_type_option, opt);
Since we only want to pay attention to grep.extendedRegexp it
grep.patternType is not set. If grep.patternType is set then the
pattern_type_option will not be GREP_PATTERN_TYPE_UNSPECIFIED (but
e.g. GREP_PATTERN_TYPE_BRE).
> As grep_set_pattern_type_option is only called from here,
> I wondered if we can put the long comment (and the code)
> here in this function grep_commit_pattern_type to have it less
> subtle? I have no proposal how though.
Ah you mean the whole "When committing to the pattern type by" comment +
code. Yeah I think that makes sense. I'll try that for v2 and see if
it's better.
> I think I grokked this patch and it makes sense, though the commit
> message strongly hints at asking for tests. ;)
*Points up at "moving it around" comment above*
next prev parent reply other threads:[~2017-06-29 18:16 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-20 21:42 [PATCH v3 00/30] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 01/30] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 02/30] grep & rev-list doc: stop promising libpcre for --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 03/30] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 04/30] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 05/30] log: make --regexp-ignore-case work with --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-20 23:50 ` Junio C Hamano
2017-05-21 6:58 ` Ævar Arnfjörð Bjarmason
2017-05-22 0:17 ` Junio C Hamano
2017-06-28 21:58 ` [PATCH 0/5] grep: remove redundant code & reflags from API Ævar Arnfjörð Bjarmason
2017-06-29 22:22 ` [PATCH v2 0/6] " Ævar Arnfjörð Bjarmason
2017-06-29 22:22 ` [PATCH v2 1/6] grep: remove redundant double assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-29 22:22 ` [PATCH v2 2/6] grep: adjust a redundant grep pattern type assignment Ævar Arnfjörð Bjarmason
2017-06-29 22:22 ` [PATCH v2 3/6] grep: remove redundant "fixed" field re-assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-29 22:22 ` [PATCH v2 4/6] grep: remove redundant and verbose re-assignments " Ævar Arnfjörð Bjarmason
2017-06-29 22:22 ` [PATCH v2 5/6] grep: remove regflags from the public grep_opt API Ævar Arnfjörð Bjarmason
2017-06-30 17:20 ` Junio C Hamano
2017-06-29 22:22 ` [PATCH v2 6/6] grep: remove redundant REG_NEWLINE when compiling fixed regex Ævar Arnfjörð Bjarmason
2017-06-28 21:58 ` [PATCH 1/5] grep: remove redundant double assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-28 21:58 ` [PATCH 2/5] grep: remove redundant grep pattern type assignment Ævar Arnfjörð Bjarmason
2017-06-29 17:03 ` Stefan Beller
2017-06-29 17:50 ` Ævar Arnfjörð Bjarmason
2017-06-28 21:58 ` [PATCH 3/5] grep: remove redundant "fixed" field re-assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-29 17:10 ` Stefan Beller
2017-06-28 21:58 ` [PATCH 4/5] grep: remove redundant and verbose re-assignments " Ævar Arnfjörð Bjarmason
2017-06-28 21:58 ` [PATCH 5/5] grep: remove regflags from the public grep_opt API Ævar Arnfjörð Bjarmason
2017-06-29 17:43 ` Stefan Beller
2017-06-29 18:16 ` Ævar Arnfjörð Bjarmason [this message]
2017-05-20 21:42 ` [PATCH v3 06/30] grep: add a test asserting that --perl-regexp dies when !PCRE Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 07/30] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 08/30] grep: change non-ASCII -i test to stop using --debug Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 09/30] grep: add tests for --threads=N and grep.threads Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 10/30] grep: amend submodule recursion test for regex engine testing Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 11/30] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 12/30] grep: add a test helper function for less verbose -f \0 tests Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 13/30] grep: prepare for testing binary regexes containing rx metacharacters Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 14/30] grep: add tests to fix blind spots with \0 patterns Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 15/30] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Ævar Arnfjörð Bjarmason
2017-05-20 23:50 ` Junio C Hamano
2017-05-21 6:23 ` Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 16/30] perf: emit progress output when unpacking & building Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 17/30] perf: add a comparison test of grep regex engines Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 18/30] perf: add a comparison test of grep regex engines with -F Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 19/30] perf: add a comparison test of log --grep regex engines Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 20/30] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 21/30] grep: remove redundant regflags assignments Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 22/30] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
2017-05-23 21:17 ` Brandon Williams
2017-05-20 21:42 ` [PATCH v3 23/30] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 24/30] grep: change internal *pcre* variable & function names to be *pcre1* Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 25/30] grep: move is_fixed() earlier to avoid forward declaration Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 26/30] test-lib: add a PTHREADS prerequisite Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 27/30] pack-objects & index-pack: add test for --threads warning Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 28/30] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 29/30] grep: given --threads with NO_PTHREADS=YesPlease, warn Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 30/30] grep: assert that threading is enabled when calling grep_{lock,unlock} Ævar Arnfjörð Bjarmason
2017-05-20 23:50 ` [PATCH v3 00/30] Easy to review grep & pre-PCRE changes Junio C Hamano
2017-05-23 19:24 ` [PATCH v2 0/7] PCRE v2, PCRE v1 JIT, log -P & fixes Ævar Arnfjörð Bjarmason
2017-05-23 19:24 ` [PATCH v2 1/7] grep: don't redundantly compile throwaway patterns under threading Ævar Arnfjörð Bjarmason
2017-05-24 4:42 ` Junio C Hamano
2017-05-25 10:33 ` Ævar Arnfjörð Bjarmason
2017-05-26 0:58 ` Junio C Hamano
2017-05-26 8:06 ` Ævar Arnfjörð Bjarmason
2017-05-26 9:49 ` Junio C Hamano
2017-05-23 19:24 ` [PATCH v2 2/7] grep: skip pthreads overhead when using one thread Ævar Arnfjörð Bjarmason
2017-05-24 4:45 ` Junio C Hamano
2017-05-23 19:24 ` [PATCH v2 3/7] log: add -P as a synonym for --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-23 19:24 ` [PATCH v2 4/7] grep: add support for the PCRE v1 JIT API Ævar Arnfjörð Bjarmason
2017-05-24 5:17 ` Junio C Hamano
2017-05-24 7:37 ` Ævar Arnfjörð Bjarmason
2017-05-23 19:24 ` [PATCH v2 5/7] grep: un-break building with PCRE < 8.32 Ævar Arnfjörð Bjarmason
2017-05-24 6:00 ` Junio C Hamano
2017-05-24 6:38 ` Junio C Hamano
2017-05-23 19:24 ` [PATCH v2 6/7] grep: un-break building with PCRE < 8.20 Ævar Arnfjörð Bjarmason
2017-05-23 19:24 ` [PATCH v2 7/7] grep: add support for PCRE v2 Ævar Arnfjörð Bjarmason
2017-05-24 6:23 ` Junio C Hamano
2017-05-25 9:49 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zicqirrg.fsf@gmail.com \
--to=avarab@gmail.com \
--cc=dark.panda@gmail.com \
--cc=frekui@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jratt0@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.