From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Дилян Палаузов" <dilyan.palauzov@aegee.org>, git@vger.kernel.org
Subject: Re: git grep -P fatal: pcre_exec failed with error code -8
Date: Mon, 06 Nov 2017 12:50:45 +0100 [thread overview]
Message-ID: <878tfj62y2.fsf@evledraar.booking.com> (raw)
In-Reply-To: <20171106103125.fwtrxv6zycrbihcv@sigill.intra.peff.net>
On Mon, Nov 06 2017, Jeff King jotted:
> On Sun, Nov 05, 2017 at 10:41:17AM +0100, Дилян Палаузов wrote:
>
>> I understand that the PCRE's stack can get exhausted for some files, but in
>> such cases, git grep shall proceed with the other files, and print at the
>> end/stderr for which files the pattern was not applied. Such behaviour
>> would be more usefull than the current one.
>
> Yes, I had a similar thought. It does feel a little funny for us to
> basically treat an error as "no match" for non-interactive use, but then
> the current behavior works out to be more or less the same (we return an
> error code which most shell scripts would interpret as failure).
>
> IOW, I think something like this is probably the right direction:
>
> diff --git a/grep.c b/grep.c
> index ce6a48e634..2c152e5908 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -427,7 +427,7 @@ static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
> }
>
> if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
> - die("pcre_exec failed with error code %d", ret);
> + warning("pcre_exec failed with error code %d", ret);
> if (ret > 0) {
> ret = 0;
> match->rm_so = ovector[0];
>
> but possibly:
>
> 1. It would be nice to report the filename that we couldn't match on.
> But we don't know it at this level of the code (and it might not be
> a file at all that we are matching). So probably we'd want to pass
> the error much further up the call stack. This is tricky as there
> are multiple regex libraries we can use, and the return value gets
> normalized to 1/0 for hit/not-hit long before we get as far as
> something that knows the filename.
>
> We might need to do something invasive like adding an extra
> parameter to hold the error message, and passing it through the
> whole stack.
>
> 2. We should still try to exit with an exit code other than "1" to
> indicate we hit an error besides "no lines were found".
>
> 3. Other regex libraries might need similar treatment. Probably
> pcre2match() needs it. It doesn't look like regexec() can ever
> return an error besides REG_NOMATCH.
>
> -Peff
Some replies to the thread in general, didn't want to spread this out
into different replies.
* Yes this sucks.
* Just emitting a warning without an appropriate exit code would suck
more, would break batch jobs & whatnot that expcept certain results
from grep.
* As you point out it would be nice to print out the file name we
didn't match on, we'd need to pass the grep_source struct down
further, it goes as far as grep_source_1 but stops there and isn't
passed to e.g. look_ahead(), which calls patmatch() which calls the
engine-specific matcher and would need to report the error. We could
just do this, would slow down things a bit (probably trivally) but we
could emit better error messages in genreal.
* You can adjust these limts in PCRE in Git, although it doesn't help
in this case, you just add (*LIMIT_MATCH=NUM) or
(*LIMIT_RECURSION=NUM) (or both) to the start of the pattern. See
pcresyntax(3) or pcre2syntax(3) man pages depending on what version
you have installed.
* While regexec() won't return an error its version of dealing with
this is (at least under glibc) to balloon CPU/memory use until the
OOMkiller kills git (although not on this particular pattern).
next prev parent reply other threads:[~2017-11-06 11:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-05 0:06 git grep -P fatal: pcre_exec failed with error code -8 Дилян Палаузов
2017-11-05 2:16 ` Jeff King
2017-11-05 9:41 ` Дилян Палаузов
2017-11-06 10:31 ` Jeff King
2017-11-06 11:50 ` Ævar Arnfjörð Bjarmason [this message]
2017-11-06 12:24 ` Jeff King
2017-11-06 14:04 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878tfj62y2.fsf@evledraar.booking.com \
--to=avarab@gmail.com \
--cc=dilyan.palauzov@aegee.org \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.