All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Дилян Палаузов" <dilyan.palauzov@aegee.org>, git@vger.kernel.org
Subject: Re: git grep -P fatal: pcre_exec failed with error code -8
Date: Mon, 06 Nov 2017 15:04:02 +0100	[thread overview]
Message-ID: <877ev35wrx.fsf@evledraar.booking.com> (raw)
In-Reply-To: <20171106122411.dhi2ltyegzquebhk@sigill.intra.peff.net>


On Mon, Nov 06 2017, Jeff King jotted:

> On Mon, Nov 06, 2017 at 12:50:45PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> Some replies to the thread in general, didn't want to spread this out
>> into different replies.
>>
>>  * Yes this sucks.
>>
>>  * Just emitting a warning without an appropriate exit code would suck
>>    more, would break batch jobs & whatnot that expcept certain results
>>    from grep.
>
> That was my first thought, too, but something that does:
>
>   git grep foo && echo found
>
> would behave basically the same. Do you mean here scripts that actually
> do:
>
>   git grep foo
>   case "$?" in
>   0) echo found ;;
>   1) echo not found ;;
>   *) echo wtf? ;;
>   esac
>
> I agree it would be nice to at least have _some_ way to distinguish
> between those final two cases.

Maybe we should emit a different exit code, but I just had in mind that
we could continue but the same non-zero as when the grep fails now, but
with an additional warning.

> Though something like "git log --grep" is even more complicated. We
> perhaps _would_ want to distinguish between:
>
>   - match (in which case we print the commit)
>
>   - no match (in which case we do not)
>
>   - error (in which case we do not print, but also mark the exit code as
>     failing)
>
>>  * As you point out it would be nice to print out the file name we
>>    didn't match on, we'd need to pass the grep_source struct down
>>    further, it goes as far as grep_source_1 but stops there and isn't
>>    passed to e.g. look_ahead(), which calls patmatch() which calls the
>>    engine-specific matcher and would need to report the error. We could
>>    just do this, would slow down things a bit (probably trivally) but we
>>    could emit better error messages in genreal.
>
> I'm not sure if the grep_source has enough information for all cases.
> E.g., if you hit an error while grepping in commit headers, you'd
> probably want to mention the oid of the commit. There's an "identifier"
> field in the grep_source, but it's opaque.
>
> The caller may also want to do more things than just print an error
> (like the exit code adjustment I mentioned above). Which implies to me
> we should be passing the error information up, not trying to bring the
> context down.

Indeed, I was just thinking of the case where we're grepping a file in
the tree, not when the engine is used for --grep et al.

>>  * You can adjust these limts in PCRE in Git, although it doesn't help
>>    in this case, you just add (*LIMIT_MATCH=NUM) or
>>    (*LIMIT_RECURSION=NUM) (or both) to the start of the pattern. See
>>    pcresyntax(3) or pcre2syntax(3) man pages depending on what version
>>    you have installed.
>
> I saw that in the pcre manual, but I had the impression you can't
> actually raise the limits above the defaults with that, only lower them.

You can, you just can't set them to anything less conservative than
limits already set via the C API, but we don't set any of those.

I.e. PCRE allows you to say expose a regex field in a web form (as we do
with gitweb) with really conservative settings that can't be overriden
via a (*LIMIT) set in the pattern.

But since we don't use that C API PCRE runs in a mode where the user
gets set whatever limit they want in the pattern (or other
pattern-altering switch), which makes sense for interactive git-grep
use.

>>  * While regexec() won't return an error its version of dealing with
>>    this is (at least under glibc) to balloon CPU/memory use until the
>>    OOMkiller kills git (although not on this particular pattern).
>
> So in a sense our current behavior with pcre is the same. We just have
> to provoke the death ourselves. ;)

Indeed :)

      reply	other threads:[~2017-11-06 14:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-05  0:06 git grep -P fatal: pcre_exec failed with error code -8 Дилян Палаузов
2017-11-05  2:16 ` Jeff King
2017-11-05  9:41   ` Дилян Палаузов
2017-11-06 10:31     ` Jeff King
2017-11-06 11:50       ` Ævar Arnfjörð Bjarmason
2017-11-06 12:24         ` Jeff King
2017-11-06 14:04           ` Ævar Arnfjörð Bjarmason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877ev35wrx.fsf@evledraar.booking.com \
    --to=avarab@gmail.com \
    --cc=dilyan.palauzov@aegee.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.