From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Carlo Arenas <carenas@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
Mathias Krause <minipli@grsecurity.net>,
git@vger.kernel.org, pcre-dev@exim.org
Subject: Re: [PATCH] grep: fall back to interpreter mode if JIT fails
Date: Mon, 19 Dec 2022 10:00:11 +0100 [thread overview]
Message-ID: <221219.86bknz21qj.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CAPUEspjqgSOS4KKw2nTaRYdiCFo4Ok6OfmKrqb+Mjq+oXn5nsg@mail.gmail.com>
On Fri, Dec 16 2022, Carlo Arenas wrote:
[CC-ing pcre-dev@ for this "future error API" discussion]
> On Fri, Dec 16, 2022 at 3:09 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Mathias Krause <minipli@grsecurity.net> writes:
>>
>> > ... However, from a user's point of view a fall back to
>> > interpreter mode might still be desired in this case, as a failing
>> > 'git grep' is simply not acceptable, IMHO.
>>
>> "git grep" that silently produces a wrong result (by falling back
>> after a problem is detected) would not be acceptable, either.
>
> except that an error at this point only invalidates the use of JIT,
> so calling pcre2_jit_match() is invalid but calling pcre2_match() is not.
>
> the later is setup to be used later by the code that is added,
I think we could stumble ahead, but if this were to happen our
assumptions about how the API works have been invalidated.
The pcre2_jit_compile() doesn't promise to return a finite set of error
codes, but:
[...]0 for success, or a negative error code otherwise[...]
But if new codes were added it's anyone's guess what state we'd be in,
so I think the safe thing is to BUG() out if we get as far as
pcre2_jit_compile() and don't get either PCRE2_ERROR_JIT_BADOPTION or
PCRE2_ERROR_NOMEMORY.
>> Receiving BADOPTION could be a sign that there is something wrong in
>> the input, not from the end-user but from the code, in which case
>> stopping with BUG() may be a more appropriate?
>
> The way PCRE handles this kind of errors internally is to instruct pcre2_match()
> to use the interpreter.
>
> While a BUG() might be a way to ensure the code is using the right set
> of options
> I would expect that the failure will be reported by pcre2_compile
> instead, with the
> only cases left, only being interna to PCRE (ex: JIT can't yet support
> a feature the
> interpreter has)
I agree that it's possible in general that an external library might
start returning a "benign" error code that we could recover from, so
BUG(...) would be overdoing it.
I just don't see that libpcre would do that in this case. In general the
JIT supports all patterns that the normal engine does, so if we're past
the "is it available?" hump it should work.
If it starts not doing that I'd think they'd do a major version upgrade,
or opt-in with a new flag etc.
Note that it already has a way of checking for "we tried to do the jit
thing, but it wasn't on in the end". See the code I added in
a25b9085043[1] (grep: fix segfault under -P + PCRE2 <=10.30 + (*NO_JIT),
2017-11-23), which comes right after the pcre2_jit_compile().
So not only would a BUG() biting us here require them to create a new
code for the state of "we have the JIT, but can't use it here" (for some
reason I can't imagine, as "PCRE2_ERROR_NOMEMORY" is already
"overloaded" to mean that).
It would also require them to invent a new "soft" failure mode for the
JIT, i.e. not the facility added in a25b9085043, where we can use the
JIT, but it's not on after all due to a "(*NO_JIT)" in the pattern
itself.
But I may be wrong, so I've CC'd pcre-dev@ to see if they have any
commentary on this proposed API paranoia. For them: The greater context
of this thread on the git ML is at [2].
1. https://github.com/git/git/commit/a25b9085043b8029169b4d5b172b78ca3f38fb37
2. https://lore.kernel.org/git/20221216121557.30714-1-minipli@grsecurity.net/
next prev parent reply other threads:[~2022-12-19 9:15 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-16 12:15 [PATCH] grep: fall back to interpreter mode if JIT fails Mathias Krause
2022-12-16 16:12 ` Ævar Arnfjörð Bjarmason
2022-12-16 19:26 ` Mathias Krause
2022-12-16 23:09 ` Junio C Hamano
2022-12-17 2:50 ` Carlo Arenas
2022-12-19 9:00 ` Ævar Arnfjörð Bjarmason [this message]
2022-12-20 19:29 ` Mathias Krause
2022-12-20 21:11 ` Ævar Arnfjörð Bjarmason
2023-01-18 14:22 ` Mathias Krause
2023-01-18 15:44 ` Ævar Arnfjörð Bjarmason
2023-01-19 9:19 ` Mathias Krause
2022-12-16 22:52 ` Junio C Hamano
2022-12-20 20:40 ` Mathias Krause
2023-01-27 15:49 ` [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails Mathias Krause
2023-01-27 16:34 ` Junio C Hamano
2023-01-27 17:39 ` Junio C Hamano
2023-01-27 18:46 ` Junio C Hamano
2023-01-29 13:37 ` Mathias Krause
2023-01-29 13:36 ` Mathias Krause
2023-01-29 17:15 ` Junio C Hamano
2023-01-30 10:56 ` Ævar Arnfjörð Bjarmason
2023-01-30 18:49 ` Junio C Hamano
2023-01-31 8:34 ` Ævar Arnfjörð Bjarmason
2023-01-30 11:08 ` Mathias Krause
2023-01-30 18:54 ` Junio C Hamano
2023-01-30 20:08 ` Junio C Hamano
2023-01-30 21:21 ` Junio C Hamano
2023-01-30 22:30 ` Ramsay Jones
2023-01-30 23:27 ` Junio C Hamano
2023-01-31 7:48 ` Mathias Krause
2023-01-31 16:41 ` Junio C Hamano
2023-01-31 18:34 ` Mathias Krause
2023-01-31 7:30 ` Mathias Krause
2023-01-29 12:28 ` Mathias Krause
2023-01-31 18:56 ` [PATCH v3] " Mathias Krause
2023-01-31 21:05 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=221219.86bknz21qj.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=carenas@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=minipli@grsecurity.net \
--cc=pcre-dev@exim.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).