From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Mathias Krause" <minipli@grsecurity.net>,
git@vger.kernel.org,
"Carlo Marcelo Arenas Belón" <carenas@gmail.com>
Subject: Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
Date: Mon, 30 Jan 2023 11:56:42 +0100 [thread overview]
Message-ID: <230130.867cx4s2j4.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <xmqqk0156z55.fsf@gitster.g>
On Sun, Jan 29 2023, Junio C Hamano wrote:
> Mathias Krause <minipli@grsecurity.net> writes:
>
>> ... While we might be able to compile the pattern and run it in
>> interpreter mode, it'll likely have a *much* higher runtime.
>> ...
>> So this grep run eat up ~9.5 *hours* of CPU time. Do we really want to
>> fall back to something like this for the pathological cases? ...Yeah, I
>> don't think so either.
>
> You may not, but I do not agree with you at all. The code should
> not outsmart the user in such a case.
It's the falling back in the nominal case that would be outsmarting the
user.
If I compile libpcre2 with JIT support I'm expecting Git to use that,
and not fall back in those cases where the JIT engine would give up.
> Even if the pattern the user came up with is impossible to grok for
> a working JIT compiler, and it might be hard to grok for the
> interpreter, what is the next step you recommend the user if you
> refuse to fall back on the interprete? "Rewrite it to please the
> JIT compiler"?
I'd argue that it's pretty much impossible to unintentionally write such
pathological patterns, the edge cases where e.g. the JIT would run out
of resources v.s. the normal engine are a non-issue for any "normal"
use.
Pathological regexes are pretty much only interesting to anyone in the
context of DoS attacks where they're being used to cause intentional
slowdowns.
Here we're discussing an orthagonal case where the "JIT fails", but
rather than some pathological pattern it's because SELinux has made it
not work at runtime, and we're trying to tease the two cases apart.
> If that is the best pattern the user can produce to solve the
> problem at hand, being able to give the user an answer in 9 hours is
> much better than not being able to give anything at all.
Speed is a feature in itself, and in a lot of cases (e.g. user-supplied
patterns vulnerable to a DoS attack) continuing on the slow path is much
worse.
Even just using my terminal for ad-hoc "git grep", I'd *much* rather get
an early error about the pattern exceeding JIT resources than continuing
on the fallback path.
If I had somehow written one by accident (and this is stretching
credulity) you can usually apply some minor tweaks to the pattern, and
then execute it in seconds instead of minutes/hours.
> Maybe after waiting for 5 minutes, the user gets bored and ^C, or
> without killing it, open another terminal and try a different
> patern, and in 9 hours, perhaps comes up with an equivalent (or
> different but close enough) pattern that happens to run much faster,
> at which time the user may kill the original one. In any of these
> cases, by refusing to run, the code is not doing any service to the
> user.
I don't think this is plausible at all per the above, and that we
shouldn't harm realistic use-cases to satisfy hypothetical ones.
next prev parent reply other threads:[~2023-01-30 11:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-16 12:15 [PATCH] grep: fall back to interpreter mode if JIT fails Mathias Krause
2022-12-16 16:12 ` Ævar Arnfjörð Bjarmason
2022-12-16 19:26 ` Mathias Krause
2022-12-16 23:09 ` Junio C Hamano
2022-12-17 2:50 ` Carlo Arenas
2022-12-19 9:00 ` Ævar Arnfjörð Bjarmason
2022-12-20 19:29 ` Mathias Krause
2022-12-20 21:11 ` Ævar Arnfjörð Bjarmason
2023-01-18 14:22 ` Mathias Krause
2023-01-18 15:44 ` Ævar Arnfjörð Bjarmason
2023-01-19 9:19 ` Mathias Krause
2022-12-16 22:52 ` Junio C Hamano
2022-12-20 20:40 ` Mathias Krause
2023-01-27 15:49 ` [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails Mathias Krause
2023-01-27 16:34 ` Junio C Hamano
2023-01-27 17:39 ` Junio C Hamano
2023-01-27 18:46 ` Junio C Hamano
2023-01-29 13:37 ` Mathias Krause
2023-01-29 13:36 ` Mathias Krause
2023-01-29 17:15 ` Junio C Hamano
2023-01-30 10:56 ` Ævar Arnfjörð Bjarmason [this message]
2023-01-30 18:49 ` Junio C Hamano
2023-01-31 8:34 ` Ævar Arnfjörð Bjarmason
2023-01-30 11:08 ` Mathias Krause
2023-01-30 18:54 ` Junio C Hamano
2023-01-30 20:08 ` Junio C Hamano
2023-01-30 21:21 ` Junio C Hamano
2023-01-30 22:30 ` Ramsay Jones
2023-01-30 23:27 ` Junio C Hamano
2023-01-31 7:48 ` Mathias Krause
2023-01-31 16:41 ` Junio C Hamano
2023-01-31 18:34 ` Mathias Krause
2023-01-31 7:30 ` Mathias Krause
2023-01-29 12:28 ` Mathias Krause
2023-01-31 18:56 ` [PATCH v3] " Mathias Krause
2023-01-31 21:05 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=230130.867cx4s2j4.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=carenas@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=minipli@grsecurity.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).