All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Carlo Arenas <carenas@gmail.com>
Cc: sandals@crustytoothpaste.net, git@vger.kernel.org, pcre-dev@exim.org
Subject: Re: [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1
Date: Tue, 11 Dec 2018 21:51:12 +0100	[thread overview]
Message-ID: <87zhtbn5xb.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CAPUEspjHs7+G+FHXjxb4rCcNLqaybbhZESik=14_9Q5h2HMzMA@mail.gmail.com>


On Tue, Dec 11 2018, Carlo Arenas wrote:

> On Mon, Dec 10, 2018 at 12:24 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Mon, Dec 10 2018, brian m. carlson wrote:
>> > Considering that some Linux users use PaX kernels with standard
>> > distributions and that most BSD kernels can be custom-compiled with a
>> > variety of options enabled or disabled, I think this is something we
>> > should detect dynamically.
>>
>> Right. I'm asking whether we're mixing up cases where it can always be
>> detected at compile-time on some systems v.s. cases where it'll
>> potentially change at runtime.
>
> the closer we come to a system specific issues is with macOS where the
> compiler (in some newer versions) is allocating the memory using the
> MAP_JIT flag, which seems was originally meant to be only used in iOS
> and has the strange characteristic of failing the mmap for versions
> older than 10.14 if it was called more than once.
>
> IMHO as brian pointed out, this is better done at runtime.

Sure. Just something I was wondering since it wasn't clear from the
patch. Makes sense, if it's always runtime (or not worth the effort to
divide the two) let's do that.

>> >> I'm inclined to suggest that we should have another ifdef here for "if
>> >> JIT fails I'd like it to die", so that e.g. packages I build (for
>> >> internal use) don't silently slow down in the future, only for me to
>> >> find some months later that someone enabled an overzealous SELinux
>> >> policy and we swept this under the rug.
>> >
>> > My view is that JIT is a nice performance optimization, but it's
>> > optional. I honestly don't think it should even be exposed through the
>> > API: if it works, then things are faster, and if it doesn't, then
>> > they're not. I don't see the value in an option for causing things to be
>> > broken if someone improves the security of the system.
>>
>> For many users that's definitely the case, but for others that's like
>> saying a RDBMS is still going to be functional if the "ORDER BY"
>> function degrades to bubblesort. The JIT improves performance my
>> multi-hundred percents sometimes, so some users (e.g. me) rely on that
>> not being silently degraded.
>
> the opposite is also true, specially considering that some old
> versions of pcre result in a segfault instead of an error message and
> therefore since there is no way to disable JIT, the only option left
> is not to use `git grep -P` (or the equivalent git log call)

Right, of course it segfaulting is a bug...

>> So I'm wondering if we can have something like:
>>
>>     if (!jit)
>>         if (must_have_jit)
>>             BUG(...); // Like currently
>>         else
>>             fallback(); // new behavior
>
> I am wondering if something like a `git doctor` command might be an
> interesting alternative to this.
>
> This way we could (for ex: in NetBSD) give the user a hint of what to
> do to make their git grep -P faster when we detect we are running the
> fallback, and might be useful as well to provide hints for
> optimizations that could be used in other cases (probably even
> depending on the size of the git repository)

Such a command has been discussed before on-list. I think it's a good
idea for the more fuzzy things like optimization suggests, but for the
case of expecting something at compile-time where not having that at
runtime is a boolean state it's nicer to just die with a BUG(...).

> For your use case, you just need to add a crontab that will trigger an
> alarm if this command ever mentions PCRE

...The reason I'd like it to die is because it neatly and naturally
integrates with all existing test infrastructure I have. I.e. build
package, run stress tests on all sorts of machines, see that it passes,
and SELinux isn't ruining it or whatever.

I know this works now (I always get PCRE v2 JIT) because it doesn't die
or segfault. I'd like not to have it regress to having worse
performance.

Having a cronjob to test for "does PCRE v2 JIT still work?" is not as
easy & isn't a drop-in solution.

  reply	other threads:[~2018-12-11 20:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-09 23:00 [RFC PATCH 0/2] fallback to interpreter if JIT fails with pcre Carlo Marcelo Arenas Belón
2018-12-09 23:00 ` [RFC PATCH 1/2] grep: fallback to interpreter if JIT fails with pcre1 Carlo Marcelo Arenas Belón
2018-12-09 23:51   ` Ævar Arnfjörð Bjarmason
2018-12-10  0:42     ` brian m. carlson
2018-12-10  1:25       ` Carlo Arenas
2018-12-10  6:42       ` Junio C Hamano
2018-12-10  8:24       ` Ævar Arnfjörð Bjarmason
2018-12-11 20:11         ` Carlo Arenas
2018-12-11 20:51           ` Ævar Arnfjörð Bjarmason [this message]
2018-12-10  6:54     ` Junio C Hamano
2018-12-10  6:48   ` Junio C Hamano
2018-12-09 23:00 ` [RFC PATCH 2/2] grep: fallback to interpreter if JIT fails with pcre2 Carlo Marcelo Arenas Belón
2018-12-10  7:00   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zhtbn5xb.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pcre-dev@exim.org \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.