All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Stefan Beller <sbeller@google.com>
Cc: "git\@vger.kernel.org" <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Brandon Casey <drafnel@gmail.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH] die routine: change recursion limit from 1 to 1024
Date: Tue, 20 Jun 2017 00:32:56 +0200	[thread overview]
Message-ID: <87podz8v6v.fsf@gmail.com> (raw)
In-Reply-To: <CAGZ79kaHXX4ne_hfjVs7PaeLzHc+CbVSZ1hCNz0ebG5F3uJEzg@mail.gmail.com>


On Mon, Jun 19 2017, Stefan Beller jotted:

>> Now, git-grep could make use of the pluggable error facility added in
>> commit c19a490e37 ("usage: allow pluggable die-recursion checks",
>> 2013-04-16).
>
> I think we should do that instead (though I have not looked at the downsides
> of this), because...

It makes sense to do that in addition to what I'm doing here (or if the
approach I'm taking doesn't make sense, some other patch to fix the
general issue in the default handler).

I'm going to try to get around to fixing the grep behavior in a
follow-up patch, this is a fix for the overzealous recursion detection
in the default handler needlessly causing other issues.

>>
>> So let's just set the recursion limit to a number higher than the
>> number of threads we're ever likely to spawn. Now we won't lose
>> errors, and if we have a recursing die handler we'll still die within
>> microseconds.
>
> how are we handling access to that global variable?
> Do we need to hold a mutex to be correct? or rather hope that
> it works across threads, not counting on it, because each thread
> individually would count up to 1024?

It's not guarded by a mutex and the ++ here and the reads from it are
both racy.

However, for its stated purpose that's fine, even if we're racily
incrementing it and losing some updates some will get through, which is
good enough for an infinite recursion detection. We don't really care if
we die at exactly 1 or exactly 1024.

> I would prefer if we kept the number as low as "at most
> one screen of lines".

In practice this is the case in git, because the programs that would
encounter this are going to be spawning less than screenfull of threads,
assuming (as is the case) that each thread might print out one error.

The semantics of that aren't changed with this patch, the difference is
that you're going to get e.g. N repeats of a meaningful error instead of
N repeats of either the meaningful error OR "recursion detected in die
handler", depending on your luck.

I.e. in current git (after a few runs to get an unlucky one):

    $ git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$'
    fatal: recursion detected in die handler
    fatal: recursion detected in die handler
    fatal: recursion detected in die handler

Or if you're lucky at least one of these will be the actual error:

    $ git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$'
    fatal: recursion detected in die handler
    fatal: pcre_exec failed with error code -8
    fatal: recursion detected in die handler
    fatal: recursion detected in die handler
    fatal: recursion detected in die handler

But with this change:

    $ ~/g/git/git-grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$'
    fatal: pcre2_jit_match failed with error code -47: match limit exceeded
    fatal: pcre2_jit_match failed with error code -47: match limit exceeded
    fatal: pcre2_jit_match failed with error code -47: match limit exceeded

(The error message is different because I compiled with PCRE v2 locally,
instead of the system PCRE v1, but that doesn't matter for this example)

Over 1000 runs thish is how that breaks down on my machine, without this
patch. I've replaced the recursion error with just "R" and the PCRE
error with "P", and shown them in descending order by occurrence, lines
without a "P" only printed out the recursion error:

    $ (for i in {1..1000}; do git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort -nr|head -n 10
        247 R R
        136 P R R
        122 P R
        112 R R R
        108 R
         59 R P R
         54 R P
         54 P
         31 P R R R
         21 R R P

There's a long tail I've omitted there of alterations to that. As this
shows in >10% of cases we don't print out any meaningful error at
all. But with this change:

    $ (for i in {1..1000}; do ~/g/git/git-grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort -nr|head -n 10
        377 P P
        358 P P P
        192 P
         63 P P P P
          8 P P P P P
          2 P P P P P P

We will always show a meaningful error, but may of course do so multiple
times, which is a subject for a fix in git-grep in particular, but the
point is again, to fix the general case for the default handler.

Something something sorry about the long mail didn't have time to write
a shorter one :)

>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  usage.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/usage.c b/usage.c
>> index 2f87ca69a8..1c198d4882 100644
>> --- a/usage.c
>> +++ b/usage.c
>> @@ -44,7 +44,9 @@ static void warn_builtin(const char *warn, va_list params)
>>  static int die_is_recursing_builtin(void)
>>  {
>>         static int dying;
>> -       return dying++;
>> +       static int recursion_limit = 1024;
>> +
>> +       return dying++ > recursion_limit;
>>  }
>>
>>  /* If we are in a dlopen()ed .so write to a global variable would segfault
>> --
>> 2.13.1.518.g3df882009
>>

  reply	other threads:[~2017-06-19 22:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-19 22:00 [PATCH] die routine: change recursion limit from 1 to 1024 Ævar Arnfjörð Bjarmason
2017-06-19 22:08 ` Stefan Beller
2017-06-19 22:32   ` Ævar Arnfjörð Bjarmason [this message]
2017-06-19 22:38     ` Stefan Beller
2017-06-21 20:47     ` [PATCH v2] die(): stop hiding errors due to overzealous recursion guard Ævar Arnfjörð Bjarmason
2017-06-21 21:12       ` Stefan Beller
2017-06-21 21:21       ` Morten Welinder
2017-06-21 21:40         ` Ævar Arnfjörð Bjarmason
2017-06-21 21:32       ` Junio C Hamano
2017-06-24 12:36         ` Jeff King
2017-06-24 18:32           ` Junio C Hamano
2017-06-20 15:54 ` [PATCH] die routine: change recursion limit from 1 to 1024 Jeff King
2017-06-20 16:15   ` Jeff King
2017-06-20 18:49   ` Ævar Arnfjörð Bjarmason
2017-06-20 19:05     ` Jeff King
2017-06-21  8:12     ` Simon Ruderich
2017-06-21 10:10       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87podz8v6v.fsf@gmail.com \
    --to=avarab@gmail.com \
    --cc=drafnel@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.