From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
jolsa@redhat.com, Namhyung Kim <namhyung@kernel.org>,
luca abeni <luca.abeni@santannapisa.it>,
syzkaller <syzkaller@googlegroups.com>,
Ivan Delalande <colona@arista.com>
Subject: Re: [PATCH 1/2] signal: Always notice exiting tasks
Date: Tue, 12 Feb 2019 21:58:51 -0600 [thread overview]
Message-ID: <87mun05og4.fsf@xmission.com> (raw)
In-Reply-To: <20190212165022.GA29263@redhat.com> (Oleg Nesterov's message of "Tue, 12 Feb 2019 17:50:23 +0100")
Oleg Nesterov <oleg@redhat.com> writes:
> On 02/12, Eric W. Biederman wrote:
>>
>> > Here I was trying for the simple minimal change and I hit this landmine.
>> > Which leaves me with the question of what should be semantics of signal
>> > handling after exit.
>
> Yes, currently it is undefined. Even signal_pending() is random.
>
>> > I think from dim memory of previous conversations the desired semantics
>> > look like:
>> > a) Ignore all signal state except for SIGKILL.
>> > b) Letting SIGKILL wake up the process should be sufficient.
>
> signal_wake_up(true) to make fatal_signal_pending() == T, I think.
>
>> Oleg any ideas on how to make PTRACE_EVENT_EXIT reliably killable?
>
> My answer is very simple: PTRACE_EVENT_EXIT must not stop if the tracee was
> killed by the "real" SIGKILL (not by group_exit/etc), that is all. But this
> is another user-visible change, it can equally confuse, say, strace (albeit
> not too much iiuc).
>
> But this needs another discussion.
Yes. Quite.
I will just point out that as described that logic will rebreak Ivan's
program.
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index 99fa8ff06fd9..a1f154dca73c 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -2544,6 +2544,9 @@ bool get_signal(struct ksignal *ksig)
>> }
>>
>> fatal:
>> + /* No more signals can be pending past this point */
>> + sigdelset(¤t->pending.signal, SIGKILL);
>
> Well, this is very confusing. In fact, this is not really correct. Say, we should
> not remove the pending SIGKILL if we are going to call do_coredump(). This is
> possible if ptrace_signal() was called, or after is_current_pgrp_orphaned() returns
> false.
I don't see bugs in it. But it is certainly subtle and that is not what
is needed right now.
The subtlety is that we will never have a per thread SIGKILL pending
unless signal_group_exit is true. So removing when it is not there is harmless.
>> + clear_tsk_thread_flag(current, TIF_SIGPENDING);
>
> I don't understand this change, it looks irrelevant. Possibly makes sense, but
> this connects to "semantics of signal handling after exit".
As on the other the location is too subtle for the regression fix.
The primary motivation is that dequeue_signal calls recalc_sigpending.
And in the common case that will result clearing the TIF_SIGPENDING
which will result in signal_pending being false.
I have not found a location that cares enough to cause a misbehavior
if we don't clear TIF_SIGPENDING but it is a practical change and there
might be. So if the word of the day is be very conservative and
avoid landminds I expect we need the clearing of TIF_SIGPENDING.
Hmm. Probably using recalc_sigpending() now that I think about it.
> OK, we need a minimal incremental fix for now. I'd suggest to replace
>
> ksig->info.si_signo = signr = SIGKILL;
> if (signal_group_exit(signal))
> goto fatal;
>
> added by this patch with
>
> if (__fatal_signal_pending(current)) {
> ksig->info.si_signo = signr = SIGKILL;
> sigdelset(¤t->pending.signal, SIGKILL);
> goto fatal;
> }
>
> __fatal_signal_pending() is cheaper and looks more understandable.
I definitely agree that it is much less likely to cause a problem
if we move all of the work before jumping to fatal.
The cost of both __fatal_signal_pending and signal_group_exit is just
a cache line read. So not a big deal wither way.
On the other hand __fatal_signal_pending as currently implemented is
insanely subtle and arguably a bit confusing. It tests for a SIGKILL in
the current pending sigset, to discover the signal group property of if
a process as started exiting.
In the long run we need our data structures not to be subtle and
tricky to use. To do that we need a test of something in signal_struct
because it is a per signal group property. Further we need to remove
the abuse of the per thread SIGKILL.
Since signal_group_exit always implies __fatal_signal_pending in this
case and the reverse. I see no reason to use a function that requires
we maintain a huge amount of confusing and unnecessary machinery to keep
working.
All of that plus the signal_group_exit test has been tested and shown to
fix an ignored SIGKILL and the only practical problem is it doesn't do
one or two little things that dequeue_signal has done that made it
impossible to stop in PTRACE_EVENT_EXIT.
So for the regression fix let's just do the few little things that
dequeue_signal used to do. That gives us a strong guarantee that
nothing else was missed.
Eric
next prev parent reply other threads:[~2019-02-13 3:59 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-01 16:48 perf_event_open+clone = unkillable process Dmitry Vyukov
2019-02-01 17:06 ` Dmitry Vyukov
2019-02-02 18:30 ` Jiri Olsa
2019-02-03 15:21 ` Jiri Olsa
2019-02-04 9:27 ` Thomas Gleixner
2019-02-04 9:38 ` Dmitry Vyukov
2019-02-04 17:38 ` Thomas Gleixner
2019-02-05 3:00 ` Eric W. Biederman
2019-02-05 4:27 ` Eric W. Biederman
2019-02-05 6:07 ` Eric W. Biederman
2019-02-05 15:26 ` [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending Eric W. Biederman
2019-02-06 12:09 ` Dmitry Vyukov
2019-02-06 21:47 ` Eric W. Biederman
2019-02-06 18:07 ` Oleg Nesterov
2019-02-06 22:25 ` Eric W. Biederman
2019-02-07 6:42 ` [PATCH 0/2]: Fixing unkillable processes caused by SIGHUP timers Eric W. Biederman
2019-02-07 6:43 ` [PATCH 1/2] signal: Always notice exiting tasks Eric W. Biederman
2019-02-11 14:13 ` Oleg Nesterov
2019-02-12 0:42 ` Eric W. Biederman
2019-02-12 8:18 ` Eric W. Biederman
2019-02-12 16:50 ` Oleg Nesterov
2019-02-13 3:58 ` Eric W. Biederman [this message]
2019-02-13 4:09 ` [PATCH] signal: Restore the stop PTRACE_EVENT_EXIT Eric W. Biederman
2019-02-13 13:55 ` Oleg Nesterov
2019-02-13 14:38 ` Oleg Nesterov
2019-02-13 14:58 ` Eric W. Biederman
2019-02-07 6:44 ` [PATCH 2/2] signal: Better detection of synchronous signals Eric W. Biederman
2019-02-11 15:18 ` Oleg Nesterov
2019-02-12 0:01 ` Eric W. Biederman
2019-02-12 17:21 ` Oleg Nesterov
2019-02-07 11:46 ` [PATCH 0/2]: Fixing unkillable processes caused by SIGHUP timers Dmitry Vyukov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mun05og4.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=colona@arista.com \
--cc=dvyukov@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@santannapisa.it \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=syzkaller@googlegroups.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.