From: Oleg Nesterov <oleg@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case
Date: Fri, 21 Jun 2024 12:40:32 +0200 [thread overview]
Message-ID: <20240621104031.GA12521@redhat.com> (raw)
In-Reply-To: <874j9mdf14.fsf@email.froward.int.ebiederm.org>
Another case when I can hardly understand your reply...
This patch adds a minor user visible change, that was my point.
If you say that the new behaviour is better / more consistent -
I won't really argue, "I expect no one cares" below is probably
true. In my opinion group_exit_code = SIGKILL makes more sense
in this special case, but again, I won't insist.
But then this change should be mentioned and explained in the
changelog, agree?
As for "zap_threads that tests if SIGNAL_GROUP_EXIT is already set",
this is another thing but probably I misundertood you. It is not that
zap_threads/zap_process do not set ->group_exit_code in this case,
in this case do_coredump() will be aborted.
And to remind, zap_threads() used to set SIGNAL_GROUP_COREDUMP, not
SIGNAL_GROUP_EXIT. Because to me the coredumping process is not exiting
yet, it tries to handle the coredumping signal. That is why I prefer
group_exit_code = SIGKILL if it is killed during the dump. But this is
slightly offtopic today.
Oleg.
On 06/21, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > On 06/19, Eric W. Biederman wrote:
> >>
> >> Oleg Nesterov <oleg@redhat.com> writes:
> >>
> >> > Hi Eric,
> >> >
> >> > I'll _try_ to read this (nontrivial) changes this week. To be honest,
> >> > right now I don't really understand your goals after the quick glance...
> >> >
> >> > So far I have only looked at this simple 1/17 and it doesn't look right
> >> > to me.
> >>
> >> It might be worth applying them all on a branch and just looking at the
> >> end result.
> >
> > Perhaps. Say, the next 2/17 patch. I'd say it is very difficult to understand
> > the purpose unless you read the next patches. OK, at least the change log
> > mentions "in preparation".
> >
> >> > - complete_signal() won't be called, so signal->group_exit_code
> >> > won't be updated.
> >> >
> >> > coredump_finish() won't change it too so the process will exit
> >> > with group_exit_code == signr /* coredumping signal */.
> >> >
> >> > Yes, the fix is obvious and trivial...
> >>
> >> The signal handling from the coredump is arguably correct. The process
> >> has already exited, and gotten an exit code.
> >
> > And zap_process() sets roup_exit_code = signr. But,
> >
> >> But I really don't care about the exit_code either way. I just want to
> >> make ``killing'' a dead process while it core dumps independent of
> >> complete_signal.
> >>
> >> That ``killing'' of a dead process is a completely special case.
> >
> > Sorry I fail to understand...
> >
> > If the coredumping process is killed by SIGKILL, it should exit with
> > group_exit_code = SIGKILL, right? At least this is what we have now.
>
> In general when a fatal signal is sent:
> - It is short circuit delivered.
> - If SIGNAL_GROUP_EXIT is not set
> SIGNAL_GROUP_EXIT is set
> signal->group_exit_code is set
>
> Under those rules group_exit_code should not be updated. Frankly no
> signals should even be processed (except to be queued) after a fatal
> signal comes in.
>
> There is an issue that short circuit delivery does not work on coredump
> signals (because of the way the coredump code works). So it winds up
> being zap_threads that tests if SIGNAL_GROUP_EXIT is already set and
> zap_process that sets SIGNAL_GROUP_EXIT. Essentially the logic remains
> the same, and importantly no later signal is able to set
> group_exit_code. Or really have any effect because the signal sent was
> fatal.
>
> Except except except when the kernel in the context of the userspace
> process is writing a coredump for that userspace process. Then we allow
> the kernel to be sent SIGKILL to stop it's coredumping activities
> because sometimes it can block indefinitely otherwise.
>
> Which is why I call handling that SIGKILL after a coredump has
> begun and SIGNAL_GROUP_EXIT is already set a completely special case.
>
> We might have to change group_exit_code to SIGKILL in that special case,
> if someone in userspace cares. However I expect no one cares.
>
> Further if adding support for SIGKILL during a coredump were being added
> from scratch. The semantics I would choose would be for that SIGKILL
> and indeed all of the coredumping activity would be invisible to
> userspace except for the delay to make it happen. Otherwise a coredump
> which every occasionally gets it's return code changed could introduce
> heisenbugs.
>
> But none of this is documented in the change description and at a bare
> minimum this change of behavior should be before such code is merged.
>
> Eric
>
next prev parent reply other threads:[~2024-06-21 10:42 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-09 14:23 [PATCH 0/1] exit: kill signal_struct->quick_threads Oleg Nesterov
2024-06-09 14:24 ` [PATCH 1/1] " Oleg Nesterov
2024-06-09 18:28 ` [PATCH 0/1] " Oleg Nesterov
2024-06-10 10:50 ` Q: css_task_iter_advance() && dying_tasks Oleg Nesterov
2024-06-10 11:08 ` Oleg Nesterov
2024-06-10 20:02 ` Tejun Heo
2024-06-10 20:00 ` Tejun Heo
2024-06-10 12:15 ` [PATCH 0/1] exit: kill signal_struct->quick_threads Eric W. Biederman
2024-06-10 15:29 ` Oleg Nesterov
2024-06-10 15:42 ` Oleg Nesterov
2024-06-10 16:18 ` Oleg Nesterov
2024-06-13 15:45 ` Oleg Nesterov
2024-06-15 14:53 ` Eric W. Biederman
2024-06-17 18:37 ` Oleg Nesterov
2024-06-19 3:48 ` Eric W. Biederman
2024-06-19 4:04 ` [PATCH 0/17] exit: complete synchronize_group_exit Eric W. Biederman
2024-06-19 4:05 ` [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2024-06-19 15:50 ` Oleg Nesterov
2024-06-19 18:09 ` Eric W. Biederman
2024-06-19 19:11 ` Oleg Nesterov
2024-06-21 5:46 ` Eric W. Biederman
2024-06-21 10:40 ` Oleg Nesterov [this message]
2024-06-21 16:30 ` Eric W. Biederman
2024-06-19 4:05 ` [PATCH 02/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2024-06-25 12:34 ` Oleg Nesterov
2024-06-19 4:06 ` [PATCH 03/17] coredump: Consolidate the work to allow SIGKILL during coredumps Eric W. Biederman
2024-06-25 12:34 ` Oleg Nesterov
2024-06-19 4:06 ` [PATCH 04/17] signal: In get_signal call do_exit when it is unnecessary to shoot down threads Eric W. Biederman
2024-06-25 12:35 ` Oleg Nesterov
2024-06-19 4:07 ` [PATCH 05/17] signal: Bring down all threads when handling a non-coredump fatal signal Eric W. Biederman
2024-06-25 12:56 ` Oleg Nesterov
2024-06-19 4:07 ` [PATCH 06/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2024-06-19 4:08 ` [PATCH 07/17] signal: Always set JOBCTL_WILL_EXIT for " Eric W. Biederman
2024-06-30 14:00 ` kernel test robot
2024-06-19 4:08 ` [PATCH 08/17] signal: Don't target tasks that are exiting Eric W. Biederman
2024-06-19 4:09 ` [PATCH 09/17] signal: Test for process exit or de_thread using task_exit_pending Eric W. Biederman
2024-06-19 4:09 ` [PATCH 10/17] signal: Only set JOBCTL_WILL_EXIT if it is not already set Eric W. Biederman
2024-06-19 4:10 ` [PATCH 11/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2024-06-19 4:10 ` [PATCH 12/17] signal: Remove zap_other_threads Eric W. Biederman
2024-06-19 4:11 ` [PATCH 13/17] signal: Stop skipping current in do_group_exit & get_signal Eric W. Biederman
2024-06-28 5:43 ` kernel test robot
2024-06-19 4:11 ` [PATCH 14/17] signal: Factor out schedule_group_exit_locked Eric W. Biederman
2024-06-19 4:12 ` [PATCH 15/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2024-06-19 4:12 ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2024-06-19 4:13 ` [PATCH 17/17] signal: Set SIGNAL_GROUP_EXIT when all tasks have decided to exit Eric W. Biederman
2024-06-19 20:18 ` [PATCH 0/1] exit: kill signal_struct->quick_threads Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240621104031.GA12521@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox