From: Oleg Nesterov <oleg@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case
Date: Fri, 21 Jun 2024 12:40:32 +0200 [thread overview]
Message-ID: <20240621104031.GA12521@redhat.com> (raw)
In-Reply-To: <874j9mdf14.fsf@email.froward.int.ebiederm.org>
Another case when I can hardly understand your reply...
This patch adds a minor user visible change, that was my point.
If you say that the new behaviour is better / more consistent -
I won't really argue, "I expect no one cares" below is probably
true. In my opinion group_exit_code = SIGKILL makes more sense
in this special case, but again, I won't insist.
But then this change should be mentioned and explained in the
changelog, agree?
As for "zap_threads that tests if SIGNAL_GROUP_EXIT is already set",
this is another thing but probably I misundertood you. It is not that
zap_threads/zap_process do not set ->group_exit_code in this case,
in this case do_coredump() will be aborted.
And to remind, zap_threads() used to set SIGNAL_GROUP_COREDUMP, not
SIGNAL_GROUP_EXIT. Because to me the coredumping process is not exiting
yet, it tries to handle the coredumping signal. That is why I prefer
group_exit_code = SIGKILL if it is killed during the dump. But this is
slightly offtopic today.
Oleg.
On 06/21, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > On 06/19, Eric W. Biederman wrote:
> >>
> >> Oleg Nesterov <oleg@redhat.com> writes:
> >>
> >> > Hi Eric,
> >> >
> >> > I'll _try_ to read this (nontrivial) changes this week. To be honest,
> >> > right now I don't really understand your goals after the quick glance...
> >> >
> >> > So far I have only looked at this simple 1/17 and it doesn't look right
> >> > to me.
> >>
> >> It might be worth applying them all on a branch and just looking at the
> >> end result.
> >
> > Perhaps. Say, the next 2/17 patch. I'd say it is very difficult to understand
> > the purpose unless you read the next patches. OK, at least the change log
> > mentions "in preparation".
> >
> >> > - complete_signal() won't be called, so signal->group_exit_code
> >> > won't be updated.
> >> >
> >> > coredump_finish() won't change it too so the process will exit
> >> > with group_exit_code == signr /* coredumping signal */.
> >> >
> >> > Yes, the fix is obvious and trivial...
> >>
> >> The signal handling from the coredump is arguably correct. The process
> >> has already exited, and gotten an exit code.
> >
> > And zap_process() sets roup_exit_code = signr. But,
> >
> >> But I really don't care about the exit_code either way. I just want to
> >> make ``killing'' a dead process while it core dumps independent of
> >> complete_signal.
> >>
> >> That ``killing'' of a dead process is a completely special case.
> >
> > Sorry I fail to understand...
> >
> > If the coredumping process is killed by SIGKILL, it should exit with
> > group_exit_code = SIGKILL, right? At least this is what we have now.
>
> In general when a fatal signal is sent:
> - It is short circuit delivered.
> - If SIGNAL_GROUP_EXIT is not set
> SIGNAL_GROUP_EXIT is set
> signal->group_exit_code is set
>
> Under those rules group_exit_code should not be updated. Frankly no
> signals should even be processed (except to be queued) after a fatal
> signal comes in.
>
> There is an issue that short circuit delivery does not work on coredump
> signals (because of the way the coredump code works). So it winds up
> being zap_threads that tests if SIGNAL_GROUP_EXIT is already set and
> zap_process that sets SIGNAL_GROUP_EXIT. Essentially the logic remains
> the same, and importantly no later signal is able to set
> group_exit_code. Or really have any effect because the signal sent was
> fatal.
>
> Except except except when the kernel in the context of the userspace
> process is writing a coredump for that userspace process. Then we allow
> the kernel to be sent SIGKILL to stop it's coredumping activities
> because sometimes it can block indefinitely otherwise.
>
> Which is why I call handling that SIGKILL after a coredump has
> begun and SIGNAL_GROUP_EXIT is already set a completely special case.
>
> We might have to change group_exit_code to SIGKILL in that special case,
> if someone in userspace cares. However I expect no one cares.
>
> Further if adding support for SIGKILL during a coredump were being added
> from scratch. The semantics I would choose would be for that SIGKILL
> and indeed all of the coredumping activity would be invisible to
> userspace except for the delay to make it happen. Otherwise a coredump
> which every occasionally gets it's return code changed could introduce
> heisenbugs.
>
> But none of this is documented in the change description and at a bare
> minimum this change of behavior should be before such code is merged.
>
> Eric
>
next prev parent reply other threads:[~2024-06-21 10:42 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-09 14:23 [PATCH 0/1] exit: kill signal_struct->quick_threads Oleg Nesterov
2024-06-09 14:24 ` [PATCH 1/1] " Oleg Nesterov
2024-06-09 18:28 ` [PATCH 0/1] " Oleg Nesterov
2024-06-10 10:50 ` Q: css_task_iter_advance() && dying_tasks Oleg Nesterov
2024-06-10 11:08 ` Oleg Nesterov
2024-06-10 20:02 ` Tejun Heo
2024-06-10 20:00 ` Tejun Heo
2024-06-10 12:15 ` [PATCH 0/1] exit: kill signal_struct->quick_threads Eric W. Biederman
2024-06-10 15:29 ` Oleg Nesterov
2024-06-10 15:42 ` Oleg Nesterov
2024-06-10 16:18 ` Oleg Nesterov
2024-06-13 15:45 ` Oleg Nesterov
2024-06-15 14:53 ` Eric W. Biederman
2024-06-17 18:37 ` Oleg Nesterov
2024-06-19 3:48 ` Eric W. Biederman
2024-06-19 4:04 ` [PATCH 0/17] exit: complete synchronize_group_exit Eric W. Biederman
2024-06-19 4:05 ` [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2024-06-19 15:50 ` Oleg Nesterov
2024-06-19 18:09 ` Eric W. Biederman
2024-06-19 19:11 ` Oleg Nesterov
2024-06-21 5:46 ` Eric W. Biederman
2024-06-21 10:40 ` Oleg Nesterov [this message]
2024-06-21 16:30 ` Eric W. Biederman
2024-06-19 4:05 ` [PATCH 02/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2024-06-25 12:34 ` Oleg Nesterov
2024-06-19 4:06 ` [PATCH 03/17] coredump: Consolidate the work to allow SIGKILL during coredumps Eric W. Biederman
2024-06-25 12:34 ` Oleg Nesterov
2024-06-19 4:06 ` [PATCH 04/17] signal: In get_signal call do_exit when it is unnecessary to shoot down threads Eric W. Biederman
2024-06-25 12:35 ` Oleg Nesterov
2024-06-19 4:07 ` [PATCH 05/17] signal: Bring down all threads when handling a non-coredump fatal signal Eric W. Biederman
2024-06-25 12:56 ` Oleg Nesterov
2024-06-19 4:07 ` [PATCH 06/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2024-06-19 4:08 ` [PATCH 07/17] signal: Always set JOBCTL_WILL_EXIT for " Eric W. Biederman
2024-06-30 14:00 ` kernel test robot
2024-06-19 4:08 ` [PATCH 08/17] signal: Don't target tasks that are exiting Eric W. Biederman
2024-06-19 4:09 ` [PATCH 09/17] signal: Test for process exit or de_thread using task_exit_pending Eric W. Biederman
2024-06-19 4:09 ` [PATCH 10/17] signal: Only set JOBCTL_WILL_EXIT if it is not already set Eric W. Biederman
2024-06-19 4:10 ` [PATCH 11/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2024-06-19 4:10 ` [PATCH 12/17] signal: Remove zap_other_threads Eric W. Biederman
2024-06-19 4:11 ` [PATCH 13/17] signal: Stop skipping current in do_group_exit & get_signal Eric W. Biederman
2024-06-28 5:43 ` [LTP] " kernel test robot
2024-06-28 5:43 ` kernel test robot
2024-06-19 4:11 ` [PATCH 14/17] signal: Factor out schedule_group_exit_locked Eric W. Biederman
2024-06-19 4:12 ` [PATCH 15/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2024-06-19 4:12 ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2024-06-19 4:13 ` [PATCH 17/17] signal: Set SIGNAL_GROUP_EXIT when all tasks have decided to exit Eric W. Biederman
2024-06-19 20:18 ` [PATCH 0/1] exit: kill signal_struct->quick_threads Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240621104031.GA12521@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.