Re: [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case
Date: Fri, 21 Jun 2024 12:40:32 +0200	[thread overview]
Message-ID: <20240621104031.GA12521@redhat.com> (raw)
In-Reply-To: <874j9mdf14.fsf@email.froward.int.ebiederm.org>

Another case when I can hardly understand your reply...

This patch adds a minor user visible change, that was my point.

If you say that the new behaviour is better / more consistent -
I won't really argue, "I expect no one cares" below is probably
true. In my opinion group_exit_code = SIGKILL makes more sense
in this special case, but again, I won't insist.

But then this change should be mentioned and explained in the
changelog, agree?

As for "zap_threads that tests if SIGNAL_GROUP_EXIT is already set",
this is another thing but probably I misundertood you. It is not that
zap_threads/zap_process do not set ->group_exit_code in this case,
in this case do_coredump() will be aborted.

And to remind, zap_threads() used to set SIGNAL_GROUP_COREDUMP, not
SIGNAL_GROUP_EXIT. Because to me the coredumping process is not exiting
yet, it tries to handle the coredumping signal. That is why I prefer
group_exit_code = SIGKILL if it is killed during the dump. But this is
slightly offtopic today.

Oleg.

On 06/21, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > On 06/19, Eric W. Biederman wrote:
> >>
> >> Oleg Nesterov <oleg@redhat.com> writes:
> >>
> >> > Hi Eric,
> >> >
> >> > I'll _try_ to read this (nontrivial) changes this week. To be honest,
> >> > right now I don't really understand your goals after the quick glance...
> >> >
> >> > So far I have only looked at this simple 1/17 and it doesn't look right
> >> > to me.
> >>
> >> It might be worth applying them all on a branch and just looking at the
> >> end result.
> >
> > Perhaps. Say, the next 2/17 patch. I'd say it is very difficult to understand
> > the purpose unless you read the next patches. OK, at least the change log
> > mentions "in preparation".
> >
> >> > 	- complete_signal() won't be called, so signal->group_exit_code
> >> > 	  won't be updated.
> >> >
> >> > 	  coredump_finish() won't change it too so the process will exit
> >> > 	  with group_exit_code == signr /* coredumping signal */.
> >> >
> >> > 	  Yes, the fix is obvious and trivial...
> >>
> >> The signal handling from the coredump is arguably correct.  The process
> >> has already exited, and gotten an exit code.
> >
> > And zap_process() sets roup_exit_code = signr. But,
> >
> >> But I really don't care about the exit_code either way.  I just want to
> >> make ``killing'' a dead process while it core dumps independent of
> >> complete_signal.
> >>
> >> That ``killing'' of a dead process is a completely special case.
> >
> > Sorry I fail to understand...
> >
> > If the coredumping process is killed by SIGKILL, it should exit with
> > group_exit_code = SIGKILL, right? At least this is what we have now.
>
> In general when a fatal signal is sent:
> - It is short circuit delivered.
> - If SIGNAL_GROUP_EXIT is not set
>    SIGNAL_GROUP_EXIT is set
>    signal->group_exit_code is set
>
> Under those rules group_exit_code should not be updated.  Frankly no
> signals should even be processed (except to be queued) after a fatal
> signal comes in.
>
> There is an issue that short circuit delivery does not work on coredump
> signals (because of the way the coredump code works).  So it winds up
> being zap_threads that tests if SIGNAL_GROUP_EXIT is already set and
> zap_process that sets SIGNAL_GROUP_EXIT.  Essentially the logic remains
> the same, and importantly no later signal is able to set
> group_exit_code.  Or really have any effect because the signal sent was
> fatal.
>
> Except except except when the kernel in the context of the userspace
> process is writing a coredump for that userspace process.  Then we allow
> the kernel to be sent SIGKILL to stop it's coredumping activities
> because sometimes it can block indefinitely otherwise.
>
> Which is why I call handling that SIGKILL after a coredump has
> begun and SIGNAL_GROUP_EXIT is already set a completely special case.
>
> We might have to change group_exit_code to SIGKILL in that special case,
> if someone in userspace cares.  However I expect no one cares.
>
> Further if adding support for SIGKILL during a coredump were being added
> from scratch.  The semantics I would choose would be for that SIGKILL
> and indeed all of the coredumping activity would be invisible to
> userspace except for the delay to make it happen.  Otherwise a coredump
> which every occasionally gets it's return code changed could introduce
> heisenbugs.
>
> But none of this is documented in the change description and at a bare
> minimum this change of behavior should be before such code is merged.
>
> Eric
>

next prev parent reply	other threads:[~2024-06-21 10:42 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-09 14:23 [PATCH 0/1] exit: kill signal_struct->quick_threads Oleg Nesterov
2024-06-09 14:24 ` [PATCH 1/1] " Oleg Nesterov
2024-06-09 18:28 ` [PATCH 0/1] " Oleg Nesterov
2024-06-10 10:50 ` Q: css_task_iter_advance() && dying_tasks Oleg Nesterov
2024-06-10 11:08   ` Oleg Nesterov
2024-06-10 20:02     ` Tejun Heo
2024-06-10 20:00   ` Tejun Heo
2024-06-10 12:15 ` [PATCH 0/1] exit: kill signal_struct->quick_threads Eric W. Biederman
2024-06-10 15:29   ` Oleg Nesterov
2024-06-10 15:42     ` Oleg Nesterov
2024-06-10 16:18     ` Oleg Nesterov
2024-06-13 15:45     ` Oleg Nesterov
2024-06-15 14:53       ` Eric W. Biederman
2024-06-17 18:37         ` Oleg Nesterov
2024-06-19  3:48           ` Eric W. Biederman
2024-06-19  4:04             ` [PATCH 0/17] exit: complete synchronize_group_exit Eric W. Biederman
2024-06-19  4:05               ` [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit special case Eric W. Biederman
2024-06-19 15:50                 ` Oleg Nesterov
2024-06-19 18:09                   ` Eric W. Biederman
2024-06-19 19:11                     ` Oleg Nesterov
2024-06-21  5:46                       ` Eric W. Biederman
2024-06-21 10:40                         ` Oleg Nesterov [this message]
2024-06-21 16:30                           ` Eric W. Biederman
2024-06-19  4:05               ` [PATCH 02/17] signal: Compute the process exit_code in get_signal Eric W. Biederman
2024-06-25 12:34                 ` Oleg Nesterov
2024-06-19  4:06               ` [PATCH 03/17] coredump: Consolidate the work to allow SIGKILL during coredumps Eric W. Biederman
2024-06-25 12:34                 ` Oleg Nesterov
2024-06-19  4:06               ` [PATCH 04/17] signal: In get_signal call do_exit when it is unnecessary to shoot down threads Eric W. Biederman
2024-06-25 12:35                 ` Oleg Nesterov
2024-06-19  4:07               ` [PATCH 05/17] signal: Bring down all threads when handling a non-coredump fatal signal Eric W. Biederman
2024-06-25 12:56                 ` Oleg Nesterov
2024-06-19  4:07               ` [PATCH 06/17] signal: Add JOBCTL_WILL_EXIT to mark exiting tasks Eric W. Biederman
2024-06-19  4:08               ` [PATCH 07/17] signal: Always set JOBCTL_WILL_EXIT for " Eric W. Biederman
2024-06-30 14:00                 ` kernel test robot
2024-06-19  4:08               ` [PATCH 08/17] signal: Don't target tasks that are exiting Eric W. Biederman
2024-06-19  4:09               ` [PATCH 09/17] signal: Test for process exit or de_thread using task_exit_pending Eric W. Biederman
2024-06-19  4:09               ` [PATCH 10/17] signal: Only set JOBCTL_WILL_EXIT if it is not already set Eric W. Biederman
2024-06-19  4:10               ` [PATCH 11/17] signal: Make individual tasks exiting a first class concept Eric W. Biederman
2024-06-19  4:10               ` [PATCH 12/17] signal: Remove zap_other_threads Eric W. Biederman
2024-06-19  4:11               ` [PATCH 13/17] signal: Stop skipping current in do_group_exit & get_signal Eric W. Biederman
2024-06-28  5:43                 ` kernel test robot
2024-06-19  4:11               ` [PATCH 14/17] signal: Factor out schedule_group_exit_locked Eric W. Biederman
2024-06-19  4:12               ` [PATCH 15/17] ptrace: Separate task->ptrace_code out from task->exit_code Eric W. Biederman
2024-06-19  4:12               ` [PATCH 16/17] signal: Record the exit_code when an exit is scheduled Eric W. Biederman
2024-06-19  4:13               ` [PATCH 17/17] signal: Set SIGNAL_GROUP_EXIT when all tasks have decided to exit Eric W. Biederman
2024-06-19 20:18             ` [PATCH 0/1] exit: kill signal_struct->quick_threads Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240621104031.GA12521@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox