Re: [PATCH] signal: Set PF_SIGNALED flag for io workers during a group exit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: ebiederm@xmission.com (Eric W. Biederman)
To: Olivier Langlois <olivier@trillion01.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
	"Peter Zijlstra \(Intel\)" <peterz@infradead.org>,
	Marco Elver <elver@google.com>,
	Peter Collingbourne <pcc@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jens Axboe <axboe@kernel.dk>,
	linux-kernel@vger.kernel.org, io-uring@vger.kernel.org
Subject: Re: [PATCH] signal: Set PF_SIGNALED flag for io workers during a group exit
Date: Mon, 07 Jun 2021 22:49:55 -0500	[thread overview]
Message-ID: <8735ttggm4.fsf@disp2133> (raw)
In-Reply-To: <E1lqLo6-00ENqW-TB@mx03.mta.xmission.com> (Olivier Langlois's message of "Sun, 30 May 2021 18:49:38 -0400")

Olivier Langlois <olivier@trillion01.com> writes:

> io worker threads are in most regards userspace threads except
> that they never resume userspace. Therefore, they need to explicitly
> handle signals.
>
> On delivering a fatal signal generating a core dump to a thread of
> a group having 1 or more io workers, it is possible for the io_workers
> to exit with pending signals.
>
> One example of this is the io_wqe_worker() function thread in fs/io-wq.c
> This thread can exit the function with pending signals when its
> IO_WQ_BIT_EXIT bit is set.
>
> The consequence of exiting with pending signals is that PF_SIGNALED
> will not be set. This flag is used in exit_mm() to engage into
> the synchronization between do_coredump() and exit_mm().
>
> The purpose of this synchronization is not well documented and all
> that I have found is that it is used to avoid corruption in the core file
> in the section "Deleting a Process Address Space", chapter 9 of the
> Bovet & Cesati book.

We added the check just a little while ago.  I am surprised it shows up
in any book.  What is the Bovett & Cesati book?

The flag PF_SIGNALED today is set in exactly one place, and that
is in get_signal.  The meaning of PF_SIGNALED is that do_group_exit
was called from get_signal.  AKA your task was killed by a signal.

The check in exit_mm() that tests PF_SIGNALED is empirically testing
to see if all of the necessary state is saved on the kernel stack.
That state is the state accessed by fs/binfmt_elf.c:fill_note_info.

The very good description from the original change can be found in
the commit 123cbec460db ("signal: Remove the helper signal_group_exit").

For alpha it is has the assembly function do_switch_stack been called
before your code path was called in the kernel.

Since io_uring does not have a userspace  I don't know if testing
for PF_SIGNALED is at all meaningful to detect values saved on the
stack.

I suspect io_uring is simply broken on architectures that need
extra state saved on the stack, but I haven't looked yet.

> So I am not sure if the synchronizatin MUST be applied to io_workers
> or not but the proposed patch is making sure that it is applied in
> all cases if it is needed.

That patch is definitely wrong.  If anything the check in exit_mm
should be updated.

Can you share which code paths in io_uring exit with a
fatal_signal_pending and don't bother to call get_signal?

I am currently looking to see if the wait for a coredump to read
a threads data can be moved from exit_mm into get_signal.  Even
with that io_uring might need a some additional fixes.

Eric

> Signed-off-by: Olivier Langlois <olivier@trillion01.com>
> ---
>  kernel/signal.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index f7c6ffcbd044..477bfe55fd3c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2925,6 +2925,15 @@ void exit_signals(struct task_struct *tsk)
>  
>  	if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
>  		tsk->flags |= PF_EXITING;
> +		/*
> +		 * It is possible for an io worker thread to reach this
> +		 * function with a pending SIGKILL.
> +		 * Set PF_SIGNALED for proper core dump generation
> +		 * (See exit_mm())
> +		 */
> +		if (tsk->flags & PF_IO_WORKER &&
> +		    signal_group_exit(tsk->signal))
> +			tsk->flags |= PF_SIGNALED;
>  		cgroup_threadgroup_change_end(tsk);
>  		return;
>  	}

next      parent reply	other threads:[~2021-06-08  3:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1lqLo6-00ENqW-TB@mx03.mta.xmission.com>
2021-06-08  3:49 ` Eric W. Biederman [this message]
2021-06-08 17:43   ` [PATCH] signal: Set PF_SIGNALED flag for io workers during a group exit Olivier Langlois
2021-06-08 18:44     ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735ttggm4.fsf@disp2133 \
    --to=ebiederm@xmission.com \
    --cc=axboe@kernel.dk \
    --cc=elver@google.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=olivier@trillion01.com \
    --cc=pcc@google.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.