Re: [PATCH 1/2] arm64: mm: abort uaccess retries upon fatal signal

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Rabin Vincent <rabin@rab.in>
To: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	catalin.marinas@arm.com, james.morse@arm.com, labbott@redhat.com,
	linux@armlinux.org.uk, stable@vger.kernel.org,
	steve.capper@arm.com, viro@zeniv.linux.org.uk,
	peterz@infradead.org, luto@amacapital.net
Subject: Re: [PATCH 1/2] arm64: mm: abort uaccess retries upon fatal signal
Date: Tue, 14 Nov 2017 07:46:25 +0100	[thread overview]
Message-ID: <20171114064556.GA18291@lnxartpec.se.axis.com> (raw)
In-Reply-To: <20170822094523.GA5439@arm.com>

On Tue, Aug 22, 2017 at 10:45:24AM +0100, Will Deacon wrote:
> On Mon, Aug 21, 2017 at 02:42:03PM +0100, Mark Rutland wrote:
> > On Tue, Jul 11, 2017 at 03:58:49PM +0100, Will Deacon wrote:
> > > On Tue, Jul 11, 2017 at 03:19:22PM +0100, Mark Rutland wrote:
> > > > When there's a fatal signal pending, arm64's do_page_fault()
> > > > implementation returns 0. The intent is that we'll return to the
> > > > faulting userspace instruction, delivering the signal on the way.
> > > > 
> > > > However, if we take a fatal signal during fixing up a uaccess, this
> > > > results in a return to the faulting kernel instruction, which will be
> > > > instantly retried, resulting in the same fault being taken forever. As
> > > > the task never reaches userspace, the signal is not delivered, and the
> > > > task is left unkillable. While the task is stuck in this state, it can
> > > > inhibit the forward progress of the system.
> > > > 
> > > > To avoid this, we must ensure that when a fatal signal is pending, we
> > > > apply any necessary fixup for a faulting kernel instruction. Thus we
> > > > will return to an error path, and it is up to that code to make forward
> > > > progress towards delivering the fatal signal.
> > > > 
> > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > > Reviewed-by: Steve Capper <steve.capper@arm.com>
> > > > Tested-by: Steve Capper <steve.capper@arm.com>
> > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: James Morse <james.morse@arm.com>
> > > > Cc: Laura Abbott <labbott@redhat.com>
> > > > Cc: Will Deacon <will.deacon@arm.com>
> > > > Cc: stable@vger.kernel.org
> > > > ---
> > > >  arch/arm64/mm/fault.c | 5 ++++-
> > > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > > index 37b95df..3952d5e 100644
> > > > --- a/arch/arm64/mm/fault.c
> > > > +++ b/arch/arm64/mm/fault.c
> > > > @@ -397,8 +397,11 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
> > > >  	 * signal first. We do not need to release the mmap_sem because it
> > > >  	 * would already be released in __lock_page_or_retry in mm/filemap.c.
> > > >  	 */
> > > > -	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
> > > > +	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) {
> > > > +		if (!user_mode(regs))
> > > > +			goto no_context;
> > > >  		return 0;
> > > > +	}
> > > 
> > > This will need rebasing at -rc1 (take a look at current HEAD).
> > > 
> > > Also, I think it introduces a weird corner case where we take a page fault
> > > when writing the signal frame to the user stack to deliver a SIGSEGV. If
> > > we end up with VM_FAULT_RETRY and somebody has sent a SIGKILL to the task,
> > > then we'll fail setup_sigframe and force an un-handleable SIGSEGV instead
> > > of SIGKILL.
> > > 
> > > The end result (task is killed) is the same, but the fatal signal is wrong.
> > 
> > That doesn't seem to be the case, testing on v4.13-rc5.
> > 
> > I used sigaltstack() to use the userfaultfd region as signal stack,
> > registerd a SIGSEGV handler, and dereferenced NULL. The task locks up,
> > but when killed with a SIGINT or SIGKILL, the exit status reflects that
> > signal, rather than the SIGSEGV.
> > 
> > If I move the SIGINT handler onto the userfaultfd-monitored stack, then
> > delivering SIGINT hangs, but can be killed with SIGKILL, and the exit
> > status reflects that SIGKILL.
> > 
> > As you say, it does look like we'd try to set up a deferred SIGSEGV for
> > the failed signal delivery.
> > 
> > I haven't yet figured out exactly how that works; I'll keep digging.
> 
> The SEGV makes it all the way into do_group_exit, but then signal_group_exit
> is set and the exit_code is overridden with SIGKILL at the last minute (see
> complete_signal).

Unfortunately, this last minute is too late for print-fatal-signals.
With print-fatal-signals enabled, this patch leads to misleading
"potentially unexpected fatal signal 11" warnings if a process is
SIGKILL'd at the right time.

I've seen it without userfaultfd, but it's easiest reproduced by
patching Mark's original test code [1] with the following patch and
simply running "pkill -WINCH foo; pkill -KILL foo".  This results in:

 foo: potentially unexpected fatal signal 11.
 CPU: 1 PID: 1793 Comm: foo Not tainted 4.9.58-devel #3
 task: b3534780 task.stack: b4b7c000
 PC is at 0x76effb60
 LR is at 0x4227f4
 pc : [<76effb60>]    lr : [<004227f4>]    psr: 600b0010
 sp : 7eaf7bb4  ip : 00000000  fp : 00000000
 r10: 00000001  r9 : 00000003  r8 : 76fcd000
 r7 : 0000001d  r6 : 76fd0cf0  r5 : 7eaf7c08  r4 : 00000000
 r3 : 00000000  r2 : 00000000  r1 : 7eaf7a88  r0 : fffffffc
 Flags: nZCv  IRQs on  FIQs on  Mode USER_32  ISA ARM  Segment user
 Control: 10c5387d  Table: 3357404a  DAC: 00000055
 CPU: 1 PID: 1793 Comm: foo Not tainted 4.9.58-devel #3
 [<801113f0>] (unwind_backtrace) from [<8010cfb0>] (show_stack+0x18/0x1c)
 [<8010cfb0>] (show_stack) from [<8039725c>] (dump_stack+0x84/0x98)
 [<8039725c>] (dump_stack) from [<8012f448>] (get_signal+0x384/0x684)
 [<8012f448>] (get_signal) from [<8010c2ec>] (do_signal+0xcc/0x470)
 [<8010c2ec>] (do_signal) from [<8010c868>] (do_work_pending+0xb8/0xc8)
 [<8010c868>] (do_work_pending) from [<801086d4>] (slow_work_pending+0xc/0x20)

This is ARM and I haven't tested ARM64, but the same problem even exists
on x86.

--- foo.c.orig	2017-11-13 23:45:47.802167284 +0100
+++ foo.c	2017-11-14 07:16:13.906363466 +0100
@@ -6,6 +6,11 @@
 #include <sys/syscall.h>
 #include <sys/vfs.h>
 #include <unistd.h>
+#include <signal.h>
+
+static void handler(int sig)
+{
+}
 
 int main(int argc, char *argv[])
 {
@@ -47,13 +52,17 @@
         if (ret < 0)
                 return errno;
 
+        sigaltstack(&(stack_t){.ss_sp = mem, .ss_size = pagesz}, NULL);
+        sigaction(SIGWINCH, &(struct sigaction){ .sa_handler = handler, .sa_flags = SA_ONSTACK, }, NULL);
+
         /*
          * Force an arbitrary uaccess to memory monitored by the userfaultfd.
          * This will block, but when a SIGKILL is sent, will consume all
          * available CPU time without being killed, and may inhibit forward
          * progress of the system.
          */
-        ret = fstatfs(0, (struct statfs *)mem);
+        // ret = fstatfs(0, (struct statfs *)mem);
+        pause();
 
         return 0;
 }

[1] https://lkml.kernel.org/r/1499782590-31366-1-git-send-email-mark.rutland@arm.com

next prev parent reply	other threads:[~2017-11-14  6:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11 14:19 [PATCH 0/2] Fatal signal handing within uaccess faults Mark Rutland
2017-07-11 14:19 ` [PATCH 1/2] arm64: mm: abort uaccess retries upon fatal signal Mark Rutland
2017-07-11 14:58   ` Will Deacon
2017-08-21 13:42     ` Mark Rutland
2017-08-22  9:45       ` Will Deacon
2017-11-14  6:46         ` Rabin Vincent [this message]
2017-07-12 17:18   ` James Morse
2017-07-11 14:19 ` [PATCH 2/2] arm: " Mark Rutland
2017-08-22 10:40   ` Mark Rutland
  -- strict thread matches above, loose matches on Subject: below --
2017-07-11 14:16 [PATCH 0/2] Fatal signal handing within uaccess faults Mark Rutland
2017-07-11 14:16 ` [PATCH 1/2] arm64: mm: abort uaccess retries upon fatal signal Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171114064556.GA18291@lnxartpec.se.axis.com \
    --to=rabin@rab.in \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=labbott@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mark.rutland@arm.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=steve.capper@arm.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox