linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Pavel Emelyanov <xemul@parallels.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	syzkaller <syzkaller@googlegroups.com>,
	Kostya Serebryany <kcc@google.com>,
	Alexander Potapenko <glider@google.com>,
	Sasha Levin <sasha.levin@oracle.com>
Subject: Re: fs: uninterruptible hang in handle_userfault
Date: Wed, 2 Mar 2016 15:55:21 +0100	[thread overview]
Message-ID: <20160302145521.GO1180@redhat.com> (raw)
In-Reply-To: <20160302004845.GF17997@ZenIV.linux.org.uk>

Hello,

On Wed, Mar 02, 2016 at 12:48:46AM +0000, Al Viro wrote:
> On Tue, Mar 01, 2016 at 12:06:49PM -0800, Linus Torvalds wrote:
> 
> > So the only access we really care about is the child tid-pointer
> > clearing one, and that always happens after PF_EXITING has been set
> > afaik.
> > 
> > No other case really matters. If somebody accesses a userfault region
> > just as another thread is exiting, we don't care. I don't think it
> > would necessarily be wrong to ignore the fault, but I don't think it's
> > relevant either, since at that stage the normal "you can signal the
> > thread" still works. It's only the child tid access that comes *after*
> > we have stopped acceping signals, and that's marked by that
> > PF_EXITING.
> > 
> > Or maybe I misunderstood your worry entirely or missed something, and
> > my answer above is entirely beside your point. Did you have something
> > else in mind?
> 
> No, I've misread de_thread()/zap_other_threads().  No objections to the
> patch.

I reviewed the fix (a) too and it's sure fine with me too.

So I evaluated if we could fix the deadlock by simply preventing to
run userland page faults while SIGKILL cannot be delivered anymore. I
think skipping those userland accesses wouldn't be strictly required
if they were run by a legit app with a userfaultfd manager thread
alive and well.

Running page faults that late in the exit path with signal disabled
was frankly unexpected. So I did a quick attempt to test such an
exit-code reordering change, but it's overall more risky and so far I
didn't succeed to have a SIGKILL reach handle_userfault() despite I
cleaned up that futex code into a proper futex_exit run just before
exit_signals instead of inside mm_release. Apparently it's not just
PF_EXITING that prevents SIGKILL to reach handle_userfault(). The
below change still didn't allow to kill the task:

+	exit_futex(tsk); /* run before setting PF_EXITING */
	exit_signals(tsk);  /* sets PF_EXITING */

So again I think for now fix (a) is preferable and I verified it too
with the test program. As far as the primary production user of
userfaulfd is concerned, no futex will run in the userfaultfd tracked
region so no matter what the futex exit code is there for, there's no
risk of memory corruption.

Thanks,
Andrea

  reply	other threads:[~2016-03-02 14:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-01 11:29 fs: uninterruptible hang in handle_userfault Dmitry Vyukov
2016-03-01 19:56 ` Linus Torvalds
2016-03-01 19:59   ` Al Viro
2016-03-01 20:06     ` Linus Torvalds
2016-03-02  0:48       ` Al Viro
2016-03-02 14:55         ` Andrea Arcangeli [this message]
2016-03-02 17:03           ` Linus Torvalds
2016-03-02 17:34             ` Andrea Arcangeli
2016-03-03  7:14             ` Sedat Dilek
     [not found]               ` <CA+55aFzAUx01f31X6KhZMDddzbnN=rsyzV3nLF=C7FS22m9X=Q@mail.gmail.com>
2016-03-03  7:46                 ` Sedat Dilek
2016-03-03 12:48                   ` Andrea Arcangeli
2016-03-01 20:28   ` Linus Torvalds
2016-03-02  9:16     ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160302145521.GO1180@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=kcc@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sasha.levin@oracle.com \
    --cc=syzkaller@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).