Linux MIPS Architecture development
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: linux-mips@linux-mips.org
Cc: James Hogan <james.hogan@imgtec.com>, Ralf Baechle <ralf@linux-mips.org>
Subject: potential deadlock in r4k_flush_cache_sigtramp()
Date: Sat, 23 Sep 2017 14:40:21 +0100	[thread overview]
Message-ID: <20170923134021.GN32076@ZenIV.linux.org.uk> (raw)

	Calling get_user_pages_fast() while holding ->mmap_sem is
asking for trouble:

CPU1: r4k_flush_cache_sigtramp()
	down_read(&current->mm->mmap_sem);

CPU2: (running thread with the same ->mm): sys_pkey_alloc()
	down_write(&current->mm->mmap_sem);

CPU1:
	pages = get_user_pages_fast(addr, 1, 0, &args.page);
which hits an absent page and goto slow.  Then it goes on to
        ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT,
                                      pages, write ? FOLL_WRITE : 0);
which does
        return __get_user_pages_unlocked(current, current->mm, start, nr_pages,
                                         pages, gup_flags | FOLL_TOUCH);
which does
        down_read(&mm->mmap_sem);
        ret = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, NULL, 
                                      &locked, false, gup_flags);

and we have a classical deadlock on recursive down_read() (thread 1: down_read()
gets the rwsem; thread 2: down_write() blocks waiting for thread 1 to release
it; thread 1: down_read() blocks waiting for thread 2 to get through down_write()
and eventual up_write(), which completes the deadlock).

Replacing pkey_alloc(2) with e.g. mmap(2) turns that from hard deadlock into
something killable, but while "with bad timing you might get process stuck
hard" is worse than "with bad timing you might get process stuck until you
kill -9 it", neither is a good thing.

I'm not familiar enough with arch/mips guts to suggest any variant of solution;
replacing get_user_pages_fast() with get_user_pages_locked() would solve the
deadlock, but that loses the fast path; not taking ->mmap_sem there have
local_r4k_flush_cache_sigtramp() run without fcs_args->mm being locked, which
might or might not be a problem.  Suggestions?

             reply	other threads:[~2017-09-23 13:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-23 13:40 Al Viro [this message]
2017-09-26 13:59 ` potential deadlock in r4k_flush_cache_sigtramp() James Hogan
2017-09-26 13:59   ` James Hogan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170923134021.GN32076@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=james.hogan@imgtec.com \
    --cc=linux-mips@linux-mips.org \
    --cc=ralf@linux-mips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox