From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Oleg Nesterov <oleg@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>,
Clark Williams <williams@redhat.com>
Subject: Re: safety of *mutex_unlock() (Was: [BUG] signal: sighand unprotected when accessed by /proc)
Date: Tue, 10 Jun 2014 05:56:55 -0700 [thread overview]
Message-ID: <20140610125655.GJ4581@linux.vnet.ibm.com> (raw)
In-Reply-To: <CA+55aFyO8bb7fS3AaQzfMq7geERQ1JpSdcRZ9Fbs5-tZP0oRXg@mail.gmail.com>
On Mon, Jun 09, 2014 at 11:51:09AM -0700, Linus Torvalds wrote:
> On Mon, Jun 9, 2014 at 11:29 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >>
> >> And once again, note that the normal mutex is already unsafe (unless I missed
> >> something).
> >
> > Is it unsafe?
> >
> > This thread was started because of a bug we triggered in -rt, which
> > ended up being a change specific to -rt that modified the way slub
> > handled SLAB_DESTROY_BY_RCU. What else was wrong with it?
>
> There's a different issue with freeing of mutexes, which is not a bug,
> but "by design". Namely that mutexes aren't truly "atomic". They are
> complex data structures, and they have issues that a spinlock does not
> have.
>
> When unlocking a mutex, the thread doing the unlocking will still
> touch the mutex itself _after_ another thread could already have
> successfully acquired the mutex. This is not a problem in any normal
> use. since all of this is perfectly coherent in general, but it means
> that code sequences like:
>
> mutex_lock(mem->mutex);
> kill_it = !--mem->refcount;
> mutex_unlock(mem->mutex);
> if (kill_it)
> free(mem);
>
> are fundamentally buggy.
>
> Note that if you think of mutexes as truly indivisible atomic
> operations, the above is "obviously correct": the last person who got
> the mutex marked it for killing. But the fact is, the *next-to-last*
> mutex acquirer may still actively be in the *non-indivisible*
> mutex_unlock() when the last person frees it, resulting in a
> use-after-free. And yes, we've had this bug, and as far as I know it's
> possible that the RT code *introduces* this bug when it changes
> spinlocks into mutexes. Because we do exactly the above code sequence
> with spinlocks. So just replacing spinlocks with mutexes is a very
> subtly buggy thing to do in general.
>
> Another example of this kind of situation is using a mutex as a
> completion event: that's simply buggy. Again, it's because mutexes are
> complex data structures, and you have a very similar use-after-free
> issue. It's why the fundamental data structure for a "struct
> completion" is a spinlock, not a mutex.
>
> Again, in *theory*, a completion could be just a mutex that starts out
> locked, and then the completer completes it by just unlocking it. The
> person who waits for a completion does so by just asking for a lock.
> Obvious, simple, and trivial, right? Not so, because of the *exact*
> same issue above: the completer (who does an "unlock") may still be
> accessing the completion data structure when the completion waiter has
> successfully gotten the lock. So the standard thing of the completer
> freeing the underlying completion memory (which is often on the stack,
> so "freeing" is just he act of going out of scope of the liveness of
> the completion data structure) would not work if the completion was a
> mutex.
>
> This is subtle, and it is basically unavoidable. If a mutex (or
> counting semaphore) has a fast-path - and a mutex/semaphore without a
> fast-path is shit - then this issue will exist. Exactly because the
> fast-path will depend on just one part of the whole big mutex
> structure, and the slow-path will have other pieces to it.
>
> There might be reasonable ways to avoid this issue (having the
> fastpath locking field share memory with the slow-path locking, for
> example), but it's not how our semaphores and mutexes work, and I
> suspect it cannot be the case in general (because it limits you too
> badly in how to implement the mutex). As a result, this is all "by
> design" as opposed to being a bug.
So to safely free a structure containing a mutex, is there some better
approach than the following?
mutex_lock(mem->mutex);
kill_it = !--mem->refcount;
rcu_read_lock();
mutex_unlock(mem->mutex);
rcu_read_unlock();
if (kill_it)
kfree_rcu(mem, rh); /* rh is the rcu_head field in mem. */
For example, is there some other way to know that all the prior lock
releases have finished their post-release accesses?
Thanx, Paul
next prev parent reply other threads:[~2014-06-10 12:57 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-03 17:02 [BUG] signal: sighand unprotected when accessed by /proc Steven Rostedt
2014-06-03 17:26 ` Oleg Nesterov
2014-06-03 18:03 ` Linus Torvalds
2014-06-03 20:01 ` Oleg Nesterov
2014-06-03 20:03 ` Oleg Nesterov
2014-06-06 20:33 ` Paul E. McKenney
2014-06-08 13:07 ` safety of *mutex_unlock() (Was: [BUG] signal: sighand unprotected when accessed by /proc) Oleg Nesterov
2014-06-09 16:26 ` Paul E. McKenney
2014-06-09 18:15 ` Oleg Nesterov
2014-06-09 18:29 ` Steven Rostedt
2014-06-09 18:51 ` Linus Torvalds
2014-06-09 19:41 ` Steven Rostedt
2014-06-10 8:53 ` Thomas Gleixner
2014-06-10 16:57 ` Oleg Nesterov
2014-06-10 18:08 ` Thomas Gleixner
2014-06-10 18:13 ` Steven Rostedt
2014-06-10 20:05 ` Thomas Gleixner
2014-06-10 20:13 ` Thomas Gleixner
2014-06-11 15:52 ` Paul E. McKenney
2014-06-11 17:07 ` Oleg Nesterov
2014-06-11 17:17 ` Oleg Nesterov
2014-06-11 17:29 ` Paul E. McKenney
2014-06-11 17:59 ` Oleg Nesterov
2014-06-11 19:56 ` Paul E. McKenney
2014-06-12 17:28 ` Oleg Nesterov
2014-06-12 20:35 ` Paul E. McKenney
2014-06-12 21:40 ` Thomas Gleixner
2014-06-12 22:27 ` Paul E. McKenney
2014-06-12 23:19 ` Paul E. McKenney
2014-06-13 15:08 ` Oleg Nesterov
2014-06-15 5:40 ` Paul E. McKenney
2014-06-17 18:57 ` Paul E. McKenney
2014-06-18 16:43 ` Oleg Nesterov
2014-06-18 16:53 ` Steven Rostedt
2014-06-21 19:54 ` Thomas Gleixner
2014-06-18 17:00 ` Paul E. McKenney
2014-06-13 14:55 ` Oleg Nesterov
2014-06-13 16:10 ` Thomas Gleixner
2014-06-13 16:19 ` Oleg Nesterov
2014-06-13 14:52 ` Oleg Nesterov
2014-06-11 17:27 ` Paul E. McKenney
2014-06-10 17:07 ` Oleg Nesterov
2014-06-10 17:51 ` Thomas Gleixner
2014-06-10 12:56 ` Paul E. McKenney [this message]
2014-06-10 14:48 ` Peter Zijlstra
2014-06-10 15:18 ` Paul E. McKenney
2014-06-10 15:35 ` Linus Torvalds
2014-06-10 16:15 ` Paul E. McKenney
2014-06-09 19:04 ` Oleg Nesterov
2014-06-10 8:37 ` Peter Zijlstra
2014-06-10 12:52 ` Paul E. McKenney
2014-06-10 13:01 ` Peter Zijlstra
2014-06-10 14:36 ` Paul E. McKenney
2014-06-10 15:20 ` Paul E. McKenney
2014-06-03 20:05 ` [BUG] signal: sighand unprotected when accessed by /proc Steven Rostedt
2014-06-03 20:09 ` Oleg Nesterov
2014-06-03 20:15 ` Steven Rostedt
2014-06-03 20:25 ` Steven Rostedt
2014-06-03 21:12 ` Thomas Gleixner
2014-06-03 18:05 ` Steven Rostedt
2014-06-03 19:25 ` Oleg Nesterov
2014-06-04 1:16 ` Steven Rostedt
2014-06-04 16:31 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140610125655.GJ4581@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox