From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>,
Clark Williams <williams@redhat.com>
Subject: Re: safety of *mutex_unlock() (Was: [BUG] signal: sighand unprotected when accessed by /proc)
Date: Thu, 12 Jun 2014 15:27:48 -0700 [thread overview]
Message-ID: <20140612222748.GF4581@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1406122337060.5170@nanos>
On Thu, Jun 12, 2014 at 11:40:07PM +0200, Thomas Gleixner wrote:
> On Thu, 12 Jun 2014, Paul E. McKenney wrote:
> > On Thu, Jun 12, 2014 at 07:28:44PM +0200, Oleg Nesterov wrote:
> > > On 06/11, Paul E. McKenney wrote:
> > > >
> > > > On Wed, Jun 11, 2014 at 07:59:34PM +0200, Oleg Nesterov wrote:
> > > > > On 06/11, Paul E. McKenney wrote:
> > > > > >
> > > > > > I was thinking of ->boost_completion as the way to solve it easily, but
> > > > > > what did you have in mind?
> > > > >
> > > > > I meant, rcu_boost() could probably just do "mtx->owner = t", we know that
> > > > > it was unlocked by us and nobody else can use it until we set
> > > > > t->rcu_boost_mutex.
> > > >
> > > > My concern with this is that rcu_read_unlock_special() could hypothetically
> > > > get preempted (either by kernel or hypervisor), so that it might be a long
> > > > time until it makes its reference. But maybe that reference would be
> > > > harmless in this case.
> > >
> > > Confused... Not sure I understand what did you mean, and certainly I do not
> > > understand how this connects to the proxy-locking method.
> > >
> > > Could you explain?
> >
> > Here is the hypothetical sequence of events, which cannot happen unless
> > the CPU releasing the lock accesses the structure after releasing
> > the lock:
> >
> > CPU 0 CPU 1 (booster)
> >
> > releases boost_mutex
> >
> > acquires boost_mutex
> > releases boost_mutex
> > post-release boost_mutex access?
> > Loops to next task to boost
> > proxy-locks boost_mutex
> >
> > post-release boost_mutex access:
> > confused due to proxy-lock
> > operation?
> >
> > Now maybe this ends up being safe, but it sure feels like an accident
> > waiting to happen. Some bright developer comes up with a super-fast
> > handoff, and blam, RCU priority boosting takes it in the shorts. ;-)
> > In contrast, using the completion prevents this.
> >
> > > > > And if we move it into rcu_node, then we can probably kill ->rcu_boost_mutex,
> > > > > rcu_read_unlock_special() could check rnp->boost_mutex->owner == current.
> > > >
> > > > If this was anywhere near a hot code path, I would be sorely tempted.
> > >
> > > Ah, but I didn't mean perfomance. I think it is always good to try to remove
> > > something from task_struct, it is huge. I do not mean sizeof() in the first
> > > place, the very fact that I can hardly understand the purpose of a half of its
> > > members makes me sad ;)
> >
> > Now -that- just might make a huge amount of sense! Let's see...
> >
> > o We hold the rcu_node structure's ->lock when checking the owner
> > (looks like rt_mutex_owner() is the right API).
> >
> > o We hold the rcu_node structure's ->lock when doing
> > rt_mutex_init_proxy_locked().
> >
> > o We -don't- hold ->lock when releasing the rt_mutex, but that
> > should be OK: The owner is releasing it, and it is going to
> > not-owned, so no other task can possibly see ownership moving
> > to/from them.
> >
> > o The rcu_node structure grows a bit, but not enough to worry
> > about, and on most systems, the decrease in task_struct size
> > will more than outweigh the increase in rcu_node size.
> >
> > Looks quite promising, how about the following? (Hey, it builds, so it
> > must be correct, right?)
>
> True. Why should we have users if we would test the crap we produce?
Well, it seems to be passing initial tests as well. Must be my tests
need more work.
> Just FYI, I have a patch pending which makes the release safe.
>
> http://marc.info/?l=linux-kernel&m=140251240630730&w=2
Very good, belt -and- suspenders! Might even work that way. ;-)
I could argue for a cpu_relax() in the "while (!rt_mutex_has_waiters(lock))"
loop for the case in which the CPU enqueuing itself is preempted, possibly
by a hypervisor, but either way:
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Thanx, Paul
next prev parent reply other threads:[~2014-06-12 22:27 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-03 17:02 [BUG] signal: sighand unprotected when accessed by /proc Steven Rostedt
2014-06-03 17:26 ` Oleg Nesterov
2014-06-03 18:03 ` Linus Torvalds
2014-06-03 20:01 ` Oleg Nesterov
2014-06-03 20:03 ` Oleg Nesterov
2014-06-06 20:33 ` Paul E. McKenney
2014-06-08 13:07 ` safety of *mutex_unlock() (Was: [BUG] signal: sighand unprotected when accessed by /proc) Oleg Nesterov
2014-06-09 16:26 ` Paul E. McKenney
2014-06-09 18:15 ` Oleg Nesterov
2014-06-09 18:29 ` Steven Rostedt
2014-06-09 18:51 ` Linus Torvalds
2014-06-09 19:41 ` Steven Rostedt
2014-06-10 8:53 ` Thomas Gleixner
2014-06-10 16:57 ` Oleg Nesterov
2014-06-10 18:08 ` Thomas Gleixner
2014-06-10 18:13 ` Steven Rostedt
2014-06-10 20:05 ` Thomas Gleixner
2014-06-10 20:13 ` Thomas Gleixner
2014-06-11 15:52 ` Paul E. McKenney
2014-06-11 17:07 ` Oleg Nesterov
2014-06-11 17:17 ` Oleg Nesterov
2014-06-11 17:29 ` Paul E. McKenney
2014-06-11 17:59 ` Oleg Nesterov
2014-06-11 19:56 ` Paul E. McKenney
2014-06-12 17:28 ` Oleg Nesterov
2014-06-12 20:35 ` Paul E. McKenney
2014-06-12 21:40 ` Thomas Gleixner
2014-06-12 22:27 ` Paul E. McKenney [this message]
2014-06-12 23:19 ` Paul E. McKenney
2014-06-13 15:08 ` Oleg Nesterov
2014-06-15 5:40 ` Paul E. McKenney
2014-06-17 18:57 ` Paul E. McKenney
2014-06-18 16:43 ` Oleg Nesterov
2014-06-18 16:53 ` Steven Rostedt
2014-06-21 19:54 ` Thomas Gleixner
2014-06-18 17:00 ` Paul E. McKenney
2014-06-13 14:55 ` Oleg Nesterov
2014-06-13 16:10 ` Thomas Gleixner
2014-06-13 16:19 ` Oleg Nesterov
2014-06-13 14:52 ` Oleg Nesterov
2014-06-11 17:27 ` Paul E. McKenney
2014-06-10 17:07 ` Oleg Nesterov
2014-06-10 17:51 ` Thomas Gleixner
2014-06-10 12:56 ` Paul E. McKenney
2014-06-10 14:48 ` Peter Zijlstra
2014-06-10 15:18 ` Paul E. McKenney
2014-06-10 15:35 ` Linus Torvalds
2014-06-10 16:15 ` Paul E. McKenney
2014-06-09 19:04 ` Oleg Nesterov
2014-06-10 8:37 ` Peter Zijlstra
2014-06-10 12:52 ` Paul E. McKenney
2014-06-10 13:01 ` Peter Zijlstra
2014-06-10 14:36 ` Paul E. McKenney
2014-06-10 15:20 ` Paul E. McKenney
2014-06-03 20:05 ` [BUG] signal: sighand unprotected when accessed by /proc Steven Rostedt
2014-06-03 20:09 ` Oleg Nesterov
2014-06-03 20:15 ` Steven Rostedt
2014-06-03 20:25 ` Steven Rostedt
2014-06-03 21:12 ` Thomas Gleixner
2014-06-03 18:05 ` Steven Rostedt
2014-06-03 19:25 ` Oleg Nesterov
2014-06-04 1:16 ` Steven Rostedt
2014-06-04 16:31 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140612222748.GF4581@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.