Re: lock_task_sighand() && rcu_boost()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: lock_task_sighand() && rcu_boost()
Date: Sun, 4 May 2014 15:38:04 -0700	[thread overview]
Message-ID: <20140504223804.GF8754@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140504191757.GA11319@redhat.com>

On Sun, May 04, 2014 at 09:17:57PM +0200, Oleg Nesterov wrote:
> On 05/04, Paul E. McKenney wrote:
> >
> > On Sat, May 03, 2014 at 06:11:33PM +0200, Oleg Nesterov wrote:
> > >
> > > OK, if we can't rcu_read_unlock() with irqs disabled, then we can at least
> > > cleanup it (and document the problem).
> >
> > Just to clarify (probably unnecessarily), it is OK to invoke rcu_read_unlock()
> > with irqs disabled, but only if preemption has been disabled throughout
> > the entire RCU read-side critical section.
> 
> Yes, yes, I understand, thanks.
> 
> > > and add rcu_read_unlock() into unlock_task_sighand().
> >
> > That should also work.
> 
> OK.
> 
> > > But. I simply can't understand why lockdep should complain? Why it is bad
> > > to lock/unlock ->wait_lock with irqs disabled?
> >
> > Well, lockdep doesn't -always- complain, and some cases are OK.
> >
> > The problem is that if the RCU read-side critical section has been
> > preempted, and if this task gets RCU priority-boosted in the meantime,
> > then the task will need to acquire scheduler rq and pi locks at
> > rcu_read_unlock() time.
> 
> Yes,
> 
> > If the reason that interrupts are disabled at
> > rcu_read_unlock() time is that either rq or pi locks are held (or some
> > other locks are held that are normally acquired while holding rq or
> > pi locks), then we can deadlock.  And lockdep will of course complain.
> 
> Yes. but not in this case?
> 
> > If I recall corectly, at one point, the ->siglock lock was acquired
> > while holding the rq locks, which would have resulted in lockdep
> > complaints.
> 
> No, this must not be possible. signal_wake_up_state() was always called
> under ->siglock and it does wake_up_state() which takes rq/pi locks.
> 
> And if lock_task_sighand() is preempted after rcu_read_lock(), then the
> caller doesn't hold any lock.
> 
> So perhaps we can revert a841796f11c90d53 ?

Or just update it, your choice.

> Otherwise please see below.
> 
> > Hmmm...  A better description of the bad case might be as follows:
> >
> > 	Deadlock can occur if you have an RCU read-side critical
> > 	section that is anywhere preemptible, and where the outermost
> > 	rcu_read_unlock() is invoked while holding and lock acquired
> > 	by either wakeup_next_waiter() or rt_mutex_adjust_prio(),
> > 	or while holding any lock that is ever acquired while holding
> > 	one of those locks.
> >
> > Does that help?
> >
> > Avoiding this bad case could be a bit ugly, as it is a dynamic set
> > of locks that is acquired while holding any lock acquired by either
> > wakeup_next_waiter() or rt_mutex_adjust_prio().  So I simplified the
> > rule by prohibiting invoking rcu_read_unlock() with irqs disabled
> > if the RCU read-side critical section had ever been preemptible.
> 
> OK, if you prefer to enforce this rule even if (say) lock_task_sighand()
> is fine, then it needs the comment. And a cleanup ;)

Please see below for a proposed comment.  Thinking more about it, I list
both rules and leave the choice to the caller.  Please see the end of
this email for a patch adding a comment to rcu_read_unlock().

> We can move rcu_read_unlock() into unlock_task_sighand() as I suggested
> before, or we can simply add preempt_disable/enable into lock_(),
> 
> 	struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> 						   unsigned long *flags)
> 	{
> 		struct sighand_struct *sighand;
> 		/*
> 		 * COMMENT TO EXPLAIN WHY
> 		 */
> 		preempt_disable();
> 		rcu_read_lock();
> 		for (;;) {
> 			sighand = rcu_dereference(tsk->sighand);
> 			if (unlikely(sighand == NULL))
> 				break;
> 
> 			spin_lock_irqsave(&sighand->siglock, *flags);
> 			if (likely(sighand == tsk->sighand))
> 				break;
> 			spin_unlock_irqrestore(&sighand->siglock, *flags);
> 		}
> 		rcu_read_unlock();
> 		preempt_enable();
> 
> 		return sighand;
> 	}
> 
> The only problem is the "COMMENT" above. Perhaps the "prohibit invoking
> rcu_read_unlock() with irqs disabled if ..." rule should documented
> near/above rcu_read_unlock() ? In this case that COMMENT could simply
> say "see the comment above rcu_read_unlock()".
> 
> What do you think?

Looks good to me!

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index ca6fe55913b7..17ac3c63415f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -884,6 +884,27 @@ static inline void rcu_read_lock(void)
 /**
  * rcu_read_unlock() - marks the end of an RCU read-side critical section.
  *
+ * In most situations, rcu_read_unlock() is immune from deadlock.
+ * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock()
+ * is responsible for deboosting, which it does via rt_mutex_unlock().
+ * However, this function acquires the scheduler's runqueue and
+ * priority-inheritance spinlocks.  Thus, deadlock could result if the
+ * caller of rcu_read_unlock() already held one of these locks or any lock
+ * acquired while holding them.
+ *
+ * That said, RCU readers are never priority boosted unless they were
+ * preempted.  Therefore, one way to avoid deadlock is to make sure
+ * that preemption never happens within any RCU read-side critical
+ * section whose outermost rcu_read_unlock() is called with one of
+ * rt_mutex_unlock()'s locks held.
+ *
+ * Given that the set of locks acquired by rt_mutex_unlock() might change
+ * at any time, a somewhat more future-proofed approach is to make sure that
+ * that preemption never happens within any RCU read-side critical
+ * section whose outermost rcu_read_unlock() is called with one of
+ * irqs disabled.  This approach relies on the fact that rt_mutex_unlock()
+ * currently only acquires irq-disabled locks.
+ *
  * See rcu_read_lock() for more information.
  */
 static inline void rcu_read_unlock(void)

next prev parent reply	other threads:[~2014-05-04 22:38 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-03 16:11 lock_task_sighand() && rcu_boost() Oleg Nesterov
2014-05-04 18:01 ` Paul E. McKenney
2014-05-04 19:17   ` Oleg Nesterov
2014-05-04 22:38     ` Paul E. McKenney [this message]
2014-05-05 13:26       ` Oleg Nesterov
2014-05-05 15:26         ` Paul E. McKenney
2014-05-05 16:47           ` Oleg Nesterov
2014-05-05 18:53             ` [PATCH] signal: Simplify __lock_task_sighand() Oleg Nesterov
2014-05-05 19:55               ` Oleg Nesterov
2014-05-05 20:56               ` Paul E. McKenney

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:ca6fe55913b dfblob:17ac3c63415 )
 OR (
bs:"Re: lock_task_sighand() && rcu_boost()" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140504223804.GF8754@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.