public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	Anna-Maria Gleixner <anna-maria@linutronix.de>
Subject: Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore
Date: Fri, 04 May 2018 23:38:37 -0500	[thread overview]
Message-ID: <87k1siaf8y.fsf@xmission.com> (raw)
In-Reply-To: <20180504203713.GY26088@linux.vnet.ibm.com> (Paul E. McKenney's message of "Fri, 4 May 2018 13:37:13 -0700")

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:

> On Fri, May 04, 2018 at 03:08:40PM -0500, Eric W. Biederman wrote:
>> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>> 
>> > On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote:
>> >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>> >> 
>> >> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote:
>> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>> >> >> 
>> >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote:
>> >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>> >> >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de>
>> >> >> > …
>> >> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make >
>> >> >> >> > wait_lock irq safe") for different reason.
>> >> >> >> 
>> >> >> >> Which tree has this change been made in?  I am not finding the commit
>> >> >> >> you mention above in Linus's tree.
>> >> >> >
>> >> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make
>> >> >> > wait_lock irq safe").
>> >> >> 
>> >> >> Can you fix that in your patch description and can you also up the
>> >> >> description of rcu_read_unlock?
>> >> >> 
>> >> >> If we don't need to jump through hoops it looks very reasonable to
>> >> >> remove this unnecessary logic.  But we should fix the description
>> >> >> in rcu_read_unlock that still says we need these hoops.
>> >> >
>> >> > The hoops are still required for rcu_read_lock(), otherwise you
>> >> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels.
>> >> > What happens with this patch (if I understand it correctly) is that the
>> >> > signal code now uses a different way of jumping through the hoops.
>> >> > But the hoops are still jumped through.
>> >> 
>> >> The patch changes:
>> >> 
>> >> local_irq_disable();
>> >> rcu_read_lock();
>> >> spin_lock();
>> >> rcu_read_unlock();
>> >> 
>> >> to:
>> >> 
>> >> rcu_read_lock();
>> >> spin_lock_irq();
>> >> rcu_read_unlock();
>> >> 
>> >> Now that I have a chance to relfect on it the fact that the patern
>> >> that is being restored does not work is scary.  As the failure has
>> >> nothing to do with lock ordering and people won't realize what is going
>> >> on.  Especially since the common rcu modes won't care.
>> >> 
>> >> So is it true that taking spin_lock_irq before calling rcu_read_unlock
>> >> is a problem because of rt_mutex_unlock()?  Or has b4abf91047cf ("rtmutex: Make
>> >> wait_lock irq safe") actually fixed that and we can correct the
>> >> documentation of rcu_read_unlock() ?  And fix __lock_task_sighand?
>> >
>> > The problem is that the thing taking the lock might be the scheduler,
>> > or one of the locks taken while the scheduler's pi and rq locks are
>> > held.  This occurs only with RCU-preempt.
>> >
>> > Here is what can happen:
>> >
>> > o	A task does rcu_read_lock().
>> >
>> > o	That task is preempted.
>> >
>> > o	That task stays preempted for a long time, and is therefore
>> > 	priority boosted.  This boosting involves a high-priority RCU
>> > 	kthread creating an rt_mutex, pretending that the preempted task
>> > 	already holds it, and then acquiring it.
>> >
>> > o	The task awakens, acquires the scheduler's rq lock, and
>> > 	then does rcu_read_unlock().
>> >
>> > o	Because the task has been priority boosted, __rcu_read_unlock()
>> > 	invokes the rcu_read_unlock_special() slowpath, which does
>> > 	(as you say) rt_mutex_unlock() to deboost.  The deboosting
>> > 	can cause the scheduler to acquire the rq and pi locks, which
>> > 	results in deadlock.
>> >
>> > In contrast, holding these scheduler locks across the entirety of the
>> > RCU-preempt read-side critical section is harmless because then the
>> > critical section cannot be preempted, which means that priority boosting
>> > cannot happen, which means that there will be no need to deboost at
>> > rcu_read_unlock() time.
>> >
>> > This restriction has not changed, and as far as I can see is inherent
>> > in the fact that RCU uses the scheduler and the scheduler uses RCU.
>> > There is going to be an odd corner case in there somewhere!
>> 
>> However if I read things correctly b4abf91047cf ("rtmutex: Make
>> wait_lock irq safe") did change this.
>> 
>> In particular it changed things so that it is only the scheduler locks
>> that matter, not any old lock that disabled interrupts.  This was done
>> by disabling disabling interrupts when taking the wait_lock.
>> 
>> The rcu_read_unlock documentation states:
>> 
>>  * In most situations, rcu_read_unlock() is immune from deadlock.
>>  * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock()
>>  * is responsible for deboosting, which it does via rt_mutex_unlock().
>>  * Unfortunately, this function acquires the scheduler's runqueue and
>>  * priority-inheritance spinlocks.  This means that deadlock could result
>>  * if the caller of rcu_read_unlock() already holds one of these locks or
>>  * any lock that is ever acquired while holding them; or any lock which
>>  * can be taken from interrupt context because rcu_boost()->rt_mutex_lock()
>>  * does not disable irqs while taking ->wait_lock.
>> 
>> So we can now remove the clause:
>>  * ; or any lock which
>>  * can be taken from interrupt context because rcu_boost()->rt_mutex_lock()
>>  * does not disable irqs while taking ->wait_lock.
>> 
>> Without the any lock that disabled interrupts restriction it is now safe
>> to not worry about the issues with the scheduler locks and the rt_mutex
>> Which does make it safe to not worry about these crazy complexities in
>> lock_task_sighand.
>> 
>> Paul do you agree or is the patch unsafe?
>
> Ah, I thought you were trying to get rid of all but the first line of
> that paragraph, not just the final clause.  Apologies for my confusion!
>
> It looks plausible, but the patch should be stress-tested on a preemptible
> kernel with priority boosting enabled.  Has that been done?
>
> (Me, I would run rcutorture scenario TREE03 for an extended time period
> on b4abf91047cf with your patch applied.  But what testing have you
> done already?)

Not my patch.  I was just the reviewer asking for some obvious cleanups.
So people won't get confused.  Sebastian?  Can you tell Paul what
testing you have done?

Eric

  reply	other threads:[~2018-05-05  4:38 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-04 14:40 [PATCH] kernel/signal: Remove no longer required irqsave/restore Sebastian Andrzej Siewior
2018-05-04 16:59 ` Eric W. Biederman
2018-05-04 17:04   ` Sebastian Andrzej Siewior
2018-05-04 17:17     ` Eric W. Biederman
2018-05-04 17:52       ` Paul E. McKenney
2018-05-04 19:03         ` Eric W. Biederman
2018-05-04 19:45           ` Paul E. McKenney
2018-05-04 20:08             ` Eric W. Biederman
2018-05-04 20:37               ` Paul E. McKenney
2018-05-05  4:38                 ` Eric W. Biederman [this message]
2018-05-05  5:25                   ` Paul E. McKenney
2018-05-05  5:56                     ` Thomas Gleixner
2018-05-08 13:42                       ` Anna-Maria Gleixner
2018-05-08 14:53                         ` Paul E. McKenney
2018-05-08 15:49                           ` Anna-Maria Gleixner
2018-05-08 16:53                             ` Paul E. McKenney
2018-06-07 20:21 ` [tip:core/urgent] signal: " tip-bot for Anna-Maria Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k1siaf8y.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=anna-maria@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox