All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	rcu <rcu@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [RFC] Deadlock via recursive wakeup via RCU with threadirqs
Date: Thu, 27 Jun 2019 14:16:38 -0400	[thread overview]
Message-ID: <20190627181638.GA209455@google.com> (raw)
In-Reply-To: <20190627173831.GW26519@linux.ibm.com>

On Thu, Jun 27, 2019 at 10:38:31AM -0700, Paul E. McKenney wrote:
> On Thu, Jun 27, 2019 at 12:47:24PM -0400, Joel Fernandes wrote:
> > On Thu, Jun 27, 2019 at 11:55 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
> > >
> > > On Thu, Jun 27, 2019 at 11:30:31AM -0400, Joel Fernandes wrote:
> > > > On Thu, Jun 27, 2019 at 10:34:55AM -0400, Steven Rostedt wrote:
> > > > > On Thu, 27 Jun 2019 10:24:36 -0400
> > > > > Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > >
> > > > > > > What am I missing here?
> > > > > >
> > > > > > This issue I think is
> > > > > >
> > > > > > (in normal process context)
> > > > > > spin_lock_irqsave(rq_lock); // which disables both preemption and interrupt
> > > > > >                      // but this was done in normal process context,
> > > > > >                      // not from IRQ handler
> > > > > > rcu_read_lock();
> > > > > >           <---------- IPI comes in and sets exp_hint
> > > > >
> > > > > How would an IPI come in here with interrupts disabled?
> > > > >
> > > > > -- Steve
> > > >
> > > > This is true, could it be rcu_read_unlock_special() got called for some
> > > > *other* reason other than the IPI then?
> > > >
> > > > Per Sebastian's stack trace of the recursive lock scenario, it is happening
> > > > during cpu_acct_charge() which is called with the rq_lock held.
> > > >
> > > > The only other reasons I know off to call rcu_read_unlock_special() are if
> > > > 1. the tick indicated that the CPU has to report a QS
> > > > 2. an IPI in the middle of the reader section for expedited GPs
> > > > 3. preemption in the middle of a preemptible RCU reader section
> > >
> > > 4. Some previous reader section was IPIed or preempted, but either
> > >    interrupts, softirqs, or preemption was disabled across the
> > >    rcu_read_unlock() of that previous reader section.
> > 
> > Hi Paul, I did not fully understand 4. The previous RCU reader section
> > could not have been IPI'ed or been preempted if interrupts were
> > disabled across. Also, if softirq/preempt is disabled across the
> > previous reader section, the previous reader could not be preempted in
> > these case.
> 
> Like this, courtesy of the consolidation of RCU flavors:
> 
> 	previous_reader()
> 	{
> 		rcu_read_lock();
> 		do_something(); /* Preemption happened here. */
> 		local_irq_disable(); /* Cannot be the scheduler! */
> 		do_something_else();
> 		rcu_read_unlock();  /* Must defer QS, task still queued. */
> 		do_some_other_thing();
> 		local_irq_enable();
> 	}
> 
> 	current_reader() /* QS from previous_reader() is still deferred. */
> 	{
> 		local_irq_disable();  /* Might be the scheduler. */
> 		do_whatever();
> 		rcu_read_lock();
> 		do_whatever_else();
> 		rcu_read_unlock();  /* Must still defer reporting QS. */
> 		do_whatever_comes_to_mind();
> 		local_irq_enable();
> 	}
> 
> Both instances of rcu_read_unlock() need to cause some later thing
> to report the quiescent state, and in some cases it will do a wakeup.
> Now, previous_reader()'s IRQ disabling cannot be due to scheduler rq/pi
> locks due to the rule about holding them across the entire RCU reader
> if they are held across the rcu_read_unlock().  But current_reader()'s
> IRQ disabling might well be due to the scheduler rq/pi locks, so
> current_reader() must be careful about doing wakeups.

Makes sense now, thanks.

> > That leaves us with the only scenario where the previous reader was
> > IPI'ed while softirq/preempt was disabled across it. Is that what you
> > meant?
> 
> No, but that can also happen.
> 
> >        But in this scenario, the previous reader should have set
> > exp_hint to false in the previous reader's rcu_read_unlock_special()
> > invocation itself. So I would think t->rcu_read_unlock_special should
> > be 0 during the new reader's invocation thus I did not understand how
> > rcu_read_unlock_special can be called because of a previous reader.
> 
> Yes, exp_hint would unconditionally be set to false in the first
> reader's rcu_read_unlock().  But .blocked won't be.

Makes sense.

> > I'll borrow some of that confused color paint if you don't mind ;-)
> > And we should document this somewhere for future sanity preservation
> > :-D
> 
> Or adjust the code and requirements to make it more sane, if feasible.
> 
> My current (probably wildly unreliable) guess that the conditions in
> rcu_read_unlock_special() need adjusting.  I was assuming that in_irq()
> implies a hardirq context, in other words that in_irq() would return
> false from a threaded interrupt handler.  If in_irq() instead returns
> true from within a threaded interrupt handler, then this code in
> rcu_read_unlock_special() needs fixing:
> 
> 		if ((exp || in_irq()) && irqs_were_disabled && use_softirq &&
> 		    (in_irq() || !t->rcu_read_unlock_special.b.deferred_qs)) {
> 			// Using softirq, safe to awaken, and we get
> 			// no help from enabling irqs, unlike bh/preempt.
> 			raise_softirq_irqoff(RCU_SOFTIRQ);
> 
> The fix would be replacing the calls to in_irq() with something that
> returns true only if called from within a hardirq context.
> Thoughts?

I am not sure if this will fix all cases though?

I think the crux of the problem is doing a recursive wake up. The threaded
IRQ probably just happens to be causing it here, it seems to me this problem
can also occur on a non-threaded irq system (say current_reader() in your
example executed in a scheduler path in process-context and not from an
interrupt). Is that not possible?

I think the fix should be to prevent the wake-up not based on whether we are
in hard/soft-interrupt mode but that we are doing the rcu_read_unlock() from
a scheduler path (if we can detect that)

I lost track of this code:
 		if ((exp || in_irq()) && irqs_were_disabled && use_softirq &&
 		    (in_irq() || !t->rcu_read_unlock_special.b.deferred_qs)) {

Was this patch posted to the list? I will blame it to try to get some
context. It sounds like you added more conditions on when to kick the
softirq.

> Ugh.  Same question about IRQ work.  Will the current use of it by
> rcu_read_unlock_special() cause breakage in the presence of threaded
> interrupt handlers?

/me needs to understand why the irq work stuff was added here as well. Have
my work cut out for the day! ;-)

thanks,

 - Joel


> 
> 							Thanx, Paul
> 
> > thanks,
> >  - Joel
> > 
> > 
> > 
> > >
> > > I -think- that this is what Sebastian is seeing.
> > >
> > >                                                         Thanx, Paul
> > >
> > > > 1. and 2. are not possible because interrupts are disabled, that's why the
> > > > wakeup_softirq even happened.
> > > > 3. is not possible because we are holding rq_lock in the RCU reader section.
> > > >
> > > > So I am at a bit of a loss how this can happen :-(
> > > >
> > > > Spurious call to rcu_read_unlock_special() may be when it should not have
> > > > been called?
> > > >
> > > > thanks,
> > > >
> > > > - Joel
> 

  reply	other threads:[~2019-06-27 18:16 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-26 13:54 [RFC] Deadlock via recursive wakeup via RCU with threadirqs Sebastian Andrzej Siewior
2019-06-26 16:25 ` Paul E. McKenney
2019-06-27  7:47   ` Sebastian Andrzej Siewior
2019-06-27 15:52     ` Paul E. McKenney
2019-06-27 14:24   ` Joel Fernandes
2019-06-27 14:34     ` Steven Rostedt
2019-06-27 15:30       ` Joel Fernandes
2019-06-27 15:37         ` Joel Fernandes
2019-06-27 15:40           ` Sebastian Andrzej Siewior
2019-06-27 15:42             ` Joel Fernandes
2019-06-27 17:43             ` Joel Fernandes
2019-06-27 17:46               ` Joel Fernandes
2019-06-27 18:11                 ` Paul E. McKenney
2019-06-27 18:27                   ` Joel Fernandes
2019-06-27 18:51                     ` Joel Fernandes
2019-06-27 19:14                       ` Paul E. McKenney
2019-06-27 19:15                     ` Paul E. McKenney
2019-06-27 18:30                   ` Paul E. McKenney
2019-06-27 20:45                     ` Paul E. McKenney
2019-06-27 15:55         ` Paul E. McKenney
2019-06-27 16:47           ` Joel Fernandes
2019-06-27 17:38             ` Paul E. McKenney
2019-06-27 18:16               ` Joel Fernandes [this message]
2019-06-27 18:41                 ` Paul E. McKenney
2019-06-27 20:17                   ` Scott Wood
2019-06-27 20:36                     ` Paul E. McKenney
2019-06-28  7:31                       ` Byungchul Park
2019-06-28  7:43                         ` Byungchul Park
2019-06-28  8:14                           ` Byungchul Park
2019-06-28  8:24                             ` Byungchul Park
2019-06-28 12:24                               ` Paul E. McKenney
2019-06-28  9:10                           ` Byungchul Park
2019-06-28  9:28                             ` Byungchul Park
2019-06-28 12:21                           ` Paul E. McKenney
2019-06-28 10:40                         ` Byungchul Park
2019-06-28 12:27                           ` Paul E. McKenney
2019-06-28 15:44                           ` Steven Rostedt
2019-06-29 15:12                             ` Andrea Parri
2019-06-29 16:55                               ` Paul E. McKenney
2019-06-29 18:09                                 ` Andrea Parri
2019-06-29 18:21                                   ` Andrea Parri
2019-06-29 19:15                                   ` Paul E. McKenney
2019-06-29 19:35                                     ` Andrea Parri
2019-06-30 23:55                             ` Byungchul Park
2019-06-28 14:15                       ` Peter Zijlstra
2019-06-28 15:54                         ` Paul E. McKenney
2019-06-28 16:04                           ` Peter Zijlstra
2019-06-28 17:20                             ` Paul E. McKenney
2019-07-01  9:42                               ` Peter Zijlstra
2019-07-01 10:24                                 ` Sebastian Andrzej Siewior
2019-07-01 12:23                                   ` Paul E. McKenney
2019-07-01 14:00                                     ` Peter Zijlstra
2019-07-01 16:01                                       ` Paul E. McKenney
2019-06-28 20:01                         ` Scott Wood
2019-07-01  9:45                           ` Peter Zijlstra
2019-06-28 13:54                   ` Peter Zijlstra
2019-06-28 15:30                     ` Paul E. McKenney
2019-06-28 18:40                       ` Sebastian Andrzej Siewior
2019-06-28 18:52                         ` Paul E. McKenney
2019-06-28 19:24                           ` Joel Fernandes
2019-06-28 20:04                             ` Paul E. McKenney
2019-06-28 21:40                               ` Joel Fernandes
2019-06-28 22:25                                 ` Paul E. McKenney
2019-06-28 23:12                                   ` Joel Fernandes
2019-06-29  0:06                                     ` Paul E. McKenney
2019-06-28 16:40                   ` Joel Fernandes
2019-06-28 16:45                     ` Joel Fernandes
2019-06-28 17:30                       ` Paul E. McKenney
2019-06-28 17:41                         ` Paul E. McKenney
2019-06-28 17:45                         ` Sebastian Andrzej Siewior
2019-06-28 18:07                           ` Joel Fernandes
2019-06-28 18:20                             ` Sebastian Andrzej Siewior
2019-07-01  2:08                               ` Joel Fernandes
2019-06-28 18:22                           ` Paul E. McKenney
2019-06-28 19:29                             ` Joel Fernandes
2019-06-28 20:06                               ` Paul E. McKenney
2019-06-28 18:05                         ` Joel Fernandes
2019-06-28 18:23                           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190627181638.GA209455@google.com \
    --to=joel@joelfernandes.org \
    --cc=bigeasy@linutronix.de \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.