linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled
@ 2013-08-19 15:35 Steven Rostedt
  2013-08-19 15:35 ` [PATCH RT 1/3] rcu/swait: Fix RCU conversion of wake_up_all() to swait_wake() Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Steven Rostedt @ 2013-08-19 15:35 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Clark Williams, Paul E. McKenney


Thomas and Sebastian,

These three patches fixes the boot problem that Carsten was having
with 3.10-rt and RCU_NOCB_ALL

Steven Rostedt (3):
      rcu/swait: Fix RCU conversion of wake_up_all() to swait_wake()
      swait: Add memory barrier before checking list empty
      swait: Add smp_mb() after setting h->list

----
 kernel/rcutree_plugin.h |    2 +-
 kernel/wait-simple.c    |    4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RT 1/3] rcu/swait: Fix RCU conversion of wake_up_all() to swait_wake()
  2013-08-19 15:35 [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Steven Rostedt
@ 2013-08-19 15:35 ` Steven Rostedt
  2013-08-19 15:35 ` [PATCH RT 2/3] swait: Add memory barrier before checking list empty Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2013-08-19 15:35 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Clark Williams, Paul E. McKenney

[-- Attachment #1: 0001-rcu-swait-Fix-RCU-conversion-of-wake_up_all-to-swait.patch --]
[-- Type: text/plain, Size: 1055 bytes --]

From: Steven Rostedt <rostedt@goodmis.org>

Reverting the rcu swait patches fixed the boot problem. Then when I was
looking at the revert itself, this stood out like a sore thumb.

 static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) {
-       swait_wake(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
+       wake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
 }

See the problem there?

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/rcutree_plugin.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 684b762..dc0c4b2 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2036,7 +2036,7 @@ static int rcu_nocb_needs_gp(struct rcu_state *rsp)
  */
 static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
 {
-	swait_wake(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
+	swait_wake_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
 }
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH RT 2/3] swait: Add memory barrier before checking list empty
  2013-08-19 15:35 [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Steven Rostedt
  2013-08-19 15:35 ` [PATCH RT 1/3] rcu/swait: Fix RCU conversion of wake_up_all() to swait_wake() Steven Rostedt
@ 2013-08-19 15:35 ` Steven Rostedt
  2013-08-19 15:51   ` Steven Rostedt
  2013-08-19 15:35 ` [PATCH RT 3/3] swait: Add smp_mb() after setting h->list Steven Rostedt
  2013-08-21 13:41 ` [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Sebastian Andrzej Siewior
  3 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2013-08-19 15:35 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Clark Williams, Paul E. McKenney

[-- Attachment #1: 0002-swait-Add-memory-barrier-before-checking-list-empty.patch --]
[-- Type: text/plain, Size: 1587 bytes --]

From: Steven Rostedt <rostedt@goodmis.org>

There's a race condition with swait wakeups and adding to the list. The
__swait_wake() does a check for swait_head_has_waiters(), and if it is
empty it will exit without doing any wake ups. The problem is that the
check does not include any memory barriers before it makes a decision
to wake up or not.

	CPU0				CPU1
	----				----

  condition = 1

  load h->list (is empty)
				    raw_spin_lock(hlist->lock)
				    hlist_add();
				    __set_current_state();
				    raw_spin_unlock(hlist->lock)
  swait_wake()
   swait_head_has_waiters()
   (sees h->list as empty and returns)

				    check_condition (sees condition = 0)

   store condition = 1

				    schedule()

Now the task on CPU1 has just missed its wakeup. By adding a memory
barrier before the list empty check, we fix the problem of miss seeing
the list not empty as well as pushing out the condition for the other
task to see.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/wait-simple.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/wait-simple.c b/kernel/wait-simple.c
index 4b9a0b5..9725a11 100644
--- a/kernel/wait-simple.c
+++ b/kernel/wait-simple.c
@@ -27,6 +27,8 @@ static inline void __swait_dequeue(struct swaiter *w)
 /* Check whether a head has waiters enqueued */
 static inline bool swait_head_has_waiters(struct swait_head *h)
 {
+	/* Make sure the condition is visible before checking list_empty() */
+	smp_mb();
 	return !list_empty(&h->list);
 }
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH RT 3/3] swait: Add smp_mb() after setting h->list
  2013-08-19 15:35 [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Steven Rostedt
  2013-08-19 15:35 ` [PATCH RT 1/3] rcu/swait: Fix RCU conversion of wake_up_all() to swait_wake() Steven Rostedt
  2013-08-19 15:35 ` [PATCH RT 2/3] swait: Add memory barrier before checking list empty Steven Rostedt
@ 2013-08-19 15:35 ` Steven Rostedt
  2013-08-21 13:41 ` [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Sebastian Andrzej Siewior
  3 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2013-08-19 15:35 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Clark Williams, Paul E. McKenney

[-- Attachment #1: 0003-swait-Add-smp_mb-after-setting-h-list.patch --]
[-- Type: text/plain, Size: 1387 bytes --]

From: Steven Rostedt <rostedt@goodmis.org>

The raw_spin_unlock() is not a full memory barrier. It only keeps
things from leaking past it, but does not prevent leaks from entering
the critical section. That is:

 	p = 1;

 	raw_spin_lock();
 	[...]
 	raw_spin_unlock();

 	y = x

 Can turn into:

  	p = 1;

 	raw_spin_lock();

 	load x

 	store p = 1

 	raw_spin_unlock();

 	y = x

This means that the condition check in __swait_event() (and friends)
can be seen before the h->list is set.

 	raw_spin_lock();

 	load condition;

 	store h->list;

 	raw_spin_unlock();

And the other CPU can see h->list as empty, and this CPU see condition
as not set, and possibly miss the wake up.

To prevent this from happening, add an mb() after setting the h->list.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/wait-simple.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/wait-simple.c b/kernel/wait-simple.c
index 9725a11..2c85626 100644
--- a/kernel/wait-simple.c
+++ b/kernel/wait-simple.c
@@ -16,6 +16,8 @@
 static inline void __swait_enqueue(struct swait_head *head, struct swaiter *w)
 {
 	list_add(&w->node, &head->list);
+	/* We can't let the condition leak before the setting of head */
+	smp_mb();
 }
 
 /* Removes w from head->list. Must be called with head->lock locked. */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RT 2/3] swait: Add memory barrier before checking list empty
  2013-08-19 15:35 ` [PATCH RT 2/3] swait: Add memory barrier before checking list empty Steven Rostedt
@ 2013-08-19 15:51   ` Steven Rostedt
  2013-08-19 16:49     ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2013-08-19 15:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, Clark Williams,
	Paul E. McKenney

On Mon, 19 Aug 2013 11:35:32 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> 
> There's a race condition with swait wakeups and adding to the list. The
> __swait_wake() does a check for swait_head_has_waiters(), and if it is
> empty it will exit without doing any wake ups. The problem is that the
> check does not include any memory barriers before it makes a decision
> to wake up or not.
> 
> 	CPU0				CPU1
> 	----				----
> 
>   condition = 1
> 
>   load h->list (is empty)
> 				    raw_spin_lock(hlist->lock)
> 				    hlist_add();
> 				    __set_current_state();
> 				    raw_spin_unlock(hlist->lock)
>   swait_wake()
>    swait_head_has_waiters()
>    (sees h->list as empty and returns)
> 

BTW, the race still exists if you move the raw_spin_unlock(hlist->lock)
above to here. That is:

   swait_wake()
    swait_head_has_waiters()
     (sees h->list as empty and returns)

				    raw_spin_unlock(hlist->lock)


Maybe this will help to understand it more.

-- Steve

> 				    check_condition (sees condition = 0)
> 
>    store condition = 1
> 
> 				    schedule()
> 
> Now the task on CPU1 has just missed its wakeup. By adding a memory
> barrier before the list empty check, we fix the problem of miss seeing
> the list not empty as well as pushing out the condition for the other
> task to see.
> 
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RT 2/3] swait: Add memory barrier before checking list empty
  2013-08-19 15:51   ` Steven Rostedt
@ 2013-08-19 16:49     ` Steven Rostedt
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2013-08-19 16:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, Clark Williams,
	Paul E. McKenney

On Mon, 19 Aug 2013 11:51:55 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 19 Aug 2013 11:35:32 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > From: Steven Rostedt <rostedt@goodmis.org>
> > 
> > There's a race condition with swait wakeups and adding to the list. The
> > __swait_wake() does a check for swait_head_has_waiters(), and if it is
> > empty it will exit without doing any wake ups. The problem is that the
> > check does not include any memory barriers before it makes a decision
> > to wake up or not.
> > 
> > 	CPU0				CPU1
> > 	----				----
> > 
> >   condition = 1
> > 
> >   load h->list (is empty)
> > 				    raw_spin_lock(hlist->lock)
> > 				    hlist_add();
> > 				    __set_current_state();
> > 				    raw_spin_unlock(hlist->lock)
> >   swait_wake()
> >    swait_head_has_waiters()
> >    (sees h->list as empty and returns)
> > 
> 
> BTW, the race still exists if you move the raw_spin_unlock(hlist->lock)
> above to here. That is:
> 
>    swait_wake()
>     swait_head_has_waiters()
>      (sees h->list as empty and returns)
> 
> 				    raw_spin_unlock(hlist->lock)
> 
> 
> Maybe this will help to understand it more.

As with all memory barrier issues, it can be confusing. To try to
describe this even better, the bug happens because the load of h->list
can happen before the store of condition by the waker.

The wake up code requires that the condition is written out first, and
then the check to see if waiters exist happens (which is the check if
h->list is empty or not).

To prevent missing a wake up, the waiter will add itself to the wait
queue (h->list), then check the condition. If the condition is already
set, it puts back its state, and continues without calling schedule. If
the condition is not set, its safe to call schedule because if the task
that is to wake it up will do so after setting the condition. But
that's assuming that it knows to wake it up (because its on the list).

If the condition is stored after the check of whether or not there are
waiters, then we can not guarantee we will wake up the task waiting for
the condition to arise. What happens instead, the waker checks the list,
sees nothing on it, and returns (the condition is still in the process
of being stored by the CPU, but hasn't yet made it to memory). The
waiter then sets itself on the list to be woken, and checks the
condition. But since the condition hasn't been written out yet, it goes
to sleep. But this time, nobody is there to wake it up.

By adding the memory barrier before checking the list to see if anyone
is waiting for the condition, we force the condition out to memory and
so it will be seen by the waiter before it goes to sleep, or after it
added itself to the waiter list. In any case, the memory barrier will
either have the waiter see the condition is set, or the waker will see
the waiter is on the list. Either case, the waiter will be woken up.

Now, patch 3/3 handles the case where the waiter may read the condition
before it sets itself on the list. This too is bad, because it can read
the condition before the waker sets it, but then the waker can read the
list before the waiter adds itself to the list. This is also a missed
wakeup.

But luckily, this case on x86 isn't a issue because the
raw_spin_unlock() will prevent that from happening, because on x86,
raw_spin_unlock() is a full memory barrier. But it may not be on other
architectures. If other architectures only flush out what has been
written, it does not guarantee that something could have been read
early, that would let the condition leak before the h->list is stored.
That's why I kept patch 3/3 different than this patch. This patch is a
bug on all SMP architectures, where as 3/3 is only a bug on specific
architectures.

-- Steve


> 
> > 				    check_condition (sees condition = 0)
> > 
> >    store condition = 1
> > 
> > 				    schedule()
> > 
> > Now the task on CPU1 has just missed its wakeup. By adding a memory
> > barrier before the list empty check, we fix the problem of miss seeing
> > the list not empty as well as pushing out the condition for the other
> > task to see.
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled
  2013-08-19 15:35 [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Steven Rostedt
                   ` (2 preceding siblings ...)
  2013-08-19 15:35 ` [PATCH RT 3/3] swait: Add smp_mb() after setting h->list Steven Rostedt
@ 2013-08-21 13:41 ` Sebastian Andrzej Siewior
  3 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-08-21 13:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Clark Williams, Paul E. McKenney

* Steven Rostedt | 2013-08-19 11:35:30 [-0400]:

>Thomas and Sebastian,
Hi Steven,

thanks for the series.

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-08-21 13:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-19 15:35 [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Steven Rostedt
2013-08-19 15:35 ` [PATCH RT 1/3] rcu/swait: Fix RCU conversion of wake_up_all() to swait_wake() Steven Rostedt
2013-08-19 15:35 ` [PATCH RT 2/3] swait: Add memory barrier before checking list empty Steven Rostedt
2013-08-19 15:51   ` Steven Rostedt
2013-08-19 16:49     ` Steven Rostedt
2013-08-19 15:35 ` [PATCH RT 3/3] swait: Add smp_mb() after setting h->list Steven Rostedt
2013-08-21 13:41 ` [PATCH RT 0/3] rt/rcu/swait: Fix boot up when RCU NOCB_ALL is enabled Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).