Re: [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Krister Johansen <kjlx@templeofstupid.com>,
	Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up.
Date: Tue, 13 Jun 2017 20:58:43 -0700	[thread overview]
Message-ID: <20170614035843.GI3721@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170613211547.49814d25@gandalf.local.home>

On Tue, Jun 13, 2017 at 09:15:47PM -0400, Steven Rostedt wrote:
> On Tue, 13 Jun 2017 16:42:05 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jun 13, 2017 at 07:23:08PM -0400, Steven Rostedt wrote:
> > > On Fri, 9 Jun 2017 05:45:54 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jun 09, 2017 at 09:19:57AM +0200, Peter Zijlstra wrote:  
> > > > > On Thu, Jun 08, 2017 at 08:25:46PM -0700, Krister Johansen wrote:    
> > > > > > The behavior of swake_up() differs from that of wake_up(), and from the
> > > > > > swake_up() that came from RT linux. A memory barrier, or some other
> > > > > > synchronization, is needed prior to a swake_up so that the waiter sees
> > > > > > the condition set by the waker, and so that the waker does not see an
> > > > > > empty wait list.    
> > > > > 
> > > > > Urgh.. let me stare at that. But it sounds like the wrong solution since
> > > > > we wanted to keep the wait and swait APIs as close as possible.    
> > > > 
> > > > But don't they both need some sort of ordering, be it memory barriers or
> > > > locking, to handle the case where the wait/swait doesn't actually sleep?
> > > >   
> > > 
> > > Looking at an RCU example, and assuming that ordering can move around
> > > within a spin lock, and that changes can leak into a spin lock region
> > > from both before and after. Could we have:
> > > 
> > > (looking at __call_rcu_core() and rcu_gp_kthread()
> > > 
> > > 	CPU0				CPU1
> > > 	----				----
> > > 				__call_rcu_core() {
> > > 
> > > 				 spin_lock(rnp_root)
> > > 				 need_wake = __rcu_start_gp() {
> > > 				  rcu_start_gp_advanced() {
> > > 				   gp_flags = FLAG_INIT
> > > 				  }
> > > 				 }
> > > 
> > >  rcu_gp_kthread() {
> > >    swait_event_interruptible(wq,
> > > 	gp_flags & FLAG_INIT) {
> > >    spin_lock(q->lock)
> > > 
> > > 				*fetch wq->task_list here! *
> > > 
> > >    list_add(wq->task_list, q->task_list)
> > >    spin_unlock(q->lock);
> > > 
> > >    *fetch old value of gp_flags here *  
> > 
> > Both reads of ->gp_flags are READ_ONCE(), so having seen the new value
> > in swait_event_interruptible(), this task/CPU cannot see the old value
> > from some later access.  You have to have accesses to two different
> > variables to require a memory barrier (at least assuming consistent use
> > of READ_ONCE(), WRITE_ONCE(), or equivalent).
> 
> If I'm not mistaken, READ_ONCE() and WRITE_ONCE() is just volatiles
> added. The compiler may not leak or move the the fetches, but what
> about the hardware?

The hardware cannot move the references if both references are in the
same thread and to the same variable, which is the case with ->gp_flags.

> A spin_lock() only needs to make sure what is after it does not leak
> before it.
> 
> A spin_unlock() only needs to make sure what is before it must not leak
> after it.

Both true, with the exception of a spin_is_locked() to that same
lock variable, which cannot be reordered with either spin_lock() or
spin_unlock() in either direction.

> From my understandings of reading memory-barrier.txt, there's no
> guarantees that the hardware doesn't let reads or writes that happen
> before a spin_lock() happen after it. Nor does it guarantee that reads
> or writes that happen after a spin_unlock() doesn't happen before it.
> 
> The spin_locks only need to protect the inside of the critical section,
> not the outside of it leaking in.

Again, quite true.

> I'm looking at this in particular:
> 
> ====
>   (1) ACQUIRE operation implication:
> 
>      Memory operations issued after the ACQUIRE will be completed after the
>      ACQUIRE operation has completed.
> 
>      Memory operations issued before the ACQUIRE may be completed after
>      the ACQUIRE operation has completed.  An smp_mb__before_spinlock(),
>      combined with a following ACQUIRE, orders prior stores against
>      subsequent loads and stores.  Note that this is weaker than smp_mb()!
>      The smp_mb__before_spinlock() primitive is free on many architectures.
> 
>  (2) RELEASE operation implication:
> 
>      Memory operations issued before the RELEASE will be completed before the
>      RELEASE operation has completed.
> 
>      Memory operations issued after the RELEASE may be completed before the
>      RELEASE operation has completed.
> ====

And here is the part you also need to look at:

====

 (*) Overlapping loads and stores within a particular CPU will appear to be
     ordered within that CPU.  This means that for:

	a = READ_ONCE(*X); WRITE_ONCE(*X, b);

     the CPU will only issue the following sequence of memory operations:

	a = LOAD *X, STORE *X = b

     And for:

	WRITE_ONCE(*X, c); d = READ_ONCE(*X);

     the CPU will only issue:

	STORE *X = c, d = LOAD *X

     (Loads and stores overlap if they are targeted at overlapping pieces of
     memory).

====

This section needs some help -- the actual guarantee is stronger, that
all CPUs will agree on the order of volatile same-sized aligned accesses
to a given single location.  So if a previous READ_ONCE() sees the new
value, any subsequent READ_ONCE() from that same variable is guaranteed
to also see the new value (or some later value).

Does that help, or am I missing something here?

							Thanx, Paul

> -- Steve
> 
> 
> > 
> > > 				 spin_unlock(rnp_root)
> > > 
> > > 				 rcu_gp_kthread_wake() {
> > > 				  swake_up(wq) {
> > > 				   swait_active(wq) {
> > > 				    list_empty(wq->task_list)
> > > 
> > > 				   } * return false *
> > > 
> > >   if (condition) * false *
> > >     schedule();
> > > 
> > > Looks like a memory barrier is missing. Perhaps we should slap on into
> > > swait_active()? I don't think it is wise to let users add there own, as
> > > I think we currently have bugs now.  
> > 
> > I -know- I have bugs now.  ;-)
> > 
> > But I don't believe this is one of them.  Or am I getting confused?
> > 
> > 							Thanx, Paul
>

next prev parent reply	other threads:[~2017-06-14  3:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-09  3:25 [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up Krister Johansen
2017-06-09  7:19 ` Peter Zijlstra
2017-06-09 12:45   ` Paul E. McKenney
2017-06-13 23:23     ` Steven Rostedt
2017-06-13 23:42       ` Paul E. McKenney
2017-06-14  1:15         ` Steven Rostedt
2017-06-14  3:58           ` Paul E. McKenney [this message]
2017-06-14 13:10             ` Steven Rostedt
2017-06-14 15:02               ` Steven Rostedt
2017-06-14 16:25                 ` Krister Johansen
2017-06-15  4:18                   ` Boqun Feng
2017-06-15 17:56                     ` Paul E. McKenney
2017-06-16  1:07                       ` Boqun Feng
2017-06-16  3:09                         ` Paul E. McKenney
2017-08-10 12:10                     ` [tip:locking/core] sched/wait: Remove the lockless swait_active() check in swake_up*() tip-bot for Boqun Feng
2017-06-14 15:55               ` [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614035843.GI3721@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=kjlx@templeofstupid.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.