public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Krister Johansen <kjlx@templeofstupid.com>,
	Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up.
Date: Tue, 13 Jun 2017 20:58:43 -0700	[thread overview]
Message-ID: <20170614035843.GI3721@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170613211547.49814d25@gandalf.local.home>

On Tue, Jun 13, 2017 at 09:15:47PM -0400, Steven Rostedt wrote:
> On Tue, 13 Jun 2017 16:42:05 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Jun 13, 2017 at 07:23:08PM -0400, Steven Rostedt wrote:
> > > On Fri, 9 Jun 2017 05:45:54 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > On Fri, Jun 09, 2017 at 09:19:57AM +0200, Peter Zijlstra wrote:  
> > > > > On Thu, Jun 08, 2017 at 08:25:46PM -0700, Krister Johansen wrote:    
> > > > > > The behavior of swake_up() differs from that of wake_up(), and from the
> > > > > > swake_up() that came from RT linux. A memory barrier, or some other
> > > > > > synchronization, is needed prior to a swake_up so that the waiter sees
> > > > > > the condition set by the waker, and so that the waker does not see an
> > > > > > empty wait list.    
> > > > > 
> > > > > Urgh.. let me stare at that. But it sounds like the wrong solution since
> > > > > we wanted to keep the wait and swait APIs as close as possible.    
> > > > 
> > > > But don't they both need some sort of ordering, be it memory barriers or
> > > > locking, to handle the case where the wait/swait doesn't actually sleep?
> > > >   
> > > 
> > > Looking at an RCU example, and assuming that ordering can move around
> > > within a spin lock, and that changes can leak into a spin lock region
> > > from both before and after. Could we have:
> > > 
> > > (looking at __call_rcu_core() and rcu_gp_kthread()
> > > 
> > > 	CPU0				CPU1
> > > 	----				----
> > > 				__call_rcu_core() {
> > > 
> > > 				 spin_lock(rnp_root)
> > > 				 need_wake = __rcu_start_gp() {
> > > 				  rcu_start_gp_advanced() {
> > > 				   gp_flags = FLAG_INIT
> > > 				  }
> > > 				 }
> > > 
> > >  rcu_gp_kthread() {
> > >    swait_event_interruptible(wq,
> > > 	gp_flags & FLAG_INIT) {
> > >    spin_lock(q->lock)
> > > 
> > > 				*fetch wq->task_list here! *
> > > 
> > >    list_add(wq->task_list, q->task_list)
> > >    spin_unlock(q->lock);
> > > 
> > >    *fetch old value of gp_flags here *  
> > 
> > Both reads of ->gp_flags are READ_ONCE(), so having seen the new value
> > in swait_event_interruptible(), this task/CPU cannot see the old value
> > from some later access.  You have to have accesses to two different
> > variables to require a memory barrier (at least assuming consistent use
> > of READ_ONCE(), WRITE_ONCE(), or equivalent).
> 
> If I'm not mistaken, READ_ONCE() and WRITE_ONCE() is just volatiles
> added. The compiler may not leak or move the the fetches, but what
> about the hardware?

The hardware cannot move the references if both references are in the
same thread and to the same variable, which is the case with ->gp_flags.

> A spin_lock() only needs to make sure what is after it does not leak
> before it.
> 
> A spin_unlock() only needs to make sure what is before it must not leak
> after it.

Both true, with the exception of a spin_is_locked() to that same
lock variable, which cannot be reordered with either spin_lock() or
spin_unlock() in either direction.

> From my understandings of reading memory-barrier.txt, there's no
> guarantees that the hardware doesn't let reads or writes that happen
> before a spin_lock() happen after it. Nor does it guarantee that reads
> or writes that happen after a spin_unlock() doesn't happen before it.
> 
> The spin_locks only need to protect the inside of the critical section,
> not the outside of it leaking in.

Again, quite true.

> I'm looking at this in particular:
> 
> ====
>   (1) ACQUIRE operation implication:
> 
>      Memory operations issued after the ACQUIRE will be completed after the
>      ACQUIRE operation has completed.
> 
>      Memory operations issued before the ACQUIRE may be completed after
>      the ACQUIRE operation has completed.  An smp_mb__before_spinlock(),
>      combined with a following ACQUIRE, orders prior stores against
>      subsequent loads and stores.  Note that this is weaker than smp_mb()!
>      The smp_mb__before_spinlock() primitive is free on many architectures.
> 
>  (2) RELEASE operation implication:
> 
>      Memory operations issued before the RELEASE will be completed before the
>      RELEASE operation has completed.
> 
>      Memory operations issued after the RELEASE may be completed before the
>      RELEASE operation has completed.
> ====

And here is the part you also need to look at:

====

 (*) Overlapping loads and stores within a particular CPU will appear to be
     ordered within that CPU.  This means that for:

	a = READ_ONCE(*X); WRITE_ONCE(*X, b);

     the CPU will only issue the following sequence of memory operations:

	a = LOAD *X, STORE *X = b

     And for:

	WRITE_ONCE(*X, c); d = READ_ONCE(*X);

     the CPU will only issue:

	STORE *X = c, d = LOAD *X

     (Loads and stores overlap if they are targeted at overlapping pieces of
     memory).

====

This section needs some help -- the actual guarantee is stronger, that
all CPUs will agree on the order of volatile same-sized aligned accesses
to a given single location.  So if a previous READ_ONCE() sees the new
value, any subsequent READ_ONCE() from that same variable is guaranteed
to also see the new value (or some later value).

Does that help, or am I missing something here?

							Thanx, Paul

> -- Steve
> 
> 
> > 
> > > 				 spin_unlock(rnp_root)
> > > 
> > > 				 rcu_gp_kthread_wake() {
> > > 				  swake_up(wq) {
> > > 				   swait_active(wq) {
> > > 				    list_empty(wq->task_list)
> > > 
> > > 				   } * return false *
> > > 
> > >   if (condition) * false *
> > >     schedule();
> > > 
> > > Looks like a memory barrier is missing. Perhaps we should slap on into
> > > swait_active()? I don't think it is wise to let users add there own, as
> > > I think we currently have bugs now.  
> > 
> > I -know- I have bugs now.  ;-)
> > 
> > But I don't believe this is one of them.  Or am I getting confused?
> > 
> > 							Thanx, Paul
> 

  reply	other threads:[~2017-06-14  3:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-09  3:25 [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up Krister Johansen
2017-06-09  7:19 ` Peter Zijlstra
2017-06-09 12:45   ` Paul E. McKenney
2017-06-13 23:23     ` Steven Rostedt
2017-06-13 23:42       ` Paul E. McKenney
2017-06-14  1:15         ` Steven Rostedt
2017-06-14  3:58           ` Paul E. McKenney [this message]
2017-06-14 13:10             ` Steven Rostedt
2017-06-14 15:02               ` Steven Rostedt
2017-06-14 16:25                 ` Krister Johansen
2017-06-15  4:18                   ` Boqun Feng
2017-06-15 17:56                     ` Paul E. McKenney
2017-06-16  1:07                       ` Boqun Feng
2017-06-16  3:09                         ` Paul E. McKenney
2017-08-10 12:10                     ` [tip:locking/core] sched/wait: Remove the lockless swait_active() check in swake_up*() tip-bot for Boqun Feng
2017-06-14 15:55               ` [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614035843.GI3721@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=kjlx@templeofstupid.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox