From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
jiangshanlai@gmail.com, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org,
dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com,
oleg@redhat.com
Subject: Re: [PATCH tip/core/rcu 1/9] rcu: Provide GP ordering in face of migrations and delays
Date: Thu, 5 Oct 2017 09:19:09 -0700 [thread overview]
Message-ID: <20171005161909.GS3521@linux.vnet.ibm.com> (raw)
In-Reply-To: <20171005153913.5wanmnfmi6i3byv7@hirez.programming.kicks-ass.net>
On Thu, Oct 05, 2017 at 05:39:13PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 05, 2017 at 07:55:13AM -0700, Paul E. McKenney wrote:
> > On Thu, Oct 05, 2017 at 11:41:14AM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 04, 2017 at 02:29:27PM -0700, Paul E. McKenney wrote:
> > > > Consider the following admittedly improbable sequence of events:
> > > >
> > > > o RCU is initially idle.
> > > >
> > > > o Task A on CPU 0 executes rcu_read_lock().
> > > >
> > > > o Task B on CPU 1 executes synchronize_rcu(), which must
> > > > wait on Task A:
> > > >
> > > > o Task B registers the callback, which starts a new
> > > > grace period, awakening the grace-period kthread
> > > > on CPU 3, which immediately starts a new grace period.
> > > >
> > > > o Task B migrates to CPU 2, which provides a quiescent
> > > > state for both CPUs 1 and 2.
> > > >
> > > > o Both CPUs 1 and 2 take scheduling-clock interrupts,
> > > > and both invoke RCU_SOFTIRQ, both thus learning of the
> > > > new grace period.
> > > >
> > > > o Task B is delayed, perhaps by vCPU preemption on CPU 2.
> > > >
> > > > o CPUs 2 and 3 pass through quiescent states, which are reported
> > > > to core RCU.
> > > >
> > > > o Task B is resumed just long enough to be migrated to CPU 3,
> > > > and then is once again delayed.
> > > >
> > > > o Task A executes rcu_read_unlock(), exiting its RCU read-side
> > > > critical section.
> > > >
> > > > o CPU 0 passes through a quiescent sate, which is reported to
> > > > core RCU. Only CPU 1 continues to block the grace period.
> > > >
> > > > o CPU 1 passes through a quiescent state, which is reported to
> > > > core RCU. This ends the grace period, and CPU 1 therefore
> > > > invokes its callbacks, one of which awakens Task B via
> > > > complete().
> > > >
> > > > o Task B resumes (still on CPU 3) and starts executing
> > > > wait_for_completion(), which sees that the completion has
> > > > already completed, and thus does not block. It returns from
> > > > the synchronize_rcu() without any ordering against the
> > > > end of Task A's RCU read-side critical section.
> > > >
> > > > It can therefore mess up Task A's RCU read-side critical section,
> > > > in theory, anyway.
> > >
> > > I'm not sure I follow, at the very least the wait_for_completion() does
> > > an ACQUIRE such that it observes the state prior to the RELEASE as done
> > > by complete(), no?
> >
> > Your point being that both wait_for_completion() and complete() acquire
> > and release the same lock? (Yes, I suspect that I was confusing this
> > with wait_event() and wake_up(), just so you know.)
>
> Well, fundamentally complete()/wait_for_completion() is a message-pass
> and they include a RELEASE/ACQUIRE pair for causal reasons.
>
> Per the implementation they use a spinlock, but any implementation needs
> to provide at least that RELEASE/ACQUIRE pair.
>
> > > And is not CPU0's QS reporting ordered against that complete()?
> >
> > Mumble mumble mumble powerpc mumble mumble mumble...
> >
> > OK, I will make this new memory barrier only execute for powerpc.
> >
> > Or am I missing something else here?
>
> So I'm not entirely clear on the required semantics here; why do we need
> a full mb? I'm thinking CPU0's QS propagating through the tree and
> arriving at the root node is a multi-copy-atomic / transitive thing and
> all CPUs will agree the system QS has ended, right?
>
> Whichever CPU establishes the system QS does complete() and the
> wait_for_completion() then has the weak-transitive causal relation to
> that, ensuring that -- in the above example -- CPU3 must be _after_
> CPU0's rcu_read_unlock().
Yes, the ordering does need to be visible to uninvolved CPUs, so
release-acquire is not necessarily strong enough.
My current thought is like this:
if (IS_ENABLED(CONFIG_ARCH_WEAK_RELEASE_ACQUIRE))
smp_mb();
Thoughts?
Thanx, Paul
next prev parent reply other threads:[~2017-10-05 16:19 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-04 21:29 [PATCH tip/core/rcu 0/9] Miscellaneous fixes for v4.15 Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 1/9] rcu: Provide GP ordering in face of migrations and delays Paul E. McKenney
2017-10-05 9:41 ` Peter Zijlstra
2017-10-05 14:55 ` Paul E. McKenney
2017-10-05 15:39 ` Peter Zijlstra
2017-10-05 16:19 ` Paul E. McKenney [this message]
2017-10-05 16:25 ` Peter Zijlstra
2017-10-05 18:22 ` Paul E. McKenney
2017-10-06 9:07 ` Peter Zijlstra
2017-10-06 19:18 ` Paul E. McKenney
2017-10-06 20:15 ` Peter Zijlstra
2017-10-07 3:31 ` Paul E. McKenney
2017-10-07 9:29 ` Peter Zijlstra
2017-10-07 18:28 ` Paul E. McKenney
2017-10-09 8:16 ` Peter Zijlstra
2017-10-09 14:37 ` Andrea Parri
2017-10-09 23:15 ` Paul E. McKenney
2017-10-05 13:17 ` Steven Rostedt
2017-10-05 13:40 ` Peter Zijlstra
2017-10-05 14:13 ` Steven Rostedt
2017-10-04 21:29 ` [PATCH tip/core/rcu 2/9] rcu: Fix up pending cbs check in rcu_prepare_for_idle Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 3/9] rcu: Create call_rcu_tasks() kthread at boot time Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 4/9] irq_work: Map irq_work_on_queue() to irq_work_on() in !SMP Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 5/9] srcu: Add parameters to SRCU docbook comments Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 6/9] sched: Make resched_cpu() unconditional Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 7/9] rcu: Pretend ->boost_mtx acquired legitimately Paul E. McKenney
2017-10-05 9:50 ` Peter Zijlstra
2017-10-05 15:06 ` Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 8/9] rcu: Add extended-quiescent-state testing advice Paul E. McKenney
2017-10-04 21:29 ` [PATCH tip/core/rcu 9/9] rcu/segcblist: Include rcupdate.h Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171005161909.GS3521@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.