* [RFC] Control dependencies
@ 2013-11-21 16:17 Peter Zijlstra
2013-11-21 18:02 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2013-11-21 16:17 UTC (permalink / raw)
To: Paul McKenney
Cc: linux-kernel, Will Deacon, Tim Chen, Ingo Molnar, Andrew Morton,
Thomas Gleixner, linux-arch, Linus Torvalds, Waiman Long,
Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Rik van Riel,
Peter Hurley, Raghavendra K T, George Spelvin, H. Peter Anvin,
Arnd Bergmann, Aswin Chandramouleeswaran, Scott J Norton,
Figo.zhang
Hey Paul,
So on IRC you said you were going to post a patch to clarify/allow
control dependencies -- seeing you didn't get around to it, I thought
I'd take a stab at it.
The below is a patch to the perf code that uses one to get rid of a
pesky full memory barrier. Along with a patch to _the_ Document to
hopefully clarify the issue some. Although I feel there is far more to
say on this subject than I have attempted.
Since it now looks like smp_store_release() needs a full memory barrier
that approach isn't actually looking all that attractive for me anymore
(I'll not give up on those patches just yet), but I did want to put this
approach forward.
---
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -526,21 +526,27 @@ See also the subsection on "Cache Cohere
CONTROL DEPENDENCIES
--------------------
-A control dependency requires a full read memory barrier, not simply a data
-dependency barrier to make it work correctly. Consider the following bit of
-code:
+Because CPUs do not allow store speculation -- this would result in values out
+of thin air -- store visibility depends on a linear branch history. Therefore
+we can rely on LOAD -> STORE control dependencies to order things.
- q = &a;
- if (p) {
- <data dependency barrier>
- q = &b;
+ if (x) {
+ y = 1;
+ }
+
+The store to y must happen after the read to x. However C11/C++11 does not
+(yet) prohibit STORE speculation, and therefore we should insert a compiler
+barrier to force our compiler to do as it is told, and the above example
+should read:
+
+ if (x) {
+ barrier();
+ y = 1;
}
- x = *q;
-This will not have the desired effect because there is no actual data
-dependency, but rather a control dependency that the CPU may short-circuit by
-attempting to predict the outcome in advance. In such a case what's actually
-required is:
+On the other hand, CPUs (and compilers) are allowed to aggressively speculate
+on loads, therefore we cannot rely on LOAD -> LOAD control dependencies such
+as:
q = &a;
if (p) {
@@ -549,6 +555,8 @@ attempting to predict the outcome in adv
}
x = *q;
+And the read barrier as per the above example is indeed required to ensure
+order.
SMP BARRIER PAIRING
-------------------
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -62,18 +62,18 @@ static void perf_output_put_handle(struc
* kernel user
*
* READ ->data_tail READ ->data_head
- * smp_mb() (A) smp_rmb() (C)
+ * barrier() (A) smp_rmb() (C)
* WRITE $data READ $data
* smp_wmb() (B) smp_mb() (D)
* STORE ->data_head WRITE ->data_tail
*
* Where A pairs with D, and B pairs with C.
*
- * I don't think A needs to be a full barrier because we won't in fact
- * write data until we see the store from userspace. So we simply don't
- * issue the data WRITE until we observe it. Be conservative for now.
+ * In our case (A) is a control barrier that separates the loading of
+ * the ->data_tail and the writing of $data. In case ->data_tail
+ * indicates there is no room in the buffer to store $data we bail.
*
- * OTOH, D needs to be a full barrier since it separates the data READ
+ * D needs to be a full barrier since it separates the data READ
* from the tail WRITE.
*
* For B a WMB is sufficient since it separates two WRITEs, and for C
@@ -148,13 +148,15 @@ int perf_output_begin(struct perf_output
} while (local_cmpxchg(&rb->head, offset, head) != offset);
/*
- * Separate the userpage->tail read from the data stores below.
+ * Control barrier separating the @tail read above from the
+ * data writes below.
+ *
* Matches the MB userspace SHOULD issue after reading the data
* and before storing the new tail position.
*
* See perf_output_put_handle().
*/
- smp_mb();
+ barrier();
if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
local_add(rb->watermark, &rb->wakeup);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] Control dependencies
2013-11-21 16:17 [RFC] Control dependencies Peter Zijlstra
@ 2013-11-21 18:02 ` Paul E. McKenney
2013-11-21 19:28 ` Paul E. McKenney
2013-11-22 13:46 ` Peter Zijlstra
0 siblings, 2 replies; 7+ messages in thread
From: Paul E. McKenney @ 2013-11-21 18:02 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, Will Deacon, Tim Chen, Ingo Molnar, Andrew Morton,
Thomas Gleixner, linux-arch, Linus Torvalds, Waiman Long,
Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Rik van Riel,
Peter Hurley, Raghavendra K T, George Spelvin, H. Peter Anvin,
Arnd Bergmann, Aswin Chandramouleeswaran, Scott J Norton,
Figo.zhang
On Thu, Nov 21, 2013 at 05:17:33PM +0100, Peter Zijlstra wrote:
> Hey Paul,
>
> So on IRC you said you were going to post a patch to clarify/allow
> control dependencies -- seeing you didn't get around to it, I thought
> I'd take a stab at it.
I do have a prototype queued, but let's see what you have. Sending mine
in parallel anyway, just in case we have a shortage of confusion. ;-)
I will combine them and keep your Signed-off-by if you are OK with that.
> The below is a patch to the perf code that uses one to get rid of a
> pesky full memory barrier. Along with a patch to _the_ Document to
> hopefully clarify the issue some. Although I feel there is far more to
> say on this subject than I have attempted.
>
> Since it now looks like smp_store_release() needs a full memory barrier
> that approach isn't actually looking all that attractive for me anymore
> (I'll not give up on those patches just yet), but I did want to put this
> approach forward.
I am not so sure that smp_store_release() needs a full memory barrier, but
please take a look at Message-ID <20131121172558.GA27927@linux.vnet.ibm.com>,
sent quite recently.
Please see comments interspersed, thoughts?
Thanx, Paul
> ---
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -526,21 +526,27 @@ See also the subsection on "Cache Cohere
> CONTROL DEPENDENCIES
> --------------------
>
> -A control dependency requires a full read memory barrier, not simply a data
> -dependency barrier to make it work correctly. Consider the following bit of
> -code:
> +Because CPUs do not allow store speculation -- this would result in values out
> +of thin air -- store visibility depends on a linear branch history. Therefore
> +we can rely on LOAD -> STORE control dependencies to order things.
>
> - q = &a;
> - if (p) {
> - <data dependency barrier>
> - q = &b;
> + if (x) {
> + y = 1;
> + }
> +
> +The store to y must happen after the read to x. However C11/C++11 does not
> +(yet) prohibit STORE speculation, and therefore we should insert a compiler
> +barrier to force our compiler to do as it is told, and the above example
> +should read:
> +
> + if (x) {
> + barrier();
> + y = 1;
> }
> - x = *q;
In my draft, I rely on ACCESS_ONCE() rather than barrier(), but clearly
we need to call out both. Both yours and mine need to be more emphatic
about the "if" being needed.
There is an odd example in mine where both branches of an "if" statement
do identical writes. This seems vulnerable to compiler "optimizations"
that hoist the stores out of the "if" statement, defeating ordering.
I had not yet figured out what to say about that, hence my tardiness
in sending. (Plus inertia, of course!)
> -This will not have the desired effect because there is no actual data
> -dependency, but rather a control dependency that the CPU may short-circuit by
> -attempting to predict the outcome in advance. In such a case what's actually
> -required is:
> +On the other hand, CPUs (and compilers) are allowed to aggressively speculate
> +on loads, therefore we cannot rely on LOAD -> LOAD control dependencies such
> +as:
Your example is better than mine here, because mine fails to make it
painfully clear that control dependencies don't apply to subsequent loads.
I will pull this point in.
> q = &a;
> if (p) {
> @@ -549,6 +555,8 @@ attempting to predict the outcome in adv
> }
> x = *q;
>
> +And the read barrier as per the above example is indeed required to ensure
> +order.
>
> SMP BARRIER PAIRING
> -------------------
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
My patch does not cover this file. Wouldn't hurt for them to be
separate.
> @@ -62,18 +62,18 @@ static void perf_output_put_handle(struc
> * kernel user
> *
> * READ ->data_tail READ ->data_head
> - * smp_mb() (A) smp_rmb() (C)
> + * barrier() (A) smp_rmb() (C)
We need a conditional for this to work. I know that the required
conditional is there in the code, but we need it explicitly in this
example as well.
> * WRITE $data READ $data
> * smp_wmb() (B) smp_mb() (D)
> * STORE ->data_head WRITE ->data_tail
> *
> * Where A pairs with D, and B pairs with C.
> *
> - * I don't think A needs to be a full barrier because we won't in fact
> - * write data until we see the store from userspace. So we simply don't
> - * issue the data WRITE until we observe it. Be conservative for now.
> + * In our case (A) is a control barrier that separates the loading of
> + * the ->data_tail and the writing of $data.
Actually, we need both a conditional and something to prevent the
compiler from hoisting code to precede that conditional. The barrier()
prevents the hoisting, but we need to explicitly call out the conditional.
Otherwise, someone will read this and assume that any load followed by
barrier() will always be ordered against subsequent stores.
We should call this a "control dependency" rather than a "control
barrier" to emphasize that the conditional is part and parcel of the
ordering guarantee.
> + * In case ->data_tail
> + * indicates there is no room in the buffer to store $data we bail.
> *
> - * OTOH, D needs to be a full barrier since it separates the data READ
> + * D needs to be a full barrier since it separates the data READ
> * from the tail WRITE.
> *
> * For B a WMB is sufficient since it separates two WRITEs, and for C
> @@ -148,13 +148,15 @@ int perf_output_begin(struct perf_output
> } while (local_cmpxchg(&rb->head, offset, head) != offset);
>
> /*
> - * Separate the userpage->tail read from the data stores below.
> + * Control barrier separating the @tail read above from the
> + * data writes below.
Or maybe "control dependency barrier".
> + *
> * Matches the MB userspace SHOULD issue after reading the data
> * and before storing the new tail position.
> *
> * See perf_output_put_handle().
> */
> - smp_mb();
> + barrier();
>
> if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
> local_add(rb->watermark, &rb->wakeup);
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] Control dependencies
2013-11-21 18:02 ` Paul E. McKenney
@ 2013-11-21 19:28 ` Paul E. McKenney
2013-11-22 13:46 ` Peter Zijlstra
1 sibling, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2013-11-21 19:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, Will Deacon, Tim Chen, Ingo Molnar, Andrew Morton,
Thomas Gleixner, linux-arch, Linus Torvalds, Waiman Long,
Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Rik van Riel,
Peter Hurley, Raghavendra K T, George Spelvin, H. Peter Anvin,
Arnd Bergmann, Aswin Chandramouleeswaran, Scott J Norton,
Figo.zhang
On Thu, Nov 21, 2013 at 10:02:37AM -0800, Paul E. McKenney wrote:
> On Thu, Nov 21, 2013 at 05:17:33PM +0100, Peter Zijlstra wrote:
> > Hey Paul,
> >
> > So on IRC you said you were going to post a patch to clarify/allow
> > control dependencies -- seeing you didn't get around to it, I thought
> > I'd take a stab at it.
>
> I do have a prototype queued, but let's see what you have. Sending mine
> in parallel anyway, just in case we have a shortage of confusion. ;-)
>
> I will combine them and keep your Signed-off-by if you are OK with that.
Oh, right, and there was this other complication... Suppose that we
have something like the following:
q = ACCESS_ONCE(a);
if (c)
ACCESS_ONCE(b) = q;
Suppose that the compiler figures out that c!=0 always. Then the
compiler is free to optimize as follows:
q = ACCESS_ONCE(a);
ACCESS_ONCE(b) = q;
Goodbye to the control dependency and associated ordering guarantees!
One way to deal with this is to restrict the condition to something that
the compiler is not permitted to predict, for example:
q = ACCESS_ONCE(a);
if (ACCESS_ONCE(c))
ACCESS_ONCE(b) = q;
Of course, you should not even -think- about having a conditional that
is a constant! ;-)
Other thoughts?
Thanx, Paul
> > The below is a patch to the perf code that uses one to get rid of a
> > pesky full memory barrier. Along with a patch to _the_ Document to
> > hopefully clarify the issue some. Although I feel there is far more to
> > say on this subject than I have attempted.
> >
> > Since it now looks like smp_store_release() needs a full memory barrier
> > that approach isn't actually looking all that attractive for me anymore
> > (I'll not give up on those patches just yet), but I did want to put this
> > approach forward.
>
> I am not so sure that smp_store_release() needs a full memory barrier, but
> please take a look at Message-ID <20131121172558.GA27927@linux.vnet.ibm.com>,
> sent quite recently.
>
> Please see comments interspersed, thoughts?
>
> Thanx, Paul
>
> > ---
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -526,21 +526,27 @@ See also the subsection on "Cache Cohere
> > CONTROL DEPENDENCIES
> > --------------------
> >
> > -A control dependency requires a full read memory barrier, not simply a data
> > -dependency barrier to make it work correctly. Consider the following bit of
> > -code:
> > +Because CPUs do not allow store speculation -- this would result in values out
> > +of thin air -- store visibility depends on a linear branch history. Therefore
> > +we can rely on LOAD -> STORE control dependencies to order things.
> >
> > - q = &a;
> > - if (p) {
> > - <data dependency barrier>
> > - q = &b;
> > + if (x) {
> > + y = 1;
> > + }
> > +
> > +The store to y must happen after the read to x. However C11/C++11 does not
> > +(yet) prohibit STORE speculation, and therefore we should insert a compiler
> > +barrier to force our compiler to do as it is told, and the above example
> > +should read:
> > +
> > + if (x) {
> > + barrier();
> > + y = 1;
> > }
> > - x = *q;
>
> In my draft, I rely on ACCESS_ONCE() rather than barrier(), but clearly
> we need to call out both. Both yours and mine need to be more emphatic
> about the "if" being needed.
>
> There is an odd example in mine where both branches of an "if" statement
> do identical writes. This seems vulnerable to compiler "optimizations"
> that hoist the stores out of the "if" statement, defeating ordering.
> I had not yet figured out what to say about that, hence my tardiness
> in sending. (Plus inertia, of course!)
>
> > -This will not have the desired effect because there is no actual data
> > -dependency, but rather a control dependency that the CPU may short-circuit by
> > -attempting to predict the outcome in advance. In such a case what's actually
> > -required is:
> > +On the other hand, CPUs (and compilers) are allowed to aggressively speculate
> > +on loads, therefore we cannot rely on LOAD -> LOAD control dependencies such
> > +as:
>
> Your example is better than mine here, because mine fails to make it
> painfully clear that control dependencies don't apply to subsequent loads.
> I will pull this point in.
>
> > q = &a;
> > if (p) {
> > @@ -549,6 +555,8 @@ attempting to predict the outcome in adv
> > }
> > x = *q;
> >
> > +And the read barrier as per the above example is indeed required to ensure
> > +order.
> >
> > SMP BARRIER PAIRING
> > -------------------
> > --- a/kernel/events/ring_buffer.c
> > +++ b/kernel/events/ring_buffer.c
>
> My patch does not cover this file. Wouldn't hurt for them to be
> separate.
>
> > @@ -62,18 +62,18 @@ static void perf_output_put_handle(struc
> > * kernel user
> > *
> > * READ ->data_tail READ ->data_head
> > - * smp_mb() (A) smp_rmb() (C)
> > + * barrier() (A) smp_rmb() (C)
>
> We need a conditional for this to work. I know that the required
> conditional is there in the code, but we need it explicitly in this
> example as well.
>
> > * WRITE $data READ $data
> > * smp_wmb() (B) smp_mb() (D)
> > * STORE ->data_head WRITE ->data_tail
> > *
> > * Where A pairs with D, and B pairs with C.
> > *
> > - * I don't think A needs to be a full barrier because we won't in fact
> > - * write data until we see the store from userspace. So we simply don't
> > - * issue the data WRITE until we observe it. Be conservative for now.
> > + * In our case (A) is a control barrier that separates the loading of
> > + * the ->data_tail and the writing of $data.
>
> Actually, we need both a conditional and something to prevent the
> compiler from hoisting code to precede that conditional. The barrier()
> prevents the hoisting, but we need to explicitly call out the conditional.
> Otherwise, someone will read this and assume that any load followed by
> barrier() will always be ordered against subsequent stores.
>
> We should call this a "control dependency" rather than a "control
> barrier" to emphasize that the conditional is part and parcel of the
> ordering guarantee.
>
> > + * In case ->data_tail
> > + * indicates there is no room in the buffer to store $data we bail.
> > *
> > - * OTOH, D needs to be a full barrier since it separates the data READ
> > + * D needs to be a full barrier since it separates the data READ
> > * from the tail WRITE.
> > *
> > * For B a WMB is sufficient since it separates two WRITEs, and for C
> > @@ -148,13 +148,15 @@ int perf_output_begin(struct perf_output
> > } while (local_cmpxchg(&rb->head, offset, head) != offset);
> >
> > /*
> > - * Separate the userpage->tail read from the data stores below.
> > + * Control barrier separating the @tail read above from the
> > + * data writes below.
>
> Or maybe "control dependency barrier".
>
> > + *
> > * Matches the MB userspace SHOULD issue after reading the data
> > * and before storing the new tail position.
> > *
> > * See perf_output_put_handle().
> > */
> > - smp_mb();
> > + barrier();
> >
> > if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
> > local_add(rb->watermark, &rb->wakeup);
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] Control dependencies
2013-11-21 18:02 ` Paul E. McKenney
2013-11-21 19:28 ` Paul E. McKenney
@ 2013-11-22 13:46 ` Peter Zijlstra
2013-11-22 17:59 ` Peter Hurley
2013-11-22 18:16 ` Paul E. McKenney
1 sibling, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2013-11-22 13:46 UTC (permalink / raw)
To: Paul E. McKenney
Cc: linux-kernel, Will Deacon, Tim Chen, Ingo Molnar, Andrew Morton,
Thomas Gleixner, linux-arch, Linus Torvalds, Waiman Long,
Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Rik van Riel,
Peter Hurley, Raghavendra K T, George Spelvin, H. Peter Anvin,
Arnd Bergmann, Aswin Chandramouleeswaran, Scott J Norton,
Figo.zhang
On Thu, Nov 21, 2013 at 10:02:37AM -0800, Paul E. McKenney wrote:
> > --- a/kernel/events/ring_buffer.c
> > +++ b/kernel/events/ring_buffer.c
>
> My patch does not cover this file. Wouldn't hurt for them to be
> separate.
Oh sure, but I wanted to present the RFC with at least one working
example to illustrate why I even bother and to aid in discussion.
> > @@ -62,18 +62,18 @@ static void perf_output_put_handle(struc
> > * kernel user
> > *
> > * READ ->data_tail READ ->data_head
> > - * smp_mb() (A) smp_rmb() (C)
> > + * barrier() (A) smp_rmb() (C)
>
> We need a conditional for this to work. I know that the required
> conditional is there in the code, but we need it explicitly in this
> example as well.
Agreed, I skimped on that because I didn't quite know how to write that
best.
How about the below version?
---
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -61,19 +61,20 @@ static void perf_output_put_handle(struc
*
* kernel user
*
- * READ ->data_tail READ ->data_head
- * smp_mb() (A) smp_rmb() (C)
- * WRITE $data READ $data
- * smp_wmb() (B) smp_mb() (D)
- * STORE ->data_head WRITE ->data_tail
+ * if (LOAD ->data_tail) { LOAD ->data_head
+ * (A) smp_rmb() (C)
+ * STORE $data LOAD $data
+ * smp_wmb() (B) smp_mb() (D)
+ * STORE ->data_head STORE ->data_tail
+ * }
*
* Where A pairs with D, and B pairs with C.
*
- * I don't think A needs to be a full barrier because we won't in fact
- * write data until we see the store from userspace. So we simply don't
- * issue the data WRITE until we observe it. Be conservative for now.
+ * In our case (A) is a control dependency that separates the load of
+ * the ->data_tail and the stores of $data. In case ->data_tail
+ * indicates there is no room in the buffer to store $data we do not.
*
- * OTOH, D needs to be a full barrier since it separates the data READ
+ * D needs to be a full barrier since it separates the data READ
* from the tail WRITE.
*
* For B a WMB is sufficient since it separates two WRITEs, and for C
@@ -81,7 +82,7 @@ static void perf_output_put_handle(struc
*
* See perf_output_begin().
*/
- smp_wmb();
+ smp_wmb(); /* B, matches C */
rb->user_page->data_head = head;
/*
@@ -144,17 +145,26 @@ int perf_output_begin(struct perf_output
if (!rb->overwrite &&
unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
goto fail;
+
+ /*
+ * The above forms a control dependency barrier separating the
+ * @tail load above from the data stores below. Since the @tail
+ * load is required to compute the branch to fail below.
+ *
+ * A, matches D; the full memory barrier userspace SHOULD issue
+ * after reading the data and before storing the new tail
+ * position.
+ *
+ * See perf_output_put_handle().
+ */
+
head += size;
} while (local_cmpxchg(&rb->head, offset, head) != offset);
/*
- * Separate the userpage->tail read from the data stores below.
- * Matches the MB userspace SHOULD issue after reading the data
- * and before storing the new tail position.
- *
- * See perf_output_put_handle().
+ * We rely on the implied barrier() by local_cmpxchg() to ensure
+ * none of the data stores below can be lifted up by the compiler.
*/
- smp_mb();
if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
local_add(rb->watermark, &rb->wakeup);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] Control dependencies
2013-11-22 13:46 ` Peter Zijlstra
@ 2013-11-22 17:59 ` Peter Hurley
2013-11-25 9:42 ` Peter Zijlstra
2013-11-22 18:16 ` Paul E. McKenney
1 sibling, 1 reply; 7+ messages in thread
From: Peter Hurley @ 2013-11-22 17:59 UTC (permalink / raw)
To: Peter Zijlstra, Paul E. McKenney
Cc: linux-kernel, Will Deacon, Tim Chen, Ingo Molnar, Andrew Morton,
Thomas Gleixner, linux-arch, Linus Torvalds, Waiman Long,
Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Rik van Riel,
Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
Aswin Chandramouleeswaran, Scott J Norton, Figo.zhang
On 11/22/2013 08:46 AM, Peter Zijlstra wrote:
> How about the below version?
>
> ---
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -61,19 +61,20 @@ static void perf_output_put_handle(struc
> *
> * kernel user
> *
> - * READ ->data_tail READ ->data_head
> - * smp_mb() (A) smp_rmb() (C)
> - * WRITE $data READ $data
> - * smp_wmb() (B) smp_mb() (D)
> - * STORE ->data_head WRITE ->data_tail
> + * if (LOAD ->data_tail) { LOAD ->data_head
> + * (A) smp_rmb() (C)
> + * STORE $data LOAD $data
> + * smp_wmb() (B) smp_mb() (D)
> + * STORE ->data_head STORE ->data_tail
I wasn't subscribed to linux-arch so missed the smp_store_release()
outcome, if there was one.
Are (B) and (D) still slated for changing to STORE.rel semantics,
aka smp_store_release()?
I realize that, for the perf ring buffer, (D) is in userspace but
I'm also interested in non-perf situations where (D) would be in the
kernel.
Regards,
Peter Hurley
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] Control dependencies
2013-11-22 13:46 ` Peter Zijlstra
2013-11-22 17:59 ` Peter Hurley
@ 2013-11-22 18:16 ` Paul E. McKenney
1 sibling, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2013-11-22 18:16 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, Will Deacon, Tim Chen, Ingo Molnar, Andrew Morton,
Thomas Gleixner, linux-arch, Linus Torvalds, Waiman Long,
Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Rik van Riel,
Peter Hurley, Raghavendra K T, George Spelvin, H. Peter Anvin,
Arnd Bergmann, Aswin Chandramouleeswaran, Scott J Norton,
Figo.zhang
On Fri, Nov 22, 2013 at 02:46:30PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 21, 2013 at 10:02:37AM -0800, Paul E. McKenney wrote:
> > > --- a/kernel/events/ring_buffer.c
> > > +++ b/kernel/events/ring_buffer.c
> >
> > My patch does not cover this file. Wouldn't hurt for them to be
> > separate.
>
> Oh sure, but I wanted to present the RFC with at least one working
> example to illustrate why I even bother and to aid in discussion.
>
> > > @@ -62,18 +62,18 @@ static void perf_output_put_handle(struc
> > > * kernel user
> > > *
> > > * READ ->data_tail READ ->data_head
> > > - * smp_mb() (A) smp_rmb() (C)
> > > + * barrier() (A) smp_rmb() (C)
> >
> > We need a conditional for this to work. I know that the required
> > conditional is there in the code, but we need it explicitly in this
> > example as well.
>
> Agreed, I skimped on that because I didn't quite know how to write that
> best.
Indeed, we still seem to be converging on that.
> How about the below version?
Much better! Might even be correct. ;-)
Thanx, Paul
> ---
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -61,19 +61,20 @@ static void perf_output_put_handle(struc
> *
> * kernel user
> *
> - * READ ->data_tail READ ->data_head
> - * smp_mb() (A) smp_rmb() (C)
> - * WRITE $data READ $data
> - * smp_wmb() (B) smp_mb() (D)
> - * STORE ->data_head WRITE ->data_tail
> + * if (LOAD ->data_tail) { LOAD ->data_head
> + * (A) smp_rmb() (C)
> + * STORE $data LOAD $data
> + * smp_wmb() (B) smp_mb() (D)
> + * STORE ->data_head STORE ->data_tail
> + * }
> *
> * Where A pairs with D, and B pairs with C.
> *
> - * I don't think A needs to be a full barrier because we won't in fact
> - * write data until we see the store from userspace. So we simply don't
> - * issue the data WRITE until we observe it. Be conservative for now.
> + * In our case (A) is a control dependency that separates the load of
> + * the ->data_tail and the stores of $data. In case ->data_tail
> + * indicates there is no room in the buffer to store $data we do not.
> *
> - * OTOH, D needs to be a full barrier since it separates the data READ
> + * D needs to be a full barrier since it separates the data READ
> * from the tail WRITE.
> *
> * For B a WMB is sufficient since it separates two WRITEs, and for C
> @@ -81,7 +82,7 @@ static void perf_output_put_handle(struc
> *
> * See perf_output_begin().
> */
> - smp_wmb();
> + smp_wmb(); /* B, matches C */
> rb->user_page->data_head = head;
>
> /*
> @@ -144,17 +145,26 @@ int perf_output_begin(struct perf_output
> if (!rb->overwrite &&
> unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
> goto fail;
> +
> + /*
> + * The above forms a control dependency barrier separating the
> + * @tail load above from the data stores below. Since the @tail
> + * load is required to compute the branch to fail below.
> + *
> + * A, matches D; the full memory barrier userspace SHOULD issue
> + * after reading the data and before storing the new tail
> + * position.
> + *
> + * See perf_output_put_handle().
> + */
> +
> head += size;
> } while (local_cmpxchg(&rb->head, offset, head) != offset);
>
> /*
> - * Separate the userpage->tail read from the data stores below.
> - * Matches the MB userspace SHOULD issue after reading the data
> - * and before storing the new tail position.
> - *
> - * See perf_output_put_handle().
> + * We rely on the implied barrier() by local_cmpxchg() to ensure
> + * none of the data stores below can be lifted up by the compiler.
> */
> - smp_mb();
>
> if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
> local_add(rb->watermark, &rb->wakeup);
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] Control dependencies
2013-11-22 17:59 ` Peter Hurley
@ 2013-11-25 9:42 ` Peter Zijlstra
0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2013-11-25 9:42 UTC (permalink / raw)
To: Peter Hurley
Cc: Paul E. McKenney, linux-kernel, Will Deacon, Tim Chen,
Ingo Molnar, Andrew Morton, Thomas Gleixner, linux-arch,
Linus Torvalds, Waiman Long, Andrea Arcangeli, Alex Shi,
Andi Kleen, Michel Lespinasse, Davidlohr Bueso, Matthew R Wilcox,
Dave Hansen, Rik van Riel, Raghavendra K T, George Spelvin,
H. Peter Anvin, Arnd Bergmann, Aswin Chandramouleeswaran,
Scott J Norton, Figo.zhang
On Fri, Nov 22, 2013 at 12:59:41PM -0500, Peter Hurley wrote:
> On 11/22/2013 08:46 AM, Peter Zijlstra wrote:
> >How about the below version?
> >
> >---
> >--- a/kernel/events/ring_buffer.c
> >+++ b/kernel/events/ring_buffer.c
> >@@ -61,19 +61,20 @@ static void perf_output_put_handle(struc
> > *
> > * kernel user
> > *
> >- * READ ->data_tail READ ->data_head
> >- * smp_mb() (A) smp_rmb() (C)
> >- * WRITE $data READ $data
> >- * smp_wmb() (B) smp_mb() (D)
> >- * STORE ->data_head WRITE ->data_tail
> >+ * if (LOAD ->data_tail) { LOAD ->data_head
> >+ * (A) smp_rmb() (C)
> >+ * STORE $data LOAD $data
> >+ * smp_wmb() (B) smp_mb() (D)
> >+ * STORE ->data_head STORE ->data_tail
>
>
> I wasn't subscribed to linux-arch so missed the smp_store_release()
> outcome, if there was one.
>
> Are (B) and (D) still slated for changing to STORE.rel semantics,
> aka smp_store_release()?
The earlier proposal would have A and C be smp_load_acquire() and B and
D be smp_store_release().
> I realize that, for the perf ring buffer, (D) is in userspace but
> I'm also interested in non-perf situations where (D) would be in the
> kernel.
So we're still debating the exact semantics of smp_store_release(), it
now looks like it needs a heavier memory barrier than previously
thought. In which case using it wouldn't make sense for me anymore.
Note that C and D are in userspace and not in any hot path (usually)
They're only issued once to read an entire buffer backlog at once, so I
don't really care about them all that much.
A and B otoh are in kernel space and are issued for every single event
written, so I'm interested to get them as cheaply as possible.
With this proposed patch, we remove a full barrier, with the earlier
smp_load_acquire() / smp_store_release() patches we would only
downgrade the full barrier to an acquire barrier, which is still more
than no barrier at all.
And now it looks like the smp_store_release() would actually upgrade the
wmb to a full barrier on some systems at least.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-11-25 9:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-21 16:17 [RFC] Control dependencies Peter Zijlstra
2013-11-21 18:02 ` Paul E. McKenney
2013-11-21 19:28 ` Paul E. McKenney
2013-11-22 13:46 ` Peter Zijlstra
2013-11-22 17:59 ` Peter Hurley
2013-11-25 9:42 ` Peter Zijlstra
2013-11-22 18:16 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox