From mboxrd@z Thu Jan 1 00:00:00 1970 From: Waiman Long Subject: Re: [RFC][PATCH 1/3] locking: Introduce smp_acquire__after_ctrl_dep Date: Wed, 25 May 2016 11:20:42 -0400 Message-ID: <5745C2CA.4040003@hpe.com> References: <20160524142723.178148277@infradead.org> <20160524143649.523586684@infradead.org> <57451581.6000700@hpe.com> <20160525045329.GQ4148@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Cc: Peter Zijlstra , , , , , , , , , , , , , , To: Return-path: In-Reply-To: <20160525045329.GQ4148@linux.vnet.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: netfilter-devel.vger.kernel.org On 05/25/2016 12:53 AM, Paul E. McKenney wrote: > On Tue, May 24, 2016 at 11:01:21PM -0400, Waiman Long wrote: >> On 05/24/2016 10:27 AM, Peter Zijlstra wrote: >>> Introduce smp_acquire__after_ctrl_dep(), this construct is not >>> uncommen, but the lack of this barrier is. >>> >>> Signed-off-by: Peter Zijlstra (Intel) >>> --- >>> include/linux/compiler.h | 14 ++++++++++---- >>> ipc/sem.c | 14 ++------------ >>> 2 files changed, 12 insertions(+), 16 deletions(-) >>> >>> --- a/include/linux/compiler.h >>> +++ b/include/linux/compiler.h >>> @@ -305,20 +305,26 @@ static __always_inline void __write_once >>> }) >>> >>> /** >>> + * smp_acquire__after_ctrl_dep() - Provide ACQUIRE ordering after a control dependency >>> + * >>> + * A control dependency provides a LOAD->STORE order, the additional RMB >>> + * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order, >>> + * aka. ACQUIRE. >>> + */ >>> +#define smp_acquire__after_ctrl_dep() smp_rmb() >>> + >>> +/** >>> * smp_cond_acquire() - Spin wait for cond with ACQUIRE ordering >>> * @cond: boolean expression to wait for >>> * >>> * Equivalent to using smp_load_acquire() on the condition variable but employs >>> * the control dependency of the wait to reduce the barrier on many platforms. >>> * >>> - * The control dependency provides a LOAD->STORE order, the additional RMB >>> - * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order, >>> - * aka. ACQUIRE. >>> */ >>> #define smp_cond_acquire(cond) do { \ >>> while (!(cond)) \ >>> cpu_relax(); \ >>> - smp_rmb(); /* ctrl + rmb := acquire */ \ >>> + smp_acquire__after_ctrl_dep(); \ >>> } while (0) >>> >>> >> I have a question about the claim that control dependence + rmb is >> equivalent to an acquire memory barrier. For example, >> >> S1: if (a) >> S2: b = 1; >> smp_rmb() >> S3: c = 2; >> >> Since c is independent of both a and b, is it possible that the cpu >> may reorder to execute store statement S3 first before S1 and S2? > The CPUs I know of won't do, nor should the compiler, at least assuming > "a" (AKA "cond") includes READ_ONCE(). Ditto "b" and WRITE_ONCE(). > Otherwise, the compiler could do quite a few "interesting" things, > especially if it knows the value of "b". For example, if the compiler > knows that b==1, without the volatile casts, the compiler could just > throw away both S1 and S2, eliminating any ordering. This can get > quite tricky -- see memory-barriers.txt for more mischief. > > The smp_rmb() is not needed in this example because S3 is a write, not > a read. Perhaps you meant something more like this: > > if (READ_ONCE(a)) > WRITE_ONCE(b, 1); > smp_rmb(); > r1 = READ_ONCE(c); > > This sequence would guarantee that "a" was read before "c". > > Thanx, Paul > The smp_rmb() in Linux should be a compiler barrier. So the compiler should not recorder it above smp_rmb. However, what I am wondering is whether a condition + rmb combination can be considered a real acquire memory barrier from the CPU point of view which requires that it cannot reorder the data store in S3 above S1 and S2. This is where I am not so sure about. Cheers, Longman