From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Borkmann Subject: Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb} Date: Fri, 19 Oct 2018 12:37:18 +0200 Message-ID: References: <20181017144156.16639-1-daniel@iogearbox.net> <20181017144156.16639-3-daniel@iogearbox.net> <20181017155050.GM3121@hirez.programming.kicks-ass.net> <55f86215-44a8-2bb8-b1d0-a77a142dc697@iogearbox.net> <20181018081434.GT3121@hirez.programming.kicks-ass.net> <20181019094417.GE3121@hirez.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: alexei.starovoitov@gmail.com, paulmck@linux.vnet.ibm.com, will.deacon@arm.com, acme@redhat.com, yhs@fb.com, john.fastabend@gmail.com, netdev@vger.kernel.org To: Peter Zijlstra Return-path: Received: from www62.your-server.de ([213.133.104.62]:60570 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726698AbeJSSnC (ORCPT ); Fri, 19 Oct 2018 14:43:02 -0400 In-Reply-To: <20181019094417.GE3121@hirez.programming.kicks-ass.net> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 10/19/2018 11:44 AM, Peter Zijlstra wrote: > On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote: >> diff --git a/tools/include/linux/ring_buffer.h b/tools/include/linux/ring_buffer.h >> new file mode 100644 >> index 0000000..48200e0 >> --- /dev/null >> +++ b/tools/include/linux/ring_buffer.h >> @@ -0,0 +1,69 @@ >> +#ifndef _TOOLS_LINUX_RING_BUFFER_H_ >> +#define _TOOLS_LINUX_RING_BUFFER_H_ >> + >> +#include >> +#include >> + >> +/* >> + * Below barriers pair as follows (kernel/events/ring_buffer.c): >> + * >> + * Since the mmap() consumer (userspace) can run on a different CPU: >> + * >> + * kernel user >> + * >> + * if (LOAD ->data_tail) { LOAD ->data_head >> + * (A) smp_rmb() (C) >> + * STORE $data LOAD $data >> + * smp_wmb() (B) smp_mb() (D) >> + * STORE ->data_head STORE ->data_tail >> + * } >> + * >> + * Where A pairs with D, and B pairs with C. >> + * >> + * In our case A is a control dependency that separates the load >> + * of the ->data_tail and the stores of $data. In case ->data_tail >> + * indicates there is no room in the buffer to store $data we do not. >> + * >> + * D needs to be a full barrier since it separates the data READ >> + * from the tail WRITE. >> + * >> + * For B a WMB is sufficient since it separates two WRITEs, and for >> + * C an RMB is sufficient since it separates two READs. >> + */ >> + >> +/* >> + * Note, instead of B, C, D we could also use smp_store_release() >> + * in B and D as well as smp_load_acquire() in C. However, this >> + * optimization makes sense not for all architectures since it >> + * would resolve into READ_ONCE() + smp_mb() pair for smp_load_acquire() >> + * and smp_mb() + WRITE_ONCE() pair for smp_store_release(), thus >> + * for those smp_wmb() in B and smp_rmb() in C would still be less >> + * expensive. For the case of D this has either the same cost or >> + * is less expensive. For example, due to TSO (total store order), >> + * x86 can avoid the CPU barrier entirely. >> + */ >> + >> +static inline u64 ring_buffer_read_head(struct perf_event_mmap_page *base) >> +{ >> +/* >> + * Architectures where smp_load_acquire() does not fallback to >> + * READ_ONCE() + smp_mb() pair. >> + */ >> +#if defined(__x86_64__) || defined(__aarch64__) || defined(__powerpc64__) || \ >> + defined(__ia64__) || defined(__sparc__) && defined(__arch64__) >> + return smp_load_acquire(&base->data_head); >> +#else >> + u64 head = READ_ONCE(base->data_head); >> + >> + smp_rmb(); >> + return head; >> +#endif >> +} >> + >> +static inline void ring_buffer_write_tail(struct perf_event_mmap_page *base, >> + u64 tail) >> +{ >> + smp_store_release(&base->data_tail, tail); >> +} >> + >> +#endif /* _TOOLS_LINUX_RING_BUFFER_H_ */ > > (for the whole patch, but in particular the above) > > Acked-by: Peter Zijlstra (Intel) Great, thanks a lot, Peter! Will flush out v2 in a bit.