Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Daniel Borkmann <daniel@iogearbox.net>
To: Peter Zijlstra <peterz@infradead.org>
Cc: alexei.starovoitov@gmail.com, paulmck@linux.vnet.ibm.com,
	will.deacon@arm.com, acme@redhat.com, yhs@fb.com,
	john.fastabend@gmail.com, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}
Date: Fri, 19 Oct 2018 12:37:18 +0200	[thread overview]
Message-ID: <e98e4ba1-109c-bae9-16db-5cfad88d4f64@iogearbox.net> (raw)
In-Reply-To: <20181019094417.GE3121@hirez.programming.kicks-ass.net>

On 10/19/2018 11:44 AM, Peter Zijlstra wrote:
> On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote:
>> diff --git a/tools/include/linux/ring_buffer.h b/tools/include/linux/ring_buffer.h
>> new file mode 100644
>> index 0000000..48200e0
>> --- /dev/null
>> +++ b/tools/include/linux/ring_buffer.h
>> @@ -0,0 +1,69 @@
>> +#ifndef _TOOLS_LINUX_RING_BUFFER_H_
>> +#define _TOOLS_LINUX_RING_BUFFER_H_
>> +
>> +#include <linux/compiler.h>
>> +#include <asm/barrier.h>
>> +
>> +/*
>> + * Below barriers pair as follows (kernel/events/ring_buffer.c):
>> + *
>> + * Since the mmap() consumer (userspace) can run on a different CPU:
>> + *
>> + *   kernel                             user
>> + *
>> + *   if (LOAD ->data_tail) {            LOAD ->data_head
>> + *                      (A)             smp_rmb()       (C)
>> + *      STORE $data                     LOAD $data
>> + *      smp_wmb()       (B)             smp_mb()        (D)
>> + *      STORE ->data_head               STORE ->data_tail
>> + *   }
>> + *
>> + * Where A pairs with D, and B pairs with C.
>> + *
>> + * In our case A is a control dependency that separates the load
>> + * of the ->data_tail and the stores of $data. In case ->data_tail
>> + * indicates there is no room in the buffer to store $data we do not.
>> + *
>> + * D needs to be a full barrier since it separates the data READ
>> + * from the tail WRITE.
>> + *
>> + * For B a WMB is sufficient since it separates two WRITEs, and for
>> + * C an RMB is sufficient since it separates two READs.
>> + */
>> +
>> +/*
>> + * Note, instead of B, C, D we could also use smp_store_release()
>> + * in B and D as well as smp_load_acquire() in C. However, this
>> + * optimization makes sense not for all architectures since it
>> + * would resolve into READ_ONCE() + smp_mb() pair for smp_load_acquire()
>> + * and smp_mb() + WRITE_ONCE() pair for smp_store_release(), thus
>> + * for those smp_wmb() in B and smp_rmb() in C would still be less
>> + * expensive. For the case of D this has either the same cost or
>> + * is less expensive. For example, due to TSO (total store order),
>> + * x86 can avoid the CPU barrier entirely.
>> + */
>> +
>> +static inline u64 ring_buffer_read_head(struct perf_event_mmap_page *base)
>> +{
>> +/*
>> + * Architectures where smp_load_acquire() does not fallback to
>> + * READ_ONCE() + smp_mb() pair.
>> + */
>> +#if defined(__x86_64__) || defined(__aarch64__) || defined(__powerpc64__) || \
>> +    defined(__ia64__) || defined(__sparc__) && defined(__arch64__)
>> +	return smp_load_acquire(&base->data_head);
>> +#else
>> +	u64 head = READ_ONCE(base->data_head);
>> +
>> +	smp_rmb();
>> +	return head;
>> +#endif
>> +}
>> +
>> +static inline void ring_buffer_write_tail(struct perf_event_mmap_page *base,
>> +					  u64 tail)
>> +{
>> +	smp_store_release(&base->data_tail, tail);
>> +}
>> +
>> +#endif /* _TOOLS_LINUX_RING_BUFFER_H_ */
> 
> (for the whole patch, but in particular the above)
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Great, thanks a lot, Peter! Will flush out v2 in a bit.

next prev parent reply	other threads:[~2018-10-19 18:43 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-17 14:41 [PATCH bpf-next 0/3] improve and fix barriers for walking perf rb Daniel Borkmann
2018-10-17 14:41 ` [PATCH bpf-next 1/3] tools: add smp_* barrier variants to include infrastructure Daniel Borkmann
2018-10-17 14:41 ` [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb} Daniel Borkmann
2018-10-17 15:50   ` Peter Zijlstra
2018-10-17 23:10     ` Daniel Borkmann
2018-10-18  8:14       ` Peter Zijlstra
2018-10-18 15:04         ` Daniel Borkmann
2018-10-18 15:33           ` Alexei Starovoitov
2018-10-18 19:00             ` Daniel Borkmann
2018-10-19  3:53               ` Alexei Starovoitov
2018-10-19 11:02                 ` Will Deacon
2018-10-19 11:56                   ` Paul E. McKenney
2018-10-19  8:04             ` Peter Zijlstra
2018-10-19  9:44           ` Peter Zijlstra
2018-10-19 10:37             ` Daniel Borkmann [this message]
2018-10-17 14:41 ` [PATCH bpf-next 3/3] bpf, libbpf: use proper barriers in perf ring buffer walk Daniel Borkmann
2018-10-17 15:51   ` Peter Zijlstra
2018-10-17 15:03 ` [PATCH bpf-next 0/3] improve and fix barriers for walking perf rb Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e98e4ba1-109c-bae9-16db-5cfad88d4f64@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=acme@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).