From: Yonghong Song <yhs@fb.com>
To: Jon Doron <arilou@gmail.com>,
bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net
Cc: Jon Doron <jond@wiz.io>
Subject: Re: [PATCH bpf-next v3 1/1] libbpf: perfbuf: Add API to get the ring buffer
Date: Fri, 15 Jul 2022 09:58:39 -0700 [thread overview]
Message-ID: <36d47140-b144-0f72-d79c-18b8f3d3be5e@fb.com> (raw)
In-Reply-To: <20220715141835.93513-2-arilou@gmail.com>
On 7/15/22 7:18 AM, Jon Doron wrote:
> From: Jon Doron <jond@wiz.io>
>
> Add support for writing a custom event reader, by exposing the ring
> buffer.
>
> Few simple examples where this type of needed:
> 1. perf_event_read_simple is allocating using malloc, perhaps you want
> to handle the wrap-around in some other way.
> 2. Since perf buf is per-cpu then the order of the events is not
> guarnteed, for example:
> Given 3 events where each event has a timestamp t0 < t1 < t2,
> and the events are spread on more than 1 CPU, then we can end
> up with the following state in the ring buf:
> CPU[0] => [t0, t2]
> CPU[1] => [t1]
> When you consume the events from CPU[0], you could know there is
> a t1 missing, (assuming there are no drops, and your event data
> contains a sequential index).
> So now one can simply do the following, for CPU[0], you can store
> the address of t0 and t2 in an array (without moving the tail, so
> there data is not perished) then move on the CPU[1] and set the
> address of t1 in the same array.
> So you end up with something like:
> void **arr[] = [&t0, &t1, &t2], now you can consume it orderely
> and move the tails as you process in order.
> 3. Assuming there are multiple CPUs and we want to start draining the
> messages from them, then we can "pick" with which one to start with
> according to the remaining free space in the ring buffer.
>
> Signed-off-by: Jon Doron <jond@wiz.io>
> ---
> tools/lib/bpf/libbpf.c | 26 ++++++++++++++++++++++++++
> tools/lib/bpf/libbpf.h | 2 ++
> tools/lib/bpf/libbpf.map | 1 +
> 3 files changed, 29 insertions(+)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index e89cc9c885b3..250263812194 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -12485,6 +12485,32 @@ int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx)
> return cpu_buf->fd;
> }
>
> +/*
> + * Return the memory region of a ring buffer in *buf_idx* slot of
> + * PERF_EVENT_ARRAY BPF map. This ring buffer can be used to implement
> + * a custom events consumer.
> + * The ring buffer starts with the *struct perf_event_mmap_page*, which
> + * holds the ring buffer managment fields, when accessing the header
> + * structure it's important to be SMP aware.
> + * You can refer to *perf_event_read_simple* for a simple example.
> + */
> +int perf_buffer__buffer(struct perf_buffer *pb, int buf_idx, void **buf,
> + size_t *buf_size)
> +{
> + struct perf_cpu_buf *cpu_buf;
> +
> + if (buf_idx >= pb->cpu_cnt)
> + return libbpf_err(-EINVAL);
> +
> + cpu_buf = pb->cpu_bufs[buf_idx];
> + if (!cpu_buf)
> + return libbpf_err(-ENOENT);
> +
> + *buf = cpu_buf->base;
> + *buf_size = pb->mmap_size;
> + return 0;
> +}
> +
> /*
> * Consume data from perf ring buffer corresponding to slot *buf_idx* in
> * PERF_EVENT_ARRAY BPF map without waiting/polling. If there is no data to
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 9e9a3fd3edd8..78a7ab8f610a 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -1381,6 +1381,8 @@ LIBBPF_API int perf_buffer__consume(struct perf_buffer *pb);
> LIBBPF_API int perf_buffer__consume_buffer(struct perf_buffer *pb, size_t buf_idx);
> LIBBPF_API size_t perf_buffer__buffer_cnt(const struct perf_buffer *pb);
> LIBBPF_API int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx);
> +LIBBPF_API int perf_buffer__buffer(struct perf_buffer *pb, int buf_idx, void **buf,
> + size_t *buf_size);
>
> typedef enum bpf_perf_event_ret
> (*bpf_perf_event_print_t)(struct perf_event_header *hdr,
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 52973cffc20c..971072c6dfd8 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -458,6 +458,7 @@ LIBBPF_0.8.0 {
> bpf_program__set_insns;
> libbpf_register_prog_handler;
> libbpf_unregister_prog_handler;
> + perf_buffer__buffer;
You cannot add the LIBBPF_0.7.0 which has been released.
Please add to LIBBPF_1.0.0.
> } LIBBPF_0.7.0;
>
> LIBBPF_1.0.0 {
next prev parent reply other threads:[~2022-07-15 16:59 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-15 14:18 [PATCH bpf-next v3 0/1] libbpf: perfbuf expose ring buffer Jon Doron
2022-07-15 14:18 ` [PATCH bpf-next v3 1/1] libbpf: perfbuf: Add API to get the " Jon Doron
2022-07-15 16:54 ` Andrii Nakryiko
2022-07-15 16:58 ` Yonghong Song [this message]
2022-07-15 16:53 ` [PATCH bpf-next v3 0/1] libbpf: perfbuf expose " Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=36d47140-b144-0f72-d79c-18b8f3d3be5e@fb.com \
--to=yhs@fb.com \
--cc=andrii@kernel.org \
--cc=arilou@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jond@wiz.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox