All of lore.kernel.org
 help / color / mirror / Atom feed
From: Francis Laniel <flaniel@linux.microsoft.com>
To: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
	Francis Laniel <flaniel@linux.microsoft.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Joanne Koong <joannelkoong@gmail.com>,
	Dave Marchevsky <davemarchevsky@fb.com>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Geliang Tang <geliang.tang@suse.com>,
	Hengqi Chen <hengqi.chen@gmail.com>
Subject: [RFC PATCH v1 3/3] libbpf: Make bpf ring buffer overwritable.
Date: Wed, 10 Aug 2022 19:16:54 +0200	[thread overview]
Message-ID: <20220810171702.74932-4-flaniel@linux.microsoft.com> (raw)
In-Reply-To: <20220810171702.74932-1-flaniel@linux.microsoft.com>

This patch permits using over writable feature for BPF ring buffer from
userspace.

Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
---
 tools/include/uapi/linux/bpf.h |  3 +++
 tools/lib/bpf/ringbuf.c        | 35 +++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ef78e0e1a754..19c7039265d8 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1226,6 +1226,9 @@ enum {

 /* Create a map that is suitable to be an inner map with dynamic max entries */
 	BPF_F_INNER_MAP		= (1U << 12),
+
+/* Create an over writable BPF_RINGBUF */
+	BFP_F_RB_OVER_WRITABLE	= (1U << 13),
 };

 /* Flags for BPF_PROG_QUERY. */
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 8bc117bcc7bc..2bd584f7250b 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -23,6 +23,8 @@

 struct ring {
 	ring_buffer_sample_fn sample_cb;
+	__u8 over_writable: 1,
+	     __reserved:    7;
 	void *ctx;
 	void *data;
 	unsigned long *consumer_pos;
@@ -95,6 +97,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	r->sample_cb = sample_cb;
 	r->ctx = ctx;
 	r->mask = info.max_entries - 1;
+	r->over_writable = !!(info.map_flags & BFP_F_RB_OVER_WRITABLE);

 	/* Map writable consumer page */
 	tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
@@ -202,6 +205,11 @@ static inline int roundup_len(__u32 len)
 	return (len + 7) / 8 * 8;
 }

+static inline bool is_over_writable(struct ring *r)
+{
+	return !!r->over_writable;
+}
+
 static int64_t ringbuf_process_ring(struct ring* r)
 {
 	int *len_ptr, len, err;
@@ -209,12 +217,25 @@ static int64_t ringbuf_process_ring(struct ring* r)
 	int64_t cnt = 0;
 	unsigned long cons_pos, prod_pos;
 	bool got_new_data;
+	int rounded_len;
 	void *sample;

 	cons_pos = smp_load_acquire(r->consumer_pos);
 	do {
 		got_new_data = false;
 		prod_pos = smp_load_acquire(r->producer_pos);
+
+		/*
+		 * If the difference between the producrer position and that of
+		 * the consumer is higher than the buffer size, it means the
+		 * producer already looped over the buffer.
+		 * So, data at consumer position were already over written.
+		 * We can then bump consumer position to be that of the producer
+		 * minus the buffer size.
+		 */
+		if (is_over_writable(r) && prod_pos - cons_pos > r->mask)
+			cons_pos = prod_pos - (r->mask + 1);
+
 		while (cons_pos < prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
 			len = smp_load_acquire(len_ptr);
@@ -224,7 +245,19 @@ static int64_t ringbuf_process_ring(struct ring* r)
 				goto done;

 			got_new_data = true;
-			cons_pos += roundup_len(len);
+			rounded_len = roundup_len(len);
+			cons_pos += rounded_len;
+
+			/*
+			 * rounded_len is rounded to be divisible by 8, but a
+			 * length divisible by 8 can be not divisible by 4096.
+			 * So, we need to round again to avoid writing at new
+			 * places.
+			 * See kernel implementation for more details.
+			 */
+			if (is_over_writable(r)) {
+				cons_pos -= (cons_pos & r->mask) % rounded_len;
+			}

 			if ((len & BPF_RINGBUF_DISCARD_BIT) == 0) {
 				sample = (void *)len_ptr + BPF_RINGBUF_HDR_SZ;
--
2.25.1


  parent reply	other threads:[~2022-08-10 17:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-10 17:16 [RFC PATCH v1 0/3] Make BPF ring buffer over writable Francis Laniel
2022-08-10 17:16 ` [RFC PATCH v1 1/3] bpf: Make ring buffer overwritable Francis Laniel
2022-08-15 21:52   ` Andrii Nakryiko
2022-08-16 10:23     ` Francis Laniel
2022-08-16 12:28     ` Alban Crequy
2022-08-10 17:16 ` [RFC PATCH v1 2/3] do not merge: Temporary fix for is_power_of_2 Francis Laniel
2022-08-10 17:16 ` Francis Laniel [this message]
2022-08-10 17:16 ` [PATCH] for test purpose only: Add toy to play with BPF ring buffer Francis Laniel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220810171702.74932-4-flaniel@linux.microsoft.com \
    --to=flaniel@linux.microsoft.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=geliang.tang@suse.com \
    --cc=haoluo@google.com \
    --cc=hengqi.chen@gmail.com \
    --cc=joannelkoong@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.