[PATCH bpf 0/6] libbpf: Fix ring buffer consumption

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf 0/6] libbpf: Fix ring buffer consumption
@ 2026-06-14  1:48 Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

Fix several correctness issues in libbpf's ring buffer consumer.

A zero record bound currently consumes one record. A NULL callback is
accepted during manager construction but crashes when callback-based
consumption reaches the ring. Position counters stop consumption after
wrapping because they are compared by magnitude.

The consumer can also miss a readiness notification after publishing its
position and checking for new data without a full StoreLoad barrier. Use
compiler atomics and add the missing barrier, including when retrying a
busy record after publishing earlier records.

Callback traversal does not follow the overwrite position maintained by
BPF_F_RB_OVERWRITE maps. Reject callback consumption of those maps, as
discussed here:
https://lore.kernel.org/bpf/CAEf4Bzaq5drHWChXoRBnrmkb6reAsSVj8r=uByFSup31FMA7hw@mail.gmail.com/

Andrew Werner found the position-wrap and missed-wakeup failures while
implementing Aya's ring buffer reader. Aya's original implementation
contains the equality reasoning and edge-triggered regression test:
https://github.com/aya-rs/aya/commit/e2cf734490bc188bcedb1eac92d23d81123e42cd

Aya later corrected the consumer ordering with the same explicit fence:
https://github.com/aya-rs/aya/commit/7277a57ea8cdb74918d3096a4b22b6d814481973

Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
Tamir Duberstein (6):
      libbpf: ringbuf: Honor zero consume bounds
      libbpf: ringbuf: Prevent NULL callback crash
      libbpf: ringbuf: Handle position counter wrap
      libbpf: ringbuf: Use compiler atomics
      libbpf: ringbuf: Prevent missed wakeups
      libbpf: ringbuf: Reject overwrite callback use

 tools/lib/bpf/libbpf.h                           |  34 +++-
 tools/lib/bpf/ringbuf.c                          |  84 +++++++--
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 229 +++++++++++++++++++++++
 3 files changed, 323 insertions(+), 24 deletions(-)
---
base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d
change-id: 20260613-bpf-ringbuf-fixes-e9a8b3c6125b

Best regards,
--  
Tamir Duberstein <tamird@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash Tamir Duberstein
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

ringbuf_process_ring() checks the record bound only after advancing the
consumer position and invoking the callback. A zero bound therefore
consumes the first available record.

Return before reading the ring positions when the bound is zero so
ring_buffer__consume_n() and ring__consume_n() leave all records queued.

Fixes: 4d22ea94ea33 ("libbpf: Add ring__consume_n / ring_buffer__consume_n")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/ringbuf.c                          |  3 +++
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 00ec4837a06d..f2bb619d5a75 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -240,6 +240,9 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	bool got_new_data;
 	void *sample;
 
+	if (n == 0)
+		return 0;
+
 	cons_pos = smp_load_acquire(r->consumer_pos);
 	do {
 		got_new_data = false;
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 64520684d2cb..4f0558f14847 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -404,6 +404,7 @@ static int process_n_sample(void *ctx, void *data, size_t len)
 static void ringbuf_n_subtest(void)
 {
 	struct test_ringbuf_n_lskel *skel_n;
+	struct ring *ring;
 	int err, i;
 
 	skel_n = test_ringbuf_n_lskel__open();
@@ -431,6 +432,18 @@ static void ringbuf_n_subtest(void)
 	for (i = 0; i < N_TOT_SAMPLES; i++)
 		syscall(__NR_getpgid);
 
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring"))
+		goto cleanup_ringbuf;
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	if (!ASSERT_EQ(err, 0, "ringbuf_consume_zero"))
+		goto cleanup_ringbuf;
+
+	err = ring__consume_n(ring, 0);
+	if (!ASSERT_EQ(err, 0, "ring_consume_zero"))
+		goto cleanup_ringbuf;
+
 	/* Consume all samples from the ring buffer in batches of N_SAMPLES */
 	for (i = 0; i < N_TOT_SAMPLES; i += err) {
 		err = ring_buffer__consume_n(ringbuf, N_SAMPLES);

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

ring_buffer__new() and ring_buffer__add() allow a NULL sample
callback. When callback-based consumption reaches such a ring, it calls
through the NULL function pointer and crashes.

Validate every ring in a manager before polling or consuming. Return
-EINVAL without consuming records from an earlier valid ring or waiting
for an event. Perform the same check before honoring a zero record bound
so invalid callback consumption consistently reports the error.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/libbpf.h                           | 11 ++-
 tools/lib/bpf/ringbuf.c                          | 41 +++++++++--
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 93 ++++++++++++++++++++++++
 3 files changed, 134 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index bba4e8464396..9ba6b9ad3498 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1526,18 +1526,17 @@ LIBBPF_API int ring__map_fd(const struct ring *r);
  *
  * @param r A ringbuffer object.
  * @return The number of records consumed (or INT_MAX, whichever is less), or
- * a negative number if any of the callbacks return an error.
+ * a negative error code on failure.
  */
 LIBBPF_API int ring__consume(struct ring *r);
 
 /**
- * @brief **ring__consume_n()** consumes up to a requested amount of items from
- * a ringbuffer without event polling.
+ * @brief **ring__consume_n()** consumes up to a requested number of records
+ * from a ringbuffer without event polling.
  *
  * @param r A ringbuffer object.
- * @param n Maximum amount of items to consume.
- * @return The number of items consumed, or a negative number if any of the
- * callbacks return an error.
+ * @param n Maximum number of records to consume.
+ * @return The number of records consumed, or a negative error code on failure.
  */
 LIBBPF_API int ring__consume_n(struct ring *r, size_t n);
 
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index f2bb619d5a75..ae7fa79b6217 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -231,6 +231,24 @@ static inline int roundup_len(__u32 len)
 	return (len + 7) / 8 * 8;
 }
 
+static int ringbuf_validate(const struct ring *r)
+{
+	return r->sample_cb ? 0 : -EINVAL;
+}
+
+static int ringbuf_validate_callbacks(const struct ring_buffer *rb)
+{
+	int i, err;
+
+	for (i = 0; i < rb->ring_cnt; i++) {
+		err = ringbuf_validate(rb->rings[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 {
 	int *len_ptr, len, err;
@@ -240,6 +258,9 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	bool got_new_data;
 	void *sample;
 
+	err = ringbuf_validate(r);
+	if (err)
+		return err;
 	if (n == 0)
 		return 0;
 
@@ -284,14 +305,17 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
  * records.
  *
  * Returns number of records consumed across all registered ring buffers (or
- * n, whichever is less), or negative number if any of the callbacks return
- * error.
+ * n, whichever is less), or a negative error code on failure.
  */
 int ring_buffer__consume_n(struct ring_buffer *rb, size_t n)
 {
 	int64_t err, res = 0;
 	int i;
 
+	err = ringbuf_validate_callbacks(rb);
+	if (err)
+		return libbpf_err(err);
+
 	for (i = 0; i < rb->ring_cnt; i++) {
 		struct ring *ring = rb->rings[i];
 
@@ -309,14 +333,17 @@ int ring_buffer__consume_n(struct ring_buffer *rb, size_t n)
 
 /* Consume available ring buffer(s) data without event polling.
  * Returns number of records consumed across all registered ring buffers (or
- * INT_MAX, whichever is less), or negative number if any of the callbacks
- * return error.
+ * INT_MAX, whichever is less), or a negative error code on failure.
  */
 int ring_buffer__consume(struct ring_buffer *rb)
 {
 	int64_t err, res = 0;
 	int i;
 
+	err = ringbuf_validate_callbacks(rb);
+	if (err)
+		return libbpf_err(err);
+
 	for (i = 0; i < rb->ring_cnt; i++) {
 		struct ring *ring = rb->rings[i];
 
@@ -334,13 +361,17 @@ int ring_buffer__consume(struct ring_buffer *rb)
 
 /* Poll for available data and consume records, if any are available.
  * Returns number of records consumed (or INT_MAX, whichever is less), or
- * negative number, if any of the registered callbacks returned error.
+ * a negative error code on failure.
  */
 int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
 {
 	int i, cnt;
 	int64_t err, res = 0;
 
+	err = ringbuf_validate_callbacks(rb);
+	if (err)
+		return libbpf_err(err);
+
 	cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms);
 	if (cnt < 0)
 		return libbpf_err(-errno);
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 4f0558f14847..9ce996bcea8c 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -401,6 +401,97 @@ static int process_n_sample(void *ctx, void *data, size_t len)
 	return 0;
 }
 
+static int process_noop_sample(void *ctx, void *data, size_t len)
+{
+	return 0;
+}
+
+static void ringbuf_null_cb_subtest(void)
+{
+	struct test_ringbuf_n_lskel *skel_n;
+	struct ring_buffer *ringbuf = NULL;
+	struct ring *ring;
+	unsigned long consumer_pos;
+	int no_cb_map_fd = -1;
+	int err;
+
+	skel_n = test_ringbuf_n_lskel__open();
+	if (!ASSERT_OK_PTR(skel_n, "test_ringbuf_n_lskel__open"))
+		return;
+
+	skel_n->maps.ringbuf.max_entries = getpagesize();
+	skel_n->bss->pid = getpid();
+	skel_n->bss->value = SAMPLE_VALUE;
+
+	err = test_ringbuf_n_lskel__load(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__load"))
+		goto cleanup;
+
+	err = test_ringbuf_n_lskel__attach(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__attach"))
+		goto cleanup;
+
+	syscall(__NR_getpgid);
+
+	no_cb_map_fd = bpf_map_create(BPF_MAP_TYPE_RINGBUF, NULL, 0, 0,
+				      getpagesize(), NULL);
+	if (!ASSERT_OK_FD(no_cb_map_fd, "bpf_map_create"))
+		goto cleanup;
+
+	/* Manager APIs must validate all rings before consuming any of them. */
+	ringbuf = ring_buffer__new(skel_n->maps.ringbuf.map_fd,
+				   process_noop_sample, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new"))
+		goto cleanup_fd;
+
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring"))
+		goto cleanup_ringbuf;
+
+	err = ring_buffer__add(ringbuf, no_cb_map_fd, NULL, NULL);
+	if (!ASSERT_OK(err, "ring_buffer__add_no_cb"))
+		goto cleanup_ringbuf;
+
+	consumer_pos = ring__consumer_pos(ring);
+	ASSERT_GT(ring__producer_pos(ring), consumer_pos,
+		  "producer_pos_mixed_cb");
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_consume_zero_mixed_cb");
+	err = ring_buffer__consume(ringbuf);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_consume_mixed_cb");
+	err = ring_buffer__poll(ringbuf, 0);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_poll_mixed_cb");
+	ASSERT_EQ(ring__consumer_pos(ring), consumer_pos,
+		  "consumer_pos_mixed_cb");
+
+	ring_buffer__free(ringbuf);
+	ringbuf =
+		ring_buffer__new(skel_n->maps.ringbuf.map_fd, NULL, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new_no_cb"))
+		goto cleanup_fd;
+
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring_no_cb"))
+		goto cleanup_ringbuf;
+	consumer_pos = ring__consumer_pos(ring);
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_consume_zero_no_cb");
+	err = ring__consume_n(ring, 0);
+	ASSERT_EQ(err, -EINVAL, "ring_consume_zero_no_cb");
+	err = ring__consume(ring);
+	ASSERT_EQ(err, -EINVAL, "ring_consume_no_cb");
+	ASSERT_EQ(ring__consumer_pos(ring), consumer_pos, "consumer_pos_no_cb");
+
+cleanup_ringbuf:
+	ring_buffer__free(ringbuf);
+cleanup_fd:
+	close(no_cb_map_fd);
+cleanup:
+	test_ringbuf_n_lskel__destroy(skel_n);
+}
+
 static void ringbuf_n_subtest(void)
 {
 	struct test_ringbuf_n_lskel *skel_n;
@@ -579,6 +670,8 @@ void test_ringbuf(void)
 		ringbuf_subtest();
 	if (test__start_subtest("ringbuf_n"))
 		ringbuf_n_subtest();
+	if (test__start_subtest("ringbuf_null_cb"))
+		ringbuf_null_cb_subtest();
 	if (test__start_subtest("ringbuf_map_key"))
 		ringbuf_map_key_subtest();
 	if (test__start_subtest("ringbuf_write"))

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  2:05   ` sashiko-bot
  2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

Ring buffer positions are unsigned long counters and can wrap on 32-bit
systems. ringbuf_process_ring() stops consuming when producer_pos wraps
below consumer_pos because it compares the counters by magnitude.

Compare the positions for equality instead. The producer cannot move
logically behind the consumer in a non-overwrite ring.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Reported-by: Andrew Werner <awerner32@gmail.com>
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/ringbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index ae7fa79b6217..b7adce37b519 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -268,7 +268,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	do {
 		got_new_data = false;
 		prod_pos = smp_load_acquire(r->producer_pos);
-		while (cons_pos < prod_pos) {
+		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
 			len = smp_load_acquire(len_ptr);
 

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
                   ` (2 preceding siblings ...)
  2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:59   ` sashiko-bot
  2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use Tamir Duberstein
  5 siblings, 1 reply; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

Consumer-side ring buffer code uses architecture-specific smp_* helpers
for shared memory accesses.

Use compiler atomics instead. They provide equivalent acquire and
release ordering through a portable userspace interface and allow the
next commit to use compiler fences in the wakeup protocol without mixing
atomic interfaces.

Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/ringbuf.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index b7adce37b519..1c24a83f59d5 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -264,13 +264,13 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	if (n == 0)
 		return 0;
 
-	cons_pos = smp_load_acquire(r->consumer_pos);
+	cons_pos = __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
 	do {
 		got_new_data = false;
-		prod_pos = smp_load_acquire(r->producer_pos);
+		prod_pos = __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
 		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
-			len = smp_load_acquire(len_ptr);
+			len = __atomic_load_n(len_ptr, __ATOMIC_ACQUIRE);
 
 			/* sample not committed yet, bail out for now */
 			if (len & BPF_RINGBUF_BUSY_BIT)
@@ -284,14 +284,16 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 				err = r->sample_cb(r->ctx, sample, len);
 				if (err < 0) {
 					/* update consumer pos and bail out */
-					smp_store_release(r->consumer_pos,
-							  cons_pos);
+					__atomic_store_n(r->consumer_pos,
+							 cons_pos,
+							 __ATOMIC_RELEASE);
 					return err;
 				}
 				cnt++;
 			}
 
-			smp_store_release(r->consumer_pos, cons_pos);
+			__atomic_store_n(r->consumer_pos, cons_pos,
+					 __ATOMIC_RELEASE);
 
 			if (cnt >= n)
 				goto done;
@@ -406,8 +408,8 @@ struct ring *ring_buffer__ring(struct ring_buffer *rb, unsigned int idx)
 
 unsigned long ring__consumer_pos(const struct ring *r)
 {
-	/* Synchronizes with smp_store_release() in ringbuf_process_ring(). */
-	return smp_load_acquire(r->consumer_pos);
+	/* Synchronizes with the release store in ringbuf_process_ring(). */
+	return __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
 }
 
 unsigned long ring__producer_pos(const struct ring *r)
@@ -415,7 +417,7 @@ unsigned long ring__producer_pos(const struct ring *r)
 	/* Synchronizes with smp_store_release() in __bpf_ringbuf_reserve() in
 	 * the kernel.
 	 */
-	return smp_load_acquire(r->producer_pos);
+	return __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
 }
 
 size_t ring__avail_data_size(const struct ring *r)

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
                   ` (3 preceding siblings ...)
  2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:57   ` sashiko-bot
  2026-06-14  1:48 ` [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use Tamir Duberstein
  5 siblings, 1 reply; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

After consuming the last visible record, ringbuf_process_ring()
publishes the consumer position and checks the producer position. These
operations lack a full StoreLoad barrier. A producer can therefore
commit a new record but read the old consumer position while the
consumer reads the old producer position. The producer sends no
notification and the consumer waits despite a queued record.

Insert a full barrier before checking for new data, ensuring that either
the consumer observes the producer update or the producer observes the
consumer update and sends a notification. Apply the same handshake when
a busy record follows records whose consumer position was published.

Add an edge-triggered epoll test with a concurrent producer. Without the
barrier, a missed notification leaves the producer dropping records from
a full ring while the consumer times out. Document that bounded
consumers and callbacks that terminate consumption must drain before
waiting again.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Reported-by: Andrew Werner <awerner32@gmail.com>
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/libbpf.h                           | 22 +++++++
 tools/lib/bpf/ringbuf.c                          | 14 +++-
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 84 ++++++++++++++++++++++++
 3 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 9ba6b9ad3498..a3b8f606a91d 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1439,6 +1439,10 @@ struct ring_buffer;
 struct ring;
 struct user_ring_buffer;
 
+/* A negative return stops consumption; non-negative values continue. Stopping
+ * can leave records queued without a new readiness notification. Before
+ * waiting for readiness again, consume until no records remain.
+ */
 typedef int (*ring_buffer_sample_fn)(void *ctx, void *data, size_t size);
 
 struct ring_buffer_opts {
@@ -1455,6 +1459,20 @@ LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 				ring_buffer_sample_fn sample_cb, void *ctx);
 LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
 LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
+
+/**
+ * @brief **ring_buffer__consume_n()** consumes up to a requested number of
+ * records from a ring buffer manager without event polling.
+ *
+ * @param rb A ring buffer manager object.
+ * @param n Maximum number of records to consume.
+ * @return The number of records consumed, or a negative error code on failure.
+ *
+ * Reaching the requested bound does not establish that every ring is empty.
+ * Records can remain queued without a new readiness notification. Before
+ * waiting on ring_buffer__epoll_fd(), call ring_buffer__consume() until it
+ * returns 0.
+ */
 LIBBPF_API int ring_buffer__consume_n(struct ring_buffer *rb, size_t n);
 LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);
 
@@ -1537,6 +1555,10 @@ LIBBPF_API int ring__consume(struct ring *r);
  * @param r A ringbuffer object.
  * @param n Maximum number of records to consume.
  * @return The number of records consumed, or a negative error code on failure.
+ *
+ * Reaching the requested bound does not establish that the ring is empty.
+ * Records can remain queued without a new readiness notification. Before
+ * waiting on ring__map_fd(), call ring__consume() until it returns 0.
  */
 LIBBPF_API int ring__consume_n(struct ring *r, size_t n);
 
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 1c24a83f59d5..ea8909fec4e9 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -255,7 +255,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	/* 64-bit to avoid overflow in case of extreme application behavior */
 	int64_t cnt = 0;
 	unsigned long cons_pos, prod_pos;
-	bool got_new_data;
+	bool got_new_data, needs_wakeup = false;
 	void *sample;
 
 	err = ringbuf_validate(r);
@@ -267,14 +267,21 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	cons_pos = __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
 	do {
 		got_new_data = false;
+		if (needs_wakeup) {
+			/* Ensure either this sees a new record or its producer sees
+			 * the updated consumer position and sends a notification.
+			 */
+			__atomic_thread_fence(__ATOMIC_SEQ_CST);
+			needs_wakeup = false;
+		}
 		prod_pos = __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
 		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
 			len = __atomic_load_n(len_ptr, __ATOMIC_ACQUIRE);
 
-			/* sample not committed yet, bail out for now */
+			/* Retry a busy record once after publishing prior records. */
 			if (len & BPF_RINGBUF_BUSY_BIT)
-				goto done;
+				break;
 
 			got_new_data = true;
 			cons_pos += roundup_len(len);
@@ -294,6 +301,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 
 			__atomic_store_n(r->consumer_pos, cons_pos,
 					 __ATOMIC_RELEASE);
+			needs_wakeup = true;
 
 			if (cnt >= n)
 				goto done;
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 9ce996bcea8c..5f0c679bf9a6 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -492,6 +492,88 @@ static void ringbuf_null_cb_subtest(void)
 	test_ringbuf_n_lskel__destroy(skel_n);
 }
 
+#define N_WAKEUP_SAMPLES 20000
+
+struct wakeup_ctx {
+	bool stop;
+};
+
+static void *wakeup_producer(void *arg)
+{
+	struct wakeup_ctx *ctx = arg;
+
+	while (!__atomic_load_n(&ctx->stop, __ATOMIC_RELAXED))
+		syscall(__NR_getpgid);
+	return NULL;
+}
+
+static void ringbuf_wakeup_subtest(void)
+{
+	struct test_ringbuf_n_lskel *skel_n;
+	struct ring_buffer *ringbuf = NULL;
+	struct epoll_event event = {
+		.events = EPOLLIN | EPOLLET,
+	};
+	struct wakeup_ctx ctx = {};
+	pthread_t producer;
+	int epoll_fd = -1;
+	int err, total = 0;
+
+	skel_n = test_ringbuf_n_lskel__open();
+	if (!ASSERT_OK_PTR(skel_n, "test_ringbuf_n_lskel__open"))
+		return;
+
+	skel_n->maps.ringbuf.max_entries = getpagesize();
+	skel_n->bss->pid = getpid();
+	skel_n->bss->value = SAMPLE_VALUE;
+
+	err = test_ringbuf_n_lskel__load(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__load"))
+		goto cleanup;
+
+	err = test_ringbuf_n_lskel__attach(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__attach"))
+		goto cleanup;
+
+	ringbuf = ring_buffer__new(skel_n->maps.ringbuf.map_fd,
+				   process_noop_sample, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new"))
+		goto cleanup;
+
+	epoll_fd = epoll_create1(EPOLL_CLOEXEC);
+	if (!ASSERT_OK_FD(epoll_fd, "epoll_create1"))
+		goto cleanup_ringbuf;
+
+	err = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, skel_n->maps.ringbuf.map_fd,
+			&event);
+	if (!ASSERT_OK(err, "epoll_ctl"))
+		goto cleanup_epoll;
+
+	err = pthread_create(&producer, NULL, wakeup_producer, &ctx);
+	if (!ASSERT_OK(err, "pthread_create"))
+		goto cleanup_epoll;
+
+	while (total < N_WAKEUP_SAMPLES) {
+		err = epoll_wait(epoll_fd, &event, 1, 1000);
+		if (!ASSERT_EQ(err, 1, "epoll_wait"))
+			goto cleanup_thread;
+		while ((err = ring_buffer__consume(ringbuf)) > 0)
+			total += err;
+		if (!ASSERT_OK(err, "ring_buffer__consume"))
+			goto cleanup_thread;
+	}
+
+cleanup_thread:
+	__atomic_store_n(&ctx.stop, true, __ATOMIC_RELAXED);
+	pthread_join(producer, NULL);
+cleanup_epoll:
+	close(epoll_fd);
+cleanup_ringbuf:
+	ring_buffer__free(ringbuf);
+cleanup:
+	test_ringbuf_n_lskel__destroy(skel_n);
+}
+
 static void ringbuf_n_subtest(void)
 {
 	struct test_ringbuf_n_lskel *skel_n;
@@ -672,6 +754,8 @@ void test_ringbuf(void)
 		ringbuf_n_subtest();
 	if (test__start_subtest("ringbuf_null_cb"))
 		ringbuf_null_cb_subtest();
+	if (test__start_subtest("ringbuf_wakeup"))
+		ringbuf_wakeup_subtest();
 	if (test__start_subtest("ringbuf_map_key"))
 		ringbuf_map_key_subtest();
 	if (test__start_subtest("ringbuf_write"))

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
                   ` (4 preceding siblings ...)
  2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  5 siblings, 0 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

BPF_F_RB_OVERWRITE can advance overwrite_pos past consumer_pos.
Callback traversal does not read overwrite_pos, so after the producer
laps the consumer it can treat overwritten data as a record header.

An earlier proposal[0] copied the readable window before invoking
callbacks. Review concluded that callbacks are a poor fit because
copying penalizes zero-copy users and the API cannot report skipped
records.

Record the map flag and reject callback consumption with -EOPNOTSUPP.

Link: https://lore.kernel.org/bpf/CAEf4Bzaq5drHWChXoRBnrmkb6reAsSVj8r=uByFSup31FMA7hw@mail.gmail.com/ [0]
Fixes: feeaf1346f80 ("bpf: Add overwrite mode for BPF ring buffer")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/libbpf.h                           |  1 +
 tools/lib/bpf/ringbuf.c                          |  4 +++
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 39 ++++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index a3b8f606a91d..899457d5d536 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1439,6 +1439,7 @@ struct ring_buffer;
 struct ring;
 struct user_ring_buffer;
 
+/* Callback-based consumption is unsupported for BPF_F_RB_OVERWRITE maps. */
 /* A negative return stops consumption; non-negative values continue. Stopping
  * can leave records queued without a new readiness notification. Before
  * waiting for readiness again, consume until no records remain.
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index ea8909fec4e9..f7972eae05ba 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -30,6 +30,7 @@ struct ring {
 	unsigned long *producer_pos;
 	unsigned long mask;
 	int map_fd;
+	bool overwrite;
 };
 
 struct ring_buffer {
@@ -118,6 +119,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	r->sample_cb = sample_cb;
 	r->ctx = ctx;
 	r->mask = info.max_entries - 1;
+	r->overwrite = info.map_flags & BPF_F_RB_OVERWRITE;
 
 	/* Map writable consumer page */
 	tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, map_fd, 0);
@@ -233,6 +235,8 @@ static inline int roundup_len(__u32 len)
 
 static int ringbuf_validate(const struct ring *r)
 {
+	if (r->overwrite)
+		return -EOPNOTSUPP;
 	return r->sample_cb ? 0 : -EINVAL;
 }
 
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 5f0c679bf9a6..a6c707af1134 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -684,6 +684,43 @@ static void ringbuf_map_key_subtest(void)
 	test_ringbuf_map_key_lskel__destroy(skel_map_key);
 }
 
+static void ringbuf_overwrite_callback_subtest(void)
+{
+	LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_RB_OVERWRITE);
+	struct ring_buffer *ringbuf;
+	struct ring *ring;
+	int map_fd, err;
+
+	map_fd = bpf_map_create(BPF_MAP_TYPE_RINGBUF, NULL, 0, 0, getpagesize(),
+				&opts);
+	if (!ASSERT_OK_FD(map_fd, "bpf_map_create"))
+		return;
+
+	ringbuf = ring_buffer__new(map_fd, process_noop_sample, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new"))
+		goto cleanup_fd;
+
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring"))
+		goto cleanup_ringbuf;
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ringbuf_consume_zero");
+	err = ring_buffer__consume(ringbuf);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ringbuf_consume");
+	err = ring_buffer__poll(ringbuf, 0);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ringbuf_poll");
+	err = ring__consume_n(ring, 0);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ring_consume_zero");
+	err = ring__consume(ring);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ring_consume");
+
+cleanup_ringbuf:
+	ring_buffer__free(ringbuf);
+cleanup_fd:
+	close(map_fd);
+}
+
 static void ringbuf_overwrite_mode_subtest(void)
 {
 	unsigned long size, len1, len2, len3, len4, len5;
@@ -760,6 +797,8 @@ void test_ringbuf(void)
 		ringbuf_map_key_subtest();
 	if (test__start_subtest("ringbuf_write"))
 		ringbuf_write_subtest();
+	if (test__start_subtest("ringbuf_overwrite_callback"))
+		ringbuf_overwrite_callback_subtest();
 	if (test__start_subtest("ringbuf_overwrite_mode"))
 		ringbuf_overwrite_mode_subtest();
 }

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups
  2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
@ 2026-06-14  1:57   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-14  1:57 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Early returns in `ringbuf_process_ring()` bypass the newly added StoreLoad memory barrier, leaving bounded consumers vulnerable to missed wakeups.
- [Low] Multi-line comments introduced by the patch violate the BPF subsystem comment style.
--

commit 1b4f252cfd5df3b60a00d140737f0f9bc3b4ad87
Author: Tamir Duberstein <tamird@kernel.org>
Subject: libbpf: ringbuf: Prevent missed wakeups

This commit addresses missed wakeups in the BPF ring buffer by adding a full
StoreLoad barrier before checking for new data. It ensures either the
consumer observes the producer's update, or the producer observes the
consumer's update and sends a notification. It also updates documentation
and adds a selftest.

> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 9ba6b9ad34983..a3b8f606a91d0 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -1439,6 +1439,10 @@ struct ring_buffer;
>  struct ring;
>  struct user_ring_buffer;
>  
> +/* A negative return stops consumption; non-negative values continue. Stopping
> + * can leave records queued without a new readiness notification. Before
> + * waiting for readiness again, consume until no records remain.
> + */

[Severity: Low]
This isn't a bug, but does this multi-line comment follow the BPF subsystem
style? The guidelines prefer the opening /* to be on its own line.

[ ... ]

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index 1c24a83f59d5c..ea8909fec4e98 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
[ ... ]
> @@ -267,14 +267,21 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  	cons_pos = __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
>  	do {
>  		got_new_data = false;
> +		if (needs_wakeup) {
> +			/* Ensure either this sees a new record or its producer sees
> +			 * the updated consumer position and sends a notification.
> +			 */

[Severity: Low]
This isn't a bug, but should this multi-line comment have the opening /* on
its own line to match the BPF subsystem comment style?

> +			__atomic_thread_fence(__ATOMIC_SEQ_CST);
> +			needs_wakeup = false;
> +		}
>  		prod_pos = __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
>  		while (cons_pos != prod_pos) {
[ ... ]
> @@ -294,6 +301,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  
>  			__atomic_store_n(r->consumer_pos, cons_pos,
>  					 __ATOMIC_RELEASE);
> +			needs_wakeup = true;
>  
>  			if (cnt >= n)
>  				goto done;

[Severity: High]
If this loop terminates early by reaching the bound (cnt >= n) or due to a
negative callback return, consumer_pos is updated and the function returns,
discarding the local needs_wakeup variable.

When ring_buffer__consume() is called again to drain the remaining records
before sleeping, it will initialize needs_wakeup = false. This causes the
__ATOMIC_SEQ_CST fence to be skipped prior to loading producer_pos.

Could this allow the CPU to delay the visibility of the consumer_pos store
while reading a stale producer_pos, resulting in the consumer incorrectly
sleeping and the concurrent producer skipping the wakeup notification?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613-bpf-ringbuf-fixes-v1-0-e623481cb724@kernel.org?part=5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics
  2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
@ 2026-06-14  1:59   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-14  1:59 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] 32-bit monitoring applications using the BPF ring buffer on a 64-bit kernel will permanently stall after processing 4GB of events due to an unsigned long pointer type mismatch for producer/consumer positions.
--

commit b302f5c853774170c0ab9e37b85abcc4d647ac3b
Author: Tamir Duberstein <tamird@kernel.org>

libbpf: ringbuf: Use compiler atomics

This commit replaces architecture-specific smp_* helpers with portable compiler
atomics for shared memory accesses in the consumer-side ring buffer code.

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index b7adce37b5196..1c24a83f59d5c 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
[ ... ]
> @@ -284,14 +284,16 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  				err = r->sample_cb(r->ctx, sample, len);
>  				if (err < 0) {
>  					/* update consumer pos and bail out */
> -					smp_store_release(r->consumer_pos,
> -							  cons_pos);
> +					__atomic_store_n(r->consumer_pos,
> +							 cons_pos,
> +							 __ATOMIC_RELEASE);
>  					return err;
>  				}
>  				cnt++;
>  			}
>  
> -			smp_store_release(r->consumer_pos, cons_pos);
> +			__atomic_store_n(r->consumer_pos, cons_pos,
> +					 __ATOMIC_RELEASE);

[Severity: High]
This isn't a bug introduced by this patch, but does this write to
consumer_pos safely handle a 32-bit userspace application running on a
64-bit kernel?

In the kernel, struct bpf_ringbuf defines consumer_pos and producer_pos as
unsigned long, which are 64 bits. However, in libbpf's struct ring,
consumer_pos is mapped as an unsigned long pointer, which is only 32 bits
in a 32-bit userspace environment.

When the 32-bit user process updates consumer_pos via __atomic_store_n(), it
only overwrites the lower 32 bits of the 64-bit value on little-endian
systems. Once the true consumer position exceeds 4GB, the userspace 32-bit
value wraps to 0. Writing 0 to the lower 32 bits while the upper 32 bits
remain 0 causes the kernel to see consumer_pos drop to 0, while the 64-bit
producer_pos continues climbing past 4GB.

This causes the kernel's bpf_ringbuf_has_space() to see a delta larger than
the ring buffer mask, incorrectly treating the buffer as completely full and
permanently dropping all subsequent events for that buffer.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613-bpf-ringbuf-fixes-v1-0-e623481cb724@kernel.org?part=4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap
  2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
@ 2026-06-14  2:05   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-14  2:05 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] `__bpf_ringbuf_reserve` in the kernel permanently fails allocations after 32-bit position wrap-around.
- [High] Kernel-side `__bpf_user_ringbuf_peek` incorrectly handles 32-bit wrap-around, leading to permanent stalls and bypassing bounds checks.
- [High] 32-bit userspace running on a 64-bit kernel permanently breaks BPF ring buffers after 4GB of data is processed due to pointer size mismatches.
--

commit ce5e8d811804f9ff598195a7f394512f77c373b0
Author: Tamir Duberstein <tamird@kernel.org>

libbpf: ringbuf: Handle position counter wrap

Ring buffer positions are unsigned long counters and can wrap on 32-bit
systems. ringbuf_process_ring() stops consuming when producer_pos wraps
below consumer_pos because it compares the counters by magnitude.

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index ae7fa79b62170..b7adce37b5196 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -268,7 +268,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  	do {
>  		got_new_data = false;
>  		prod_pos = smp_load_acquire(r->producer_pos);
> -		while (cons_pos < prod_pos) {
> +		while (cons_pos != prod_pos) {
>  			len_ptr = r->data + (cons_pos & r->mask);
>  			len = smp_load_acquire(len_ptr);
> 

[Severity: High]
This is a pre-existing issue, but does this fix fully resolve the behavior
for 32-bit userspace applications running on 64-bit kernels?

Looking further down in ringbuf_process_ring(), userspace updates the
consumer position:

    __atomic_store_n(r->consumer_pos, cons_pos, __ATOMIC_RELEASE);

On a 32-bit architecture, this writes 4 bytes. However, the 64-bit kernel
defines consumer_pos as an 8-byte unsigned long. When the 32-bit userspace
counter wraps at 4GB, it writes 0 to the lower 4 bytes while leaving the
upper 4 bytes unchanged.

Could this cause the kernel's bpf_ringbuf_has_space() to perceive an
instant drop in the consumer position, making it evaluate the ring buffer
as permanently full?

[Severity: High]
This is a pre-existing issue, but does the kernel side have a similar
wrap-around issue in __bpf_ringbuf_reserve()?

kernel/bpf/ringbuf.c:__bpf_ringbuf_reserve() {
    ...
    while (pend_pos < prod_pos) {
        ...
    }
    ...
}

Since pend_pos and prod_pos are unsigned longs, on 32-bit kernels, does
this magnitude comparison fail when producer_pos wraps around to a small
value? It appears pend_pos would freeze at its pre-wrap value, causing
bpf_ringbuf_has_space() to permanently return false and deny future
allocations.

[Severity: High]
This is also a pre-existing issue, but does __bpf_user_ringbuf_peek() have
a related wrap-around bug on 32-bit systems?

kernel/bpf/ringbuf.c:__bpf_user_ringbuf_peek() {
    u64 cons_pos, prod_pos;
    ...
    prod_pos = smp_load_acquire(&rb->producer_pos);
    ...
    cons_pos = smp_load_acquire(&rb->consumer_pos);
    if (cons_pos >= prod_pos)
        ...
}

Because rb->producer_pos and rb->consumer_pos are 32-bit unsigned longs on
32-bit systems, are they zero-extended to 64-bit u64 variables? If
producer_pos wraps around while consumer_pos is still large, does the
64-bit promotion cause the magnitude comparison to evaluate to true,
permanently stalling consumption by returning -ENODATA?

Additionally, would the 64-bit subtraction (prod_pos - cons_pos) yield a
massive value that defeats the bounds check?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613-bpf-ringbuf-fixes-v1-0-e623481cb724@kernel.org?part=3

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-06-14  2:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
2026-06-14  1:48 ` [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash Tamir Duberstein
2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
2026-06-14  2:05   ` sashiko-bot
2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
2026-06-14  1:59   ` sashiko-bot
2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
2026-06-14  1:57   ` sashiko-bot
2026-06-14  1:48 ` [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use Tamir Duberstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox