[PATCH bpf 0/6] libbpf: Fix ring buffer consumption

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf 0/6] libbpf: Fix ring buffer consumption
@ 2026-06-14  1:48 Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

Fix several correctness issues in libbpf's ring buffer consumer.

A zero record bound currently consumes one record. A NULL callback is
accepted during manager construction but crashes when callback-based
consumption reaches the ring. Position counters stop consumption after
wrapping because they are compared by magnitude.

The consumer can also miss a readiness notification after publishing its
position and checking for new data without a full StoreLoad barrier. Use
compiler atomics and add the missing barrier, including when retrying a
busy record after publishing earlier records.

Callback traversal does not follow the overwrite position maintained by
BPF_F_RB_OVERWRITE maps. Reject callback consumption of those maps, as
discussed here:
https://lore.kernel.org/bpf/CAEf4Bzaq5drHWChXoRBnrmkb6reAsSVj8r=uByFSup31FMA7hw@mail.gmail.com/

Andrew Werner found the position-wrap and missed-wakeup failures while
implementing Aya's ring buffer reader. Aya's original implementation
contains the equality reasoning and edge-triggered regression test:
https://github.com/aya-rs/aya/commit/e2cf734490bc188bcedb1eac92d23d81123e42cd

Aya later corrected the consumer ordering with the same explicit fence:
https://github.com/aya-rs/aya/commit/7277a57ea8cdb74918d3096a4b22b6d814481973

Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
Tamir Duberstein (6):
      libbpf: ringbuf: Honor zero consume bounds
      libbpf: ringbuf: Prevent NULL callback crash
      libbpf: ringbuf: Handle position counter wrap
      libbpf: ringbuf: Use compiler atomics
      libbpf: ringbuf: Prevent missed wakeups
      libbpf: ringbuf: Reject overwrite callback use

 tools/lib/bpf/libbpf.h                           |  34 +++-
 tools/lib/bpf/ringbuf.c                          |  84 +++++++--
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 229 +++++++++++++++++++++++
 3 files changed, 323 insertions(+), 24 deletions(-)
---
base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d
change-id: 20260613-bpf-ringbuf-fixes-e9a8b3c6125b

Best regards,
--  
Tamir Duberstein <tamird@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash Tamir Duberstein
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

ringbuf_process_ring() checks the record bound only after advancing the
consumer position and invoking the callback. A zero bound therefore
consumes the first available record.

Return before reading the ring positions when the bound is zero so
ring_buffer__consume_n() and ring__consume_n() leave all records queued.

Fixes: 4d22ea94ea33 ("libbpf: Add ring__consume_n / ring_buffer__consume_n")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/ringbuf.c                          |  3 +++
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 00ec4837a06d..f2bb619d5a75 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -240,6 +240,9 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	bool got_new_data;
 	void *sample;
 
+	if (n == 0)
+		return 0;
+
 	cons_pos = smp_load_acquire(r->consumer_pos);
 	do {
 		got_new_data = false;
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 64520684d2cb..4f0558f14847 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -404,6 +404,7 @@ static int process_n_sample(void *ctx, void *data, size_t len)
 static void ringbuf_n_subtest(void)
 {
 	struct test_ringbuf_n_lskel *skel_n;
+	struct ring *ring;
 	int err, i;
 
 	skel_n = test_ringbuf_n_lskel__open();
@@ -431,6 +432,18 @@ static void ringbuf_n_subtest(void)
 	for (i = 0; i < N_TOT_SAMPLES; i++)
 		syscall(__NR_getpgid);
 
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring"))
+		goto cleanup_ringbuf;
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	if (!ASSERT_EQ(err, 0, "ringbuf_consume_zero"))
+		goto cleanup_ringbuf;
+
+	err = ring__consume_n(ring, 0);
+	if (!ASSERT_EQ(err, 0, "ring_consume_zero"))
+		goto cleanup_ringbuf;
+
 	/* Consume all samples from the ring buffer in batches of N_SAMPLES */
 	for (i = 0; i < N_TOT_SAMPLES; i += err) {
 		err = ring_buffer__consume_n(ringbuf, N_SAMPLES);

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

ring_buffer__new() and ring_buffer__add() allow a NULL sample
callback. When callback-based consumption reaches such a ring, it calls
through the NULL function pointer and crashes.

Validate every ring in a manager before polling or consuming. Return
-EINVAL without consuming records from an earlier valid ring or waiting
for an event. Perform the same check before honoring a zero record bound
so invalid callback consumption consistently reports the error.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/libbpf.h                           | 11 ++-
 tools/lib/bpf/ringbuf.c                          | 41 +++++++++--
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 93 ++++++++++++++++++++++++
 3 files changed, 134 insertions(+), 11 deletions(-)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index bba4e8464396..9ba6b9ad3498 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1526,18 +1526,17 @@ LIBBPF_API int ring__map_fd(const struct ring *r);
  *
  * @param r A ringbuffer object.
  * @return The number of records consumed (or INT_MAX, whichever is less), or
- * a negative number if any of the callbacks return an error.
+ * a negative error code on failure.
  */
 LIBBPF_API int ring__consume(struct ring *r);
 
 /**
- * @brief **ring__consume_n()** consumes up to a requested amount of items from
- * a ringbuffer without event polling.
+ * @brief **ring__consume_n()** consumes up to a requested number of records
+ * from a ringbuffer without event polling.
  *
  * @param r A ringbuffer object.
- * @param n Maximum amount of items to consume.
- * @return The number of items consumed, or a negative number if any of the
- * callbacks return an error.
+ * @param n Maximum number of records to consume.
+ * @return The number of records consumed, or a negative error code on failure.
  */
 LIBBPF_API int ring__consume_n(struct ring *r, size_t n);
 
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index f2bb619d5a75..ae7fa79b6217 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -231,6 +231,24 @@ static inline int roundup_len(__u32 len)
 	return (len + 7) / 8 * 8;
 }
 
+static int ringbuf_validate(const struct ring *r)
+{
+	return r->sample_cb ? 0 : -EINVAL;
+}
+
+static int ringbuf_validate_callbacks(const struct ring_buffer *rb)
+{
+	int i, err;
+
+	for (i = 0; i < rb->ring_cnt; i++) {
+		err = ringbuf_validate(rb->rings[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 {
 	int *len_ptr, len, err;
@@ -240,6 +258,9 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	bool got_new_data;
 	void *sample;
 
+	err = ringbuf_validate(r);
+	if (err)
+		return err;
 	if (n == 0)
 		return 0;
 
@@ -284,14 +305,17 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
  * records.
  *
  * Returns number of records consumed across all registered ring buffers (or
- * n, whichever is less), or negative number if any of the callbacks return
- * error.
+ * n, whichever is less), or a negative error code on failure.
  */
 int ring_buffer__consume_n(struct ring_buffer *rb, size_t n)
 {
 	int64_t err, res = 0;
 	int i;
 
+	err = ringbuf_validate_callbacks(rb);
+	if (err)
+		return libbpf_err(err);
+
 	for (i = 0; i < rb->ring_cnt; i++) {
 		struct ring *ring = rb->rings[i];
 
@@ -309,14 +333,17 @@ int ring_buffer__consume_n(struct ring_buffer *rb, size_t n)
 
 /* Consume available ring buffer(s) data without event polling.
  * Returns number of records consumed across all registered ring buffers (or
- * INT_MAX, whichever is less), or negative number if any of the callbacks
- * return error.
+ * INT_MAX, whichever is less), or a negative error code on failure.
  */
 int ring_buffer__consume(struct ring_buffer *rb)
 {
 	int64_t err, res = 0;
 	int i;
 
+	err = ringbuf_validate_callbacks(rb);
+	if (err)
+		return libbpf_err(err);
+
 	for (i = 0; i < rb->ring_cnt; i++) {
 		struct ring *ring = rb->rings[i];
 
@@ -334,13 +361,17 @@ int ring_buffer__consume(struct ring_buffer *rb)
 
 /* Poll for available data and consume records, if any are available.
  * Returns number of records consumed (or INT_MAX, whichever is less), or
- * negative number, if any of the registered callbacks returned error.
+ * a negative error code on failure.
  */
 int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
 {
 	int i, cnt;
 	int64_t err, res = 0;
 
+	err = ringbuf_validate_callbacks(rb);
+	if (err)
+		return libbpf_err(err);
+
 	cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms);
 	if (cnt < 0)
 		return libbpf_err(-errno);
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 4f0558f14847..9ce996bcea8c 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -401,6 +401,97 @@ static int process_n_sample(void *ctx, void *data, size_t len)
 	return 0;
 }
 
+static int process_noop_sample(void *ctx, void *data, size_t len)
+{
+	return 0;
+}
+
+static void ringbuf_null_cb_subtest(void)
+{
+	struct test_ringbuf_n_lskel *skel_n;
+	struct ring_buffer *ringbuf = NULL;
+	struct ring *ring;
+	unsigned long consumer_pos;
+	int no_cb_map_fd = -1;
+	int err;
+
+	skel_n = test_ringbuf_n_lskel__open();
+	if (!ASSERT_OK_PTR(skel_n, "test_ringbuf_n_lskel__open"))
+		return;
+
+	skel_n->maps.ringbuf.max_entries = getpagesize();
+	skel_n->bss->pid = getpid();
+	skel_n->bss->value = SAMPLE_VALUE;
+
+	err = test_ringbuf_n_lskel__load(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__load"))
+		goto cleanup;
+
+	err = test_ringbuf_n_lskel__attach(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__attach"))
+		goto cleanup;
+
+	syscall(__NR_getpgid);
+
+	no_cb_map_fd = bpf_map_create(BPF_MAP_TYPE_RINGBUF, NULL, 0, 0,
+				      getpagesize(), NULL);
+	if (!ASSERT_OK_FD(no_cb_map_fd, "bpf_map_create"))
+		goto cleanup;
+
+	/* Manager APIs must validate all rings before consuming any of them. */
+	ringbuf = ring_buffer__new(skel_n->maps.ringbuf.map_fd,
+				   process_noop_sample, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new"))
+		goto cleanup_fd;
+
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring"))
+		goto cleanup_ringbuf;
+
+	err = ring_buffer__add(ringbuf, no_cb_map_fd, NULL, NULL);
+	if (!ASSERT_OK(err, "ring_buffer__add_no_cb"))
+		goto cleanup_ringbuf;
+
+	consumer_pos = ring__consumer_pos(ring);
+	ASSERT_GT(ring__producer_pos(ring), consumer_pos,
+		  "producer_pos_mixed_cb");
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_consume_zero_mixed_cb");
+	err = ring_buffer__consume(ringbuf);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_consume_mixed_cb");
+	err = ring_buffer__poll(ringbuf, 0);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_poll_mixed_cb");
+	ASSERT_EQ(ring__consumer_pos(ring), consumer_pos,
+		  "consumer_pos_mixed_cb");
+
+	ring_buffer__free(ringbuf);
+	ringbuf =
+		ring_buffer__new(skel_n->maps.ringbuf.map_fd, NULL, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new_no_cb"))
+		goto cleanup_fd;
+
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring_no_cb"))
+		goto cleanup_ringbuf;
+	consumer_pos = ring__consumer_pos(ring);
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	ASSERT_EQ(err, -EINVAL, "ringbuf_consume_zero_no_cb");
+	err = ring__consume_n(ring, 0);
+	ASSERT_EQ(err, -EINVAL, "ring_consume_zero_no_cb");
+	err = ring__consume(ring);
+	ASSERT_EQ(err, -EINVAL, "ring_consume_no_cb");
+	ASSERT_EQ(ring__consumer_pos(ring), consumer_pos, "consumer_pos_no_cb");
+
+cleanup_ringbuf:
+	ring_buffer__free(ringbuf);
+cleanup_fd:
+	close(no_cb_map_fd);
+cleanup:
+	test_ringbuf_n_lskel__destroy(skel_n);
+}
+
 static void ringbuf_n_subtest(void)
 {
 	struct test_ringbuf_n_lskel *skel_n;
@@ -579,6 +670,8 @@ void test_ringbuf(void)
 		ringbuf_subtest();
 	if (test__start_subtest("ringbuf_n"))
 		ringbuf_n_subtest();
+	if (test__start_subtest("ringbuf_null_cb"))
+		ringbuf_null_cb_subtest();
 	if (test__start_subtest("ringbuf_map_key"))
 		ringbuf_map_key_subtest();
 	if (test__start_subtest("ringbuf_write"))

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  2:05   ` sashiko-bot
  2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

Ring buffer positions are unsigned long counters and can wrap on 32-bit
systems. ringbuf_process_ring() stops consuming when producer_pos wraps
below consumer_pos because it compares the counters by magnitude.

Compare the positions for equality instead. The producer cannot move
logically behind the consumer in a non-overwrite ring.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Reported-by: Andrew Werner <awerner32@gmail.com>
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/ringbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index ae7fa79b6217..b7adce37b519 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -268,7 +268,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	do {
 		got_new_data = false;
 		prod_pos = smp_load_acquire(r->producer_pos);
-		while (cons_pos < prod_pos) {
+		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
 			len = smp_load_acquire(len_ptr);
 

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
                   ` (2 preceding siblings ...)
  2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:59   ` sashiko-bot
  2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
  2026-06-14  1:48 ` [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use Tamir Duberstein
  5 siblings, 1 reply; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

Consumer-side ring buffer code uses architecture-specific smp_* helpers
for shared memory accesses.

Use compiler atomics instead. They provide equivalent acquire and
release ordering through a portable userspace interface and allow the
next commit to use compiler fences in the wakeup protocol without mixing
atomic interfaces.

Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/ringbuf.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index b7adce37b519..1c24a83f59d5 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -264,13 +264,13 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	if (n == 0)
 		return 0;
 
-	cons_pos = smp_load_acquire(r->consumer_pos);
+	cons_pos = __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
 	do {
 		got_new_data = false;
-		prod_pos = smp_load_acquire(r->producer_pos);
+		prod_pos = __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
 		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
-			len = smp_load_acquire(len_ptr);
+			len = __atomic_load_n(len_ptr, __ATOMIC_ACQUIRE);
 
 			/* sample not committed yet, bail out for now */
 			if (len & BPF_RINGBUF_BUSY_BIT)
@@ -284,14 +284,16 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 				err = r->sample_cb(r->ctx, sample, len);
 				if (err < 0) {
 					/* update consumer pos and bail out */
-					smp_store_release(r->consumer_pos,
-							  cons_pos);
+					__atomic_store_n(r->consumer_pos,
+							 cons_pos,
+							 __ATOMIC_RELEASE);
 					return err;
 				}
 				cnt++;
 			}
 
-			smp_store_release(r->consumer_pos, cons_pos);
+			__atomic_store_n(r->consumer_pos, cons_pos,
+					 __ATOMIC_RELEASE);
 
 			if (cnt >= n)
 				goto done;
@@ -406,8 +408,8 @@ struct ring *ring_buffer__ring(struct ring_buffer *rb, unsigned int idx)
 
 unsigned long ring__consumer_pos(const struct ring *r)
 {
-	/* Synchronizes with smp_store_release() in ringbuf_process_ring(). */
-	return smp_load_acquire(r->consumer_pos);
+	/* Synchronizes with the release store in ringbuf_process_ring(). */
+	return __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
 }
 
 unsigned long ring__producer_pos(const struct ring *r)
@@ -415,7 +417,7 @@ unsigned long ring__producer_pos(const struct ring *r)
 	/* Synchronizes with smp_store_release() in __bpf_ringbuf_reserve() in
 	 * the kernel.
 	 */
-	return smp_load_acquire(r->producer_pos);
+	return __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
 }
 
 size_t ring__avail_data_size(const struct ring *r)

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
                   ` (3 preceding siblings ...)
  2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  2026-06-14  1:57   ` sashiko-bot
  2026-06-14  1:48 ` [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use Tamir Duberstein
  5 siblings, 1 reply; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

After consuming the last visible record, ringbuf_process_ring()
publishes the consumer position and checks the producer position. These
operations lack a full StoreLoad barrier. A producer can therefore
commit a new record but read the old consumer position while the
consumer reads the old producer position. The producer sends no
notification and the consumer waits despite a queued record.

Insert a full barrier before checking for new data, ensuring that either
the consumer observes the producer update or the producer observes the
consumer update and sends a notification. Apply the same handshake when
a busy record follows records whose consumer position was published.

Add an edge-triggered epoll test with a concurrent producer. Without the
barrier, a missed notification leaves the producer dropping records from
a full ring while the consumer times out. Document that bounded
consumers and callbacks that terminate consumption must drain before
waiting again.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Reported-by: Andrew Werner <awerner32@gmail.com>
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/libbpf.h                           | 22 +++++++
 tools/lib/bpf/ringbuf.c                          | 14 +++-
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 84 ++++++++++++++++++++++++
 3 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 9ba6b9ad3498..a3b8f606a91d 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1439,6 +1439,10 @@ struct ring_buffer;
 struct ring;
 struct user_ring_buffer;
 
+/* A negative return stops consumption; non-negative values continue. Stopping
+ * can leave records queued without a new readiness notification. Before
+ * waiting for readiness again, consume until no records remain.
+ */
 typedef int (*ring_buffer_sample_fn)(void *ctx, void *data, size_t size);
 
 struct ring_buffer_opts {
@@ -1455,6 +1459,20 @@ LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 				ring_buffer_sample_fn sample_cb, void *ctx);
 LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
 LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
+
+/**
+ * @brief **ring_buffer__consume_n()** consumes up to a requested number of
+ * records from a ring buffer manager without event polling.
+ *
+ * @param rb A ring buffer manager object.
+ * @param n Maximum number of records to consume.
+ * @return The number of records consumed, or a negative error code on failure.
+ *
+ * Reaching the requested bound does not establish that every ring is empty.
+ * Records can remain queued without a new readiness notification. Before
+ * waiting on ring_buffer__epoll_fd(), call ring_buffer__consume() until it
+ * returns 0.
+ */
 LIBBPF_API int ring_buffer__consume_n(struct ring_buffer *rb, size_t n);
 LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);
 
@@ -1537,6 +1555,10 @@ LIBBPF_API int ring__consume(struct ring *r);
  * @param r A ringbuffer object.
  * @param n Maximum number of records to consume.
  * @return The number of records consumed, or a negative error code on failure.
+ *
+ * Reaching the requested bound does not establish that the ring is empty.
+ * Records can remain queued without a new readiness notification. Before
+ * waiting on ring__map_fd(), call ring__consume() until it returns 0.
  */
 LIBBPF_API int ring__consume_n(struct ring *r, size_t n);
 
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 1c24a83f59d5..ea8909fec4e9 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -255,7 +255,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	/* 64-bit to avoid overflow in case of extreme application behavior */
 	int64_t cnt = 0;
 	unsigned long cons_pos, prod_pos;
-	bool got_new_data;
+	bool got_new_data, needs_wakeup = false;
 	void *sample;
 
 	err = ringbuf_validate(r);
@@ -267,14 +267,21 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 	cons_pos = __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
 	do {
 		got_new_data = false;
+		if (needs_wakeup) {
+			/* Ensure either this sees a new record or its producer sees
+			 * the updated consumer position and sends a notification.
+			 */
+			__atomic_thread_fence(__ATOMIC_SEQ_CST);
+			needs_wakeup = false;
+		}
 		prod_pos = __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
 		while (cons_pos != prod_pos) {
 			len_ptr = r->data + (cons_pos & r->mask);
 			len = __atomic_load_n(len_ptr, __ATOMIC_ACQUIRE);
 
-			/* sample not committed yet, bail out for now */
+			/* Retry a busy record once after publishing prior records. */
 			if (len & BPF_RINGBUF_BUSY_BIT)
-				goto done;
+				break;
 
 			got_new_data = true;
 			cons_pos += roundup_len(len);
@@ -294,6 +301,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
 
 			__atomic_store_n(r->consumer_pos, cons_pos,
 					 __ATOMIC_RELEASE);
+			needs_wakeup = true;
 
 			if (cnt >= n)
 				goto done;
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 9ce996bcea8c..5f0c679bf9a6 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -492,6 +492,88 @@ static void ringbuf_null_cb_subtest(void)
 	test_ringbuf_n_lskel__destroy(skel_n);
 }
 
+#define N_WAKEUP_SAMPLES 20000
+
+struct wakeup_ctx {
+	bool stop;
+};
+
+static void *wakeup_producer(void *arg)
+{
+	struct wakeup_ctx *ctx = arg;
+
+	while (!__atomic_load_n(&ctx->stop, __ATOMIC_RELAXED))
+		syscall(__NR_getpgid);
+	return NULL;
+}
+
+static void ringbuf_wakeup_subtest(void)
+{
+	struct test_ringbuf_n_lskel *skel_n;
+	struct ring_buffer *ringbuf = NULL;
+	struct epoll_event event = {
+		.events = EPOLLIN | EPOLLET,
+	};
+	struct wakeup_ctx ctx = {};
+	pthread_t producer;
+	int epoll_fd = -1;
+	int err, total = 0;
+
+	skel_n = test_ringbuf_n_lskel__open();
+	if (!ASSERT_OK_PTR(skel_n, "test_ringbuf_n_lskel__open"))
+		return;
+
+	skel_n->maps.ringbuf.max_entries = getpagesize();
+	skel_n->bss->pid = getpid();
+	skel_n->bss->value = SAMPLE_VALUE;
+
+	err = test_ringbuf_n_lskel__load(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__load"))
+		goto cleanup;
+
+	err = test_ringbuf_n_lskel__attach(skel_n);
+	if (!ASSERT_OK(err, "test_ringbuf_n_lskel__attach"))
+		goto cleanup;
+
+	ringbuf = ring_buffer__new(skel_n->maps.ringbuf.map_fd,
+				   process_noop_sample, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new"))
+		goto cleanup;
+
+	epoll_fd = epoll_create1(EPOLL_CLOEXEC);
+	if (!ASSERT_OK_FD(epoll_fd, "epoll_create1"))
+		goto cleanup_ringbuf;
+
+	err = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, skel_n->maps.ringbuf.map_fd,
+			&event);
+	if (!ASSERT_OK(err, "epoll_ctl"))
+		goto cleanup_epoll;
+
+	err = pthread_create(&producer, NULL, wakeup_producer, &ctx);
+	if (!ASSERT_OK(err, "pthread_create"))
+		goto cleanup_epoll;
+
+	while (total < N_WAKEUP_SAMPLES) {
+		err = epoll_wait(epoll_fd, &event, 1, 1000);
+		if (!ASSERT_EQ(err, 1, "epoll_wait"))
+			goto cleanup_thread;
+		while ((err = ring_buffer__consume(ringbuf)) > 0)
+			total += err;
+		if (!ASSERT_OK(err, "ring_buffer__consume"))
+			goto cleanup_thread;
+	}
+
+cleanup_thread:
+	__atomic_store_n(&ctx.stop, true, __ATOMIC_RELAXED);
+	pthread_join(producer, NULL);
+cleanup_epoll:
+	close(epoll_fd);
+cleanup_ringbuf:
+	ring_buffer__free(ringbuf);
+cleanup:
+	test_ringbuf_n_lskel__destroy(skel_n);
+}
+
 static void ringbuf_n_subtest(void)
 {
 	struct test_ringbuf_n_lskel *skel_n;
@@ -672,6 +754,8 @@ void test_ringbuf(void)
 		ringbuf_n_subtest();
 	if (test__start_subtest("ringbuf_null_cb"))
 		ringbuf_null_cb_subtest();
+	if (test__start_subtest("ringbuf_wakeup"))
+		ringbuf_wakeup_subtest();
 	if (test__start_subtest("ringbuf_map_key"))
 		ringbuf_map_key_subtest();
 	if (test__start_subtest("ringbuf_write"))

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use
  2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
                   ` (4 preceding siblings ...)
  2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
@ 2026-06-14  1:48 ` Tamir Duberstein
  5 siblings, 0 replies; 10+ messages in thread
From: Tamir Duberstein @ 2026-06-14  1:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Andrea Righi,
	Xu Kuohai, Andrea Righi
  Cc: bpf, linux-kernel, linux-kselftest, Andrew Werner, Zvi Effron,
	Andrii Nakryiko, Tamir Duberstein

BPF_F_RB_OVERWRITE can advance overwrite_pos past consumer_pos.
Callback traversal does not read overwrite_pos, so after the producer
laps the consumer it can treat overwritten data as a record header.

An earlier proposal[0] copied the readable window before invoking
callbacks. Review concluded that callbacks are a poor fit because
copying penalizes zero-copy users and the API cannot report skipped
records.

Record the map flag and reject callback consumption with -EOPNOTSUPP.

Link: https://lore.kernel.org/bpf/CAEf4Bzaq5drHWChXoRBnrmkb6reAsSVj8r=uByFSup31FMA7hw@mail.gmail.com/ [0]
Fixes: feeaf1346f80 ("bpf: Add overwrite mode for BPF ring buffer")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
---
 tools/lib/bpf/libbpf.h                           |  1 +
 tools/lib/bpf/ringbuf.c                          |  4 +++
 tools/testing/selftests/bpf/prog_tests/ringbuf.c | 39 ++++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index a3b8f606a91d..899457d5d536 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1439,6 +1439,7 @@ struct ring_buffer;
 struct ring;
 struct user_ring_buffer;
 
+/* Callback-based consumption is unsupported for BPF_F_RB_OVERWRITE maps. */
 /* A negative return stops consumption; non-negative values continue. Stopping
  * can leave records queued without a new readiness notification. Before
  * waiting for readiness again, consume until no records remain.
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index ea8909fec4e9..f7972eae05ba 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -30,6 +30,7 @@ struct ring {
 	unsigned long *producer_pos;
 	unsigned long mask;
 	int map_fd;
+	bool overwrite;
 };
 
 struct ring_buffer {
@@ -118,6 +119,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	r->sample_cb = sample_cb;
 	r->ctx = ctx;
 	r->mask = info.max_entries - 1;
+	r->overwrite = info.map_flags & BPF_F_RB_OVERWRITE;
 
 	/* Map writable consumer page */
 	tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, map_fd, 0);
@@ -233,6 +235,8 @@ static inline int roundup_len(__u32 len)
 
 static int ringbuf_validate(const struct ring *r)
 {
+	if (r->overwrite)
+		return -EOPNOTSUPP;
 	return r->sample_cb ? 0 : -EINVAL;
 }
 
diff --git a/tools/testing/selftests/bpf/prog_tests/ringbuf.c b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
index 5f0c679bf9a6..a6c707af1134 100644
--- a/tools/testing/selftests/bpf/prog_tests/ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ringbuf.c
@@ -684,6 +684,43 @@ static void ringbuf_map_key_subtest(void)
 	test_ringbuf_map_key_lskel__destroy(skel_map_key);
 }
 
+static void ringbuf_overwrite_callback_subtest(void)
+{
+	LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_RB_OVERWRITE);
+	struct ring_buffer *ringbuf;
+	struct ring *ring;
+	int map_fd, err;
+
+	map_fd = bpf_map_create(BPF_MAP_TYPE_RINGBUF, NULL, 0, 0, getpagesize(),
+				&opts);
+	if (!ASSERT_OK_FD(map_fd, "bpf_map_create"))
+		return;
+
+	ringbuf = ring_buffer__new(map_fd, process_noop_sample, NULL, NULL);
+	if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new"))
+		goto cleanup_fd;
+
+	ring = ring_buffer__ring(ringbuf, 0);
+	if (!ASSERT_OK_PTR(ring, "ring_buffer__ring"))
+		goto cleanup_ringbuf;
+
+	err = ring_buffer__consume_n(ringbuf, 0);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ringbuf_consume_zero");
+	err = ring_buffer__consume(ringbuf);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ringbuf_consume");
+	err = ring_buffer__poll(ringbuf, 0);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ringbuf_poll");
+	err = ring__consume_n(ring, 0);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ring_consume_zero");
+	err = ring__consume(ring);
+	ASSERT_EQ(err, -EOPNOTSUPP, "ring_consume");
+
+cleanup_ringbuf:
+	ring_buffer__free(ringbuf);
+cleanup_fd:
+	close(map_fd);
+}
+
 static void ringbuf_overwrite_mode_subtest(void)
 {
 	unsigned long size, len1, len2, len3, len4, len5;
@@ -760,6 +797,8 @@ void test_ringbuf(void)
 		ringbuf_map_key_subtest();
 	if (test__start_subtest("ringbuf_write"))
 		ringbuf_write_subtest();
+	if (test__start_subtest("ringbuf_overwrite_callback"))
+		ringbuf_overwrite_callback_subtest();
 	if (test__start_subtest("ringbuf_overwrite_mode"))
 		ringbuf_overwrite_mode_subtest();
 }

-- 
2.55.0.rc0.96.gc050c23164


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups
  2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
@ 2026-06-14  1:57   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-14  1:57 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Early returns in `ringbuf_process_ring()` bypass the newly added StoreLoad memory barrier, leaving bounded consumers vulnerable to missed wakeups.
- [Low] Multi-line comments introduced by the patch violate the BPF subsystem comment style.
--

commit 1b4f252cfd5df3b60a00d140737f0f9bc3b4ad87
Author: Tamir Duberstein <tamird@kernel.org>
Subject: libbpf: ringbuf: Prevent missed wakeups

This commit addresses missed wakeups in the BPF ring buffer by adding a full
StoreLoad barrier before checking for new data. It ensures either the
consumer observes the producer's update, or the producer observes the
consumer's update and sends a notification. It also updates documentation
and adds a selftest.

> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 9ba6b9ad34983..a3b8f606a91d0 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -1439,6 +1439,10 @@ struct ring_buffer;
>  struct ring;
>  struct user_ring_buffer;
>  
> +/* A negative return stops consumption; non-negative values continue. Stopping
> + * can leave records queued without a new readiness notification. Before
> + * waiting for readiness again, consume until no records remain.
> + */

[Severity: Low]
This isn't a bug, but does this multi-line comment follow the BPF subsystem
style? The guidelines prefer the opening /* to be on its own line.

[ ... ]

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index 1c24a83f59d5c..ea8909fec4e98 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
[ ... ]
> @@ -267,14 +267,21 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  	cons_pos = __atomic_load_n(r->consumer_pos, __ATOMIC_ACQUIRE);
>  	do {
>  		got_new_data = false;
> +		if (needs_wakeup) {
> +			/* Ensure either this sees a new record or its producer sees
> +			 * the updated consumer position and sends a notification.
> +			 */

[Severity: Low]
This isn't a bug, but should this multi-line comment have the opening /* on
its own line to match the BPF subsystem comment style?

> +			__atomic_thread_fence(__ATOMIC_SEQ_CST);
> +			needs_wakeup = false;
> +		}
>  		prod_pos = __atomic_load_n(r->producer_pos, __ATOMIC_ACQUIRE);
>  		while (cons_pos != prod_pos) {
[ ... ]
> @@ -294,6 +301,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  
>  			__atomic_store_n(r->consumer_pos, cons_pos,
>  					 __ATOMIC_RELEASE);
> +			needs_wakeup = true;
>  
>  			if (cnt >= n)
>  				goto done;

[Severity: High]
If this loop terminates early by reaching the bound (cnt >= n) or due to a
negative callback return, consumer_pos is updated and the function returns,
discarding the local needs_wakeup variable.

When ring_buffer__consume() is called again to drain the remaining records
before sleeping, it will initialize needs_wakeup = false. This causes the
__ATOMIC_SEQ_CST fence to be skipped prior to loading producer_pos.

Could this allow the CPU to delay the visibility of the consumer_pos store
while reading a stale producer_pos, resulting in the consumer incorrectly
sleeping and the concurrent producer skipping the wakeup notification?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613-bpf-ringbuf-fixes-v1-0-e623481cb724@kernel.org?part=5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics
  2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
@ 2026-06-14  1:59   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-14  1:59 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] 32-bit monitoring applications using the BPF ring buffer on a 64-bit kernel will permanently stall after processing 4GB of events due to an unsigned long pointer type mismatch for producer/consumer positions.
--

commit b302f5c853774170c0ab9e37b85abcc4d647ac3b
Author: Tamir Duberstein <tamird@kernel.org>

libbpf: ringbuf: Use compiler atomics

This commit replaces architecture-specific smp_* helpers with portable compiler
atomics for shared memory accesses in the consumer-side ring buffer code.

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index b7adce37b5196..1c24a83f59d5c 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
[ ... ]
> @@ -284,14 +284,16 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  				err = r->sample_cb(r->ctx, sample, len);
>  				if (err < 0) {
>  					/* update consumer pos and bail out */
> -					smp_store_release(r->consumer_pos,
> -							  cons_pos);
> +					__atomic_store_n(r->consumer_pos,
> +							 cons_pos,
> +							 __ATOMIC_RELEASE);
>  					return err;
>  				}
>  				cnt++;
>  			}
>  
> -			smp_store_release(r->consumer_pos, cons_pos);
> +			__atomic_store_n(r->consumer_pos, cons_pos,
> +					 __ATOMIC_RELEASE);

[Severity: High]
This isn't a bug introduced by this patch, but does this write to
consumer_pos safely handle a 32-bit userspace application running on a
64-bit kernel?

In the kernel, struct bpf_ringbuf defines consumer_pos and producer_pos as
unsigned long, which are 64 bits. However, in libbpf's struct ring,
consumer_pos is mapped as an unsigned long pointer, which is only 32 bits
in a 32-bit userspace environment.

When the 32-bit user process updates consumer_pos via __atomic_store_n(), it
only overwrites the lower 32 bits of the 64-bit value on little-endian
systems. Once the true consumer position exceeds 4GB, the userspace 32-bit
value wraps to 0. Writing 0 to the lower 32 bits while the upper 32 bits
remain 0 causes the kernel to see consumer_pos drop to 0, while the 64-bit
producer_pos continues climbing past 4GB.

This causes the kernel's bpf_ringbuf_has_space() to see a delta larger than
the ring buffer mask, incorrectly treating the buffer as completely full and
permanently dropping all subsequent events for that buffer.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613-bpf-ringbuf-fixes-v1-0-e623481cb724@kernel.org?part=4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap
  2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
@ 2026-06-14  2:05   ` sashiko-bot
  0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-14  2:05 UTC (permalink / raw)
  To: Tamir Duberstein; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] `__bpf_ringbuf_reserve` in the kernel permanently fails allocations after 32-bit position wrap-around.
- [High] Kernel-side `__bpf_user_ringbuf_peek` incorrectly handles 32-bit wrap-around, leading to permanent stalls and bypassing bounds checks.
- [High] 32-bit userspace running on a 64-bit kernel permanently breaks BPF ring buffers after 4GB of data is processed due to pointer size mismatches.
--

commit ce5e8d811804f9ff598195a7f394512f77c373b0
Author: Tamir Duberstein <tamird@kernel.org>

libbpf: ringbuf: Handle position counter wrap

Ring buffer positions are unsigned long counters and can wrap on 32-bit
systems. ringbuf_process_ring() stops consuming when producer_pos wraps
below consumer_pos because it compares the counters by magnitude.

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index ae7fa79b62170..b7adce37b5196 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -268,7 +268,7 @@ static int64_t ringbuf_process_ring(struct ring *r, size_t n)
>  	do {
>  		got_new_data = false;
>  		prod_pos = smp_load_acquire(r->producer_pos);
> -		while (cons_pos < prod_pos) {
> +		while (cons_pos != prod_pos) {
>  			len_ptr = r->data + (cons_pos & r->mask);
>  			len = smp_load_acquire(len_ptr);
> 

[Severity: High]
This is a pre-existing issue, but does this fix fully resolve the behavior
for 32-bit userspace applications running on 64-bit kernels?

Looking further down in ringbuf_process_ring(), userspace updates the
consumer position:

    __atomic_store_n(r->consumer_pos, cons_pos, __ATOMIC_RELEASE);

On a 32-bit architecture, this writes 4 bytes. However, the 64-bit kernel
defines consumer_pos as an 8-byte unsigned long. When the 32-bit userspace
counter wraps at 4GB, it writes 0 to the lower 4 bytes while leaving the
upper 4 bytes unchanged.

Could this cause the kernel's bpf_ringbuf_has_space() to perceive an
instant drop in the consumer position, making it evaluate the ring buffer
as permanently full?

[Severity: High]
This is a pre-existing issue, but does the kernel side have a similar
wrap-around issue in __bpf_ringbuf_reserve()?

kernel/bpf/ringbuf.c:__bpf_ringbuf_reserve() {
    ...
    while (pend_pos < prod_pos) {
        ...
    }
    ...
}

Since pend_pos and prod_pos are unsigned longs, on 32-bit kernels, does
this magnitude comparison fail when producer_pos wraps around to a small
value? It appears pend_pos would freeze at its pre-wrap value, causing
bpf_ringbuf_has_space() to permanently return false and deny future
allocations.

[Severity: High]
This is also a pre-existing issue, but does __bpf_user_ringbuf_peek() have
a related wrap-around bug on 32-bit systems?

kernel/bpf/ringbuf.c:__bpf_user_ringbuf_peek() {
    u64 cons_pos, prod_pos;
    ...
    prod_pos = smp_load_acquire(&rb->producer_pos);
    ...
    cons_pos = smp_load_acquire(&rb->consumer_pos);
    if (cons_pos >= prod_pos)
        ...
}

Because rb->producer_pos and rb->consumer_pos are 32-bit unsigned longs on
32-bit systems, are they zero-extended to 64-bit u64 variables? If
producer_pos wraps around while consumer_pos is still large, does the
64-bit promotion cause the magnitude comparison to evaluate to true,
permanently stalling consumption by returning -ENODATA?

Additionally, would the 64-bit subtraction (prod_pos - cons_pos) yield a
massive value that defeats the bounds check?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260613-bpf-ringbuf-fixes-v1-0-e623481cb724@kernel.org?part=3

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-06-14  2:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-14  1:48 [PATCH bpf 0/6] libbpf: Fix ring buffer consumption Tamir Duberstein
2026-06-14  1:48 ` [PATCH bpf 1/6] libbpf: ringbuf: Honor zero consume bounds Tamir Duberstein
2026-06-14  1:48 ` [PATCH bpf 2/6] libbpf: ringbuf: Prevent NULL callback crash Tamir Duberstein
2026-06-14  1:48 ` [PATCH bpf 3/6] libbpf: ringbuf: Handle position counter wrap Tamir Duberstein
2026-06-14  2:05   ` sashiko-bot
2026-06-14  1:48 ` [PATCH bpf 4/6] libbpf: ringbuf: Use compiler atomics Tamir Duberstein
2026-06-14  1:59   ` sashiko-bot
2026-06-14  1:48 ` [PATCH bpf 5/6] libbpf: ringbuf: Prevent missed wakeups Tamir Duberstein
2026-06-14  1:57   ` sashiko-bot
2026-06-14  1:48 ` [PATCH bpf 6/6] libbpf: ringbuf: Reject overwrite callback use Tamir Duberstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.