public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard()
@ 2026-03-30 12:17 Eyal Birger
  2026-03-30 13:39 ` Puranjay Mohan
  2026-03-31  0:27 ` Andrii Nakryiko
  0 siblings, 2 replies; 5+ messages in thread
From: Eyal Birger @ 2026-03-30 12:17 UTC (permalink / raw)
  To: ast, daniel, andrii, martin.lau, eddyz87, memxor, song,
	yonghong.song, jolsa, paul.chaignon, chen.dylane, kpsingh,
	a.s.protopopov, yatsenko, ameryhung, tklauser, shmulik.ladkani
  Cc: bpf, linux-kernel, Eyal Birger

Document that BPF_RB_NO_WAKEUP is not recommended for
bpf_ringbuf_discard().

A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
wakeup from a valid record, leaving an epoll-based userspace consumer
asleep even though data is available in the ring buffer.

Scenario:

epoll_wait(rb_fd);                     // blocks

rec = bpf_ringbuf_reserve(&rb, ...);
bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);

rec = bpf_ringbuf_reserve(&rb, ...);
bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup

This behavior is surprising in the context of bpf_ringbuf_discard()
as it seems natural not to want to wake userspace.

Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
---
 include/uapi/linux/bpf.h       | 3 ++-
 tools/include/uapi/linux/bpf.h | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c8d400b7680a..c46b06d45904 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4645,7 +4645,8 @@ union bpf_attr {
  * 	Description
  * 		Discard reserved ring buffer sample, pointed to by *data*.
  * 		If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
- * 		of new data availability is sent.
+ * 		of new data availability is sent, which is not recommended as
+ * 		it can suppress a later adaptive wakeup from a subsequent submit.
  * 		If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
  * 		of new data availability is sent unconditionally.
  * 		If **0** is specified in *flags*, an adaptive notification
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5e38b4887de6..96de37c3b896 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4645,7 +4645,8 @@ union bpf_attr {
  * 	Description
  * 		Discard reserved ring buffer sample, pointed to by *data*.
  * 		If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
- * 		of new data availability is sent.
+ * 		of new data availability is sent, which is not recommended as
+ * 		it can suppress a later adaptive wakeup from a subsequent submit.
  * 		If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
  * 		of new data availability is sent unconditionally.
  * 		If **0** is specified in *flags*, an adaptive notification
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard()
  2026-03-30 12:17 [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard() Eyal Birger
@ 2026-03-30 13:39 ` Puranjay Mohan
  2026-03-30 13:53   ` Eyal Birger
  2026-03-31  0:27 ` Andrii Nakryiko
  1 sibling, 1 reply; 5+ messages in thread
From: Puranjay Mohan @ 2026-03-30 13:39 UTC (permalink / raw)
  To: Eyal Birger, ast, daniel, andrii, martin.lau, eddyz87, memxor,
	song, yonghong.song, jolsa, paul.chaignon, chen.dylane, kpsingh,
	a.s.protopopov, yatsenko, ameryhung, tklauser, shmulik.ladkani
  Cc: bpf, linux-kernel, Eyal Birger

Eyal Birger <eyal.birger@gmail.com> writes:

> Document that BPF_RB_NO_WAKEUP is not recommended for
> bpf_ringbuf_discard().
>
> A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> wakeup from a valid record, leaving an epoll-based userspace consumer
> asleep even though data is available in the ring buffer.
>
> Scenario:
>
> epoll_wait(rb_fd);                     // blocks
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup
>
> This behavior is surprising in the context of bpf_ringbuf_discard()
> as it seems natural not to want to wake userspace.

It appears that once you do a submit or discard with NO_WAKEUP, you can
never go back to adaptive mode. You will need to do an explicit
BPF_RB_FORCE_WAKEUP to do a wakeup. 

This looks like expected behaviour (by design) but for discard the
programmer might think: "I'm discarding and there's no data, why would I
wake anyone?  Let me pass BPF_RB_NO_WAKEUP to be a good citizen." 

I am not sure if we just want to change the description of the helper or
also fix the code to ignore the BPF_RB_NO_WAKEUP for discard?

Or a better approach would be to add a wakeup_needed flag to struct
bpf_ringbuf. When NO_WAKEUP suppresses a wakeup that would have fired
(cons_pos == rec_pos), it sets the flag. Any subsequent adaptive commit
sees the flag and sends the wakeup, clearing it. FORCE_WAKEUP also
clears it. But this could change how the ringbuf behaves for existing
programs.

I will let others comment on this.

> Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
>  include/uapi/linux/bpf.h       | 3 ++-
>  tools/include/uapi/linux/bpf.h | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c8d400b7680a..c46b06d45904 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
>   * 	Description
>   * 		Discard reserved ring buffer sample, pointed to by *data*.
>   * 		If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * 		of new data availability is sent.
> + * 		of new data availability is sent, which is not recommended as
> + * 		it can suppress a later adaptive wakeup from a subsequent submit.
>   * 		If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
>   * 		of new data availability is sent unconditionally.
>   * 		If **0** is specified in *flags*, an adaptive notification
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 5e38b4887de6..96de37c3b896 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
>   * 	Description
>   * 		Discard reserved ring buffer sample, pointed to by *data*.
>   * 		If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * 		of new data availability is sent.
> + * 		of new data availability is sent, which is not recommended as
> + * 		it can suppress a later adaptive wakeup from a subsequent submit.
>   * 		If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
>   * 		of new data availability is sent unconditionally.
>   * 		If **0** is specified in *flags*, an adaptive notification

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard()
  2026-03-30 13:39 ` Puranjay Mohan
@ 2026-03-30 13:53   ` Eyal Birger
  0 siblings, 0 replies; 5+ messages in thread
From: Eyal Birger @ 2026-03-30 13:53 UTC (permalink / raw)
  To: Puranjay Mohan
  Cc: ast, daniel, andrii, martin.lau, eddyz87, memxor, song,
	yonghong.song, jolsa, paul.chaignon, chen.dylane, kpsingh,
	a.s.protopopov, yatsenko, ameryhung, tklauser, shmulik.ladkani,
	bpf, linux-kernel

Hi,

On Mon, Mar 30, 2026 at 6:39 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>
> Eyal Birger <eyal.birger@gmail.com> writes:
>
> > Document that BPF_RB_NO_WAKEUP is not recommended for
> > bpf_ringbuf_discard().
> >
> > A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> > wakeup from a valid record, leaving an epoll-based userspace consumer
> > asleep even though data is available in the ring buffer.
> >
> > Scenario:
> >
> > epoll_wait(rb_fd);                     // blocks
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup
> >
> > This behavior is surprising in the context of bpf_ringbuf_discard()
> > as it seems natural not to want to wake userspace.
>
> It appears that once you do a submit or discard with NO_WAKEUP, you can
> never go back to adaptive mode. You will need to do an explicit
> BPF_RB_FORCE_WAKEUP to do a wakeup.
>
> This looks like expected behaviour (by design) but for discard the
> programmer might think: "I'm discarding and there's no data, why would I
> wake anyone?  Let me pass BPF_RB_NO_WAKEUP to be a good citizen."

This is exactly what I've seen and it created real bugs.

>
> I am not sure if we just want to change the description of the helper or
> also fix the code to ignore the BPF_RB_NO_WAKEUP for discard?

I wanted to avoid breaking something that might be in use by advanced use-cases.

>
> Or a better approach would be to add a wakeup_needed flag to struct
> bpf_ringbuf. When NO_WAKEUP suppresses a wakeup that would have fired
> (cons_pos == rec_pos), it sets the flag. Any subsequent adaptive commit
> sees the flag and sends the wakeup, clearing it. FORCE_WAKEUP also
> clears it. But this could change how the ringbuf behaves for existing
> programs.

Considered this, but wanted to avoid a runtime penalty for this edge case.

>
> I will let others comment on this.

Thanks for your input!

Eyal.
>
> > Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> > Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> > ---
> >  include/uapi/linux/bpf.h       | 3 ++-
> >  tools/include/uapi/linux/bpf.h | 3 ++-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c8d400b7680a..c46b06d45904 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> >   *   Description
> >   *           Discard reserved ring buffer sample, pointed to by *data*.
> >   *           If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - *           of new data availability is sent.
> > + *           of new data availability is sent, which is not recommended as
> > + *           it can suppress a later adaptive wakeup from a subsequent submit.
> >   *           If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> >   *           of new data availability is sent unconditionally.
> >   *           If **0** is specified in *flags*, an adaptive notification
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 5e38b4887de6..96de37c3b896 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> >   *   Description
> >   *           Discard reserved ring buffer sample, pointed to by *data*.
> >   *           If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - *           of new data availability is sent.
> > + *           of new data availability is sent, which is not recommended as
> > + *           it can suppress a later adaptive wakeup from a subsequent submit.
> >   *           If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> >   *           of new data availability is sent unconditionally.
> >   *           If **0** is specified in *flags*, an adaptive notification

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard()
  2026-03-30 12:17 [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard() Eyal Birger
  2026-03-30 13:39 ` Puranjay Mohan
@ 2026-03-31  0:27 ` Andrii Nakryiko
  2026-03-31 12:57   ` Eyal Birger
  1 sibling, 1 reply; 5+ messages in thread
From: Andrii Nakryiko @ 2026-03-31  0:27 UTC (permalink / raw)
  To: Eyal Birger
  Cc: ast, daniel, andrii, martin.lau, eddyz87, memxor, song,
	yonghong.song, jolsa, paul.chaignon, chen.dylane, kpsingh,
	a.s.protopopov, yatsenko, ameryhung, tklauser, shmulik.ladkani,
	bpf, linux-kernel

On Mon, Mar 30, 2026 at 5:18 AM Eyal Birger <eyal.birger@gmail.com> wrote:
>
> Document that BPF_RB_NO_WAKEUP is not recommended for
> bpf_ringbuf_discard().
>
> A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> wakeup from a valid record, leaving an epoll-based userspace consumer
> asleep even though data is available in the ring buffer.
>
> Scenario:
>
> epoll_wait(rb_fd);                     // blocks
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup
>
> This behavior is surprising in the context of bpf_ringbuf_discard()
> as it seems natural not to want to wake userspace.

why? discarded record still takes space in ringbuf and needs to be
consumed by user-space. So all the same concerns apply here whether
it's discarded or submitted ringbuf record.

>
> Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
>  include/uapi/linux/bpf.h       | 3 ++-
>  tools/include/uapi/linux/bpf.h | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c8d400b7680a..c46b06d45904 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
>   *     Description
>   *             Discard reserved ring buffer sample, pointed to by *data*.
>   *             If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - *             of new data availability is sent.
> + *             of new data availability is sent, which is not recommended as
> + *             it can suppress a later adaptive wakeup from a subsequent submit.

NAK, it's not "not recommended". If you are using NO_WAKEUP you
normally would use it with FORCE_WAKEUP as well. It's not a rule, but
that's the most practical combination. But if you are fancy and know
what you are doing, you are free to combine NO_WAKEUP, FORCE_WAKEUP
and "adaptive wakeup" in any form or shape you want.


If anything, instead, it might maybe help to emphasize that
bpf_ringbuf_discard() does live data in the ringbuf that needs to be
consumed by user-space code (normally transparently in libbpf or other
BPF libraries), and the only distinction between submitted and
discarded record is whether application's user code sees that record
or it's silently skipped.

>   *             If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
>   *             of new data availability is sent unconditionally.
>   *             If **0** is specified in *flags*, an adaptive notification
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 5e38b4887de6..96de37c3b896 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
>   *     Description
>   *             Discard reserved ring buffer sample, pointed to by *data*.
>   *             If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - *             of new data availability is sent.
> + *             of new data availability is sent, which is not recommended as
> + *             it can suppress a later adaptive wakeup from a subsequent submit.
>   *             If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
>   *             of new data availability is sent unconditionally.
>   *             If **0** is specified in *flags*, an adaptive notification
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard()
  2026-03-31  0:27 ` Andrii Nakryiko
@ 2026-03-31 12:57   ` Eyal Birger
  0 siblings, 0 replies; 5+ messages in thread
From: Eyal Birger @ 2026-03-31 12:57 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: ast, daniel, andrii, martin.lau, eddyz87, memxor, song,
	yonghong.song, jolsa, paul.chaignon, chen.dylane, kpsingh,
	a.s.protopopov, yatsenko, ameryhung, tklauser, shmulik.ladkani,
	bpf, linux-kernel

Hi,

On Mon, Mar 30, 2026 at 5:27 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Mar 30, 2026 at 5:18 AM Eyal Birger <eyal.birger@gmail.com> wrote:
> >
> > Document that BPF_RB_NO_WAKEUP is not recommended for
> > bpf_ringbuf_discard().
> >
> > A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> > wakeup from a valid record, leaving an epoll-based userspace consumer
> > asleep even though data is available in the ring buffer.
> >
> > Scenario:
> >
> > epoll_wait(rb_fd);                     // blocks
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
> >
> > rec = bpf_ringbuf_reserve(&rb, ...);
> > bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup
> >
> > This behavior is surprising in the context of bpf_ringbuf_discard()
> > as it seems natural not to want to wake userspace.
>
> why? discarded record still takes space in ringbuf and needs to be
> consumed by user-space. So all the same concerns apply here whether
> it's discarded or submitted ringbuf record.

Right, but sadly this isn't obvious to someone who hasn't carefully
inspected the implementation.

>
> >
> > Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> > Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> > ---
> >  include/uapi/linux/bpf.h       | 3 ++-
> >  tools/include/uapi/linux/bpf.h | 3 ++-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c8d400b7680a..c46b06d45904 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> >   *     Description
> >   *             Discard reserved ring buffer sample, pointed to by *data*.
> >   *             If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - *             of new data availability is sent.
> > + *             of new data availability is sent, which is not recommended as
> > + *             it can suppress a later adaptive wakeup from a subsequent submit.
>
> NAK, it's not "not recommended". If you are using NO_WAKEUP you
> normally would use it with FORCE_WAKEUP as well. It's not a rule, but
> that's the most practical combination. But if you are fancy and know
> what you are doing, you are free to combine NO_WAKEUP, FORCE_WAKEUP
> and "adaptive wakeup" in any form or shape you want.

There are still possible corner cases - for example, multiple silent
discards might fill the ringbuf preventing a later placement of a
valid record - even if it's meant to use "force".
Point is, the user should be extra careful, that fact is documented
in other places, and imho should be noted here too.

>
>
> If anything, instead, it might maybe help to emphasize that
> bpf_ringbuf_discard() does live data in the ringbuf that needs to be
> consumed by user-space code (normally transparently in libbpf or other
> BPF libraries), and the only distinction between submitted and
> discarded record is whether application's user code sees that record
> or it's silently skipped.

Ack. Will send a v2 in that spirit.

Eyal.
>
> >   *             If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> >   *             of new data availability is sent unconditionally.
> >   *             If **0** is specified in *flags*, an adaptive notification
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 5e38b4887de6..96de37c3b896 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -4645,7 +4645,8 @@ union bpf_attr {
> >   *     Description
> >   *             Discard reserved ring buffer sample, pointed to by *data*.
> >   *             If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> > - *             of new data availability is sent.
> > + *             of new data availability is sent, which is not recommended as
> > + *             it can suppress a later adaptive wakeup from a subsequent submit.
> >   *             If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
> >   *             of new data availability is sent unconditionally.
> >   *             If **0** is specified in *flags*, an adaptive notification
> > --
> > 2.43.0
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-31 12:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 12:17 [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard() Eyal Birger
2026-03-30 13:39 ` Puranjay Mohan
2026-03-30 13:53   ` Eyal Birger
2026-03-31  0:27 ` Andrii Nakryiko
2026-03-31 12:57   ` Eyal Birger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox