All of lore.kernel.org
 help / color / mirror / Atom feed
From: Puranjay Mohan <puranjay@kernel.org>
To: Eyal Birger <eyal.birger@gmail.com>,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	martin.lau@linux.dev, eddyz87@gmail.com, memxor@gmail.com,
	song@kernel.org, yonghong.song@linux.dev, jolsa@kernel.org,
	paul.chaignon@gmail.com, chen.dylane@linux.dev,
	kpsingh@kernel.org, a.s.protopopov@gmail.com, yatsenko@meta.com,
	ameryhung@gmail.com, tklauser@distanz.ch,
	shmulik.ladkani@gmail.com
Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	Eyal Birger <eyal.birger@gmail.com>
Subject: Re: [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard()
Date: Mon, 30 Mar 2026 14:39:23 +0100	[thread overview]
Message-ID: <m2qzp1zbas.fsf@kernel.org> (raw)
In-Reply-To: <20260330121732.1601352-1-eyal.birger@gmail.com>

Eyal Birger <eyal.birger@gmail.com> writes:

> Document that BPF_RB_NO_WAKEUP is not recommended for
> bpf_ringbuf_discard().
>
> A discard done with BPF_RB_NO_WAKEUP can suppress a later adaptive
> wakeup from a valid record, leaving an epoll-based userspace consumer
> asleep even though data is available in the ring buffer.
>
> Scenario:
>
> epoll_wait(rb_fd);                     // blocks
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);
>
> rec = bpf_ringbuf_reserve(&rb, ...);
> bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup
>
> This behavior is surprising in the context of bpf_ringbuf_discard()
> as it seems natural not to want to wake userspace.

It appears that once you do a submit or discard with NO_WAKEUP, you can
never go back to adaptive mode. You will need to do an explicit
BPF_RB_FORCE_WAKEUP to do a wakeup. 

This looks like expected behaviour (by design) but for discard the
programmer might think: "I'm discarding and there's no data, why would I
wake anyone?  Let me pass BPF_RB_NO_WAKEUP to be a good citizen." 

I am not sure if we just want to change the description of the helper or
also fix the code to ignore the BPF_RB_NO_WAKEUP for discard?

Or a better approach would be to add a wakeup_needed flag to struct
bpf_ringbuf. When NO_WAKEUP suppresses a wakeup that would have fired
(cons_pos == rec_pos), it sets the flag. Any subsequent adaptive commit
sees the flag and sends the wakeup, clearing it. FORCE_WAKEUP also
clears it. But this could change how the ringbuf behaves for existing
programs.

I will let others comment on this.

> Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
>  include/uapi/linux/bpf.h       | 3 ++-
>  tools/include/uapi/linux/bpf.h | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c8d400b7680a..c46b06d45904 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
>   * 	Description
>   * 		Discard reserved ring buffer sample, pointed to by *data*.
>   * 		If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * 		of new data availability is sent.
> + * 		of new data availability is sent, which is not recommended as
> + * 		it can suppress a later adaptive wakeup from a subsequent submit.
>   * 		If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
>   * 		of new data availability is sent unconditionally.
>   * 		If **0** is specified in *flags*, an adaptive notification
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 5e38b4887de6..96de37c3b896 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -4645,7 +4645,8 @@ union bpf_attr {
>   * 	Description
>   * 		Discard reserved ring buffer sample, pointed to by *data*.
>   * 		If **BPF_RB_NO_WAKEUP** is specified in *flags*, no notification
> - * 		of new data availability is sent.
> + * 		of new data availability is sent, which is not recommended as
> + * 		it can suppress a later adaptive wakeup from a subsequent submit.
>   * 		If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
>   * 		of new data availability is sent unconditionally.
>   * 		If **0** is specified in *flags*, an adaptive notification

  reply	other threads:[~2026-03-30 13:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 12:17 [PATCH bpf-next] bpf: warn against BPF_RB_NO_WAKEUP in bpf_ringbuf_discard() Eyal Birger
2026-03-30 13:39 ` Puranjay Mohan [this message]
2026-03-30 13:53   ` Eyal Birger
2026-03-31  0:27 ` Andrii Nakryiko
2026-03-31 12:57   ` Eyal Birger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2qzp1zbas.fsf@kernel.org \
    --to=puranjay@kernel.org \
    --cc=a.s.protopopov@gmail.com \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=chen.dylane@linux.dev \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=eyal.birger@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=paul.chaignon@gmail.com \
    --cc=shmulik.ladkani@gmail.com \
    --cc=song@kernel.org \
    --cc=tklauser@distanz.ch \
    --cc=yatsenko@meta.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.