BPF List
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jordan Rife <jordan@jrife.io>
Cc: Aditi Ghag <aditi.ghag@isovalent.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Kuniyuki Iwashima <kuniyu@amazon.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH v2 bpf-next 2/5] bpf: udp: Propagate ENOMEM up from bpf_iter_udp_batch
Date: Fri, 11 Apr 2025 15:10:51 -0700	[thread overview]
Message-ID: <7ed28273-a716-4638-912d-f86f965e54bb@linux.dev> (raw)
In-Reply-To: <20250411173551.772577-3-jordan@jrife.io>

On 4/11/25 10:35 AM, Jordan Rife wrote:
> Stop iteration if the current bucket can't be contained in a single
> batch to avoid choosing between skipping or repeating sockets. In cases
> where none of the saved cookies can be found in the current bucket and

Since the cookie change is in the next patch, it will be useful to mention it is 
a prep work for the next patch.

> the batch isn't big enough to contain all the sockets in the bucket,
> there are really only two choices, neither of which is desirable:
> 
> 1. Start from the beginning, assuming we haven't seen any sockets in the
>     current bucket, in which case we might repeat a socket we've already
>     seen.
> 2. Go to the next bucket to avoid repeating a socket we may have already
>     seen, in which case we may accidentally skip a socket that we didn't
>     yet visit.
> 
> To avoid this tradeoff, enforce the invariant that the batch always
> contains a full snapshot of the bucket from last time by returning
> -ENOMEM if bpf_iter_udp_realloc_batch() can't grab enough memory to fit
> all sockets in the current bucket.
> 
> To test this code path, I forced bpf_iter_udp_realloc_batch() to return
> -ENOMEM when called from within bpf_iter_udp_batch() and observed that
> read() fails in userspace with errno set to ENOMEM. Otherwise, it's a
> bit hard to test this scenario.
> 
> Link: https://lore.kernel.org/bpf/CABi4-ogUtMrH8-NVB6W8Xg_F_KDLq=yy-yu-tKr2udXE2Mu1Lg@mail.gmail.com/
> 
> Signed-off-by: Jordan Rife <jordan@jrife.io>
> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>   net/ipv4/udp.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 59c3281962b9..1e8ae08d24db 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -3410,6 +3410,7 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
>   	unsigned int batch_sks = 0;
>   	bool resized = false;
>   	struct sock *sk;
> +	int err = 0;
>   
>   	resume_bucket = state->bucket;
>   	resume_offset = iter->offset;
> @@ -3475,7 +3476,11 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
>   		iter->st_bucket_done = true;
>   		goto done;
>   	}
> -	if (!resized && !bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2)) {
> +	if (!resized) {

The resized == true case will have a similar issue. Meaning the next 
bpf_iter_udp_batch() will end up skipping the remaining sk in that bucket, e.g. 
the partial-bucket batch has been consumed, so cur_sk == end_sk but 
st_bucket_done == false and bpf_iter_udp_resume() returns NULL. It is sort of a 
regression from the current "offset" implementation for this case. Any thought 
on how to make it better?

> +		err = bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2);
> +		if (err)
> +			return ERR_PTR(err);
> +
>   		resized = true;
>   		/* After allocating a larger batch, retry one more time to grab
>   		 * the whole bucket.


  reply	other threads:[~2025-04-11 22:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11 17:35 [PATCH v2 bpf-next 0/5] bpf: udp: Exactly-once socket iteration Jordan Rife
2025-04-11 17:35 ` [PATCH v2 bpf-next 1/5] bpf: udp: Use bpf_udp_iter_batch_item for bpf_udp_iter_state batch items Jordan Rife
2025-04-11 17:35 ` [PATCH v2 bpf-next 2/5] bpf: udp: Propagate ENOMEM up from bpf_iter_udp_batch Jordan Rife
2025-04-11 22:10   ` Martin KaFai Lau [this message]
2025-04-11 23:31     ` Jordan Rife
2025-04-12  3:47       ` Martin KaFai Lau
2025-04-14  0:02         ` Jordan Rife
2025-04-14 21:54           ` Martin KaFai Lau
2025-04-14 23:59             ` Jordan Rife
2025-04-11 17:35 ` [PATCH v2 bpf-next 3/5] bpf: udp: Avoid socket skips and repeats during iteration Jordan Rife
2025-04-11 20:12   ` Kuniyuki Iwashima
2025-04-11 17:35 ` [PATCH v2 bpf-next 4/5] selftests/bpf: Return socket cookies from sock_iter_batch progs Jordan Rife
2025-04-11 17:35 ` [PATCH v2 bpf-next 5/5] selftests/bpf: Add tests for bucket resume logic in UDP socket iterators Jordan Rife
2025-04-11 22:32   ` Martin KaFai Lau
2025-04-15  0:04     ` Jordan Rife

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ed28273-a716-4638-912d-f86f965e54bb@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=aditi.ghag@isovalent.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jordan@jrife.io \
    --cc=kuniyu@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox