From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jordan Rife <jrife@google.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
Daniel Borkmann <daniel@iogearbox.net>,
Yonghong Song <yonghong.song@linux.dev>,
Aditi Ghag <aditi.ghag@isovalent.com>
Subject: Re: [RFC PATCH bpf-next 0/3] Avoid skipping sockets with socket iterators
Date: Thu, 20 Mar 2025 22:46:55 -0700 [thread overview]
Message-ID: <1974322e-8c30-4c01-a566-642ed2bc7086@linux.dev> (raw)
In-Reply-To: <CADKFtnThYT4Jp1Nio8iW+uEdj8+khGmAYaLxW-w5LO4tnLZdkA@mail.gmail.com>
On 3/18/25 5:23 PM, Jordan Rife wrote:
>> imo, this is not a problem for bpf. The bpf prog has access to many fields of a
>> udp_sock (ip addresses, ports, state...etc) to make the right decision. The bpf
>> prog can decide if that rehashed socket needs to be bpf_sock_destroy(), e.g. the
>> saddr in this case because of inet_reset_saddr(sk) before the rehash. From the
>> bpf prog's pov, the rehashed udp_sock is not much different from a new udp_sock
>> getting added from the userspace into the later bucket.
>
> As a user of BPF iterators, I would, and did, find this behavior quite
> surprising. If BPF iterators make no promises about visiting each
> thing exactly once, then should that be made explicit somewhere (maybe
> it already is?)? I think the natural thing for a user is to assume
> that an iterator will only visit each "thing" once and to write their
I can see the argument that the bpf_sock_destroy() kfunc does not work as
expected if the expectation is the sk will not be rehashed. Is it your use case?
I am open to have another bpf_sock_destroy() kfunc to disallow the rehash but
that will be different from the current udp_disconnect() behavior which will
need a separate discussion. I currently don't have this use case though.
> code accordingly. Using my example from before, counting the number of
> sockets I destroyed, needs to be implemented differently if I might
> revisit the same socket during iteration by explicitly filtering for
> duplicates inside the BPF program (possibly by filtering out sockets
> where the state is TCP_CLOSE, for example) or userspace. While in this
> particular example it isn't all that important if I get the count
> wrong, how do we know other users of BPF iterators won't make the same
> assumption where repeats matter more? I still think it would be nice
> if iterators themselves guaranteed exactly-once semantics but
> understand if this isn't the direction you want BPF iterators to go.
next prev parent reply other threads:[~2025-03-21 5:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 23:35 [RFC PATCH bpf-next 0/3] Avoid skipping sockets with socket iterators Jordan Rife
2025-03-13 23:35 ` [RFC PATCH bpf-next 1/3] bpf: udp: Avoid socket skips during iteration Jordan Rife
2025-03-17 17:48 ` Willem de Bruijn
2025-03-18 1:54 ` Jordan Rife
2025-03-13 23:35 ` [RFC PATCH bpf-next 2/3] bpf: tcp: " Jordan Rife
2025-03-13 23:35 ` [RFC PATCH bpf-next 3/3] selftests/bpf: Add tests for socket skips and repeats Jordan Rife
2025-03-17 22:06 ` [RFC PATCH bpf-next 0/3] Avoid skipping sockets with socket iterators Martin KaFai Lau
2025-03-18 1:45 ` Jordan Rife
2025-03-18 23:09 ` Jordan Rife
2025-03-18 23:32 ` Martin KaFai Lau
2025-03-19 0:23 ` Jordan Rife
2025-03-21 5:46 ` Martin KaFai Lau [this message]
2025-03-19 0:30 ` Martin KaFai Lau
2025-03-31 17:23 ` Jordan Rife
2025-03-31 20:44 ` Martin KaFai Lau
2025-03-31 21:58 ` Jordan Rife
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1974322e-8c30-4c01-a566-642ed2bc7086@linux.dev \
--to=martin.lau@linux.dev \
--cc=aditi.ghag@isovalent.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jrife@google.com \
--cc=netdev@vger.kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.