From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jordan Rife <jrife@google.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
Daniel Borkmann <daniel@iogearbox.net>,
Yonghong Song <yonghong.song@linux.dev>,
Aditi Ghag <aditi.ghag@isovalent.com>
Subject: Re: [RFC PATCH bpf-next 0/3] Avoid skipping sockets with socket iterators
Date: Thu, 20 Mar 2025 22:46:55 -0700 [thread overview]
Message-ID: <1974322e-8c30-4c01-a566-642ed2bc7086@linux.dev> (raw)
In-Reply-To: <CADKFtnThYT4Jp1Nio8iW+uEdj8+khGmAYaLxW-w5LO4tnLZdkA@mail.gmail.com>
On 3/18/25 5:23 PM, Jordan Rife wrote:
>> imo, this is not a problem for bpf. The bpf prog has access to many fields of a
>> udp_sock (ip addresses, ports, state...etc) to make the right decision. The bpf
>> prog can decide if that rehashed socket needs to be bpf_sock_destroy(), e.g. the
>> saddr in this case because of inet_reset_saddr(sk) before the rehash. From the
>> bpf prog's pov, the rehashed udp_sock is not much different from a new udp_sock
>> getting added from the userspace into the later bucket.
>
> As a user of BPF iterators, I would, and did, find this behavior quite
> surprising. If BPF iterators make no promises about visiting each
> thing exactly once, then should that be made explicit somewhere (maybe
> it already is?)? I think the natural thing for a user is to assume
> that an iterator will only visit each "thing" once and to write their
I can see the argument that the bpf_sock_destroy() kfunc does not work as
expected if the expectation is the sk will not be rehashed. Is it your use case?
I am open to have another bpf_sock_destroy() kfunc to disallow the rehash but
that will be different from the current udp_disconnect() behavior which will
need a separate discussion. I currently don't have this use case though.
> code accordingly. Using my example from before, counting the number of
> sockets I destroyed, needs to be implemented differently if I might
> revisit the same socket during iteration by explicitly filtering for
> duplicates inside the BPF program (possibly by filtering out sockets
> where the state is TCP_CLOSE, for example) or userspace. While in this
> particular example it isn't all that important if I get the count
> wrong, how do we know other users of BPF iterators won't make the same
> assumption where repeats matter more? I still think it would be nice
> if iterators themselves guaranteed exactly-once semantics but
> understand if this isn't the direction you want BPF iterators to go.
next prev parent reply other threads:[~2025-03-21 5:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 23:35 [RFC PATCH bpf-next 0/3] Avoid skipping sockets with socket iterators Jordan Rife
2025-03-13 23:35 ` [RFC PATCH bpf-next 1/3] bpf: udp: Avoid socket skips during iteration Jordan Rife
2025-03-17 17:48 ` Willem de Bruijn
2025-03-18 1:54 ` Jordan Rife
2025-03-13 23:35 ` [RFC PATCH bpf-next 2/3] bpf: tcp: " Jordan Rife
2025-03-13 23:35 ` [RFC PATCH bpf-next 3/3] selftests/bpf: Add tests for socket skips and repeats Jordan Rife
2025-03-17 22:06 ` [RFC PATCH bpf-next 0/3] Avoid skipping sockets with socket iterators Martin KaFai Lau
2025-03-18 1:45 ` Jordan Rife
2025-03-18 23:09 ` Jordan Rife
2025-03-18 23:32 ` Martin KaFai Lau
2025-03-19 0:23 ` Jordan Rife
2025-03-21 5:46 ` Martin KaFai Lau [this message]
2025-03-19 0:30 ` Martin KaFai Lau
2025-03-31 17:23 ` Jordan Rife
2025-03-31 20:44 ` Martin KaFai Lau
2025-03-31 21:58 ` Jordan Rife
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1974322e-8c30-4c01-a566-642ed2bc7086@linux.dev \
--to=martin.lau@linux.dev \
--cc=aditi.ghag@isovalent.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jrife@google.com \
--cc=netdev@vger.kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).