BPF List
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next 0/3] Exactly-once UDP socket iteration
@ 2025-04-04 22:02 Jordan Rife
  2025-04-04 22:02 ` [RFC PATCH bpf-next 1/3] bpf: udp: Use bpf_udp_iter_batch_item for bpf_udp_iter_state batch items Jordan Rife
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jordan Rife @ 2025-04-04 22:02 UTC (permalink / raw)
  To: netdev, bpf
  Cc: Jordan Rife, Aditi Ghag, Daniel Borkmann, Martin KaFai Lau,
	Willem de Bruijn

Both UDP and TCP socket iterators use iter->offset to track progress
through a bucket, which is a measure of the number of matching sockets
from the current bucket that have been seen or processed by the
iterator. On subsequent iterations, if the current bucket has
unprocessed items, we skip at least iter->offset matching items in the
bucket before adding any remaining items to the next batch. However,
iter->offset isn't always an accurate measure of "things already seen"
when the underlying bucket changes between reads which can lead to
repeated or skipped sockets.

In my original RFC, [1], I proposed a solution that added a new index
field to struct sock_common, but general feedback is that we should
avoid this. After some discussion, Martin suggested using socket cookies
to keep track of what we haven't seen yet in the current bucket. This
series is a follow up from that discussion and implements a PoC of this
approach.

This series replaces struct sock **batch inside struct
bpf_udp_iter_state with union bpf_udp_iter_batch_item *batch, where
union bpf_udp_iter_batch_item can contain either a pointer to a socket
or a socket cookie. During reads, batch contains pointers to all sockets
in the current batch while between reads batch contains all the cookies
of the sockets in the current bucket that have yet to be processed. On
subsequent reads, when iteration resumes, bpf_iter_udp_batch finds the
first saved cookie that matches a socket in the bucket's socket list and
picks up from there to construct the next batch. On average, assuming
it's rare that the next socket disappears before the next read occurs,
we should only need to scan as much as we did with the offset-based
approach to find the starting point. In the case that the next socket
is no longer there, we keep scanning through the saved cookies list
until we find a match. The worst case is when none of the sockets from
last time exist anymore, but again, this should be rare.

[1]: https://lore.kernel.org/bpf/20250313233615.2329869-1-jrife@google.com/

Jordan Rife (3):
  bpf: udp: Use bpf_udp_iter_batch_item for bpf_udp_iter_state batch
    items
  bpf: udp: Avoid socket skips and repeats during iteration
  selftests/bpf: Add tests for bucket resume logic in UDP socket
    iterators

 include/linux/udp.h                           |   3 +
 net/ipv4/udp.c                                |  94 +++-
 .../bpf/prog_tests/sock_iter_batch.c          | 452 +++++++++++++++++-
 .../selftests/bpf/progs/bpf_tracing_net.h     |   1 +
 .../selftests/bpf/progs/sock_iter_batch.c     |  24 +-
 5 files changed, 533 insertions(+), 41 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-04-09  0:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-04 22:02 [RFC PATCH bpf-next 0/3] Exactly-once UDP socket iteration Jordan Rife
2025-04-04 22:02 ` [RFC PATCH bpf-next 1/3] bpf: udp: Use bpf_udp_iter_batch_item for bpf_udp_iter_state batch items Jordan Rife
2025-04-04 22:02 ` [RFC PATCH bpf-next 2/3] bpf: udp: Avoid socket skips and repeats during iteration Jordan Rife
2025-04-04 23:20   ` Kuniyuki Iwashima
2025-04-07 23:30     ` Jordan Rife
2025-04-08  0:16       ` Kuniyuki Iwashima
2025-04-08  2:39         ` Jordan Rife
2025-04-08  5:23           ` Martin KaFai Lau
2025-04-09  0:11             ` Jordan Rife
2025-04-07 21:56   ` Martin KaFai Lau
2025-04-07 23:39     ` Jordan Rife
2025-04-04 22:02 ` [RFC PATCH bpf-next 3/3] selftests/bpf: Add tests for bucket resume logic in UDP socket iterators Jordan Rife

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox