public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort
@ 2026-03-30 21:57 Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jordan Rife @ 2026-03-30 21:57 UTC (permalink / raw)
  To: netdev
  Cc: Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

BPF cgroup/sock_release hooks can be useful for performing cleanup,
map maintenance, or other bookkeeping when sockets are released. Cilium
uses cgroup/sock_release hooks to do just that, cleaning up maps keyed
by socket destination addresses. This works fine for TCP and connected
UDP sockets when they're closed gracefully, but Yusuke reported in [1]
that behavior is inconsistent following a socket abort.

From the perspective of a BPF sock_release hook, the state of sockets
following a socket abort (udp_abort, tcp_abort) is inconsistent and
surprising. After a sequence like the following,

1. connect(127.0.0.1:10001) or connect(::1:10001)
2. abort (diag_destroy() -> tcp_abort() or udp_abort())
3. close() -> sock_release hook runs

, the state of the sock_release program context differs depending on the
protocol and IP version.

   +--------------------------------------------------------------+
   |        Configuration        |           ctx fields           |
   +------+----------+-----------+-----------+---------+----------+
   | Case | Protocol | IP Family | dst_ip4   | dst_ip6 | dst_port |
   +------+----------+-----------+-----------+---------+----------+
   | 1    | TCP      | IPv4      | 127.0.0.1 | -       | 10001    |
   | 2    | TCP      | IPv6      | -         | ::1     | 10001    |
   | 3    | UDP      | IPv4      | 0         | -       | 0        |
   | 4    | UDP      | IPv6      | -         | ::1     | 0        |
   +------+----------+-----------+-----------+---------+----------+

For TCP, the state of dst_ip4/dst_ip6/dst_port are preserved. In the
case of UDP, both dst_ip4 and dst_port are cleared for IPv4 while
dst_ip6 remains intact for IPv6. This can be confusing for users of BPF
like Cilium where a sock_release hook makes use of these fields and
expects them to match a socket's state prior to abort. This series aims
to make the behavior consistent across the board to eliminate this
pitfall by preserving the state of dst_ip4 and dst_port after an abort.

[1]: https://github.com/cilium/cilium/issues/42649

v2: https://lore.kernel.org/netdev/20260303170106.129698-1-jrife@google.com/

CHANGES
=======
v2 -> v3:
* Expand selftests to add coverage for scenarios where a socket is bound
  and connected.
* Simply unhashing a socket in udp_abort() as in v2 immediately releases
  any ports the socket is bound to. Instead, the socket should hold onto
  its port until it is closed just as before (Kuniyuki). v3 employs a
  new strategy where inet_daddr and inet_dport are left intact in
  __udp_disconnect during udp_abort and bound sockets remain in the
  primary and portaddr hashes just as before. At the same time, the
  logic in hash lookups is adjusted so that inet_daddr/inet_dport are
  ignored unless the socket is currently connected (sk_state ==
  TCP_ESTABLISHED).
    Kuniyuki mentioned that performance may be a concern with changes to
    logic in the fast path. I've tried to address these concerns by
    testing the performance of udp4_lib_lookup2 before and after making
    the changes.
v1 -> v2:
* Set connect_fd back to -1 after calling destroy() in the selftest
  (Jakub).

Jordan Rife (4):
  udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
  udp: Remove disconnected sockets from the 4-tuple hash
  udp: Preserve destination address info after abort
  selftests/bpf: Ensure dst addr/port are preserved after socket abort

 net/ipv4/udp.c                                |  47 +++--
 net/ipv6/udp.c                                |  18 +-
 .../bpf/prog_tests/sock_destroy_release.c     | 180 ++++++++++++++++++
 .../bpf/progs/sock_destroy_release.c          |  56 ++++++
 4 files changed, 278 insertions(+), 23 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy_release.c
 create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_release.c

-- 
2.53.0.1118.gaef5881109-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-01 20:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
2026-03-31  1:21   ` Kuniyuki Iwashima
2026-04-01 20:50     ` Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
2026-03-31 16:51   ` kernel test robot
2026-03-31 17:33   ` kernel test robot
2026-03-31 17:42   ` kernel test robot
2026-03-31 17:55   ` kernel test robot
2026-03-31 18:49   ` kernel test robot
2026-03-30 21:57 ` [PATCH net-next v3 3/4] udp: Preserve destination address info after abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort Jordan Rife

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox