public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort
@ 2026-03-30 21:57 Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jordan Rife @ 2026-03-30 21:57 UTC (permalink / raw)
  To: netdev
  Cc: Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

BPF cgroup/sock_release hooks can be useful for performing cleanup,
map maintenance, or other bookkeeping when sockets are released. Cilium
uses cgroup/sock_release hooks to do just that, cleaning up maps keyed
by socket destination addresses. This works fine for TCP and connected
UDP sockets when they're closed gracefully, but Yusuke reported in [1]
that behavior is inconsistent following a socket abort.

From the perspective of a BPF sock_release hook, the state of sockets
following a socket abort (udp_abort, tcp_abort) is inconsistent and
surprising. After a sequence like the following,

1. connect(127.0.0.1:10001) or connect(::1:10001)
2. abort (diag_destroy() -> tcp_abort() or udp_abort())
3. close() -> sock_release hook runs

, the state of the sock_release program context differs depending on the
protocol and IP version.

   +--------------------------------------------------------------+
   |        Configuration        |           ctx fields           |
   +------+----------+-----------+-----------+---------+----------+
   | Case | Protocol | IP Family | dst_ip4   | dst_ip6 | dst_port |
   +------+----------+-----------+-----------+---------+----------+
   | 1    | TCP      | IPv4      | 127.0.0.1 | -       | 10001    |
   | 2    | TCP      | IPv6      | -         | ::1     | 10001    |
   | 3    | UDP      | IPv4      | 0         | -       | 0        |
   | 4    | UDP      | IPv6      | -         | ::1     | 0        |
   +------+----------+-----------+-----------+---------+----------+

For TCP, the state of dst_ip4/dst_ip6/dst_port are preserved. In the
case of UDP, both dst_ip4 and dst_port are cleared for IPv4 while
dst_ip6 remains intact for IPv6. This can be confusing for users of BPF
like Cilium where a sock_release hook makes use of these fields and
expects them to match a socket's state prior to abort. This series aims
to make the behavior consistent across the board to eliminate this
pitfall by preserving the state of dst_ip4 and dst_port after an abort.

[1]: https://github.com/cilium/cilium/issues/42649

v2: https://lore.kernel.org/netdev/20260303170106.129698-1-jrife@google.com/

CHANGES
=======
v2 -> v3:
* Expand selftests to add coverage for scenarios where a socket is bound
  and connected.
* Simply unhashing a socket in udp_abort() as in v2 immediately releases
  any ports the socket is bound to. Instead, the socket should hold onto
  its port until it is closed just as before (Kuniyuki). v3 employs a
  new strategy where inet_daddr and inet_dport are left intact in
  __udp_disconnect during udp_abort and bound sockets remain in the
  primary and portaddr hashes just as before. At the same time, the
  logic in hash lookups is adjusted so that inet_daddr/inet_dport are
  ignored unless the socket is currently connected (sk_state ==
  TCP_ESTABLISHED).
    Kuniyuki mentioned that performance may be a concern with changes to
    logic in the fast path. I've tried to address these concerns by
    testing the performance of udp4_lib_lookup2 before and after making
    the changes.
v1 -> v2:
* Set connect_fd back to -1 after calling destroy() in the selftest
  (Jakub).

Jordan Rife (4):
  udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
  udp: Remove disconnected sockets from the 4-tuple hash
  udp: Preserve destination address info after abort
  selftests/bpf: Ensure dst addr/port are preserved after socket abort

 net/ipv4/udp.c                                |  47 +++--
 net/ipv6/udp.c                                |  18 +-
 .../bpf/prog_tests/sock_destroy_release.c     | 180 ++++++++++++++++++
 .../bpf/progs/sock_destroy_release.c          |  56 ++++++
 4 files changed, 278 insertions(+), 23 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy_release.c
 create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_release.c

-- 
2.53.0.1118.gaef5881109-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
  2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
@ 2026-03-30 21:57 ` Jordan Rife
  2026-03-31  1:21   ` Kuniyuki Iwashima
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Jordan Rife @ 2026-03-30 21:57 UTC (permalink / raw)
  To: netdev
  Cc: Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

Adjust lookups and scoring to keep their results equivalent to before
even if inet_daddr+inet_dport are left intact after disconnecting a
socket (sk_state == TCP_CLOSE). sk_state == TCP_ESTABLISHED implies that
*daddr is non-zero, so remove redundant checks for that at the same
time. Note that __udp6_lib_demux_lookup already checks if sk_state ==
TCP_ESTABLISHED, so no change was needed there [1].

I could find no discernible difference in performance in
udp4_lib_lookup2 before and after the change in compute_score.

(AMD Ryzen 9 9900X)

kprobe:udp4_lib_lookup2 {
        @start[cpu] = nsecs;
}
kretprobe:udp4_lib_lookup2 {
        @lookup[cpu] = hist(nsecs - @start[cpu], 2);
}

BEFORE
======
@lookup[11]:
[80, 96)         1387077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[96, 112)         364973 |@@@@@@@@@@@@@                                       |
[112, 128)         34261 |@                                                   |
[128, 160)          7246 |                                                    |
[160, 192)           215 |                                                    |
[192, 224)           126 |                                                    |

AFTER
=====
@lookup[11]:
[80, 96)         1408594 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[96, 112)         340568 |@@@@@@@@@@@@                                        |
[112, 128)         30753 |@                                                   |
[128, 160)          8019 |                                                    |
[160, 192)           231 |                                                    |
[192, 224)           157 |                                                    |

[1]: https://lore.kernel.org/netdev/20170623222537.130493-1-tracywwnj@gmail.com/

Signed-off-by: Jordan Rife <jrife@google.com>
---
 net/ipv4/udp.c | 20 +++++++++++---------
 net/ipv6/udp.c | 18 +++++++++---------
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b60fad393e18..d91c587c3657 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -385,16 +385,16 @@ static int compute_score(struct sock *sk, const struct net *net,
 	score = (sk->sk_family == PF_INET) ? 2 : 1;
 
 	inet = inet_sk(sk);
-	if (inet->inet_daddr) {
+	if (sk->sk_state == TCP_ESTABLISHED) {
 		if (inet->inet_daddr != saddr)
 			return -1;
 		score += 4;
-	}
 
-	if (inet->inet_dport) {
-		if (inet->inet_dport != sport)
-			return -1;
-		score += 4;
+		if (inet->inet_dport) {
+			if (inet->inet_dport != sport)
+				return -1;
+			score += 4;
+		}
 	}
 
 	dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
@@ -796,8 +796,9 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
 
 	if (!net_eq(sock_net(sk), net) ||
 	    udp_sk(sk)->udp_port_hash != hnum ||
-	    (inet->inet_daddr && inet->inet_daddr != rmt_addr) ||
-	    (inet->inet_dport != rmt_port && inet->inet_dport) ||
+	    (sk->sk_state == TCP_ESTABLISHED &&
+	     (inet->inet_daddr != rmt_addr ||
+	     (inet->inet_dport != rmt_port && inet->inet_dport))) ||
 	    (inet->inet_rcv_saddr && inet->inet_rcv_saddr != loc_addr) ||
 	    ipv6_only_sock(sk) ||
 	    !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
@@ -2854,7 +2855,8 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
 	ports = INET_COMBINED_PORTS(rmt_port, hnum);
 
 	udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
-		if (inet_match(net, sk, acookie, ports, dif, sdif))
+		if (sk->sk_state == TCP_ESTABLISHED &&
+		    inet_match(net, sk, acookie, ports, dif, sdif))
 			return sk;
 		/* Only check first socket in chain */
 		break;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 010b909275dd..b93a9a3e7678 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -147,16 +147,16 @@ static int compute_score(struct sock *sk, const struct net *net,
 	score = 0;
 	inet = inet_sk(sk);
 
-	if (inet->inet_dport) {
+	if (sk->sk_state == TCP_ESTABLISHED) {
 		if (inet->inet_dport != sport)
 			return -1;
 		score++;
-	}
 
-	if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
-		if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
-			return -1;
-		score++;
+		if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
+			if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
+				return -1;
+			score++;
+		}
 	}
 
 	bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
@@ -949,9 +949,9 @@ static bool __udp_v6_is_mcast_sock(struct net *net, const struct sock *sk,
 
 	if (udp_sk(sk)->udp_port_hash != hnum ||
 	    sk->sk_family != PF_INET6 ||
-	    (inet->inet_dport && inet->inet_dport != rmt_port) ||
-	    (!ipv6_addr_any(&sk->sk_v6_daddr) &&
-		    !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
+	    (sk->sk_state == TCP_ESTABLISHED &&
+	     ((inet->inet_dport && inet->inet_dport != rmt_port) ||
+	     !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))) ||
 	    !udp_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif, sdif) ||
 	    (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) &&
 		    !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr)))
-- 
2.53.0.1118.gaef5881109-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
  2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
@ 2026-03-30 21:57 ` Jordan Rife
  2026-03-31 16:51   ` kernel test robot
                     ` (4 more replies)
  2026-03-30 21:57 ` [PATCH net-next v3 3/4] udp: Preserve destination address info after abort Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort Jordan Rife
  3 siblings, 5 replies; 12+ messages in thread
From: Jordan Rife @ 2026-03-30 21:57 UTC (permalink / raw)
  To: netdev
  Cc: Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

Currently, previously connected sockets stay in the 4-tuple hash after
__udp_disconnect if the socket was bound to a specific port or port:addr
pair. This is benign if inet_daddr/inet_dport are cleared as well, since
lookups matching the old 4-tuple will not find this socket in the hash.
To maintain the same behavior as before if inet_daddr/inet_dport are not
cleared in __udp_disconnect, always remove a socket from the hash on
disconnect or abort to prevent a lookup for the original 4-tuple from
finding the socket.

Signed-off-by: Jordan Rife <jrife@google.com>
---
 net/ipv4/udp.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d91c587c3657..6e5ba2ce9314 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2235,10 +2235,20 @@ int __udp_disconnect(struct sock *sk, int flags)
 }
 EXPORT_SYMBOL(__udp_disconnect);
 
+static int udp_disconnect_unhash4(struct sock *sk, int flags)
+{
+	struct udp_table *udptable = udp_get_table_prot(sk);
+
+	udp_unhash4(udptable, sk);
+	__udp_disconnect(sk, flags);
+
+	return 0;
+}
+
 int udp_disconnect(struct sock *sk, int flags)
 {
 	lock_sock(sk);
-	__udp_disconnect(sk, flags);
+	udp_disconnect_unhash4(sk, flags);
 	release_sock(sk);
 	return 0;
 }
@@ -3254,7 +3264,7 @@ int udp_abort(struct sock *sk, int err)
 
 	sk->sk_err = err;
 	sk_error_report(sk);
-	__udp_disconnect(sk, 0);
+	udp_disconnect_unhash4(sk, 0);
 
 out:
 	if (!has_current_bpf_ctx())
-- 
2.53.0.1118.gaef5881109-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 3/4] udp: Preserve destination address info after abort
  2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
@ 2026-03-30 21:57 ` Jordan Rife
  2026-03-30 21:57 ` [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort Jordan Rife
  3 siblings, 0 replies; 12+ messages in thread
From: Jordan Rife @ 2026-03-30 21:57 UTC (permalink / raw)
  To: netdev
  Cc: Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

For explicit disconnections using connect(AF_UNSPEC) behavior remains
unchanged while udp_abort now avoids clearing inet_daddr and inet_dport.
This is safe to do without changing behavior elsewhere, since lookups
only consult these fields if the socket is currently connected (sk_state
== TCP_ESTABLISHED). The behavior of getpeername doesn't change w.r.t.
aborted sockets, since it returns -ENOTCONN as long as sk_state ==
TCP_CLOSE. Behavior of BPF socket iterators and /proc/net/udp /do/
change with both now seeing the non-cleared daddr+dport pair after
an abort. Behavior of BPF socket lookup helpers which invoke
__udp*_lib_lookup don't change, since the result of compute_score should
be the same as before.

Reported-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
Signed-off-by: Jordan Rife <jrife@google.com>
---
 net/ipv4/udp.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6e5ba2ce9314..043496a249ca 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2207,7 +2207,7 @@ static int udp_connect(struct sock *sk, struct sockaddr_unsized *uaddr,
 	return res;
 }
 
-int __udp_disconnect(struct sock *sk, int flags)
+static int ___udp_disconnect(struct sock *sk, int flags, bool clear_dest)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	/*
@@ -2215,8 +2215,10 @@ int __udp_disconnect(struct sock *sk, int flags)
 	 */
 
 	sk->sk_state = TCP_CLOSE;
-	inet->inet_daddr = 0;
-	inet->inet_dport = 0;
+	if (clear_dest) {
+		inet->inet_daddr = 0;
+		inet->inet_dport = 0;
+	}
 	sock_rps_reset_rxhash(sk);
 	sk->sk_bound_dev_if = 0;
 	if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) {
@@ -2233,14 +2235,19 @@ int __udp_disconnect(struct sock *sk, int flags)
 	sk_dst_reset(sk);
 	return 0;
 }
+
+int __udp_disconnect(struct sock *sk, int flags)
+{
+	return ___udp_disconnect(sk, flags, true);
+}
 EXPORT_SYMBOL(__udp_disconnect);
 
-static int udp_disconnect_unhash4(struct sock *sk, int flags)
+static int udp_disconnect_unhash4(struct sock *sk, int flags, bool clear_dest)
 {
 	struct udp_table *udptable = udp_get_table_prot(sk);
 
 	udp_unhash4(udptable, sk);
-	__udp_disconnect(sk, flags);
+	___udp_disconnect(sk, flags, clear_dest);
 
 	return 0;
 }
@@ -2248,7 +2255,7 @@ static int udp_disconnect_unhash4(struct sock *sk, int flags)
 int udp_disconnect(struct sock *sk, int flags)
 {
 	lock_sock(sk);
-	udp_disconnect_unhash4(sk, flags);
+	udp_disconnect_unhash4(sk, flags, true);
 	release_sock(sk);
 	return 0;
 }
@@ -3264,7 +3271,7 @@ int udp_abort(struct sock *sk, int err)
 
 	sk->sk_err = err;
 	sk_error_report(sk);
-	udp_disconnect_unhash4(sk, 0);
+	udp_disconnect_unhash4(sk, 0, false);
 
 out:
 	if (!has_current_bpf_ctx())
-- 
2.53.0.1118.gaef5881109-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort
  2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
                   ` (2 preceding siblings ...)
  2026-03-30 21:57 ` [PATCH net-next v3 3/4] udp: Preserve destination address info after abort Jordan Rife
@ 2026-03-30 21:57 ` Jordan Rife
  3 siblings, 0 replies; 12+ messages in thread
From: Jordan Rife @ 2026-03-30 21:57 UTC (permalink / raw)
  To: netdev
  Cc: Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

Ensure that sock_release hooks see the original dst_ip4, dst_ip6, and
dst_port values for connected UDP and TCP sockets following a socket
abort.

Signed-off-by: Jordan Rife <jrife@google.com>
---
 .../bpf/prog_tests/sock_destroy_release.c     | 180 ++++++++++++++++++
 .../bpf/progs/sock_destroy_release.c          |  56 ++++++
 2 files changed, 236 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy_release.c
 create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_release.c

diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy_release.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy_release.c
new file mode 100644
index 000000000000..022031332b83
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy_release.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+#include "network_helpers.h"
+#include "sock_destroy_release.skel.h"
+
+#define TEST_NS "sock_destroy_release_netns"
+#define BIND_ADDR4 "127.0.0.1"
+#define BIND_ADDR6 "::1"
+#define ANY_ADDR4 "0.0.0.0"
+#define ANY_ADDR6 "::"
+
+static __u64 socket_cookie(int fd)
+{
+	__u64 cookie;
+	socklen_t cookie_len = sizeof(cookie);
+
+	if (!ASSERT_OK(getsockopt(fd, SOL_SOCKET, SO_COOKIE, &cookie,
+				  &cookie_len), "getsockopt(SO_COOKIE)"))
+		return 0;
+	return cookie;
+}
+
+static void destroy(struct sock_destroy_release *skel, int fd, int sock_type)
+{
+	__u64 cookie = socket_cookie(fd);
+	struct bpf_link *link = NULL;
+	int iter_fd = -1;
+	int nread;
+	__u64 out;
+
+	skel->bss->abort_cookie = cookie;
+
+	link = bpf_program__attach_iter(sock_type == SOCK_STREAM ?
+					skel->progs.abort_tcp :
+					skel->progs.abort_udp, NULL);
+	if (!ASSERT_OK_PTR(link, "bpf_program__attach_iter"))
+		goto done;
+
+	iter_fd = bpf_iter_create(bpf_link__fd(link));
+	if (!ASSERT_OK_FD(iter_fd, "bpf_iter_create"))
+		goto done;
+
+	/* Delete matching socket. */
+	nread = read(iter_fd, &out, sizeof(out));
+	ASSERT_GE(nread, 0, "nread");
+	if (nread)
+		ASSERT_EQ(out, cookie, "cookie matches");
+done:
+	if (iter_fd >= 0)
+		close(iter_fd);
+	bpf_link__destroy(link);
+}
+
+static void do_test(struct sock_destroy_release *skel, int sock_type,
+		    int family, const char *bind_addr_str, const int bind_port)
+{
+	const char *addr = family == AF_INET ? BIND_ADDR4 : BIND_ADDR6;
+	int listen_fd = -1, connect_fd = -1, accept_fd = -1;
+	struct sockaddr_storage bind_addr;
+	static const int port = 10001;
+	socklen_t bind_addr_len;
+
+	listen_fd = start_server(family, sock_type, addr, port, 0);
+	if (!ASSERT_OK_FD(listen_fd, "start_server"))
+		goto cleanup;
+
+	connect_fd = client_socket(family, sock_type, NULL);
+	if (!ASSERT_OK_FD(connect_fd, "client_socket"))
+		goto cleanup;
+
+	if (bind_addr_str) {
+		if (!ASSERT_OK(make_sockaddr(family, bind_addr_str, bind_port,
+					     &bind_addr, &bind_addr_len),
+			       "make_sockaddr"))
+			goto cleanup;
+		if (!ASSERT_OK(bind(connect_fd, (struct sockaddr *)&bind_addr,
+				    bind_addr_len), "bind"))
+			goto cleanup;
+	}
+
+	if (!ASSERT_OK(connect_fd_to_fd(connect_fd, listen_fd, 0),
+		       "connect_fd_to_fd"))
+		goto cleanup;
+
+	memset(&skel->bss->sk, 0, sizeof(skel->bss->sk));
+	destroy(skel, connect_fd, sock_type);
+	close(connect_fd);
+	connect_fd = -1;
+	ASSERT_EQ(ntohs(skel->bss->sk.dst_port), port, "dst_port");
+	if (family == AF_INET) {
+		ASSERT_EQ(ntohl(skel->bss->sk.dst_ip4), 0x7f000001, "dst_ip4");
+	} else {
+		ASSERT_EQ(skel->bss->sk.dst_ip6[0], 0, "dst_ip6[0]");
+		ASSERT_EQ(skel->bss->sk.dst_ip6[1], 0, "dst_ip6[1]");
+		ASSERT_EQ(skel->bss->sk.dst_ip6[2], 0, "dst_ip6[2]");
+		ASSERT_EQ(ntohl(skel->bss->sk.dst_ip6[3]), 0x1, "dst_ip6[3]");
+	}
+cleanup:
+	if (connect_fd >= 0)
+		close(connect_fd);
+	if (accept_fd >= 0)
+		close(accept_fd);
+	if (listen_fd >= 0)
+		close(listen_fd);
+}
+
+static void do_tests(struct sock_destroy_release *skel, int sock_type,
+		     int family, const char * const *bind_addrs,
+		     size_t bind_addrs_len, const int *bind_ports,
+		     size_t bind_ports_len)
+{
+	const char *protocol_name = sock_type == SOCK_STREAM ? "tcp" : "udp";
+	const char *family_name = family == AF_INET ? "ipv4" : "ipv6";
+	char name[256];
+
+	for (size_t i = 0; i < bind_addrs_len; i++) {
+		for (size_t j = 0; j < bind_ports_len; j++) {
+			snprintf(name, sizeof(name), "%s/%s/destroy/%s:%d",
+				 protocol_name, family_name, bind_addrs[i],
+				 bind_ports[j]);
+			if (test__start_subtest(name))
+				do_test(skel, sock_type, family, bind_addrs[i],
+					bind_ports[j]);
+		}
+	}
+}
+
+void test_sock_destroy_release(void)
+{
+	static const char * const bind4_addresses[] = {NULL, ANY_ADDR4,
+						       BIND_ADDR4};
+	static const char * const bind6_addresses[] = {NULL, ANY_ADDR6,
+						       BIND_ADDR6};
+	static const int bind_ports[] = {0, 10002};
+	struct sock_destroy_release *skel = NULL;
+	struct nstoken *nstoken = NULL;
+	int cgroup_fd = -1;
+
+	skel = sock_destroy_release__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		goto done;
+
+	cgroup_fd = test__join_cgroup("/sock_destroy_release");
+	if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup"))
+		goto done;
+
+	skel->links.sock_release = bpf_program__attach_cgroup(
+		skel->progs.sock_release, cgroup_fd);
+	if (!ASSERT_OK_PTR(skel->links.sock_release, "attach_cgroup"))
+		goto done;
+
+	SYS_NOFAIL("ip netns del " TEST_NS);
+	SYS(done, "ip netns add %s", TEST_NS);
+	SYS(done, "ip -net %s link set dev lo up", TEST_NS);
+
+	nstoken = open_netns(TEST_NS);
+	if (!ASSERT_OK_PTR(nstoken, "open_netns"))
+		goto done;
+
+	do_tests(skel, SOCK_STREAM, AF_INET, bind4_addresses,
+		 ARRAY_SIZE(bind4_addresses), bind_ports,
+		 ARRAY_SIZE(bind_ports));
+	do_tests(skel, SOCK_STREAM, AF_INET6, bind6_addresses,
+		 ARRAY_SIZE(bind6_addresses), bind_ports,
+		 ARRAY_SIZE(bind_ports));
+	do_tests(skel, SOCK_DGRAM, AF_INET, bind4_addresses,
+		 ARRAY_SIZE(bind4_addresses), bind_ports,
+		 ARRAY_SIZE(bind_ports));
+	do_tests(skel, SOCK_DGRAM, AF_INET6, bind6_addresses,
+		 ARRAY_SIZE(bind6_addresses), bind_ports,
+		 ARRAY_SIZE(bind_ports));
+done:
+	if (nstoken)
+		close_netns(nstoken);
+	if (cgroup_fd >= 0)
+		close(cgroup_fd);
+	SYS_NOFAIL("ip netns del " TEST_NS);
+	sock_destroy_release__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_release.c b/tools/testing/selftests/bpf/progs/sock_destroy_release.c
new file mode 100644
index 000000000000..5389f79226f9
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/sock_destroy_release.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+
+volatile __u64 abort_cookie;
+
+void maybe_abort(struct sock_common *sk, struct seq_file *seq)
+{
+	__u64 sock_cookie;
+
+	if (!sk)
+		return;
+
+	sock_cookie = bpf_get_socket_cookie(sk);
+	if (sock_cookie != abort_cookie)
+		return;
+
+	bpf_sock_destroy(sk);
+	bpf_seq_write(seq, &sock_cookie, sizeof(sock_cookie));
+}
+
+SEC("iter/udp")
+int abort_udp(struct bpf_iter__udp *ctx)
+{
+	maybe_abort((struct sock_common *)ctx->udp_sk,
+		    ctx->meta->seq);
+
+	return 0;
+}
+
+SEC("iter/tcp")
+int abort_tcp(struct bpf_iter__tcp *ctx)
+{
+	maybe_abort((struct sock_common *)ctx->sk_common,
+		    ctx->meta->seq);
+
+	return 0;
+}
+
+struct bpf_sock sk = {};
+
+SEC("cgroup/sock_release")
+int sock_release(struct bpf_sock *ctx)
+{
+	sk.dst_ip4 = ctx->dst_ip4;
+	sk.dst_ip6[0] = ctx->dst_ip6[0];
+	sk.dst_ip6[1] = ctx->dst_ip6[1];
+	sk.dst_ip6[2] = ctx->dst_ip6[2];
+	sk.dst_ip6[3] = ctx->dst_ip6[3];
+	sk.dst_port = ctx->dst_port;
+
+	return 1;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.53.0.1118.gaef5881109-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
  2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
@ 2026-03-31  1:21   ` Kuniyuki Iwashima
  2026-04-01 20:50     ` Jordan Rife
  0 siblings, 1 reply; 12+ messages in thread
From: Kuniyuki Iwashima @ 2026-03-31  1:21 UTC (permalink / raw)
  To: Jordan Rife
  Cc: netdev, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski

On Mon, Mar 30, 2026 at 2:57 PM Jordan Rife <jrife@google.com> wrote:
>
> Adjust lookups and scoring to keep their results equivalent to before
> even if inet_daddr+inet_dport are left intact after disconnecting a
> socket (sk_state == TCP_CLOSE). sk_state == TCP_ESTABLISHED implies that
> *daddr is non-zero, so remove redundant checks for that at the same
> time. Note that __udp6_lib_demux_lookup already checks if sk_state ==
> TCP_ESTABLISHED, so no change was needed there [1].
>
> I could find no discernible difference in performance in
> udp4_lib_lookup2 before and after the change in compute_score.

What workload did you test the series with ?

I think we want to see results under DDoS.


>
> (AMD Ryzen 9 9900X)
>
> kprobe:udp4_lib_lookup2 {
>         @start[cpu] = nsecs;
> }
> kretprobe:udp4_lib_lookup2 {
>         @lookup[cpu] = hist(nsecs - @start[cpu], 2);
> }
>
> BEFORE
> ======
> @lookup[11]:
> [80, 96)         1387077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [96, 112)         364973 |@@@@@@@@@@@@@                                       |
> [112, 128)         34261 |@                                                   |
> [128, 160)          7246 |                                                    |
> [160, 192)           215 |                                                    |
> [192, 224)           126 |                                                    |
>
> AFTER
> =====
> @lookup[11]:
> [80, 96)         1408594 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [96, 112)         340568 |@@@@@@@@@@@@                                        |
> [112, 128)         30753 |@                                                   |
> [128, 160)          8019 |                                                    |
> [160, 192)           231 |                                                    |
> [192, 224)           157 |                                                    |
>
> [1]: https://lore.kernel.org/netdev/20170623222537.130493-1-tracywwnj@gmail.com/
>
> Signed-off-by: Jordan Rife <jrife@google.com>
> ---
>  net/ipv4/udp.c | 20 +++++++++++---------
>  net/ipv6/udp.c | 18 +++++++++---------
>  2 files changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index b60fad393e18..d91c587c3657 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -385,16 +385,16 @@ static int compute_score(struct sock *sk, const struct net *net,
>         score = (sk->sk_family == PF_INET) ? 2 : 1;
>
>         inet = inet_sk(sk);
> -       if (inet->inet_daddr) {
> +       if (sk->sk_state == TCP_ESTABLISHED) {
>                 if (inet->inet_daddr != saddr)
>                         return -1;
>                 score += 4;
> -       }
>
> -       if (inet->inet_dport) {
> -               if (inet->inet_dport != sport)
> -                       return -1;
> -               score += 4;
> +               if (inet->inet_dport) {
> +                       if (inet->inet_dport != sport)
> +                               return -1;
> +                       score += 4;
> +               }
>         }
>
>         dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
> @@ -796,8 +796,9 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
>
>         if (!net_eq(sock_net(sk), net) ||
>             udp_sk(sk)->udp_port_hash != hnum ||
> -           (inet->inet_daddr && inet->inet_daddr != rmt_addr) ||
> -           (inet->inet_dport != rmt_port && inet->inet_dport) ||
> +           (sk->sk_state == TCP_ESTABLISHED &&
> +            (inet->inet_daddr != rmt_addr ||
> +            (inet->inet_dport != rmt_port && inet->inet_dport))) ||
>             (inet->inet_rcv_saddr && inet->inet_rcv_saddr != loc_addr) ||
>             ipv6_only_sock(sk) ||
>             !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
> @@ -2854,7 +2855,8 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
>         ports = INET_COMBINED_PORTS(rmt_port, hnum);
>
>         udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
> -               if (inet_match(net, sk, acookie, ports, dif, sdif))
> +               if (sk->sk_state == TCP_ESTABLISHED &&
> +                   inet_match(net, sk, acookie, ports, dif, sdif))
>                         return sk;
>                 /* Only check first socket in chain */
>                 break;
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 010b909275dd..b93a9a3e7678 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -147,16 +147,16 @@ static int compute_score(struct sock *sk, const struct net *net,
>         score = 0;
>         inet = inet_sk(sk);
>
> -       if (inet->inet_dport) {
> +       if (sk->sk_state == TCP_ESTABLISHED) {
>                 if (inet->inet_dport != sport)
>                         return -1;
>                 score++;
> -       }
>
> -       if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
> -               if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> -                       return -1;
> -               score++;
> +               if (!ipv6_addr_any(&sk->sk_v6_daddr)) {

This looks unnecessary.


> +                       if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> +                               return -1;
> +                       score++;
> +               }
>         }
>
>         bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
> @@ -949,9 +949,9 @@ static bool __udp_v6_is_mcast_sock(struct net *net, const struct sock *sk,
>
>         if (udp_sk(sk)->udp_port_hash != hnum ||
>             sk->sk_family != PF_INET6 ||
> -           (inet->inet_dport && inet->inet_dport != rmt_port) ||
> -           (!ipv6_addr_any(&sk->sk_v6_daddr) &&
> -                   !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
> +           (sk->sk_state == TCP_ESTABLISHED &&
> +            ((inet->inet_dport && inet->inet_dport != rmt_port) ||
> +            !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))) ||
>             !udp_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif, sdif) ||
>             (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) &&
>                     !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr)))
> --
> 2.53.0.1118.gaef5881109-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
@ 2026-03-31 16:51   ` kernel test robot
  2026-03-31 17:33   ` kernel test robot
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2026-03-31 16:51 UTC (permalink / raw)
  To: Jordan Rife, netdev
  Cc: llvm, oe-kbuild-all, Jordan Rife, bpf, Willem de Bruijn,
	Eric Dumazet, Daniel Borkmann, Martin KaFai Lau,
	Stanislav Fomichev, Andrii Nakryiko, Yusuke Suzuki,
	Jakub Kicinski, Kuniyuki Iwashima

Hi Jordan,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Jordan-Rife/udp-Only-compare-daddr-dport-when-sk_state-TCP_ESTABLISHED/20260331-082300
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260330215707.2374657-3-jrife%40google.com
patch subject: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260331/202603311827.nefybt0I-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260331/202603311827.nefybt0I-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603311827.nefybt0I-lkp@intel.com/

All errors (new ones prefixed by >>):

>> net/ipv4/udp.c:2182:31: error: call to undeclared function 'udp_get_table_prot'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    2182 |         struct udp_table *udptable = udp_get_table_prot(sk);
         |                                      ^
   net/ipv4/udp.c:2182:31: note: did you mean 'vm_get_page_prot'?
   include/linux/mm.h:4105:10: note: 'vm_get_page_prot' declared here
    4105 | pgprot_t vm_get_page_prot(vm_flags_t vm_flags);
         |          ^
>> net/ipv4/udp.c:2182:20: error: incompatible integer to pointer conversion initializing 'struct udp_table *' with an expression of type 'int' [-Wint-conversion]
    2182 |         struct udp_table *udptable = udp_get_table_prot(sk);
         |                           ^          ~~~~~~~~~~~~~~~~~~~~~~
   2 errors generated.


vim +/udp_get_table_prot +2182 net/ipv4/udp.c

  2179	
  2180	static int udp_disconnect_unhash4(struct sock *sk, int flags)
  2181	{
> 2182		struct udp_table *udptable = udp_get_table_prot(sk);
  2183	
  2184		udp_unhash4(udptable, sk);
  2185		__udp_disconnect(sk, flags);
  2186	
  2187		return 0;
  2188	}
  2189	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
  2026-03-31 16:51   ` kernel test robot
@ 2026-03-31 17:33   ` kernel test robot
  2026-03-31 17:42   ` kernel test robot
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2026-03-31 17:33 UTC (permalink / raw)
  To: Jordan Rife, netdev
  Cc: oe-kbuild-all, Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet,
	Daniel Borkmann, Martin KaFai Lau, Stanislav Fomichev,
	Andrii Nakryiko, Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

Hi Jordan,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Jordan-Rife/udp-Only-compare-daddr-dport-when-sk_state-TCP_ESTABLISHED/20260331-082300
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260330215707.2374657-3-jrife%40google.com
patch subject: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
config: openrisc-defconfig (https://download.01.org/0day-ci/archive/20260401/202604010147.jxlA7slL-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 15.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260401/202604010147.jxlA7slL-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604010147.jxlA7slL-lkp@intel.com/

All errors (new ones prefixed by >>):

   net/ipv4/udp.c: In function 'udp_disconnect_unhash4':
>> net/ipv4/udp.c:2182:38: error: implicit declaration of function 'udp_get_table_prot'; did you mean 'vm_get_page_prot'? [-Wimplicit-function-declaration]
    2182 |         struct udp_table *udptable = udp_get_table_prot(sk);
         |                                      ^~~~~~~~~~~~~~~~~~
         |                                      vm_get_page_prot
>> net/ipv4/udp.c:2182:38: error: initialization of 'struct udp_table *' from 'int' makes pointer from integer without a cast [-Wint-conversion]


vim +2182 net/ipv4/udp.c

  2179	
  2180	static int udp_disconnect_unhash4(struct sock *sk, int flags)
  2181	{
> 2182		struct udp_table *udptable = udp_get_table_prot(sk);
  2183	
  2184		udp_unhash4(udptable, sk);
  2185		__udp_disconnect(sk, flags);
  2186	
  2187		return 0;
  2188	}
  2189	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
  2026-03-31 16:51   ` kernel test robot
  2026-03-31 17:33   ` kernel test robot
@ 2026-03-31 17:42   ` kernel test robot
  2026-03-31 17:55   ` kernel test robot
  2026-03-31 18:49   ` kernel test robot
  4 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2026-03-31 17:42 UTC (permalink / raw)
  To: Jordan Rife, netdev
  Cc: oe-kbuild-all, Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet,
	Daniel Borkmann, Martin KaFai Lau, Stanislav Fomichev,
	Andrii Nakryiko, Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

Hi Jordan,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Jordan-Rife/udp-Only-compare-daddr-dport-when-sk_state-TCP_ESTABLISHED/20260331-082300
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260330215707.2374657-3-jrife%40google.com
patch subject: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260331/202603311954.k4BkWpJN-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260331/202603311954.k4BkWpJN-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603311954.k4BkWpJN-lkp@intel.com/

All errors (new ones prefixed by >>):

   net/ipv4/udp.c: In function 'udp_disconnect_unhash4':
>> net/ipv4/udp.c:2182:38: error: implicit declaration of function 'udp_get_table_prot'; did you mean 'vm_get_page_prot'? [-Wimplicit-function-declaration]
    2182 |         struct udp_table *udptable = udp_get_table_prot(sk);
         |                                      ^~~~~~~~~~~~~~~~~~
         |                                      vm_get_page_prot
>> net/ipv4/udp.c:2182:38: error: initialization of 'struct udp_table *' from 'int' makes pointer from integer without a cast [-Wint-conversion]


vim +2182 net/ipv4/udp.c

  2179	
  2180	static int udp_disconnect_unhash4(struct sock *sk, int flags)
  2181	{
> 2182		struct udp_table *udptable = udp_get_table_prot(sk);
  2183	
  2184		udp_unhash4(udptable, sk);
  2185		__udp_disconnect(sk, flags);
  2186	
  2187		return 0;
  2188	}
  2189	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
                     ` (2 preceding siblings ...)
  2026-03-31 17:42   ` kernel test robot
@ 2026-03-31 17:55   ` kernel test robot
  2026-03-31 18:49   ` kernel test robot
  4 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2026-03-31 17:55 UTC (permalink / raw)
  To: Jordan Rife, netdev
  Cc: oe-kbuild-all, Jordan Rife, bpf, Willem de Bruijn, Eric Dumazet,
	Daniel Borkmann, Martin KaFai Lau, Stanislav Fomichev,
	Andrii Nakryiko, Yusuke Suzuki, Jakub Kicinski, Kuniyuki Iwashima

Hi Jordan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Jordan-Rife/udp-Only-compare-daddr-dport-when-sk_state-TCP_ESTABLISHED/20260331-082300
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260330215707.2374657-3-jrife%40google.com
patch subject: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
config: sparc64-randconfig-r071-20260331 (https://download.01.org/0day-ci/archive/20260401/202604010154.iewJbUya-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 8.5.0
smatch: v0.5.0-9004-gb810ac53
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260401/202604010154.iewJbUya-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604010154.iewJbUya-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/ipv4/udp.c: In function 'udp_disconnect_unhash4':
   net/ipv4/udp.c:2182:31: error: implicit declaration of function 'udp_get_table_prot'; did you mean 'vm_get_page_prot'? [-Werror=implicit-function-declaration]
     struct udp_table *udptable = udp_get_table_prot(sk);
                                  ^~~~~~~~~~~~~~~~~~
                                  vm_get_page_prot
>> net/ipv4/udp.c:2182:31: warning: initialization of 'struct udp_table *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
   cc1: some warnings being treated as errors


vim +2182 net/ipv4/udp.c

  2179	
  2180	static int udp_disconnect_unhash4(struct sock *sk, int flags)
  2181	{
> 2182		struct udp_table *udptable = udp_get_table_prot(sk);
  2183	
  2184		udp_unhash4(udptable, sk);
  2185		__udp_disconnect(sk, flags);
  2186	
  2187		return 0;
  2188	}
  2189	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
  2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
                     ` (3 preceding siblings ...)
  2026-03-31 17:55   ` kernel test robot
@ 2026-03-31 18:49   ` kernel test robot
  4 siblings, 0 replies; 12+ messages in thread
From: kernel test robot @ 2026-03-31 18:49 UTC (permalink / raw)
  To: Jordan Rife, netdev
  Cc: llvm, oe-kbuild-all, Jordan Rife, bpf, Willem de Bruijn,
	Eric Dumazet, Daniel Borkmann, Martin KaFai Lau,
	Stanislav Fomichev, Andrii Nakryiko, Yusuke Suzuki,
	Jakub Kicinski, Kuniyuki Iwashima

Hi Jordan,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Jordan-Rife/udp-Only-compare-daddr-dport-when-sk_state-TCP_ESTABLISHED/20260331-082300
base:   net-next/main
patch link:    https://lore.kernel.org/r/20260330215707.2374657-3-jrife%40google.com
patch subject: [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260401/202604010240.xuggtnr1-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260401/202604010240.xuggtnr1-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604010240.xuggtnr1-lkp@intel.com/

All errors (new ones prefixed by >>):

>> net/ipv4/udp.c:2182:31: error: call to undeclared function 'udp_get_table_prot'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    2182 |         struct udp_table *udptable = udp_get_table_prot(sk);
         |                                      ^
   net/ipv4/udp.c:2182:31: note: did you mean 'vm_get_page_prot'?
   include/linux/mm.h:4105:10: note: 'vm_get_page_prot' declared here
    4105 | pgprot_t vm_get_page_prot(vm_flags_t vm_flags);
         |          ^
>> net/ipv4/udp.c:2182:20: error: incompatible integer to pointer conversion initializing 'struct udp_table *' with an expression of type 'int' [-Wint-conversion]
    2182 |         struct udp_table *udptable = udp_get_table_prot(sk);
         |                           ^          ~~~~~~~~~~~~~~~~~~~~~~
   2 errors generated.


vim +/udp_get_table_prot +2182 net/ipv4/udp.c

  2179	
  2180	static int udp_disconnect_unhash4(struct sock *sk, int flags)
  2181	{
> 2182		struct udp_table *udptable = udp_get_table_prot(sk);
  2183	
  2184		udp_unhash4(udptable, sk);
  2185		__udp_disconnect(sk, flags);
  2186	
  2187		return 0;
  2188	}
  2189	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED
  2026-03-31  1:21   ` Kuniyuki Iwashima
@ 2026-04-01 20:50     ` Jordan Rife
  0 siblings, 0 replies; 12+ messages in thread
From: Jordan Rife @ 2026-04-01 20:50 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: netdev, bpf, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
	Martin KaFai Lau, Stanislav Fomichev, Andrii Nakryiko,
	Yusuke Suzuki, Jakub Kicinski

On Mon, Mar 30, 2026 at 06:21:39PM -0700, Kuniyuki Iwashima wrote:
> On Mon, Mar 30, 2026 at 2:57 PM Jordan Rife <jrife@google.com> wrote:
> >
> > Adjust lookups and scoring to keep their results equivalent to before
> > even if inet_daddr+inet_dport are left intact after disconnecting a
> > socket (sk_state == TCP_CLOSE). sk_state == TCP_ESTABLISHED implies that
> > *daddr is non-zero, so remove redundant checks for that at the same
> > time. Note that __udp6_lib_demux_lookup already checks if sk_state ==
> > TCP_ESTABLISHED, so no change was needed there [1].
> >
> > I could find no discernible difference in performance in
> > udp4_lib_lookup2 before and after the change in compute_score.
> 
> What workload did you test the series with ?

These measurements were taken on the server side while running a netperf
UDP_STREAM test over a 100 Gbps link.

> I think we want to see results under DDoS.

Intuitively, it seems like the performance should be similar.
sk_state resides in the same cache line as inet_daddr, inet_dport, and
sk_v6_daddr, and we trade a comparison with inet_daddr/sk_v6_daddr for
one with sk_state. Of course, code-level intuition can be wrong, so I'm
happy to do some more extensive testing if you feel it's warranted to
make sure that performance isn't regressing.
 
> >
> > (AMD Ryzen 9 9900X)
> >
> > kprobe:udp4_lib_lookup2 {
> >         @start[cpu] = nsecs;
> > }
> > kretprobe:udp4_lib_lookup2 {
> >         @lookup[cpu] = hist(nsecs - @start[cpu], 2);
> > }
> >
> > BEFORE
> > ======
> > @lookup[11]:
> > [80, 96)         1387077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [96, 112)         364973 |@@@@@@@@@@@@@                                       |
> > [112, 128)         34261 |@                                                   |
> > [128, 160)          7246 |                                                    |
> > [160, 192)           215 |                                                    |
> > [192, 224)           126 |                                                    |
> >
> > AFTER
> > =====
> > @lookup[11]:
> > [80, 96)         1408594 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [96, 112)         340568 |@@@@@@@@@@@@                                        |
> > [112, 128)         30753 |@                                                   |
> > [128, 160)          8019 |                                                    |
> > [160, 192)           231 |                                                    |
> > [192, 224)           157 |                                                    |
> >
> > [1]: https://lore.kernel.org/netdev/20170623222537.130493-1-tracywwnj@gmail.com/
> >
> > Signed-off-by: Jordan Rife <jrife@google.com>
> > ---
> >  net/ipv4/udp.c | 20 +++++++++++---------
> >  net/ipv6/udp.c | 18 +++++++++---------
> >  2 files changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index b60fad393e18..d91c587c3657 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -385,16 +385,16 @@ static int compute_score(struct sock *sk, const struct net *net,
> >         score = (sk->sk_family == PF_INET) ? 2 : 1;
> >
> >         inet = inet_sk(sk);
> > -       if (inet->inet_daddr) {
> > +       if (sk->sk_state == TCP_ESTABLISHED) {
> >                 if (inet->inet_daddr != saddr)
> >                         return -1;
> >                 score += 4;
> > -       }
> >
> > -       if (inet->inet_dport) {
> > -               if (inet->inet_dport != sport)
> > -                       return -1;
> > -               score += 4;
> > +               if (inet->inet_dport) {
> > +                       if (inet->inet_dport != sport)
> > +                               return -1;
> > +                       score += 4;
> > +               }
> >         }
> >
> >         dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
> > @@ -796,8 +796,9 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
> >
> >         if (!net_eq(sock_net(sk), net) ||
> >             udp_sk(sk)->udp_port_hash != hnum ||
> > -           (inet->inet_daddr && inet->inet_daddr != rmt_addr) ||
> > -           (inet->inet_dport != rmt_port && inet->inet_dport) ||
> > +           (sk->sk_state == TCP_ESTABLISHED &&
> > +            (inet->inet_daddr != rmt_addr ||
> > +            (inet->inet_dport != rmt_port && inet->inet_dport))) ||
> >             (inet->inet_rcv_saddr && inet->inet_rcv_saddr != loc_addr) ||
> >             ipv6_only_sock(sk) ||
> >             !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
> > @@ -2854,7 +2855,8 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
> >         ports = INET_COMBINED_PORTS(rmt_port, hnum);
> >
> >         udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
> > -               if (inet_match(net, sk, acookie, ports, dif, sdif))
> > +               if (sk->sk_state == TCP_ESTABLISHED &&
> > +                   inet_match(net, sk, acookie, ports, dif, sdif))
> >                         return sk;
> >                 /* Only check first socket in chain */
> >                 break;
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 010b909275dd..b93a9a3e7678 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -147,16 +147,16 @@ static int compute_score(struct sock *sk, const struct net *net,
> >         score = 0;
> >         inet = inet_sk(sk);
> >
> > -       if (inet->inet_dport) {
> > +       if (sk->sk_state == TCP_ESTABLISHED) {
> >                 if (inet->inet_dport != sport)
> >                         return -1;
> >                 score++;
> > -       }
> >
> > -       if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
> > -               if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> > -                       return -1;
> > -               score++;
> > +               if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
> 
> This looks unnecessary.

Yep, thanks. This is inverted. It should have been similar to the ipv4
logic where the unnecessary `if (inet->inet_daddr)` check was removed
after adding if `(sk->sk_state == TCP_ESTABLISHED)`:

	if (sk->sk_state == TCP_ESTABLISHED) {
		if (inet->inet_dport) {
			if (inet->inet_dport != sport)
				return -1;
			score++;
		}

		if (!ipv6_addr_equal(&sk->sk_v6_addr, saddr))
			return -1;
		score++;
	}
 
> 
> > +                       if (!ipv6_addr_equal(&sk->sk_v6_daddr, saddr))
> > +                               return -1;
> > +                       score++;
> > +               }
> >         }
> >
> >         bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
> > @@ -949,9 +949,9 @@ static bool __udp_v6_is_mcast_sock(struct net *net, const struct sock *sk,
> >
> >         if (udp_sk(sk)->udp_port_hash != hnum ||
> >             sk->sk_family != PF_INET6 ||
> > -           (inet->inet_dport && inet->inet_dport != rmt_port) ||
> > -           (!ipv6_addr_any(&sk->sk_v6_daddr) &&
> > -                   !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
> > +           (sk->sk_state == TCP_ESTABLISHED &&
> > +            ((inet->inet_dport && inet->inet_dport != rmt_port) ||
> > +            !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))) ||
> >             !udp_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif, sdif) ||
> >             (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) &&
> >                     !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr)))
> > --
> > 2.53.0.1118.gaef5881109-goog
> >

Thanks,
Jordan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-01 20:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 21:57 [PATCH net-next v3 0/4] udp: Preserve UDP socket addresses on abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 1/4] udp: Only compare daddr/dport when sk_state == TCP_ESTABLISHED Jordan Rife
2026-03-31  1:21   ` Kuniyuki Iwashima
2026-04-01 20:50     ` Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 2/4] udp: Remove disconnected sockets from the 4-tuple hash Jordan Rife
2026-03-31 16:51   ` kernel test robot
2026-03-31 17:33   ` kernel test robot
2026-03-31 17:42   ` kernel test robot
2026-03-31 17:55   ` kernel test robot
2026-03-31 18:49   ` kernel test robot
2026-03-30 21:57 ` [PATCH net-next v3 3/4] udp: Preserve destination address info after abort Jordan Rife
2026-03-30 21:57 ` [PATCH net-next v3 4/4] selftests/bpf: Ensure dst addr/port are preserved after socket abort Jordan Rife

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox