netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows
@ 2025-02-12 13:13 Eric Dumazet
  2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Eric Dumazet @ 2025-02-12 13:13 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
	eric.dumazet, Eric Dumazet

Small series to make inet_sock_set_state() more interesting for 
LISTEN -> TCP_SYN_RECV changes : The 4-tuple parts are now correct.

First patch is a cleanup.

Eric Dumazet (2):
  inet: reduce inet_csk_clone_lock() indent level
  inet: consolidate inet_csk_clone_lock()

 net/dccp/ipv4.c                 |  3 --
 net/dccp/ipv6.c                 |  9 ++---
 net/ipv4/inet_connection_sock.c | 66 +++++++++++++++++++++------------
 net/ipv4/tcp_ipv4.c             |  4 --
 net/ipv6/tcp_ipv6.c             |  8 +---
 5 files changed, 48 insertions(+), 42 deletions(-)

-- 
2.48.1.502.g6dc24dfdaf-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level
  2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
@ 2025-02-12 13:13 ` Eric Dumazet
  2025-02-13  2:54   ` Kuniyuki Iwashima
  2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
  2025-02-14 21:50 ` [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows patchwork-bot+netdevbpf
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2025-02-12 13:13 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
	eric.dumazet, Eric Dumazet

Return early from inet_csk_clone_lock() if the socket
allocation failed, to reduce the indentation level.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/inet_connection_sock.c | 50 ++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 2b7775b90a0907727fa3e4d04cfa77f6e76e82b0..1c00069552ccfbf8c0d0d91d14cf951a39711273 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1237,39 +1237,43 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
 				 const gfp_t priority)
 {
 	struct sock *newsk = sk_clone_lock(sk, priority);
+	struct inet_connection_sock *newicsk;
 
-	if (newsk) {
-		struct inet_connection_sock *newicsk = inet_csk(newsk);
+	if (!newsk)
+		return NULL;
 
-		inet_sk_set_state(newsk, TCP_SYN_RECV);
-		newicsk->icsk_bind_hash = NULL;
-		newicsk->icsk_bind2_hash = NULL;
+	newicsk = inet_csk(newsk);
 
-		inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
-		inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
-		inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
+	inet_sk_set_state(newsk, TCP_SYN_RECV);
+	newicsk->icsk_bind_hash = NULL;
+	newicsk->icsk_bind2_hash = NULL;
 
-		/* listeners have SOCK_RCU_FREE, not the children */
-		sock_reset_flag(newsk, SOCK_RCU_FREE);
+	inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
+	inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
+	inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
 
-		inet_sk(newsk)->mc_list = NULL;
+	/* listeners have SOCK_RCU_FREE, not the children */
+	sock_reset_flag(newsk, SOCK_RCU_FREE);
 
-		newsk->sk_mark = inet_rsk(req)->ir_mark;
-		atomic64_set(&newsk->sk_cookie,
-			     atomic64_read(&inet_rsk(req)->ir_cookie));
+	inet_sk(newsk)->mc_list = NULL;
 
-		newicsk->icsk_retransmits = 0;
-		newicsk->icsk_backoff	  = 0;
-		newicsk->icsk_probes_out  = 0;
-		newicsk->icsk_probes_tstamp = 0;
+	newsk->sk_mark = inet_rsk(req)->ir_mark;
+	atomic64_set(&newsk->sk_cookie,
+		     atomic64_read(&inet_rsk(req)->ir_cookie));
 
-		/* Deinitialize accept_queue to trap illegal accesses. */
-		memset(&newicsk->icsk_accept_queue, 0, sizeof(newicsk->icsk_accept_queue));
+	newicsk->icsk_retransmits = 0;
+	newicsk->icsk_backoff	  = 0;
+	newicsk->icsk_probes_out  = 0;
+	newicsk->icsk_probes_tstamp = 0;
 
-		inet_clone_ulp(req, newsk, priority);
+	/* Deinitialize accept_queue to trap illegal accesses. */
+	memset(&newicsk->icsk_accept_queue, 0,
+	       sizeof(newicsk->icsk_accept_queue));
+
+	inet_clone_ulp(req, newsk, priority);
+
+	security_inet_csk_clone(newsk, req);
 
-		security_inet_csk_clone(newsk, req);
-	}
 	return newsk;
 }
 EXPORT_SYMBOL_GPL(inet_csk_clone_lock);
-- 
2.48.1.502.g6dc24dfdaf-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock()
  2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
  2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
@ 2025-02-12 13:13 ` Eric Dumazet
  2025-02-13  3:26   ` Kuniyuki Iwashima
  2025-02-14 21:50 ` [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows patchwork-bot+netdevbpf
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2025-02-12 13:13 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
	eric.dumazet, Eric Dumazet

Current inet_sock_set_state trace from inet_csk_clone_lock() is missing
many details :

... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
    sport=4901 dport=0 \
    saddr=127.0.0.6 daddr=0.0.0.0 \
    saddrv6=:: daddrv6=:: \
    oldstate=TCP_LISTEN newstate=TCP_SYN_RECV

Only the sport gives the listener port, no other parts of the n-tuple are correct.

In this patch, I initialize relevant fields of the new socket before
calling inet_sk_set_state(newsk, TCP_SYN_RECV).

We now have a trace including all the source/destination bits.

... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
    sport=4901 dport=47648 \
    saddr=127.0.0.6 daddr=127.0.0.6 \
    saddrv6=2002:a05:6830:1f85:: daddrv6=2001:4860:f803:65::3 \
    oldstate=TCP_LISTEN newstate=TCP_SYN_RECV

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/dccp/ipv4.c                 |  3 ---
 net/dccp/ipv6.c                 |  9 +++------
 net/ipv4/inet_connection_sock.c | 24 ++++++++++++++++++++----
 net/ipv4/tcp_ipv4.c             |  4 ----
 net/ipv6/tcp_ipv6.c             |  8 ++------
 5 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index be515ba821e2d4d6a7bca973b5e7c2363a2f13cc..bfa529a54acab6abd279c9e4a600e699e8904d8a 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -426,9 +426,6 @@ struct sock *dccp_v4_request_recv_sock(const struct sock *sk,
 
 	newinet		   = inet_sk(newsk);
 	ireq		   = inet_rsk(req);
-	sk_daddr_set(newsk, ireq->ir_rmt_addr);
-	sk_rcv_saddr_set(newsk, ireq->ir_loc_addr);
-	newinet->inet_saddr	= ireq->ir_loc_addr;
 	RCU_INIT_POINTER(newinet->inet_opt, rcu_dereference(ireq->ireq_opt));
 	newinet->mc_index  = inet_iif(skb);
 	newinet->mc_ttl	   = ip_hdr(skb)->ttl;
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index d6649246188d72b3df6c74750779b7aa5910dcb7..39ae9d89d7d43fc8730dd2ec20d6e1cf72d20bf3 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -365,6 +365,9 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 	ireq = inet_rsk(req);
 	ireq->ir_v6_rmt_addr = ipv6_hdr(skb)->saddr;
 	ireq->ir_v6_loc_addr = ipv6_hdr(skb)->daddr;
+	ireq->ir_rmt_addr = LOOPBACK4_IPV6;
+	ireq->ir_loc_addr = LOOPBACK4_IPV6;
+
 	ireq->ireq_family = AF_INET6;
 	ireq->ir_mark = inet_request_mark(sk, skb);
 
@@ -504,10 +507,7 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk,
 
 	memcpy(newnp, np, sizeof(struct ipv6_pinfo));
 
-	newsk->sk_v6_daddr	= ireq->ir_v6_rmt_addr;
 	newnp->saddr		= ireq->ir_v6_loc_addr;
-	newsk->sk_v6_rcv_saddr	= ireq->ir_v6_loc_addr;
-	newsk->sk_bound_dev_if	= ireq->ir_iif;
 
 	/* Now IPv6 options...
 
@@ -546,9 +546,6 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk,
 
 	dccp_sync_mss(newsk, dst_mtu(dst));
 
-	newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6;
-	newinet->inet_rcv_saddr = LOOPBACK4_IPV6;
-
 	if (__inet_inherit_port(sk, newsk) < 0) {
 		inet_csk_prepare_forced_close(newsk);
 		dccp_done(newsk);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 1c00069552ccfbf8c0d0d91d14cf951a39711273..bf9ce0c196575910b4b03fca13001979d4326297 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1238,19 +1238,33 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
 {
 	struct sock *newsk = sk_clone_lock(sk, priority);
 	struct inet_connection_sock *newicsk;
+	struct inet_request_sock *ireq;
+	struct inet_sock *newinet;
 
 	if (!newsk)
 		return NULL;
 
 	newicsk = inet_csk(newsk);
+	newinet = inet_sk(newsk);
+	ireq = inet_rsk(req);
 
-	inet_sk_set_state(newsk, TCP_SYN_RECV);
 	newicsk->icsk_bind_hash = NULL;
 	newicsk->icsk_bind2_hash = NULL;
 
-	inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
-	inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
-	inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
+	newinet->inet_dport = ireq->ir_rmt_port;
+	newinet->inet_num = ireq->ir_num;
+	newinet->inet_sport = htons(ireq->ir_num);
+
+	newsk->sk_bound_dev_if = ireq->ir_iif;
+
+	newsk->sk_daddr = ireq->ir_rmt_addr;
+	newsk->sk_rcv_saddr = ireq->ir_loc_addr;
+	newinet->inet_saddr = ireq->ir_loc_addr;
+
+#if IS_ENABLED(CONFIG_IPV6)
+	newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr;
+	newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
+#endif
 
 	/* listeners have SOCK_RCU_FREE, not the children */
 	sock_reset_flag(newsk, SOCK_RCU_FREE);
@@ -1270,6 +1284,8 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
 	memset(&newicsk->icsk_accept_queue, 0,
 	       sizeof(newicsk->icsk_accept_queue));
 
+	inet_sk_set_state(newsk, TCP_SYN_RECV);
+
 	inet_clone_ulp(req, newsk, priority);
 
 	security_inet_csk_clone(newsk, req);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d1fd2128ac6cce9b845b1f8d278a194c511db87b..56949eb289ce330448b771f91d5c3130d6b2ac96 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1769,10 +1769,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
 	newtp		      = tcp_sk(newsk);
 	newinet		      = inet_sk(newsk);
 	ireq		      = inet_rsk(req);
-	sk_daddr_set(newsk, ireq->ir_rmt_addr);
-	sk_rcv_saddr_set(newsk, ireq->ir_loc_addr);
-	newsk->sk_bound_dev_if = ireq->ir_iif;
-	newinet->inet_saddr   = ireq->ir_loc_addr;
 	inet_opt	      = rcu_dereference(ireq->ireq_opt);
 	RCU_INIT_POINTER(newinet->inet_opt, inet_opt);
 	newinet->mc_index     = inet_iif(skb);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 2debdf085a3b4d2452b2b316cb5368507b17efc8..a806082602985fd351c5184f52dc3011c71540a9 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -798,6 +798,8 @@ static void tcp_v6_init_req(struct request_sock *req,
 
 	ireq->ir_v6_rmt_addr = ipv6_hdr(skb)->saddr;
 	ireq->ir_v6_loc_addr = ipv6_hdr(skb)->daddr;
+	ireq->ir_rmt_addr = LOOPBACK4_IPV6;
+	ireq->ir_loc_addr = LOOPBACK4_IPV6;
 
 	/* So that link locals have meaning */
 	if ((!sk_listener->sk_bound_dev_if || l3_slave) &&
@@ -1451,10 +1453,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 
 	ip6_dst_store(newsk, dst, NULL, NULL);
 
-	newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr;
 	newnp->saddr = ireq->ir_v6_loc_addr;
-	newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
-	newsk->sk_bound_dev_if = ireq->ir_iif;
 
 	/* Now IPv6 options...
 
@@ -1507,9 +1506,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 
 	tcp_initialize_rcv_mss(newsk);
 
-	newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6;
-	newinet->inet_rcv_saddr = LOOPBACK4_IPV6;
-
 #ifdef CONFIG_TCP_MD5SIG
 	l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif);
 
-- 
2.48.1.502.g6dc24dfdaf-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level
  2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
@ 2025-02-13  2:54   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13  2:54 UTC (permalink / raw)
  To: edumazet
  Cc: davem, eric.dumazet, horms, kuba, kuniyu, ncardwell, netdev,
	pabeni

From: Eric Dumazet <edumazet@google.com>
Date: Wed, 12 Feb 2025 13:13:27 +0000
> Return early from inet_csk_clone_lock() if the socket
> allocation failed, to reduce the indentation level.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock()
  2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
@ 2025-02-13  3:26   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13  3:26 UTC (permalink / raw)
  To: edumazet
  Cc: davem, eric.dumazet, horms, kuba, kuniyu, ncardwell, netdev,
	pabeni

From: Eric Dumazet <edumazet@google.com>
Date: Wed, 12 Feb 2025 13:13:28 +0000
> Current inet_sock_set_state trace from inet_csk_clone_lock() is missing
> many details :
> 
> ... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
>     sport=4901 dport=0 \
>     saddr=127.0.0.6 daddr=0.0.0.0 \
>     saddrv6=:: daddrv6=:: \
>     oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
> 
> Only the sport gives the listener port, no other parts of the n-tuple are correct.
> 
> In this patch, I initialize relevant fields of the new socket before
> calling inet_sk_set_state(newsk, TCP_SYN_RECV).
> 
> We now have a trace including all the source/destination bits.
> 
> ... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
>     sport=4901 dport=47648 \
>     saddr=127.0.0.6 daddr=127.0.0.6 \
>     saddrv6=2002:a05:6830:1f85:: daddrv6=2001:4860:f803:65::3 \
>     oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows
  2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
  2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
  2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
@ 2025-02-14 21:50 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-02-14 21:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, netdev, horms, ncardwell, kuniyu,
	eric.dumazet

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 12 Feb 2025 13:13:26 +0000 you wrote:
> Small series to make inet_sock_set_state() more interesting for
> LISTEN -> TCP_SYN_RECV changes : The 4-tuple parts are now correct.
> 
> First patch is a cleanup.
> 
> Eric Dumazet (2):
>   inet: reduce inet_csk_clone_lock() indent level
>   inet: consolidate inet_csk_clone_lock()
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] inet: reduce inet_csk_clone_lock() indent level
    https://git.kernel.org/netdev/net-next/c/55250b83b02a
  - [net-next,2/2] inet: consolidate inet_csk_clone_lock()
    https://git.kernel.org/netdev/net-next/c/a3a128f611a9

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-02-14 21:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
2025-02-13  2:54   ` Kuniyuki Iwashima
2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
2025-02-13  3:26   ` Kuniyuki Iwashima
2025-02-14 21:50 ` [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).