* [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows
@ 2025-02-12 13:13 Eric Dumazet
2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Eric Dumazet @ 2025-02-12 13:13 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: netdev, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
eric.dumazet, Eric Dumazet
Small series to make inet_sock_set_state() more interesting for
LISTEN -> TCP_SYN_RECV changes : The 4-tuple parts are now correct.
First patch is a cleanup.
Eric Dumazet (2):
inet: reduce inet_csk_clone_lock() indent level
inet: consolidate inet_csk_clone_lock()
net/dccp/ipv4.c | 3 --
net/dccp/ipv6.c | 9 ++---
net/ipv4/inet_connection_sock.c | 66 +++++++++++++++++++++------------
net/ipv4/tcp_ipv4.c | 4 --
net/ipv6/tcp_ipv6.c | 8 +---
5 files changed, 48 insertions(+), 42 deletions(-)
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level
2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
@ 2025-02-12 13:13 ` Eric Dumazet
2025-02-13 2:54 ` Kuniyuki Iwashima
2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
2025-02-14 21:50 ` [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows patchwork-bot+netdevbpf
2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2025-02-12 13:13 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: netdev, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
eric.dumazet, Eric Dumazet
Return early from inet_csk_clone_lock() if the socket
allocation failed, to reduce the indentation level.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/inet_connection_sock.c | 50 ++++++++++++++++++---------------
1 file changed, 27 insertions(+), 23 deletions(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 2b7775b90a0907727fa3e4d04cfa77f6e76e82b0..1c00069552ccfbf8c0d0d91d14cf951a39711273 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1237,39 +1237,43 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
const gfp_t priority)
{
struct sock *newsk = sk_clone_lock(sk, priority);
+ struct inet_connection_sock *newicsk;
- if (newsk) {
- struct inet_connection_sock *newicsk = inet_csk(newsk);
+ if (!newsk)
+ return NULL;
- inet_sk_set_state(newsk, TCP_SYN_RECV);
- newicsk->icsk_bind_hash = NULL;
- newicsk->icsk_bind2_hash = NULL;
+ newicsk = inet_csk(newsk);
- inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
- inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
- inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
+ inet_sk_set_state(newsk, TCP_SYN_RECV);
+ newicsk->icsk_bind_hash = NULL;
+ newicsk->icsk_bind2_hash = NULL;
- /* listeners have SOCK_RCU_FREE, not the children */
- sock_reset_flag(newsk, SOCK_RCU_FREE);
+ inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
+ inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
+ inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
- inet_sk(newsk)->mc_list = NULL;
+ /* listeners have SOCK_RCU_FREE, not the children */
+ sock_reset_flag(newsk, SOCK_RCU_FREE);
- newsk->sk_mark = inet_rsk(req)->ir_mark;
- atomic64_set(&newsk->sk_cookie,
- atomic64_read(&inet_rsk(req)->ir_cookie));
+ inet_sk(newsk)->mc_list = NULL;
- newicsk->icsk_retransmits = 0;
- newicsk->icsk_backoff = 0;
- newicsk->icsk_probes_out = 0;
- newicsk->icsk_probes_tstamp = 0;
+ newsk->sk_mark = inet_rsk(req)->ir_mark;
+ atomic64_set(&newsk->sk_cookie,
+ atomic64_read(&inet_rsk(req)->ir_cookie));
- /* Deinitialize accept_queue to trap illegal accesses. */
- memset(&newicsk->icsk_accept_queue, 0, sizeof(newicsk->icsk_accept_queue));
+ newicsk->icsk_retransmits = 0;
+ newicsk->icsk_backoff = 0;
+ newicsk->icsk_probes_out = 0;
+ newicsk->icsk_probes_tstamp = 0;
- inet_clone_ulp(req, newsk, priority);
+ /* Deinitialize accept_queue to trap illegal accesses. */
+ memset(&newicsk->icsk_accept_queue, 0,
+ sizeof(newicsk->icsk_accept_queue));
+
+ inet_clone_ulp(req, newsk, priority);
+
+ security_inet_csk_clone(newsk, req);
- security_inet_csk_clone(newsk, req);
- }
return newsk;
}
EXPORT_SYMBOL_GPL(inet_csk_clone_lock);
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock()
2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
@ 2025-02-12 13:13 ` Eric Dumazet
2025-02-13 3:26 ` Kuniyuki Iwashima
2025-02-14 21:50 ` [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows patchwork-bot+netdevbpf
2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2025-02-12 13:13 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: netdev, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
eric.dumazet, Eric Dumazet
Current inet_sock_set_state trace from inet_csk_clone_lock() is missing
many details :
... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
sport=4901 dport=0 \
saddr=127.0.0.6 daddr=0.0.0.0 \
saddrv6=:: daddrv6=:: \
oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
Only the sport gives the listener port, no other parts of the n-tuple are correct.
In this patch, I initialize relevant fields of the new socket before
calling inet_sk_set_state(newsk, TCP_SYN_RECV).
We now have a trace including all the source/destination bits.
... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
sport=4901 dport=47648 \
saddr=127.0.0.6 daddr=127.0.0.6 \
saddrv6=2002:a05:6830:1f85:: daddrv6=2001:4860:f803:65::3 \
oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/dccp/ipv4.c | 3 ---
net/dccp/ipv6.c | 9 +++------
net/ipv4/inet_connection_sock.c | 24 ++++++++++++++++++++----
net/ipv4/tcp_ipv4.c | 4 ----
net/ipv6/tcp_ipv6.c | 8 ++------
5 files changed, 25 insertions(+), 23 deletions(-)
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index be515ba821e2d4d6a7bca973b5e7c2363a2f13cc..bfa529a54acab6abd279c9e4a600e699e8904d8a 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -426,9 +426,6 @@ struct sock *dccp_v4_request_recv_sock(const struct sock *sk,
newinet = inet_sk(newsk);
ireq = inet_rsk(req);
- sk_daddr_set(newsk, ireq->ir_rmt_addr);
- sk_rcv_saddr_set(newsk, ireq->ir_loc_addr);
- newinet->inet_saddr = ireq->ir_loc_addr;
RCU_INIT_POINTER(newinet->inet_opt, rcu_dereference(ireq->ireq_opt));
newinet->mc_index = inet_iif(skb);
newinet->mc_ttl = ip_hdr(skb)->ttl;
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index d6649246188d72b3df6c74750779b7aa5910dcb7..39ae9d89d7d43fc8730dd2ec20d6e1cf72d20bf3 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -365,6 +365,9 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
ireq = inet_rsk(req);
ireq->ir_v6_rmt_addr = ipv6_hdr(skb)->saddr;
ireq->ir_v6_loc_addr = ipv6_hdr(skb)->daddr;
+ ireq->ir_rmt_addr = LOOPBACK4_IPV6;
+ ireq->ir_loc_addr = LOOPBACK4_IPV6;
+
ireq->ireq_family = AF_INET6;
ireq->ir_mark = inet_request_mark(sk, skb);
@@ -504,10 +507,7 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk,
memcpy(newnp, np, sizeof(struct ipv6_pinfo));
- newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr;
newnp->saddr = ireq->ir_v6_loc_addr;
- newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
- newsk->sk_bound_dev_if = ireq->ir_iif;
/* Now IPv6 options...
@@ -546,9 +546,6 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk,
dccp_sync_mss(newsk, dst_mtu(dst));
- newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6;
- newinet->inet_rcv_saddr = LOOPBACK4_IPV6;
-
if (__inet_inherit_port(sk, newsk) < 0) {
inet_csk_prepare_forced_close(newsk);
dccp_done(newsk);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 1c00069552ccfbf8c0d0d91d14cf951a39711273..bf9ce0c196575910b4b03fca13001979d4326297 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1238,19 +1238,33 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
{
struct sock *newsk = sk_clone_lock(sk, priority);
struct inet_connection_sock *newicsk;
+ struct inet_request_sock *ireq;
+ struct inet_sock *newinet;
if (!newsk)
return NULL;
newicsk = inet_csk(newsk);
+ newinet = inet_sk(newsk);
+ ireq = inet_rsk(req);
- inet_sk_set_state(newsk, TCP_SYN_RECV);
newicsk->icsk_bind_hash = NULL;
newicsk->icsk_bind2_hash = NULL;
- inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
- inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
- inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
+ newinet->inet_dport = ireq->ir_rmt_port;
+ newinet->inet_num = ireq->ir_num;
+ newinet->inet_sport = htons(ireq->ir_num);
+
+ newsk->sk_bound_dev_if = ireq->ir_iif;
+
+ newsk->sk_daddr = ireq->ir_rmt_addr;
+ newsk->sk_rcv_saddr = ireq->ir_loc_addr;
+ newinet->inet_saddr = ireq->ir_loc_addr;
+
+#if IS_ENABLED(CONFIG_IPV6)
+ newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr;
+ newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
+#endif
/* listeners have SOCK_RCU_FREE, not the children */
sock_reset_flag(newsk, SOCK_RCU_FREE);
@@ -1270,6 +1284,8 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
memset(&newicsk->icsk_accept_queue, 0,
sizeof(newicsk->icsk_accept_queue));
+ inet_sk_set_state(newsk, TCP_SYN_RECV);
+
inet_clone_ulp(req, newsk, priority);
security_inet_csk_clone(newsk, req);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d1fd2128ac6cce9b845b1f8d278a194c511db87b..56949eb289ce330448b771f91d5c3130d6b2ac96 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1769,10 +1769,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
newtp = tcp_sk(newsk);
newinet = inet_sk(newsk);
ireq = inet_rsk(req);
- sk_daddr_set(newsk, ireq->ir_rmt_addr);
- sk_rcv_saddr_set(newsk, ireq->ir_loc_addr);
- newsk->sk_bound_dev_if = ireq->ir_iif;
- newinet->inet_saddr = ireq->ir_loc_addr;
inet_opt = rcu_dereference(ireq->ireq_opt);
RCU_INIT_POINTER(newinet->inet_opt, inet_opt);
newinet->mc_index = inet_iif(skb);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 2debdf085a3b4d2452b2b316cb5368507b17efc8..a806082602985fd351c5184f52dc3011c71540a9 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -798,6 +798,8 @@ static void tcp_v6_init_req(struct request_sock *req,
ireq->ir_v6_rmt_addr = ipv6_hdr(skb)->saddr;
ireq->ir_v6_loc_addr = ipv6_hdr(skb)->daddr;
+ ireq->ir_rmt_addr = LOOPBACK4_IPV6;
+ ireq->ir_loc_addr = LOOPBACK4_IPV6;
/* So that link locals have meaning */
if ((!sk_listener->sk_bound_dev_if || l3_slave) &&
@@ -1451,10 +1453,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
ip6_dst_store(newsk, dst, NULL, NULL);
- newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr;
newnp->saddr = ireq->ir_v6_loc_addr;
- newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
- newsk->sk_bound_dev_if = ireq->ir_iif;
/* Now IPv6 options...
@@ -1507,9 +1506,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
tcp_initialize_rcv_mss(newsk);
- newinet->inet_daddr = newinet->inet_saddr = LOOPBACK4_IPV6;
- newinet->inet_rcv_saddr = LOOPBACK4_IPV6;
-
#ifdef CONFIG_TCP_MD5SIG
l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif);
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level
2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
@ 2025-02-13 2:54 ` Kuniyuki Iwashima
0 siblings, 0 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13 2:54 UTC (permalink / raw)
To: edumazet
Cc: davem, eric.dumazet, horms, kuba, kuniyu, ncardwell, netdev,
pabeni
From: Eric Dumazet <edumazet@google.com>
Date: Wed, 12 Feb 2025 13:13:27 +0000
> Return early from inet_csk_clone_lock() if the socket
> allocation failed, to reduce the indentation level.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock()
2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
@ 2025-02-13 3:26 ` Kuniyuki Iwashima
0 siblings, 0 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-13 3:26 UTC (permalink / raw)
To: edumazet
Cc: davem, eric.dumazet, horms, kuba, kuniyu, ncardwell, netdev,
pabeni
From: Eric Dumazet <edumazet@google.com>
Date: Wed, 12 Feb 2025 13:13:28 +0000
> Current inet_sock_set_state trace from inet_csk_clone_lock() is missing
> many details :
>
> ... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
> sport=4901 dport=0 \
> saddr=127.0.0.6 daddr=0.0.0.0 \
> saddrv6=:: daddrv6=:: \
> oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
>
> Only the sport gives the listener port, no other parts of the n-tuple are correct.
>
> In this patch, I initialize relevant fields of the new socket before
> calling inet_sk_set_state(newsk, TCP_SYN_RECV).
>
> We now have a trace including all the source/destination bits.
>
> ... sock:inet_sock_set_state: family=AF_INET6 protocol=IPPROTO_TCP \
> sport=4901 dport=47648 \
> saddr=127.0.0.6 daddr=127.0.0.6 \
> saddrv6=2002:a05:6830:1f85:: daddrv6=2001:4860:f803:65::3 \
> oldstate=TCP_LISTEN newstate=TCP_SYN_RECV
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows
2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
@ 2025-02-14 21:50 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-02-14 21:50 UTC (permalink / raw)
To: Eric Dumazet
Cc: davem, kuba, pabeni, netdev, horms, ncardwell, kuniyu,
eric.dumazet
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 12 Feb 2025 13:13:26 +0000 you wrote:
> Small series to make inet_sock_set_state() more interesting for
> LISTEN -> TCP_SYN_RECV changes : The 4-tuple parts are now correct.
>
> First patch is a cleanup.
>
> Eric Dumazet (2):
> inet: reduce inet_csk_clone_lock() indent level
> inet: consolidate inet_csk_clone_lock()
>
> [...]
Here is the summary with links:
- [net-next,1/2] inet: reduce inet_csk_clone_lock() indent level
https://git.kernel.org/netdev/net-next/c/55250b83b02a
- [net-next,2/2] inet: consolidate inet_csk_clone_lock()
https://git.kernel.org/netdev/net-next/c/a3a128f611a9
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-02-14 21:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-12 13:13 [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows Eric Dumazet
2025-02-12 13:13 ` [PATCH net-next 1/2] inet: reduce inet_csk_clone_lock() indent level Eric Dumazet
2025-02-13 2:54 ` Kuniyuki Iwashima
2025-02-12 13:13 ` [PATCH net-next 2/2] inet: consolidate inet_csk_clone_lock() Eric Dumazet
2025-02-13 3:26 ` Kuniyuki Iwashima
2025-02-14 21:50 ` [PATCH net-next 0/2] inet: better inet_sock_set_state() for passive flows patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).