* RE:答复: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close
@ 2024-11-19 8:38 mengkanglai
2024-11-20 2:38 ` Kuniyuki Iwashima
0 siblings, 1 reply; 4+ messages in thread
From: mengkanglai @ 2024-11-19 8:38 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem@davemloft.net, dsahern@kernel.org, edumazet@google.com,
Fengtao (fengtao, Euler), kuba@kernel.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
pabeni@redhat.com, Yanan (Euler)
> -----邮件原件-----
> 发件人: Kuniyuki Iwashima <kuniyu@amazon.com>
> 发送时间: 2024年11月14日 2:56
> 收件人: mengkanglai <mengkanglai2@huawei.com>
> 抄送: davem@davemloft.net; dsahern@kernel.org; edumazet@google.com; Fengtao (fengtao, Euler) <fengtao40@huawei.com>; kuba@kernel.org; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; pabeni@redhat.com; Yanan (Euler) <yanan@huawei.com>; kuniyu@amazon.com
> 主题: Re: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close
>
> From: mengkanglai <mengkanglai2@huawei.com>
> Date: Wed, 13 Nov 2024 12:40:34 +0000
> > Hello, Eric:
> > Commit 151c9c724d05 (tcp: properly terminate timers for kernel
> > sockets) introduce inet_csk_clear_xmit_timers_sync in tcp_close.
> > For kernel sockets it does not hold sk->sk_net_refcnt, if this is
> > kernel tcp socket it will call tcp_send_fin in __tcp_close to send FIN
> > packet to remotes server,
>
> Just curious which subsystem the kernel socket is created by.
>
> Recently, CIFS and sunrpc are (being) converted to hold net refcnt.
>
> CIFS: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ef7134c7fc48e1441b398e55a862232868a6f0a7
> sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/
>
> I remember RDS's listener does not hold refcnt but other client sockets (SMC, RDS, MPTCP, CIFS, sunrpc) do.
>
> I think all TCP kernel sockets should hold netns refcnt except for one created at pernet_operations.init() hook like RDS.
>
> > if this fin packet lost due to network faults, tcp should retransmit
> > this fin packet, but tcp_timer stopped by inet_csk_clear_xmit_timers_sync.
> > tcp sockets state will stuck in FIN_WAIT1 and never go away. I think
> > it's not right.
I found this problem when testing nfs. sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/ will solve this problem.
I agree with that all TCP kernel sockets should hold netns refcnt.
However, for kernel tcp sockets created by other kernel modules through sock_create_kern or sk_alloc(kern=0), it means that they must now hold sk_net_refcnf, otherwise fin will only be sent once and will not be retransmitted when the socket is released.But other use tcp modules may not be aware of hold sk_net_refcnt. should we add a check in tcp_close?
---
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fb920369c..6b92026a4 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2804,7 +2804,7 @@ void tcp_close(struct sock *sk, long timeout)
lock_sock(sk);
__tcp_close(sk, timeout);
release_sock(sk);
- if (!sk->sk_net_refcnt)
+ if (sk->net != &init_net && !sk->sk_net_refcnt)
inet_csk_clear_xmit_timers_sync(sk);
sock_put(sk);
}
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close
2024-11-19 8:38 RE:答复: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close mengkanglai
@ 2024-11-20 2:38 ` Kuniyuki Iwashima
0 siblings, 0 replies; 4+ messages in thread
From: Kuniyuki Iwashima @ 2024-11-20 2:38 UTC (permalink / raw)
To: mengkanglai2
Cc: davem, dsahern, edumazet, fengtao40, kuba, kuniyu, linux-kernel,
netdev, pabeni, yanan
From: mengkanglai <mengkanglai2@huawei.com>
Date: Tue, 19 Nov 2024 08:38:26 +0000
> >
> > From: mengkanglai <mengkanglai2@huawei.com>
> > Date: Wed, 13 Nov 2024 12:40:34 +0000
> > > Hello, Eric:
> > > Commit 151c9c724d05 (tcp: properly terminate timers for kernel
> > > sockets) introduce inet_csk_clear_xmit_timers_sync in tcp_close.
> > > For kernel sockets it does not hold sk->sk_net_refcnt, if this is
> > > kernel tcp socket it will call tcp_send_fin in __tcp_close to send FIN
> > > packet to remotes server,
> >
> > Just curious which subsystem the kernel socket is created by.
> >
> > Recently, CIFS and sunrpc are (being) converted to hold net refcnt.
> >
> > CIFS: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ef7134c7fc48e1441b398e55a862232868a6f0a7
> > sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/
> >
> > I remember RDS's listener does not hold refcnt but other client sockets (SMC, RDS, MPTCP, CIFS, sunrpc) do.
> >
> > I think all TCP kernel sockets should hold netns refcnt except for one created at pernet_operations.init() hook like RDS.
> >
> > > if this fin packet lost due to network faults, tcp should retransmit
> > > this fin packet, but tcp_timer stopped by inet_csk_clear_xmit_timers_sync.
> > > tcp sockets state will stuck in FIN_WAIT1 and never go away. I think
> > > it's not right.
>
>
> I found this problem when testing nfs. sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/ will solve this problem.
> I agree with that all TCP kernel sockets should hold netns refcnt.
> However, for kernel tcp sockets created by other kernel modules through
> sock_create_kern or sk_alloc(kern=0),
In the next cycle, I'll rename sock_create_kern() to sock_create_net_noref()
and add sock_create_net() so that out-of-tree module will fail to build and
such users will notice sock_create_net_noref() would trigger an issue.
https://github.com/q2ven/linux/commits/427_2
> it means that they must now hold
> sk_net_refcnf, otherwise fin will only be sent once and will not be
> retransmitted when the socket is released.But other use tcp modules may
> not be aware of hold sk_net_refcnt. should we add a check in tcp_close?
The check doesn't fix the issue for in-netns users.
I'd rather print the allocator and change it to use
sock_create_net() instead.
---8<---
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0d704bda6c41..7d6a1faa05a3 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3220,8 +3220,12 @@ void tcp_close(struct sock *sk, long timeout)
lock_sock(sk);
__tcp_close(sk, timeout);
release_sock(sk);
+
+#ifdef CONFIG_NET_NS_REFCNT_TRACKER
if (!sk->sk_net_refcnt)
- inet_csk_clear_xmit_timers_sync(sk);
+ stack_depot_print(sk->ns_tracker);
+#endif
+
sock_put(sk);
}
EXPORT_SYMBOL(tcp_close);
---8<---
>
> ---
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index fb920369c..6b92026a4 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2804,7 +2804,7 @@ void tcp_close(struct sock *sk, long timeout)
> lock_sock(sk);
> __tcp_close(sk, timeout);
> release_sock(sk);
> - if (!sk->sk_net_refcnt)
> + if (sk->net != &init_net && !sk->sk_net_refcnt)
> inet_csk_clear_xmit_timers_sync(sk);
> sock_put(sk);
> }
^ permalink raw reply related [flat|nested] 4+ messages in thread
* kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close
@ 2024-11-13 12:40 mengkanglai
2024-11-13 18:56 ` Kuniyuki Iwashima
0 siblings, 1 reply; 4+ messages in thread
From: mengkanglai @ 2024-11-13 12:40 UTC (permalink / raw)
To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Fengtao (fengtao, Euler), Yanan (Euler)
Hello, Eric:
Commit 151c9c724d05 (tcp: properly terminate timers for kernel sockets) introduce inet_csk_clear_xmit_timers_sync in tcp_close.
For kernel sockets it does not hold sk->sk_net_refcnt, if this is kernel tcp socket it will call tcp_send_fin in __tcp_close to send FIN packet to remotes server,
if this fin packet lost due to network faults, tcp should retransmit this fin packet, but tcp_timer stopped by inet_csk_clear_xmit_timers_sync.
tcp sockets state will stuck in FIN_WAIT1 and never go away. I think it's not right.
Best wishes!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close
2024-11-13 12:40 mengkanglai
@ 2024-11-13 18:56 ` Kuniyuki Iwashima
0 siblings, 0 replies; 4+ messages in thread
From: Kuniyuki Iwashima @ 2024-11-13 18:56 UTC (permalink / raw)
To: mengkanglai2
Cc: davem, dsahern, edumazet, fengtao40, kuba, linux-kernel, netdev,
pabeni, yanan, kuniyu
From: mengkanglai <mengkanglai2@huawei.com>
Date: Wed, 13 Nov 2024 12:40:34 +0000
> Hello, Eric:
> Commit 151c9c724d05 (tcp: properly terminate timers for kernel sockets)
> introduce inet_csk_clear_xmit_timers_sync in tcp_close.
> For kernel sockets it does not hold sk->sk_net_refcnt, if this is kernel
> tcp socket it will call tcp_send_fin in __tcp_close to send FIN packet
> to remotes server,
Just curious which subsystem the kernel socket is created by.
Recently, CIFS and sunrpc are (being) converted to hold net refcnt.
CIFS: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ef7134c7fc48e1441b398e55a862232868a6f0a7
sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/
I remember RDS's listener does not hold refcnt but other client sockets
(SMC, RDS, MPTCP, CIFS, sunrpc) do.
I think all TCP kernel sockets should hold netns refcnt except for one
created at pernet_operations.init() hook like RDS.
> if this fin packet lost due to network faults, tcp should retransmit this
> fin packet, but tcp_timer stopped by inet_csk_clear_xmit_timers_sync.
> tcp sockets state will stuck in FIN_WAIT1 and never go away. I think
> it's not right.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-11-20 2:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-19 8:38 RE:答复: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close mengkanglai
2024-11-20 2:38 ` Kuniyuki Iwashima
-- strict thread matches above, loose matches on Subject: below --
2024-11-13 12:40 mengkanglai
2024-11-13 18:56 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).