netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <mengkanglai2@huawei.com>
Cc: <davem@davemloft.net>, <dsahern@kernel.org>,
	<edumazet@google.com>, <fengtao40@huawei.com>, <kuba@kernel.org>,
	<kuniyu@amazon.com>, <linux-kernel@vger.kernel.org>,
	<netdev@vger.kernel.org>, <pabeni@redhat.com>, <yanan@huawei.com>
Subject: Re: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close
Date: Tue, 19 Nov 2024 18:38:28 -0800	[thread overview]
Message-ID: <20241120023828.907-1-kuniyu@amazon.com> (raw)
In-Reply-To: <d46151818b694dc79b488061817d3d73@huawei.com>

From: mengkanglai <mengkanglai2@huawei.com>
Date: Tue, 19 Nov 2024 08:38:26 +0000
> > 
> > From: mengkanglai <mengkanglai2@huawei.com>
> > Date: Wed, 13 Nov 2024 12:40:34 +0000
> > > Hello, Eric:
> > > Commit 151c9c724d05 (tcp: properly terminate timers for kernel 
> > > sockets) introduce inet_csk_clear_xmit_timers_sync in tcp_close.
> > > For kernel sockets it does not hold sk->sk_net_refcnt, if this is 
> > > kernel tcp socket it will call tcp_send_fin in __tcp_close to send FIN 
> > > packet to remotes server,
> > 
> > Just curious which subsystem the kernel socket is created by.
> > 
> > Recently, CIFS and sunrpc are (being) converted to hold net refcnt.
> > 
> > CIFS: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ef7134c7fc48e1441b398e55a862232868a6f0a7
> > sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/
> > 
> > I remember RDS's listener does not hold refcnt but other client sockets (SMC, RDS, MPTCP, CIFS, sunrpc) do.
> > 
> > I think all TCP kernel sockets should hold netns refcnt except for one created at pernet_operations.init() hook like RDS.
> > 
> > > if this fin packet lost due to network faults, tcp should retransmit 
> > > this fin packet, but tcp_timer stopped by inet_csk_clear_xmit_timers_sync.
> > > tcp sockets state will stuck in FIN_WAIT1 and never go away. I think 
> > > it's not right.
> 
> 
> I found this problem when testing nfs. sunrpc: https://lore.kernel.org/netdev/20241112135434.803890-1-liujian56@huawei.com/ will solve this problem. 
> I agree with that all TCP kernel sockets should hold netns refcnt.
> However, for kernel tcp sockets created by other kernel modules through
> sock_create_kern or sk_alloc(kern=0),

In the next cycle, I'll rename sock_create_kern() to sock_create_net_noref()
and add sock_create_net() so that out-of-tree module will fail to build and
such users will notice sock_create_net_noref() would trigger an issue.

https://github.com/q2ven/linux/commits/427_2


> it means that they must now hold
> sk_net_refcnf, otherwise fin will only be sent once and will not be
> retransmitted when the socket is released.But other use tcp modules may
> not be aware of hold sk_net_refcnt. should we add a check in tcp_close?

The check doesn't fix the issue for in-netns users.

I'd rather print the allocator and change it to use
sock_create_net() instead.

---8<---
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0d704bda6c41..7d6a1faa05a3 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3220,8 +3220,12 @@ void tcp_close(struct sock *sk, long timeout)
 	lock_sock(sk);
 	__tcp_close(sk, timeout);
 	release_sock(sk);
+
+#ifdef CONFIG_NET_NS_REFCNT_TRACKER
 	if (!sk->sk_net_refcnt)
-		inet_csk_clear_xmit_timers_sync(sk);
+		stack_depot_print(sk->ns_tracker);
+#endif
+
 	sock_put(sk);
 }
 EXPORT_SYMBOL(tcp_close);
---8<---

> 
> ---
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index fb920369c..6b92026a4 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2804,7 +2804,7 @@ void tcp_close(struct sock *sk, long timeout)
>         lock_sock(sk);
>         __tcp_close(sk, timeout);
>         release_sock(sk);
> -       if (!sk->sk_net_refcnt)
> +       if (sk->net != &init_net && !sk->sk_net_refcnt)
>                 inet_csk_clear_xmit_timers_sync(sk);
>         sock_put(sk);
>  }

  reply	other threads:[~2024-11-20  2:38 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-19  8:38 RE:答复: kernel tcp sockets stuck in FIN_WAIT1 after call tcp_close mengkanglai
2024-11-20  2:38 ` Kuniyuki Iwashima [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-11-13 12:40 mengkanglai
2024-11-13 18:56 ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241120023828.907-1-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=fengtao40@huawei.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mengkanglai2@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=yanan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).