netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: soukjin.bae@samsung.com,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"lorenzo@google.com" <lorenzo@google.com>
Cc: 유금환 <geumhwan.yu@samsung.com>
Subject: Re: FW: [Resource Leak] Suggesting patch for tcp_close
Date: Tue, 27 Nov 2018 20:57:03 -0800	[thread overview]
Message-ID: <a2dd1915-ce29-d892-c43a-c43555f596b7@gmail.com> (raw)
In-Reply-To: <20181128010903epcms1p7ef44541625212f6af3fc2d3e0d7eb390@epcms1p7>



On 11/27/2018 05:09 PM, 배석진 wrote:
> Hello all,
> 
> nobody concern about this? minor problem? :(
> we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK.

These sessions should have a timer, and eventually disappear.

Do you have a test to demonstrate the issue ?

I know Lorenzo wrote tests, so presumably new tests are needed.

> 
> and easily reproduced in wifi network because of android gms... :p
> i heard that gms try connect to their server to confirm wifi connection.
> and they can't recieve the fin-ack frequently.
> 
> thanks,
> 
>  
> --------- Original Message ---------
> Sender : 배석진 <soukjin.bae@samsung.com> Staff Engineer/System개발2그룹(무선)/삼성전자
> Date   : 2018-11-23 16:22 (GMT+9)
> Title  : Suggesting patch for tcp_close
>  
> Dear all,
>  
>  
> This is Soukin Bae working on Samsung Elec. Mobile Division.
> we have a problem with tcp closing.
>  
> in shortly, 
> 1. on 4-way handshking to close session
> 2. if ack pkt is not arrived from opposite side
> 3. then session can't be closed forever
>  
>  
> in mobile device, condition 2 can be happend in various case.
> like as turn off wifi or mobile data. or bad condition of air network, etc..
>  
> this could be occur in both side of connection.
> when issue happened during active closing, the session remained with FIN_WAIT1 state.
> and at passive closing, remained with LAST_ACK state.
>  
> ---------------------------------------------------------------------------------------------
> below is test result after wifi on/off repetition (without mobile data).
> maybe 'Foreign Address' sent the fin-ack when wifi-off state.
> so device coun't recieve ack pkt further, and the session is remained permanently.
> and their count is growing up. this is resource leak.
>  
> ### turn on wifi
> D:\Test>adb shell netstat -npWae
> Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
> tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660   2404:6800:4008:c00::bc:5228   ESTABLISHED 10041      74347       6523/com.google.android.gms.persistent
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512   64:ff9b::3444:f3dc:443        ESTABLISHED 10137      77447       9522/com.samsung.android.game.gos
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
> tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -
>  
> ### turn off wifi
> D:\Test>adb shell netstat -npWae
> Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name
> tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           -
> tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           -
> tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           -
>  
>  
> ---------------------------------------------------------------------------------------------
> this is our analysis
> when app finished using the socket(tcp session), it calls sock_close.
> then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by excute sock_orphan().
>  
> 11-23 11:40:55.676 [5:      Thread-44:11210] TCP: bsj: tcp_set_state: TCP sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK, [2404:6800:4004:800::2003]
> 11-23 11:40:55.676 [5:      Thread-44:11210] Call trace:
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffdda8>] tcp_set_state+0x1b8/0x1f0
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008ffe3f8>] tcp_close+0x484/0x534
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff800902f830>] inet_release+0x60/0x74
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8009074238>] inet6_release+0x30/0x48
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f35e4c>] __sock_release+0x40/0x104
> 11-23 11:40:55.676 [5:      Thread-44:11210] [<ffffff8008f3bab0>] sock_close+0x18/0x28
> 11-23 11:40:55.678 [5:      Thread-44:11210] TCP: bsj: sock_orphan: TCP sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003]
>  
>  
> at this point, if the FIN_ACK comes, there's no problem. all is well~
> but without that and when turn off wifi,
> netd trying to close all the session by calling tcp_abort, sock_diag_destory.
>  
> 11-23 11:41:38.463 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4004:800::2003]
> 11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: <sock_diag_destroy>, [2404:6800:4005:80c::2004]
> 11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: <sock_diag_destroy>, [64:ff9b::34d0:9421]
>  
> but because of this sock was already changed to SOCK_DEAD state by tcp_close(), tcp_done() can't be excuted.
> so this session can't be closed.
>  
> int tcp_abort(struct sock *sk, int err)
> {
>         ...
>         if (!sock_flag(sk, SOCK_DEAD)) {        //// when SOCK_DEAD, tcp_done() be skip.
>                 ...
>                 sk->sk_error_report(sk);
>                 if (tcp_need_reset(sk->sk_state))
>                         tcp_send_active_reset(sk, GFP_ATOMIC);
>                 tcp_done(sk);
>         }
>         ...
>         return 0;
> }
>  
>  
> ---------------------------------------------------------------------------------------------
> we thought that just sk_error_report have to reside in that condition, SOCK_DEAD.
> and send-reset and tcp_done should to be always.
> we fixed it like as below, and confirmed that issue resolved.
> please check this.
>  
> Best regards,
>  
>  
>  
> From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001
> From: soukjin bae <soukjin.bae@samsung.com>
> Date: Fri, 23 Nov 2018 15:56:53 +0900
> Subject: [PATCH] net: close session always when tcp_abort
>  
> session before recieve the FIN_ACK couldn't be closed by tcp_abort
> they will remained permanently. close them
>  
> Signed-off-by: soukjin bae <soukjin.bae@samsung.com>
> Signed-off-by: geumhwan yu <geumhwan.yu@samsung.com>
> ---
>  net/ipv4/tcp.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>  
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 9e6bc4d6daa7..faf4a8bbec8e 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err)
>                  /* This barrier is coupled with smp_rmb() in tcp_poll() */
>                  smp_wmb();
>                  sk->sk_error_report(sk);
> -                if (tcp_need_reset(sk->sk_state))
> -                        tcp_send_active_reset(sk, GFP_ATOMIC);
> -                tcp_done(sk);
>          }
>  
> +        if (tcp_need_reset(sk->sk_state))
> +                tcp_send_active_reset(sk, GFP_ATOMIC);
> +        tcp_done(sk);
> +
>          bh_unlock_sock(sk);
>          local_bh_enable();
>          tcp_write_queue_purge(sk);
> -- 
> 2.13.0
>  
> 

  reply	other threads:[~2018-11-28 15:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p8>
2018-11-23  7:22 ` Suggesting patch for tcp_close 배석진
     [not found]   ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p7>
2018-11-28  1:09     ` FW: [Resource Leak] " 배석진
2018-11-28  4:57       ` Eric Dumazet [this message]
2018-11-28  4:58         ` Eric Dumazet
     [not found]       ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p1>
2018-11-28  6:17         ` 배석진
2018-11-28  7:36           ` (2) " Lorenzo Colitti
     [not found]           ` <CGME20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p6>
2018-11-28  8:49             ` 배석진
2018-11-28 14:40           ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2dd1915-ce29-d892-c43a-c43555f596b7@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=geumhwan.yu@samsung.com \
    --cc=lorenzo@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=soukjin.bae@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).