From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: FW: [Resource Leak] Suggesting patch for tcp_close Date: Tue, 27 Nov 2018 20:57:03 -0800 Message-ID: References: <20181123072258epcms1p8c41425598751be28f8cc05b91c7ac4b5@epcms1p8> <20181128010903epcms1p7ef44541625212f6af3fc2d3e0d7eb390@epcms1p7> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: =?UTF-8?B?7Jyg6riI7ZmY?= To: soukjin.bae@samsung.com, "netdev@vger.kernel.org" , "lorenzo@google.com" Return-path: Received: from mail-pg1-f196.google.com ([209.85.215.196]:42000 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727008AbeK1P5Y (ORCPT ); Wed, 28 Nov 2018 10:57:24 -0500 Received: by mail-pg1-f196.google.com with SMTP id d72so8914067pga.9 for ; Tue, 27 Nov 2018 20:57:05 -0800 (PST) In-Reply-To: <20181128010903epcms1p7ef44541625212f6af3fc2d3e0d7eb390@epcms1p7> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 11/27/2018 05:09 PM, 배석진 wrote: > Hello all, > > nobody concern about this? minor problem? :( > we saw hundreds of not closed tcp session with FIN_WAIT1 and LAST_ACK. These sessions should have a timer, and eventually disappear. Do you have a test to demonstrate the issue ? I know Lorenzo wrote tests, so presumably new tests are needed. > > and easily reproduced in wifi network because of android gms... :p > i heard that gms try connect to their server to confirm wifi connection. > and they can't recieve the fin-ack frequently. > > thanks, > >   > --------- Original Message --------- > Sender : 배석진  Staff Engineer/System개발2그룹(무선)/삼성전자 > Date   : 2018-11-23 16:22 (GMT+9) > Title  : Suggesting patch for tcp_close >   > Dear all, >   >   > This is Soukin Bae working on Samsung Elec. Mobile Division. > we have a problem with tcp closing. >   > in shortly,  > 1. on 4-way handshking to close session > 2. if ack pkt is not arrived from opposite side > 3. then session can't be closed forever >   >   > in mobile device, condition 2 can be happend in various case. > like as turn off wifi or mobile data. or bad condition of air network, etc.. >   > this could be occur in both side of connection. > when issue happened during active closing, the session remained with FIN_WAIT1 state. > and at passive closing, remained with LAST_ACK state. >   > --------------------------------------------------------------------------------------------- > below is test result after wifi on/off repetition (without mobile data). > maybe 'Foreign Address' sent the fin-ack when wifi-off state. > so device coun't recieve ack pkt further, and the session is remained permanently. > and their count is growing up. this is resource leak. >   > ### turn on wifi > D:\Test>adb shell netstat -npWae > Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name > tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd > tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660   2404:6800:4008:c00::bc:5228   ESTABLISHED 10041      74347       6523/com.google.android.gms.persistent > tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           - > tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512   64:ff9b::3444:f3dc:443        ESTABLISHED 10137      77447       9522/com.samsung.android.game.gos > tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           - > tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           - >   > ### turn off wifi > D:\Test>adb shell netstat -npWae > Proto Recv-Q Send-Q Local Address                                 Foreign Address               State       User       Inode       PID/Program Name > tcp        0      0 127.0.0.1:5037                                0.0.0.0:*                     LISTEN      0          36357       6907/adbd > tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148   2404:6800:4004:800::2003:80   LAST_ACK    0          0           - > tcp6       0      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294   2404:6800:4005:80c::2004:443  LAST_ACK    0          0           - > tcp6       1      0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260   64:ff9b::34d0:9421:80         LAST_ACK    0          0           - >   >   > --------------------------------------------------------------------------------------------- > this is our analysis > when app finished using the socket(tcp session), it calls sock_close. > then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by excute sock_orphan(). >   > 11-23 11:40:55.676 [5:      Thread-44:11210] TCP: bsj: tcp_set_state: TCP sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK, [2404:6800:4004:800::2003] > 11-23 11:40:55.676 [5:      Thread-44:11210] Call trace: > 11-23 11:40:55.676 [5:      Thread-44:11210] [] tcp_set_state+0x1b8/0x1f0 > 11-23 11:40:55.676 [5:      Thread-44:11210] [] tcp_close+0x484/0x534 > 11-23 11:40:55.676 [5:      Thread-44:11210] [] inet_release+0x60/0x74 > 11-23 11:40:55.676 [5:      Thread-44:11210] [] inet6_release+0x30/0x48 > 11-23 11:40:55.676 [5:      Thread-44:11210] [] __sock_release+0x40/0x104 > 11-23 11:40:55.676 [5:      Thread-44:11210] [] sock_close+0x18/0x28 > 11-23 11:40:55.678 [5:      Thread-44:11210] TCP: bsj: sock_orphan: TCP sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003] >   >   > at this point, if the FIN_ACK comes, there's no problem. all is well~ > but without that and when turn off wifi, > netd trying to close all the session by calling tcp_abort, sock_diag_destory. >   > 11-23 11:41:38.463 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: , [2404:6800:4004:800::2003] > 11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: , [2404:6800:4005:80c::2004] > 11-23 11:41:38.464 [4:           netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!! : TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: , [64:ff9b::34d0:9421] >   > but because of this sock was already changed to SOCK_DEAD state by tcp_close(), tcp_done() can't be excuted. > so this session can't be closed. >   > int tcp_abort(struct sock *sk, int err) > { >         ... >         if (!sock_flag(sk, SOCK_DEAD)) {        //// when SOCK_DEAD, tcp_done() be skip. >                 ... >                 sk->sk_error_report(sk); >                 if (tcp_need_reset(sk->sk_state)) >                         tcp_send_active_reset(sk, GFP_ATOMIC); >                 tcp_done(sk); >         } >         ... >         return 0; > } >   >   > --------------------------------------------------------------------------------------------- > we thought that just sk_error_report have to reside in that condition, SOCK_DEAD. > and send-reset and tcp_done should to be always. > we fixed it like as below, and confirmed that issue resolved. > please check this. >   > Best regards, >   >   >   > From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001 > From: soukjin bae  > Date: Fri, 23 Nov 2018 15:56:53 +0900 > Subject: [PATCH] net: close session always when tcp_abort >   > session before recieve the FIN_ACK couldn't be closed by tcp_abort > they will remained permanently. close them >   > Signed-off-by: soukjin bae  > Signed-off-by: geumhwan yu  > --- >  net/ipv4/tcp.c | 7 ++++--- >  1 file changed, 4 insertions(+), 3 deletions(-) >   > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index 9e6bc4d6daa7..faf4a8bbec8e 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err) >                  /* This barrier is coupled with smp_rmb() in tcp_poll() */ >                  smp_wmb(); >                  sk->sk_error_report(sk); > -                if (tcp_need_reset(sk->sk_state)) > -                        tcp_send_active_reset(sk, GFP_ATOMIC); > -                tcp_done(sk); >          } >   > +        if (tcp_need_reset(sk->sk_state)) > +                tcp_send_active_reset(sk, GFP_ATOMIC); > +        tcp_done(sk); > + >          bh_unlock_sock(sk); >          local_bh_enable(); >          tcp_write_queue_purge(sk); > --  > 2.13.0 >   >