* [PATCH] tcp: keepalive fixes @ 2021-01-12 19:25 Enke Chen 2021-01-12 22:48 ` Yuchung Cheng 2021-01-22 19:45 ` Enke Chen 0 siblings, 2 replies; 7+ messages in thread From: Enke Chen @ 2021-01-12 19:25 UTC (permalink / raw) To: Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski Cc: netdev, linux-kernel, enkechen2020 From: Enke Chen <enchen@paloaltonetworks.com> In this patch two issues with TCP keepalives are fixed: 1) TCP keepalive does not timeout when there are data waiting to be delivered and then the connection got broken. The TCP keepalive timeout is not evaluated in that condition. The fix is to remove the code that prevents TCP keepalive from being evaluated for timeout. 2) With the fix for #1, TCP keepalive can erroneously timeout after the 0-window probe kicks in. The 0-window probe counter is wrongly applied to TCP keepalives. The fix is to use the elapsed time instead of the 0-window probe counter in evaluating TCP keepalive timeout. Cc: stable@vger.kernel.org Signed-off-by: Enke Chen <enchen@paloaltonetworks.com> --- net/ipv4/tcp_timer.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 6c62b9ea1320..40953aa40d53 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t) ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT))) goto out; - elapsed = keepalive_time_when(tp); - - /* It is alive without keepalive 8) */ - if (tp->packets_out || !tcp_write_queue_empty(sk)) - goto resched; - elapsed = keepalive_time_elapsed(tp); if (elapsed >= keepalive_time_when(tp)) { @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t) * to determine when to timeout instead. */ if ((icsk->icsk_user_timeout != 0 && - elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) && - icsk->icsk_probes_out > 0) || + elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) || (icsk->icsk_user_timeout == 0 && - icsk->icsk_probes_out >= keepalive_probes(tp))) { + (elapsed >= keepalive_time_when(tp) + + keepalive_intvl_when(tp) * keepalive_probes(tp)))) { tcp_send_active_reset(sk, GFP_ATOMIC); tcp_write_err(sk); goto out; } if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) { - icsk->icsk_probes_out++; elapsed = keepalive_intvl_when(tp); } else { /* If keepalive was lost due to local congestion, @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t) } sk_mem_reclaim(sk); - -resched: inet_csk_reset_keepalive_timer (sk, elapsed); goto out; -- 2.29.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] tcp: keepalive fixes 2021-01-12 19:25 [PATCH] tcp: keepalive fixes Enke Chen @ 2021-01-12 22:48 ` Yuchung Cheng 2021-01-12 22:52 ` Eric Dumazet 2021-01-13 0:42 ` Enke Chen 2021-01-22 19:45 ` Enke Chen 1 sibling, 2 replies; 7+ messages in thread From: Yuchung Cheng @ 2021-01-12 22:48 UTC (permalink / raw) To: Enke Chen Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote: > > From: Enke Chen <enchen@paloaltonetworks.com> > > In this patch two issues with TCP keepalives are fixed: > > 1) TCP keepalive does not timeout when there are data waiting to be > delivered and then the connection got broken. The TCP keepalive > timeout is not evaluated in that condition. hi enke Do you have an example to demonstrate this issue -- in theory when there is data inflight, an RTO timer should be pending (which considers user-timeout setting). based on the user-timeout description (man tcp), the user timeout should abort the socket per the specified time after data commences. some data would help to understand the issue. > > The fix is to remove the code that prevents TCP keepalive from > being evaluated for timeout. > > 2) With the fix for #1, TCP keepalive can erroneously timeout after > the 0-window probe kicks in. The 0-window probe counter is wrongly > applied to TCP keepalives. > > The fix is to use the elapsed time instead of the 0-window probe > counter in evaluating TCP keepalive timeout. > > Cc: stable@vger.kernel.org > Signed-off-by: Enke Chen <enchen@paloaltonetworks.com> > --- > net/ipv4/tcp_timer.c | 15 +++------------ > 1 file changed, 3 insertions(+), 12 deletions(-) > > diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c > index 6c62b9ea1320..40953aa40d53 100644 > --- a/net/ipv4/tcp_timer.c > +++ b/net/ipv4/tcp_timer.c > @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t) > ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT))) > goto out; > > - elapsed = keepalive_time_when(tp); > - > - /* It is alive without keepalive 8) */ > - if (tp->packets_out || !tcp_write_queue_empty(sk)) > - goto resched; > - > elapsed = keepalive_time_elapsed(tp); > > if (elapsed >= keepalive_time_when(tp)) { > @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t) > * to determine when to timeout instead. > */ > if ((icsk->icsk_user_timeout != 0 && > - elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) && > - icsk->icsk_probes_out > 0) || > + elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) || > (icsk->icsk_user_timeout == 0 && > - icsk->icsk_probes_out >= keepalive_probes(tp))) { > + (elapsed >= keepalive_time_when(tp) + > + keepalive_intvl_when(tp) * keepalive_probes(tp)))) { > tcp_send_active_reset(sk, GFP_ATOMIC); > tcp_write_err(sk); > goto out; > } > if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) { > - icsk->icsk_probes_out++; > elapsed = keepalive_intvl_when(tp); > } else { > /* If keepalive was lost due to local congestion, > @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t) > } > > sk_mem_reclaim(sk); > - > -resched: > inet_csk_reset_keepalive_timer (sk, elapsed); > goto out; > > -- > 2.29.2 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tcp: keepalive fixes 2021-01-12 22:48 ` Yuchung Cheng @ 2021-01-12 22:52 ` Eric Dumazet 2021-01-13 20:06 ` Enke Chen 2021-01-13 0:42 ` Enke Chen 1 sibling, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2021-01-12 22:52 UTC (permalink / raw) To: Yuchung Cheng Cc: Enke Chen, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <ycheng@google.com> wrote: > > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote: > > > > From: Enke Chen <enchen@paloaltonetworks.com> > > > > In this patch two issues with TCP keepalives are fixed: > > > > 1) TCP keepalive does not timeout when there are data waiting to be > > delivered and then the connection got broken. The TCP keepalive > > timeout is not evaluated in that condition. > hi enke > Do you have an example to demonstrate this issue -- in theory when > there is data inflight, an RTO timer should be pending (which > considers user-timeout setting). based on the user-timeout description > (man tcp), the user timeout should abort the socket per the specified > time after data commences. some data would help to understand the > issue. > +1 A packetdrill test would be ideal. Also, given that there is this ongoing issue with TCP_USER_TIMEOUT, lets not mix things or risk added work for backports to stable versions. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tcp: keepalive fixes 2021-01-12 22:52 ` Eric Dumazet @ 2021-01-13 20:06 ` Enke Chen 2021-01-13 20:28 ` Enke Chen 0 siblings, 1 reply; 7+ messages in thread From: Enke Chen @ 2021-01-13 20:06 UTC (permalink / raw) To: Eric Dumazet Cc: Yuchung Cheng, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell Hi, Eric: Just to clarify: the issues for tcp keepalive and TCP_USER_TIMEOUT are separate isues, and the fixes would not conflict afaik. Thanks. -- Enke On Tue, Jan 12, 2021 at 11:52:43PM +0100, Eric Dumazet wrote: > On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <ycheng@google.com> wrote: > > > > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote: > > > > > > From: Enke Chen <enchen@paloaltonetworks.com> > > > > > > In this patch two issues with TCP keepalives are fixed: > > > > > > 1) TCP keepalive does not timeout when there are data waiting to be > > > delivered and then the connection got broken. The TCP keepalive > > > timeout is not evaluated in that condition. > > hi enke > > Do you have an example to demonstrate this issue -- in theory when > > there is data inflight, an RTO timer should be pending (which > > considers user-timeout setting). based on the user-timeout description > > (man tcp), the user timeout should abort the socket per the specified > > time after data commences. some data would help to understand the > > issue. > > > > +1 > > A packetdrill test would be ideal. > > Also, given that there is this ongoing issue with TCP_USER_TIMEOUT, > lets not mix things > or risk added work for backports to stable versions. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tcp: keepalive fixes 2021-01-13 20:06 ` Enke Chen @ 2021-01-13 20:28 ` Enke Chen 0 siblings, 0 replies; 7+ messages in thread From: Enke Chen @ 2021-01-13 20:28 UTC (permalink / raw) To: Eric Dumazet Cc: Yuchung Cheng, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell, enkechen2020 On Wed, Jan 13, 2021 at 12:06:27PM -0800, Enke Chen wrote: > Hi, Eric: > > Just to clarify: the issues for tcp keepalive and TCP_USER_TIMEOUT are > separate isues, and the fixes would not conflict afaik. > > Thanks. -- Enke I have posted patches for both issues, and there is no conflict between the patches. Thanks. -- Enke > > On Tue, Jan 12, 2021 at 11:52:43PM +0100, Eric Dumazet wrote: > > On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <ycheng@google.com> wrote: > > > > > > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote: > > > > > > > > From: Enke Chen <enchen@paloaltonetworks.com> > > > > > > > > In this patch two issues with TCP keepalives are fixed: > > > > > > > > 1) TCP keepalive does not timeout when there are data waiting to be > > > > delivered and then the connection got broken. The TCP keepalive > > > > timeout is not evaluated in that condition. > > > hi enke > > > Do you have an example to demonstrate this issue -- in theory when > > > there is data inflight, an RTO timer should be pending (which > > > considers user-timeout setting). based on the user-timeout description > > > (man tcp), the user timeout should abort the socket per the specified > > > time after data commences. some data would help to understand the > > > issue. > > > > > > > +1 > > > > A packetdrill test would be ideal. > > > > Also, given that there is this ongoing issue with TCP_USER_TIMEOUT, > > lets not mix things > > or risk added work for backports to stable versions. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tcp: keepalive fixes 2021-01-12 22:48 ` Yuchung Cheng 2021-01-12 22:52 ` Eric Dumazet @ 2021-01-13 0:42 ` Enke Chen 1 sibling, 0 replies; 7+ messages in thread From: Enke Chen @ 2021-01-13 0:42 UTC (permalink / raw) To: Yuchung Cheng Cc: Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski, netdev, LKML, Neal Cardwell, enkechen2020 Hi, Yuchung: I have attached the python script that reproduces the keepalive issues. The script is a slight modification of the one written by Marek Majkowski: https://github.com/cloudflare/cloudflare-blog/blob/master/2019-09-tcp-keepalives/test-zero.py Please note that only the TCP keepalive is configured, and not the user timeout. Thanks. -- Enke On Tue, Jan 12, 2021 at 02:48:01PM -0800, Yuchung Cheng wrote: > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <enkechen2020@gmail.com> wrote: > > > > From: Enke Chen <enchen@paloaltonetworks.com> > > > > In this patch two issues with TCP keepalives are fixed: > > > > 1) TCP keepalive does not timeout when there are data waiting to be > > delivered and then the connection got broken. The TCP keepalive > > timeout is not evaluated in that condition. > hi enke > Do you have an example to demonstrate this issue -- in theory when > there is data inflight, an RTO timer should be pending (which > considers user-timeout setting). based on the user-timeout description > (man tcp), the user timeout should abort the socket per the specified > time after data commences. some data would help to understand the > issue. > ------ #! /usr/bin/python import io import os import select import socket import time import utils import ctypes utils.new_ns() port = 1 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0) s.bind(('127.0.0.1', port)) s.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024) s.listen(16) tcpdump = utils.tcpdump_start(port) c = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0) c.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024) c.connect(('127.0.0.1', port)) x, _ = s.accept() if False: c.setsockopt(socket.IPPROTO_TCP, socket.TCP_USER_TIMEOUT, 90*1000) if True: c.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1) c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5) c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 10) c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 10) time.sleep(0.2) print("[ ] c.send()") import fcntl TIOCOUTQ=0x5411 c.setblocking(False) while True: bytes_avail = ctypes.c_int() fcntl.ioctl(c.fileno(), TIOCOUTQ, bytes_avail) if bytes_avail.value > 64*1024: break try: c.send(b"A" * 16384 * 4) except io.BlockingIOError: break c.setblocking(True) time.sleep(0.2) utils.ss(port) utils.check_buffer(c) t0 = time.time() if True: utils.drop_start(dport=port) utils.drop_start(sport=port) poll = select.poll() poll.register(c, select.POLLIN) poll.poll() utils.ss(port) e = c.getsockopt(socket.SOL_SOCKET, socket.SO_ERROR) print("[ ] SO_ERROR = %s" % (e,)) t1 = time.time() print("[ ] took: %f seconds" % (t1-t0,)) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tcp: keepalive fixes 2021-01-12 19:25 [PATCH] tcp: keepalive fixes Enke Chen 2021-01-12 22:48 ` Yuchung Cheng @ 2021-01-22 19:45 ` Enke Chen 1 sibling, 0 replies; 7+ messages in thread From: Enke Chen @ 2021-01-22 19:45 UTC (permalink / raw) To: Eric Dumazet, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, Jakub Kicinski Cc: netdev, linux-kernel, Neal Cardwell, enkechen2020 Hi, Folks: Please ignore this patch. I will split it into separate ones as suggested off-list by Neal Cardwell <ncardwell@google.com>. Thanks. -- Enke On Tue, Jan 12, 2021 at 11:25:44AM -0800, Enke Chen wrote: > From: Enke Chen <enchen@paloaltonetworks.com> > > In this patch two issues with TCP keepalives are fixed: > > 1) TCP keepalive does not timeout when there are data waiting to be > delivered and then the connection got broken. The TCP keepalive > timeout is not evaluated in that condition. > > The fix is to remove the code that prevents TCP keepalive from > being evaluated for timeout. > > 2) With the fix for #1, TCP keepalive can erroneously timeout after > the 0-window probe kicks in. The 0-window probe counter is wrongly > applied to TCP keepalives. > > The fix is to use the elapsed time instead of the 0-window probe > counter in evaluating TCP keepalive timeout. > > Cc: stable@vger.kernel.org > Signed-off-by: Enke Chen <enchen@paloaltonetworks.com> > --- > net/ipv4/tcp_timer.c | 15 +++------------ > 1 file changed, 3 insertions(+), 12 deletions(-) > > diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c > index 6c62b9ea1320..40953aa40d53 100644 > --- a/net/ipv4/tcp_timer.c > +++ b/net/ipv4/tcp_timer.c > @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t) > ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT))) > goto out; > > - elapsed = keepalive_time_when(tp); > - > - /* It is alive without keepalive 8) */ > - if (tp->packets_out || !tcp_write_queue_empty(sk)) > - goto resched; > - > elapsed = keepalive_time_elapsed(tp); > > if (elapsed >= keepalive_time_when(tp)) { > @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t) > * to determine when to timeout instead. > */ > if ((icsk->icsk_user_timeout != 0 && > - elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) && > - icsk->icsk_probes_out > 0) || > + elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) || > (icsk->icsk_user_timeout == 0 && > - icsk->icsk_probes_out >= keepalive_probes(tp))) { > + (elapsed >= keepalive_time_when(tp) + > + keepalive_intvl_when(tp) * keepalive_probes(tp)))) { > tcp_send_active_reset(sk, GFP_ATOMIC); > tcp_write_err(sk); > goto out; > } > if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) { > - icsk->icsk_probes_out++; > elapsed = keepalive_intvl_when(tp); > } else { > /* If keepalive was lost due to local congestion, > @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t) > } > > sk_mem_reclaim(sk); > - > -resched: > inet_csk_reset_keepalive_timer (sk, elapsed); > goto out; > > -- > 2.29.2 > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-01-22 22:42 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-01-12 19:25 [PATCH] tcp: keepalive fixes Enke Chen 2021-01-12 22:48 ` Yuchung Cheng 2021-01-12 22:52 ` Eric Dumazet 2021-01-13 20:06 ` Enke Chen 2021-01-13 20:28 ` Enke Chen 2021-01-13 0:42 ` Enke Chen 2021-01-22 19:45 ` Enke Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).