bug in tcp?

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* bug in tcp?
@ 2007-04-16 22:28 Sebastian Kuzminsky
  2007-04-16 22:50 ` Stephen Hemminger
  2007-04-17  4:44 ` Philip Craig
  0 siblings, 2 replies; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-16 22:28 UTC (permalink / raw)
  To: netdev

I'm seeing some weird behavior in TCP.  The issue is perfectly
reproducible using netcat and other programs.  This is what I do:

    1.  Open a TCP connection over the loopback (over IPv4).

    2.  Send a couple of bytes of data each way.  No problems.

    3.  Wait about 120 hours with no writes on either side of the
        connection.

    4.  write() a few bytes to the server's socket.  I'd expect the data
        to go through, but it doesnt.  I see the TCP frame from the
        server to the client, but instead of an ACK, the client sends
        back a RST.  netstat shows the bytes sitting in the server's
        socket's send-buffer.

    5.  write a few bytes to the client's socket.  The server gets
        these immediately.

    6.  On the next server-to-client retransmit, the client gets the
        bytes from the server.  After this, the connection works normally.


The libpcap capture file is here (only shows steps 4-6):

    http://highlab.com/~seb/tcp-idleness-bug


The behavior is reproducible on all kernels I've tried: 2.4.32, 2.6.19.1,
and 2.6.20.4.  I dont think it's iptables-related, though I'm rerunning
the tests on a machine without iptables to be sure.  I'll have results
for you in 120 hours.  ;-)


-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-16 22:28 bug in tcp? Sebastian Kuzminsky
@ 2007-04-16 22:50 ` Stephen Hemminger
  2007-04-16 23:05   ` Sebastian Kuzminsky
  2007-04-17  4:44 ` Philip Craig
  1 sibling, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2007-04-16 22:50 UTC (permalink / raw)
  To: Sebastian Kuzminsky; +Cc: netdev

On Mon, 16 Apr 2007 16:28:22 -0600
Sebastian Kuzminsky <seb@highlab.com> wrote:

> I'm seeing some weird behavior in TCP.  The issue is perfectly
> reproducible using netcat and other programs.  This is what I do:
> 
>     1.  Open a TCP connection over the loopback (over IPv4).
> 
>     2.  Send a couple of bytes of data each way.  No problems.
> 
>     3.  Wait about 120 hours with no writes on either side of the
>         connection.
> 
>     4.  write() a few bytes to the server's socket.  I'd expect the data
>         to go through, but it doesnt.  I see the TCP frame from the
>         server to the client, but instead of an ACK, the client sends
>         back a RST.  netstat shows the bytes sitting in the server's
>         socket's send-buffer.
> 
>     5.  write a few bytes to the client's socket.  The server gets
>         these immediately.
> 
>     6.  On the next server-to-client retransmit, the client gets the
>         bytes from the server.  After this, the connection works normally.
> 
> 
> The libpcap capture file is here (only shows steps 4-6):
> 
>     http://highlab.com/~seb/tcp-idleness-bug
> 
> 
> The behavior is reproducible on all kernels I've tried: 2.4.32, 2.6.19.1,
> and 2.6.20.4.  I dont think it's iptables-related, though I'm rerunning
> the tests on a machine without iptables to be sure.  I'll have results
> for you in 120 hours.  ;-)
> 
> 

What server? Some servers do application timeouts.

-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-16 22:50 ` Stephen Hemminger
@ 2007-04-16 23:05   ` Sebastian Kuzminsky
  2007-04-16 23:30     ` Stephen Hemminger
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-16 23:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Stephen Hemminger <shemminger@linux-foundation.org> wrote:
> On Mon, 16 Apr 2007 16:28:22 -0600
> Sebastian Kuzminsky <seb@highlab.com> wrote:
> 
> > I'm seeing some weird behavior in TCP.  The issue is perfectly
> > reproducible using netcat and other programs.  This is what I do:
> > 
> >     1.  Open a TCP connection over the loopback (over IPv4).
> > 
> >     2.  Send a couple of bytes of data each way.  No problems.
> > 
> >     3.  Wait about 120 hours with no writes on either side of the
> >         connection.
> > 
> >     4.  write() a few bytes to the server's socket.  I'd expect the data
> >         to go through, but it doesnt.  I see the TCP frame from the
> >         server to the client, but instead of an ACK, the client sends
> >         back a RST.  netstat shows the bytes sitting in the server's
> >         socket's send-buffer.
> > 
> >     5.  write a few bytes to the client's socket.  The server gets
> >         these immediately.
> > 
> >     6.  On the next server-to-client retransmit, the client gets the
> >         bytes from the server.  After this, the connection works normally.
> > 
> > 
> > The libpcap capture file is here (only shows steps 4-6):
> > 
> >     http://highlab.com/~seb/tcp-idleness-bug
> > 
> > 
> > The behavior is reproducible on all kernels I've tried: 2.4.32, 2.6.19.1,
> > and 2.6.20.4.  I dont think it's iptables-related, though I'm rerunning
> > the tests on a machine without iptables to be sure.  I'll have results
> > for you in 120 hours.  ;-)
> > 
> > 
> 
> What server? Some servers do application timeouts.

I've observed the behavior with the server mode of nc, and with a homebrew
application which does not do app-level timeouts.

But anyway, application timeouts wouldnt explain the described behavior,
afaik.


-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-16 23:05   ` Sebastian Kuzminsky
@ 2007-04-16 23:30     ` Stephen Hemminger
  2007-04-17  4:19       ` Sebastian Kuzminsky
  2007-04-17  4:30       ` John Heffner
  0 siblings, 2 replies; 13+ messages in thread
From: Stephen Hemminger @ 2007-04-16 23:30 UTC (permalink / raw)
  To: Sebastian Kuzminsky; +Cc: netdev

On Mon, 16 Apr 2007 17:05:42 -0600
Sebastian Kuzminsky <seb@highlab.com> wrote:

> Stephen Hemminger <shemminger@linux-foundation.org> wrote:
> > On Mon, 16 Apr 2007 16:28:22 -0600
> > Sebastian Kuzminsky <seb@highlab.com> wrote:
> > 
> > > I'm seeing some weird behavior in TCP.  The issue is perfectly
> > > reproducible using netcat and other programs.  This is what I do:
> > > 
> > >     1.  Open a TCP connection over the loopback (over IPv4).
> > > 
> > >     2.  Send a couple of bytes of data each way.  No problems.
> > > 
> > >     3.  Wait about 120 hours with no writes on either side of the
> > >         connection.
> > > 
> > >     4.  write() a few bytes to the server's socket.  I'd expect the data
> > >         to go through, but it doesnt.  I see the TCP frame from the
> > >         server to the client, but instead of an ACK, the client sends
> > >         back a RST.  netstat shows the bytes sitting in the server's
> > >         socket's send-buffer.
> > > 
> > >     5.  write a few bytes to the client's socket.  The server gets
> > >         these immediately.
> > > 
> > >     6.  On the next server-to-client retransmit, the client gets the
> > >         bytes from the server.  After this, the connection works normally.
> > > 
> > > 
> > > The libpcap capture file is here (only shows steps 4-6):
> > > 
> > >     http://highlab.com/~seb/tcp-idleness-bug
> > > 
> > > 
> > > The behavior is reproducible on all kernels I've tried: 2.4.32, 2.6.19.1,
> > > and 2.6.20.4.  I dont think it's iptables-related, though I'm rerunning
> > > the tests on a machine without iptables to be sure.  I'll have results
> > > for you in 120 hours.  ;-)
> > > 
> > > 
> > 
> > What server? Some servers do application timeouts.
> 
> I've observed the behavior with the server mode of nc, and with a homebrew
> application which does not do app-level timeouts.
> 
> But anyway, application timeouts wouldnt explain the described behavior,
> afaik.

A guess: maybe something related to a PAWS wraparound problem.
Does turning off sysctl net.ipv4.tcp_timestamps fix it?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-16 23:30     ` Stephen Hemminger
@ 2007-04-17  4:19       ` Sebastian Kuzminsky
  2007-04-17  4:30       ` John Heffner
  1 sibling, 0 replies; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-17  4:19 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Sebastian Kuzminsky

Stephen Hemminger <shemminger@linux-foundation.org> wrote:
> A guess: maybe something related to a PAWS wraparound problem.
> Does turning off sysctl net.ipv4.tcp_timestamps fix it?

I just started this test, I'll let you know in 5 days.  :-/

Any other things I should try, anyone?  I'm doing the new tests under
qemu, so it's easy to run lots of tests in parallel.  The previous
reports of this bug all occurred when running natively.

I'm doing the new tests with 2.6.18, but the behavior does not seem
sensitive to the kernel version.

-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-16 23:30     ` Stephen Hemminger
  2007-04-17  4:19       ` Sebastian Kuzminsky
@ 2007-04-17  4:30       ` John Heffner
  2007-04-17  4:42         ` Sebastian Kuzminsky
  1 sibling, 1 reply; 13+ messages in thread
From: John Heffner @ 2007-04-17  4:30 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Sebastian Kuzminsky, netdev

Stephen Hemminger wrote:
> A guess: maybe something related to a PAWS wraparound problem.
> Does turning off sysctl net.ipv4.tcp_timestamps fix it?

That was my first thought too (aside from netfilter), but a failed PAWS 
check should not result in a reset..

   -John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-17  4:30       ` John Heffner
@ 2007-04-17  4:42         ` Sebastian Kuzminsky
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-17  4:42 UTC (permalink / raw)
  To: John Heffner; +Cc: Stephen Hemminger, netdev, Sebastian Kuzminsky

John Heffner <jheffner@psc.edu> wrote:
> Stephen Hemminger wrote:
> > A guess: maybe something related to a PAWS wraparound problem.
> > Does turning off sysctl net.ipv4.tcp_timestamps fix it?
> 
> That was my first thought too (aside from netfilter), but a failed PAWS 
> check should not result in a reset..

All three of the systems where I saw the problems earlier did have some
netfilter rules up and running....

All tests I'm doing now have netfilter compiled in but no rules set
(INPUT, FORWARD, and OUTPUT are all empty, and the policies are all
ACCEPT).


-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-16 22:28 bug in tcp? Sebastian Kuzminsky
  2007-04-16 22:50 ` Stephen Hemminger
@ 2007-04-17  4:44 ` Philip Craig
  2007-04-17  5:35   ` Sebastian Kuzminsky
  1 sibling, 1 reply; 13+ messages in thread
From: Philip Craig @ 2007-04-17  4:44 UTC (permalink / raw)
  To: Sebastian Kuzminsky; +Cc: netdev

Sebastian Kuzminsky wrote:
> I'm seeing some weird behavior in TCP.  The issue is perfectly
> reproducible using netcat and other programs.  This is what I do:
> 
>     1.  Open a TCP connection over the loopback (over IPv4).
> 
>     2.  Send a couple of bytes of data each way.  No problems.
> 
>     3.  Wait about 120 hours with no writes on either side of the
>         connection.
> 
>     4.  write() a few bytes to the server's socket.  I'd expect the data
>         to go through, but it doesnt.  I see the TCP frame from the
>         server to the client, but instead of an ACK, the client sends
>         back a RST.  netstat shows the bytes sitting in the server's
>         socket's send-buffer.
> 
>     5.  write a few bytes to the client's socket.  The server gets
>         these immediately.
> 
>     6.  On the next server-to-client retransmit, the client gets the
>         bytes from the server.  After this, the connection works normally.
> 
> 
> The libpcap capture file is here (only shows steps 4-6):
> 
>     http://highlab.com/~seb/tcp-idleness-bug
> 
> 
> The behavior is reproducible on all kernels I've tried: 2.4.32, 2.6.19.1,
> and 2.6.20.4.  I dont think it's iptables-related, though I'm rerunning
> the tests on a machine without iptables to be sure.  I'll have results
> for you in 120 hours.  ;-)

It sounds like it could easily be iptables related, if you have iptables
rules that only allow new connections in the client to server direction,
which is quite normal.

The default iptables timeout for TCP connections is 5 days.
So after 5 days of idle, any packets from the server will be treated
as a new connection and the iptables rules will drop them.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-17  4:44 ` Philip Craig
@ 2007-04-17  5:35   ` Sebastian Kuzminsky
  2007-04-17  6:13     ` Philip Craig
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-17  5:35 UTC (permalink / raw)
  To: Philip Craig; +Cc: netdev, Sebastian Kuzminsky

Philip Craig <philipc@snapgear.com> wrote:
> It sounds like it could easily be iptables related, if you have iptables
> rules that only allow new connections in the client to server direction,
> which is quite normal.

Sure I have those standard rules.

iptables -A INPUT -p tcp -m state --state ESTABLISHED -j ACCEPT
iptables -A INPUT -p tcp --syn --dport ssh -j ACCEPT
iptables -A INPUT -p tcp --syn --dport http -j ACCEPT
... etc


> The default iptables timeout for TCP connections is 5 days.
> So after 5 days of idle, any packets from the server will be treated
> as a new connection and the iptables rules will drop them.

Weird.  Why does sending a message from the client make it go again?

If that's the case, it seems like a simple fix would be to enable TCP
keepalive in my app, that would keep netfilter from timing out, right?
That seems better than extending the netfilter timeout.

How do people normally handle this?


-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-17  5:35   ` Sebastian Kuzminsky
@ 2007-04-17  6:13     ` Philip Craig
  2007-04-17 13:56       ` Sebastian Kuzminsky
  0 siblings, 1 reply; 13+ messages in thread
From: Philip Craig @ 2007-04-17  6:13 UTC (permalink / raw)
  To: Sebastian Kuzminsky; +Cc: netdev

Sebastian Kuzminsky wrote:
> Weird.  Why does sending a message from the client make it go again?

The rule that allows packets with an "ESTABLISHED" state only matches
packets for which the connection is in netfilter's conntrack table.
The connection is removed from the table after the 5 days of idle.
It is only added again if netfilter accepts a packet for that connection.
So the packet from the client will cause it to be added.

> If that's the case, it seems like a simple fix would be to enable TCP
> keepalive in my app, that would keep netfilter from timing out, right?
> That seems better than extending the netfilter timeout.

Yes that would work.

> How do people normally handle this?

Change the timeout or use keepalives.  I can't think of any other way.
The 5 days is a compromise between keeping valid connections and
timing out dead connections.  There will always be connections for
which it times out too fast or too slow.  I don't think there are
any drawbacks to increasing the timeout if you aren't a router,
but as long as there is a timeout, you need keepalives to be sure.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-17  6:13     ` Philip Craig
@ 2007-04-17 13:56       ` Sebastian Kuzminsky
  2007-04-18  0:03         ` Philip Craig
  2007-04-23 18:45         ` bug in my understanding (was Re: bug in tcp?) Sebastian Kuzminsky
  0 siblings, 2 replies; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-17 13:56 UTC (permalink / raw)
  To: Philip Craig; +Cc: netdev, Sebastian Kuzminsky

Philip Craig <philipc@snapgear.com> wrote:
> Sebastian Kuzminsky wrote:
> > Weird.  Why does sending a message from the client make it go again?
> 
> The rule that allows packets with an "ESTABLISHED" state only matches
> packets for which the connection is in netfilter's conntrack table.
> The connection is removed from the table after the 5 days of idle.
> It is only added again if netfilter accepts a packet for that connection.
> So the packet from the client will cause it to be added.

Why did the packet from the client cause the connection to be added back
to the conntrack table, but the packet from the server did not?


> > How do people normally handle this?
> 
> Change the timeout or use keepalives.  I can't think of any other way.
> The 5 days is a compromise between keeping valid connections and
> timing out dead connections.  There will always be connections for
> which it times out too fast or too slow.  I don't think there are
> any drawbacks to increasing the timeout if you aren't a router,
> but as long as there is a timeout, you need keepalives to be sure.

Thanks!  I'll add keepalives and rerun the tests, and I expect the
problem to go away.


-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bug in tcp?
  2007-04-17 13:56       ` Sebastian Kuzminsky
@ 2007-04-18  0:03         ` Philip Craig
  2007-04-23 18:45         ` bug in my understanding (was Re: bug in tcp?) Sebastian Kuzminsky
  1 sibling, 0 replies; 13+ messages in thread
From: Philip Craig @ 2007-04-18  0:03 UTC (permalink / raw)
  To: Sebastian Kuzminsky; +Cc: netdev

Sebastian Kuzminsky wrote:
> Why did the packet from the client cause the connection to be added back
> to the conntrack table, but the packet from the server did not?

Because the packet from the client was accepted (by a different
iptables rule).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug in my understanding (was Re: bug in tcp?)
  2007-04-17 13:56       ` Sebastian Kuzminsky
  2007-04-18  0:03         ` Philip Craig
@ 2007-04-23 18:45         ` Sebastian Kuzminsky
  1 sibling, 0 replies; 13+ messages in thread
From: Sebastian Kuzminsky @ 2007-04-23 18:45 UTC (permalink / raw)
  To: netdev

Sebastian Kuzminsky <seb@highlab.com> wrote:
> Philip Craig <philipc@snapgear.com> wrote:
> > Change the timeout or use keepalives.  I can't think of any other way.
> > The 5 days is a compromise between keeping valid connections and
> > timing out dead connections.  There will always be connections for
> > which it times out too fast or too slow.  I don't think there are
> > any drawbacks to increasing the timeout if you aren't a router,
> > but as long as there is a timeout, you need keepalives to be sure.
> 
> Thanks!  I'll add keepalives and rerun the tests, and I expect the
> problem to go away.

I reran the tests with keepalive enabled and it worked just fine.
Thanks for all your help, and sorry for the false alarm!


-- 
Sebastian Kuzminsky

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-04-23 18:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-16 22:28 bug in tcp? Sebastian Kuzminsky
2007-04-16 22:50 ` Stephen Hemminger
2007-04-16 23:05   ` Sebastian Kuzminsky
2007-04-16 23:30     ` Stephen Hemminger
2007-04-17  4:19       ` Sebastian Kuzminsky
2007-04-17  4:30       ` John Heffner
2007-04-17  4:42         ` Sebastian Kuzminsky
2007-04-17  4:44 ` Philip Craig
2007-04-17  5:35   ` Sebastian Kuzminsky
2007-04-17  6:13     ` Philip Craig
2007-04-17 13:56       ` Sebastian Kuzminsky
2007-04-18  0:03         ` Philip Craig
2007-04-23 18:45         ` bug in my understanding (was Re: bug in tcp?) Sebastian Kuzminsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).