* TCP hangs in 2.4.20-pre11 (and 2.4.19)
@ 2002-10-29 16:51 Miquel van Smoorenburg
2002-10-30 10:44 ` TCP hangs in 2.4 - blocking write() in wait_for_tcp_memory Miquel van Smoorenburg
0 siblings, 1 reply; 4+ messages in thread
From: Miquel van Smoorenburg @ 2002-10-29 16:51 UTC (permalink / raw)
To: linux-kernel
I have one machine rsync'ing a debian mirror through another machine
which runs a HTTPS proxy - rsync can use the HTTP CONNECT method to
proxy-connected to a remote rsyncd.
On the gateway machine, the proxy consistantly hangs in a write().
I've replaced the squid proxy with a simple perl script + nc to
make sure it isn't a squid-related problem..
# netstat -t | grep ftp.debian.nl
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 70832 14480 cust.94.89.adsl.c:33876 ftp.debian.nl:rsync ESTABLISHED
# ps -eo pid,tt,user,fname,tmout,f,wchan | grep nc
26782 ? nobody nc - 100 wait_for_tcp_memory
# strace -fp 26782
write(3, "\236\3754\"OU\331\6\36d\33;\270\3776\221\21\364Z\370/\346"..., 3384 <unfinished ...>
# tcpdump -v -e -i eth1 -n host ftp.debian.nl
tcpdump: listening on eth1
17:36:40.946370 0:10:67:0:f8:8f 0:d0:b7:b2:67:82 0800 66: 195.64.82.85.873 > 195.64.94.89.33876: . [tcp sum ok] ack 2887846970 win 0 <nop,nop,timestamp 77491485 14953830> (DF) (ttl 61, id 19377, len 52)
17:36:40.946431 0:d0:b7:b2:67:82 0:10:67:0:f8:8f 0800 66: 195.64.94.89.33876 > 195.64.82.85.873: . [tcp sum ok] ack 1 win 0 <nop,nop,timestamp 14965835 77480589> (DF) (ttl 64, id 30547, len 52)
17:36:51.985379 0:d0:b7:b2:67:82 0:10:67:0:f8:8f 0800 66: 195.64.94.89.33876 > 195.64.82.85.873: . [tcp sum ok] ack 1 win 0 <nop,nop,timestamp 14966939 77480589> (DF) (ttl 64, id 30548, len 52)
17:36:52.032399 0:10:67:0:f8:8f 0:d0:b7:b2:67:82 0800 66: 195.64.82.85.873 > 195.64.94.89.33876: . [tcp sum ok] ack 1 win 0 <nop,nop,timestamp 77492593 14965835> (DF) (ttl 61, id 19378, len 52)
17:38:40.984930 0:10:67:0:f8:8f 0:d0:b7:b2:67:82 0800 66: 195.64.82.85.873 > 195.64.94.89.33876: . [tcp sum ok] ack 1 win 0 <nop,nop,timestamp 77503488 14965835> (DF) (ttl 61, id 19379, len 52)
17:38:40.984987 0:d0:b7:b2:67:82 0:10:67:0:f8:8f 0800 66: 195.64.94.89.33876 > 195.64.82.85.873: . [tcp sum ok] ack 1 win 0 <nop,nop,timestamp 14977838 77492593> (DF) (ttl 64, id 30549, len 52)
.. etc. It never recovers from this.
The machine has 2 eepro100 network cards. One is connected to the
office network (which has its own 2 mbit connected to the outside
world), the other is connected to an ADSL 8 mbit/sec line. Both
connections to the outside world show the same problem. I've tried
both eepro100 and e100 drivers with the same results. 2.4.20-pre11
and 2.4.19 both behave the same. There is no firewall in the
path anywhere. And it doesn't matter which remote
rsync server I used, the connection always eventually hangs.
Wait - I do have a dual routing table setup, so that traffic originating
from the ADSL IP goes out over the ADSL line:
# ip rule show
0: from all lookup local
32700: from 195.64.94.89 lookup adsl
32766: from all lookup main
32767: from all lookup default
# ip route show table adsl
default via 195.64.94.1 dev eth1
The machine is a PII/300 with 192 MB of RAM, of which it has
plenty free.
What can this be ? Esp. the netstat output looks weird.
Mike.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: TCP hangs in 2.4 - blocking write() in wait_for_tcp_memory
2002-10-29 16:51 TCP hangs in 2.4.20-pre11 (and 2.4.19) Miquel van Smoorenburg
@ 2002-10-30 10:44 ` Miquel van Smoorenburg
2002-10-30 12:28 ` bert hubert
0 siblings, 1 reply; 4+ messages in thread
From: Miquel van Smoorenburg @ 2002-10-30 10:44 UTC (permalink / raw)
To: linux-kernel
In article <apme9u$n2n$1@ncc1701.cistron.net>,
Miquel van Smoorenburg <miquels@cistron.nl> wrote:
>On the gateway machine, the proxy consistantly hangs in a write().
>I've replaced the squid proxy with a simple perl script + nc to
>make sure it isn't a squid-related problem..
Right, I found the cause of the problem, but I'm not sure if the
application of the kernel is wrong here.
On 2 machines do this:
machine1# socket -s 12345 < /dev/zero > /dev/null # server
machine2# socket -w machine1 12345 < /dev/zero # client
The first command starts a listening process on port 12345, that
sends an infinite stream of zeros to the remote side and sinks
all data received.
The second command connects to the first machine, sends an
infinite stream of zeros, but never does a read() on the socket
(the '-w' option).
The 'socket' program doesn't make the sockets non-blocking, it just
does a select() loop to find out readability/writeability on the
file descriptors.
This makes both socket programs hang in write(), in wait_for_tcp_memory.
Shouldn't the kernel return a short write, instead of hanging
both processes ? select() returned writeability.
As I described in my first mail, this happens in the real world
as well - an application is writing lots of data to the remote
side, while the remote side is sending data too - hang.
Oh, tested it on 2.4.19 and 2.4.20-pre11
Mike.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: TCP hangs in 2.4 - blocking write() in wait_for_tcp_memory
2002-10-30 10:44 ` TCP hangs in 2.4 - blocking write() in wait_for_tcp_memory Miquel van Smoorenburg
@ 2002-10-30 12:28 ` bert hubert
2002-10-30 12:44 ` Miquel van Smoorenburg
0 siblings, 1 reply; 4+ messages in thread
From: bert hubert @ 2002-10-30 12:28 UTC (permalink / raw)
To: Miquel van Smoorenburg; +Cc: linux-kernel
On Wed, Oct 30, 2002 at 10:44:20AM +0000, Miquel van Smoorenburg wrote:
> This makes both socket programs hang in write(), in wait_for_tcp_memory.
> Shouldn't the kernel return a short write, instead of hanging
> both processes ? select() returned writeability.
write(2) is allowed to do a short write on a blocking socket, but not
mandated to do so. In fact I've only seen short writes under
linux on non-blocking sockets.
SuSv3 says:
Blocking/immediate: Blocking is only possible with O_NONBLOCK clear. If
there is enough space for all the data requested to be written immediately,
the implementation should do so. Otherwise, the process may block; that is,
pause until enough space is available for writing. The effective size of a
pipe or FIFO (the maximum amount that can be written in one operation
without blocking) may vary dynamically, depending on the implementation, so
it is not possible to specify a fixed value for it.
...
Partial and deferred writes are only possible with O_NONBLOCK set.
Regards,
bert
--
http://www.PowerDNS.com Versatile DNS Software & Services
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: TCP hangs in 2.4 - blocking write() in wait_for_tcp_memory
2002-10-30 12:28 ` bert hubert
@ 2002-10-30 12:44 ` Miquel van Smoorenburg
0 siblings, 0 replies; 4+ messages in thread
From: Miquel van Smoorenburg @ 2002-10-30 12:44 UTC (permalink / raw)
To: linux-kernel
In article <20021030122812.GA5182@outpost.ds9a.nl>,
bert hubert <ahu@ds9a.nl> wrote:
>On Wed, Oct 30, 2002 at 10:44:20AM +0000, Miquel van Smoorenburg wrote:
>
>> This makes both socket programs hang in write(), in wait_for_tcp_memory.
>> Shouldn't the kernel return a short write, instead of hanging
>> both processes ? select() returned writeability.
>
>write(2) is allowed to do a short write on a blocking socket, but not
>mandated to do so. In fact I've only seen short writes under
>linux on non-blocking sockets.
>
>SuSv3 says:
Ah, that's interesting:
> Blocking/immediate: Blocking is only possible with O_NONBLOCK clear. If
> there is enough space for all the data requested to be written immediately,
> the implementation should do so. Otherwise, the process may block; that is,
Okay, _may_ block. That's what I needed to know. So it's not a kernel
bug, but a bug in applications like socket(1) and nc(1).
With squid I see corrupted downloads, that probably means squid does
use non-blocking sockets but doesn't handle short writes correctly.
> Partial and deferred writes are only possible with O_NONBLOCK set.
Thanks for the clarification.
Now I must go fix all those programs :/
Mike.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2002-10-30 12:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-29 16:51 TCP hangs in 2.4.20-pre11 (and 2.4.19) Miquel van Smoorenburg
2002-10-30 10:44 ` TCP hangs in 2.4 - blocking write() in wait_for_tcp_memory Miquel van Smoorenburg
2002-10-30 12:28 ` bert hubert
2002-10-30 12:44 ` Miquel van Smoorenburg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox