public inbox for linux-8086@vger.kernel.org
 help / color / mirror / Atom feed
* Re: webserver stalls [was Re: bug in (linux) slattach]
@ 2002-10-22 10:16 jb1
  2002-10-22 13:57 ` [SOLVED] " Harry Kalogirou
  0 siblings, 1 reply; 8+ messages in thread
From: jb1 @ 2002-10-22 10:16 UTC (permalink / raw)
  To: Harry Kalogirou; +Cc: Linux-8086

On 21 Oct 2002, Harry Kalogirou wrote:

> Mmm.. weird.. I probably got you tired with all this but can you try and
> see if the failures are realy random? A good aid at this the -p
> parameter of ping.

100 pings (200 packets) each of patterns 00, 55, aa, and ff had zero to 
five errors, too few to account for the 100 percent failure rate of 
certain webpage files. 55 had the most errors and was the only one with an 
error in the pattern data. Most of the other errors were something about 
the time-of-day going back; 00 had one extremely long response time 
(1074131 mS).


I think I can now prove that there's at least one IP Header sum-with-carry
that results in a reproducible checksum error. I discovered that if the
ELKS IP address were 192.168.1.135, all my test files could be read; large
files required a few tries, but I was even able to read one 4369 (0x1111)
bytes long! The unique property of the packets that never got ACK'ed is 
that their checksum-field contains 0xF6FF instead of the correct value 
0xF5FF (the complement of 0x0A00).

Each of the webpage files that stall produces a defective packet with this 
IP Header (the first twenty bytes of the packet):
	4500 003f 0000 0000 4006 f6ff c0a8 0164 c0a8 0205
The corresponding packet in the 99-byte file is one byte shorter (003e 
instead of 003f), consequently having a different IP Header Checksum 
(f600 instead of the erroneous f6ff):
	4500 003e 0000 0000 4006 f600 c0a8 0164 c0a8 0205

Ping uses Protocol 01 instead of Protocol 06, so by changing the ELKS IP 
address from 192.168.1.100 to 192.168.1.105 I was able to produce the 
identical erroneous IP Header Checksum with the command:
	ping -s 35 192.168.1.105
resulting in the IP header:
	4500 003e 0000 0000 4001 f6ff c0a8 0169 c0a8 0205

To demonstrate that the problem is not the total packet size I added 1 to 
the packetsize and subtracted 1 from the ELKS IP address:
	ping -s 36 192.168.1.104
resulting in the IP Header:
	4500 0040 0000 0000 4001 f6ff c0a8 0168 c0a8 0205

Just for symmetry, I produced the same checksum as that for the 99-byte 
webpage file, but the same length as the 100- and 266 byte webpage files 
with:
	ping -s 35 192.168.1.104
resulting in the IP Header:
	4500 003f 0000 0000 4001 f600 c0a8 0168 c0a8 0205

In all cases, the pings with the defective checksum had 100% loss, while 
those with the good checksum succeeded. I didn't try manipulating the 
source IP address (c0a8 0205 = 192.168.2.5). If you can manipulate the 
packetsize and ELKS IP address so that the sum-with-carry of this header 
sans checksum-field is 0x09C1 you should be able to reproduce my results; 
otherwise it's probably a quirk in Red Hat 7.0 Linux (or you're using a 
different version of some critical ELKS file).

SOURCE PACKAGES:
        elks-0.1.1.tar.gz, elkscmd_20020501.tar.gz, elksnet-0.1.1.tar.gz,
        Dev86src-0.16.0.tar.gz
CVS PATCHES:
        (none)
COMPILED UNDER:
        Red Hat 7.0 Linux, kernel 2.2.16-22

Note: I think bad packets comsume memory. After several unsuccessful 
transfers I started seeing "Cannot fork" on the ELKS box when I issued 
commands ... eventually I'd have to reboot it. It might be a good idea to 
purge them after a minute or two.

Does anything other than the system time depend upon the CMOS clock? It 
obviously hasn't been read on any of the four machines on which I tried 
ELKS (yes, they all *have* standard, working CMOS clocks).


By the way, I received two copies of this message in addition to the copy 
sent from the mailing list.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-10-30 10:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.33.0210300110270.32451-100000@olympus.btstream.com>
2002-10-30 10:31 ` [SOLVED] Re: webserver stalls [was Re: bug in (linux) slattach] Harry Kalogirou
2002-10-22 10:16 jb1
2002-10-22 13:57 ` [SOLVED] " Harry Kalogirou
2002-10-22 16:02   ` Harry Kalogirou
2002-10-23  9:37     ` jb1
2002-10-23 11:42       ` Harry Kalogirou
2002-10-24  8:55         ` jb1
2002-10-29 10:25     ` jb1
2002-10-29 12:37       ` Harry Kalogirou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox