* 2.4.18 userspace seeing UDP packets with bad checksum? @ 2002-07-30 10:48 bert hubert 2002-07-30 12:40 ` kuznet 2002-07-30 12:57 ` Andi Kleen 0 siblings, 2 replies; 11+ messages in thread From: bert hubert @ 2002-07-30 10:48 UTC (permalink / raw) To: netdev I'm under the strong impression that 2.4.18 lets userspace see packets with incorrect UDP checksums. The packet attached arrived in userspace after travelling all the way from Taiwan. Is this policy? Regards, bert -- http://www.PowerDNS.com Versatile DNS Software & Services http://www.tk the dot in .tk http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 10:48 2.4.18 userspace seeing UDP packets with bad checksum? bert hubert @ 2002-07-30 12:40 ` kuznet 2002-07-30 13:14 ` 3c59x " bert hubert 2002-07-30 12:57 ` Andi Kleen 1 sibling, 1 reply; 11+ messages in thread From: kuznet @ 2002-07-30 12:40 UTC (permalink / raw) To: bert hubert; +Cc: netdev Hello! > I'm under the strong impression that 2.4.18 lets userspace see packets with > incorrect UDP checksums. How did you get this impression? > Is this policy? This is impossible unless requested explicitly with SO_NO_CHECK or a buggy hardware incorrectly reports checksum is valid. Alexey ^ permalink raw reply [flat|nested] 11+ messages in thread
* 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 12:40 ` kuznet @ 2002-07-30 13:14 ` bert hubert 2002-07-30 13:31 ` kuznet 0 siblings, 1 reply; 11+ messages in thread From: bert hubert @ 2002-07-30 13:14 UTC (permalink / raw) To: kuznet; +Cc: netdev, akpm, jgarzik, becker On Tue, Jul 30, 2002 at 04:40:38PM +0400, kuznet@ms2.inr.ac.ru wrote: > Hello! > > > I'm under the strong impression that 2.4.18 lets userspace see packets with > > incorrect UDP checksums. > > How did you get this impression? The hardware: 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 01:02.0: 3Com PCI 3c905C Tornado at 0xd800. Vers LK1.1.16 01:02.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 78) The packet is subtly corrupted and contains an invalid DNS label which our nameserver tripped over (oops). It looks like a single byte error. PowerDNS lives inside a wrapper, once every second the wrapper calls wait(), to see if the child is well: Jul 30 07:04:44 knife pdns-powerdns[2983]: Our pdns instance (6595) exited after signal 11 These are the relevant packets, grouped by question/answer. 07:04:42.902162 200.171.175.165.14760 > 213.244.168.217.53: [udp sum ok] 4170 CNAME? ifm.com.br. [|domain] (ttl 112, id 41767, len 56) 07:04:42.902198 213.244.168.217.53 > 200.171.175.165.14760: [udp sum ok] 4170*- q: CNAME? ifm.com.br. 0/0/0 (28) (DF) (ttl 64, id 0, len 56) == 07:04:43.147215 202.239.113.18.56146 > 213.244.168.217.53: [udp sum ok] 32295 A? failte.powernap.org. [|domain] (ttl 241, id 11501, len 65) 07:04:43.149494 213.244.168.217.53 > 202.239.113.18.56146: [udp sum ok] 32295*- q: A? failte.powernap.org. 1/0/0 failte.powernap.org. A 213.106.2.65 (53) (DF) (ttl 64, id 0, len 81) == This is the packet I mean. Note that no answers are sent out after this one: 07:04:43.505166 61.222.31.205.62361 > 213.244.168.217.53: [bad udp cksum 25f1!] 49 op5 [2a][|domain] (ttl 112, id 51330, len 109) == 07:04:43.698853 194.25.2.147.34441 > 213.244.168.217.53: [udp sum ok] 53889 [1au] AAAA? DNS-EU1.POWERDNS.NET. . OPT UDPsize=4096 (49) (DF) (ttl 248, id 23358, len 77) == 07:04:43.699074 194.25.2.147.34441 > 213.244.168.217.53: [udp sum ok] 32420 [1au] A6 ? DNS-EU1.POWERDNS.NET. . OPT UDPsize=4096 (49) (DF) (ttl 248, id 23359, len 77) == Until a few seconds later when the parent respawns a new PowerDNS. > > Is this policy? > > This is impossible unless requested explicitly with SO_NO_CHECK > or a buggy hardware incorrectly reports checksum is valid. We don't supply SO_NO_CHECK. As the driver source mentions hardware checksumming I've cc'd in Andrew, Donald & Jeff. Regarding Andi's message, isn't it so that recvfrom() may return but in that case returns -1 and sets errno to EAGAIN? I've seen that when trying to reproduce this bug by using tcpreplay. Anything I can do to help, let me know. I get in the order of 20 of these corrupted packets each night from our Taiwanese friends at Hinet, and only at night. Netstat -s output after 81 days: Udp: 112974784 packets received 10386 packets to unknown port received. 3840 packet receive errors 112889199 packets sent Regards, bert -- http://www.PowerDNS.com Versatile DNS Software & Services http://www.tk the dot in .tk http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 13:14 ` 3c59x " bert hubert @ 2002-07-30 13:31 ` kuznet 2002-07-30 13:40 ` jamal ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: kuznet @ 2002-07-30 13:31 UTC (permalink / raw) To: bert hubert; +Cc: netdev, akpm, jgarzik, becker Hello! > Regarding Andi's message, isn't it so that recvfrom() may return but in that > case returns -1 and sets errno to EAGAIN? It should if we calculated this checksum. But 3com pretends to do this in hardware. :-) > Anything I can do to help, Well, try to prove that corrupted packet is really received by application. This is not useless work, in any case, you have to cure this place, it should not abort because of invalid data. :-) As a faster hint try to disable rx checksumming in the driver and look at the effect. I do not see module option to do this, so probably you have just to comment out the place where skb->ip_summed is set to CHECKSUM_UNNECESSARY. Alexey ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 13:31 ` kuznet @ 2002-07-30 13:40 ` jamal 2002-07-30 13:49 ` bert hubert 2002-07-30 13:57 ` Donald Becker 2 siblings, 0 replies; 11+ messages in thread From: jamal @ 2002-07-30 13:40 UTC (permalink / raw) To: kuznet; +Cc: bert hubert, netdev, akpm, jgarzik, becker On Tue, 30 Jul 2002 kuznet@ms2.inr.ac.ru wrote: > Hello! > > > Regarding Andi's message, isn't it so that recvfrom() may return but in that > > case returns -1 and sets errno to EAGAIN? > > It should if we calculated this checksum. But 3com pretends to do this > in hardware. :-) > > It doesnt seem like the NIC is Kaput given that he only sees the problems with the Taiwanese site (unless everyone else sends no checksums). More like what is in the data these guys send that is acivating this stuff? Certainly that will become more obvious when he comments out CHECKSUM_UNNECESSARY in the driver. BTW, It would be nice if the driver also reported checksum errors. This seems to only report checksum hits right now on close(); Is this something we need to add into generic stats? It would also be nice to see breakdown by TCP and UDP. A NIC/driver that doesnt implement that feature doesnt display those parameters. cheers, jamal ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 13:31 ` kuznet 2002-07-30 13:40 ` jamal @ 2002-07-30 13:49 ` bert hubert 2002-07-30 14:09 ` Donald Becker 2002-07-30 13:57 ` Donald Becker 2 siblings, 1 reply; 11+ messages in thread From: bert hubert @ 2002-07-30 13:49 UTC (permalink / raw) To: kuznet; +Cc: netdev, akpm, jgarzik, becker On Tue, Jul 30, 2002 at 05:31:44PM +0400, kuznet@ms2.inr.ac.ru wrote: > Well, try to prove that corrupted packet is really received by application. > This is not useless work, in any case, you have to cure this place, > it should not abort because of invalid data. :-) We fixed the bug already :-) Signed/unsigned arithmetic problems. I have six examples in my log of PowerDNS exiting just after receiving a packet with a broken checksum. Now, I know that this is particle physics kind of statistical evidence :-) but I think it is pretty solid. Furthermore, I see answers being sent out to questions with broken checksums. Even more enticing is that all bad checksums come from a small number of hosts, 61.222.31.205, 64.58.142.2 and 202.106.0.21. So this would be pretty unlikely for a tcpdump bug. Here are some answers to questions with a bad udp checksum. Note that the DNS question id (8, 9 and 8 in this case) matches over question and answer, which is pretty conclusive evidence that userspace saw the packet. 08:06:53.097420 61.222.31.205.63240 > 213.244.168.217.53: [bad udp cksum 4c83!] 8 op5 [2a][|domain] (ttl 112, id 444, len 109) 08:06:53.097693 213.244.168.217.53 > 61.222.31.205.63240: [udp sum ok] 8 op5 NotImp*- q:[|domain] (DF) (ttl 64, id 0, len 52) == 02:48:03.404865 61.222.31.205.62869 > 213.244.168.217.53: [bad udp cksum 7a20!] 9 op5 [2a][|domain] (ttl 112, id 1277, len 109) 02:48:03.405045 213.244.168.217.53 > 61.222.31.205.62869: [udp sum ok] 9 op5 NotImp*- q:[|domain] (DF) (ttl 64, id 0, len 52) == 03:20:36.514897 61.222.31.205.63268 > 213.244.168.217.53: [bad udp cksum b94d!] 8 op5 [2a][|domain] (ttl 112, id 2317, len 109) 03:20:36.515094 213.244.168.217.53 > 61.222.31.205.63268: [udp sum ok] 8 op5 NotImp*- q:[|domain] (DF) (ttl 64, id 0, len 80) > As a faster hint try to disable rx checksumming in the driver > and look at the effect. I do not see module option to do this, > so probably you have just to comment out the place where > skb->ip_summed is set to CHECKSUM_UNNECESSARY. Ok, will do when we are near to the machine again. Regards, bert hubert -- http://www.PowerDNS.com Versatile DNS Software & Services http://www.tk the dot in .tk http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 13:49 ` bert hubert @ 2002-07-30 14:09 ` Donald Becker 2002-07-30 14:12 ` bert hubert 0 siblings, 1 reply; 11+ messages in thread From: Donald Becker @ 2002-07-30 14:09 UTC (permalink / raw) To: bert hubert; +Cc: kuznet, netdev, akpm, jgarzik On Tue, 30 Jul 2002, bert hubert wrote: > On Tue, Jul 30, 2002 at 05:31:44PM +0400, kuznet@ms2.inr.ac.ru wrote: > > > Well, try to prove that corrupted packet is really received by application. > > This is not useless work, in any case, you have to cure this place, > > it should not abort because of invalid data. :-) > > We fixed the bug already :-) Signed/unsigned arithmetic problems. Were the arithmetic problems in your code, or the kernels checksumming? (The latter seems unlikely.) > I have six examples in my log of PowerDNS exiting just after receiving a > packet with a broken checksum. Wouldn't this have also happened with valid checksums and invalid data? This is a good time to fix your application data validity checking. Note that some network equipment will re-write the IP checksums, just as some equipment regenerates the link-level frame CRC. It's not semantically correct, but claiming that such devices don't exist doesn't make them go away. -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 14:09 ` Donald Becker @ 2002-07-30 14:12 ` bert hubert 0 siblings, 0 replies; 11+ messages in thread From: bert hubert @ 2002-07-30 14:12 UTC (permalink / raw) To: Donald Becker; +Cc: kuznet, netdev, akpm, jgarzik On Tue, Jul 30, 2002 at 10:09:13AM -0400, Donald Becker wrote: > > We fixed the bug already :-) Signed/unsigned arithmetic problems. > > Were the arithmetic problems in your code, or the kernels checksumming? > (The latter seems unlikely.) We fixed *our* bug. I would never want to rely on checksums for input validation :-) Our code should handle everything people care to throw at it and I think that right now, it will. > This is a good time to fix your application data validity checking. Has been done. Watch freshmeat for the update :-) Regards, bert -- http://www.PowerDNS.com Versatile DNS Software & Services http://www.tk the dot in .tk http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 13:31 ` kuznet 2002-07-30 13:40 ` jamal 2002-07-30 13:49 ` bert hubert @ 2002-07-30 13:57 ` Donald Becker 2002-07-31 11:46 ` jamal 2 siblings, 1 reply; 11+ messages in thread From: Donald Becker @ 2002-07-30 13:57 UTC (permalink / raw) To: kuznet; +Cc: bert hubert, netdev, akpm, jgarzik On Tue, 30 Jul 2002 kuznet@ms2.inr.ac.ru wrote: > Subject: Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? > > > Regarding Andi's message, isn't it so that recvfrom() may return but in that > > case returns -1 and sets errno to EAGAIN? > > It should if we calculated this checksum. But 3com pretends to do this > in hardware. :-) I've verified that (at least some) 3Com cards don't just fake the checksum test. They pass on packets with invalid checksums, but don't indicate that the checksum is correct. The driver code for this is int csum_bits = rx_status & 0xee000000; if (csum_bits && (csum_bits == (IPChksumValid | TCPChksumValid) || csum_bits == (IPChksumValid | UDPChksumValid))) { skb->ip_summed = CHECKSUM_UNNECESSARY; Note that this relies on the kernel to make the final decision that the checksum is invalid. [[ This is the correct semantics: the driver might indicate that a protocol level checksum is correct, but it should not discard packets based on its (necessarily limited) understanding of higher level protocols. ]] -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 3c59x 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 13:57 ` Donald Becker @ 2002-07-31 11:46 ` jamal 0 siblings, 0 replies; 11+ messages in thread From: jamal @ 2002-07-31 11:46 UTC (permalink / raw) To: Donald Becker; +Cc: kuznet, bert hubert, netdev, akpm, jgarzik On Tue, 30 Jul 2002, Donald Becker wrote: > int csum_bits = rx_status & 0xee000000; > if (csum_bits && > (csum_bits == (IPChksumValid | TCPChksumValid) || > csum_bits == (IPChksumValid | UDPChksumValid))) { > skb->ip_summed = CHECKSUM_UNNECESSARY; > > Note that this relies on the kernel to make the final decision that the > checksum is invalid. > > [[ This is the correct semantics: the driver might indicate that a protocol > level checksum is correct, but it should not discard packets based > on its (necessarily limited) understanding of higher level protocols. > ]] > Nod. So is CHECKSUM_WRONG needed to give a hint to the stack? It is possible IPcsum is right but not the transport; Maybe ip_summed needs to be a bitmap then? You could use the information to move packets out of the stack fastpath for example. cheers, jamal ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.4.18 userspace seeing UDP packets with bad checksum? 2002-07-30 10:48 2.4.18 userspace seeing UDP packets with bad checksum? bert hubert 2002-07-30 12:40 ` kuznet @ 2002-07-30 12:57 ` Andi Kleen 1 sibling, 0 replies; 11+ messages in thread From: Andi Kleen @ 2002-07-30 12:57 UTC (permalink / raw) To: bert hubert; +Cc: netdev On Tue, Jul 30, 2002 at 12:48:15PM +0200, bert hubert wrote: > I'm under the strong impression that 2.4.18 lets userspace see packets with > incorrect UDP checksums. The packet attached arrived in userspace after > travelling all the way from Taiwan. > > Is this policy? Yes. The checksum is checked while doing the copy to user space. When the checksum fails you still have the already done copy, the kernel does not try to undo it. -Andi ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2002-07-31 11:46 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-07-30 10:48 2.4.18 userspace seeing UDP packets with bad checksum? bert hubert 2002-07-30 12:40 ` kuznet 2002-07-30 13:14 ` 3c59x " bert hubert 2002-07-30 13:31 ` kuznet 2002-07-30 13:40 ` jamal 2002-07-30 13:49 ` bert hubert 2002-07-30 14:09 ` Donald Becker 2002-07-30 14:12 ` bert hubert 2002-07-30 13:57 ` Donald Becker 2002-07-31 11:46 ` jamal 2002-07-30 12:57 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).