A case AGAINST checksum offload

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* A case AGAINST checksum offload
@ 2004-11-12 23:46 John Heffner
  2004-11-12 23:49 ` David S. Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: John Heffner @ 2004-11-12 23:46 UTC (permalink / raw)
  To: netdev

Currently with many common Ethernet devices in Linux, hardware TCP
checksumming is enabled by default.  This seems fairly dangerous to me.
Most link layer checksums are much stronger than the TCP/UDP checksum;
most bit errors are caught by these.  However, one of the primary purposes
of the TCP/UDP checksum is to detect errors occurring outside the
protection of the link layer checksums -- errors when data is reassembled
or copied across busses inside hosts and routers.  Hardware checksum
offload removes the ability to detect errors between the NIC and host
memory.

For some anecdotal evidence: One of my machines has fiber e1000 (82545GM)
and I observed corruptions in its TCP streams.  I actually caught this
because large SSH flows originating from this host would usually die after
<1 GB or so with a MAC error, indicating the TCP stream was somehow
corrupt.  I looked at some TCP statistics, which indicated no dropped
packets or checksum errors, but then I realized hardware checksumming was
on.  I turned off hardware checksumming and found the stream errors
disappeared, and it correctly started discarding the corrupt TCP segments.
Luckily for me, this machine is mainly used for testing, and the strong
authentication SSH uses caught the problems.

Though I don't have any definitive references, I've heard stories that Sun
turned off UDP checksums on LANs to increase NFS performance, only to
re-enable checksumming by default after problems similar to mine caused
corruptions of some critical databases.

Since TCP checksum offload should only really helps the zero-copy case in
terms of performance, it seems safer to turn off hardware checksumming by
default, or perhaps only enable it if an application is doing a zero-copy
send.

  -John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-12 23:46 A case AGAINST checksum offload John Heffner
@ 2004-11-12 23:49 ` David S. Miller
  2004-11-13  0:36   ` John Heffner
  2004-11-12 23:53 ` Dave Hansen
  2004-11-14 20:01 ` Florian Weimer
  2 siblings, 1 reply; 9+ messages in thread
From: David S. Miller @ 2004-11-12 23:49 UTC (permalink / raw)
  To: John Heffner; +Cc: netdev

On Fri, 12 Nov 2004 18:46:11 -0500 (EST)
John Heffner <jheffner@psc.edu> wrote:

> Though I don't have any definitive references, I've heard stories that Sun
> turned off UDP checksums on LANs to increase NFS performance, only to
> re-enable checksumming by default after problems similar to mine caused
> corruptions of some critical databases.

That story about Sun is true.  But it is an entirely different matter
to disable checksums altogether vs. disabling HW assisted checksumming.

> Since TCP checksum offload should only really helps the zero-copy case in
> terms of performance, it seems safer to turn off hardware checksumming by
> default, or perhaps only enable it if an application is doing a zero-copy
> send.

I disagree.

What is the difference between the CPU (a bus agent with computational
abilities), and a networking card (again, a bus agent with computational
abilities) computing the checksums?

In your listed case you found a bug, and it appears that what happened
is that the DMA transfer got corrupted to the networking card yet a
properly checksummed packet went out because the card computed the
checksum.

What would happen if this happened on a block device?  Your filesystem
would get corrupted, perhaps irreparably.

How is this any different?  It's a hard error for the DMA data to be
corrupted.

The data could just as easily be corrupted on the way to the CPU when
doing a copy+checksum operation.  It's the same problem you say exists
with your networking card case except the path of the corruption is
RAM-->CPU instead of RAM-->PCI Controller-->Networking Card

I really don't buy this. :-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-12 23:46 A case AGAINST checksum offload John Heffner
  2004-11-12 23:49 ` David S. Miller
@ 2004-11-12 23:53 ` Dave Hansen
  2004-11-12 23:56   ` John Heffner
  2004-11-14 20:01 ` Florian Weimer
  2 siblings, 1 reply; 9+ messages in thread
From: Dave Hansen @ 2004-11-12 23:53 UTC (permalink / raw)
  To: John Heffner; +Cc: netdev

On Fri, 2004-11-12 at 15:46, John Heffner wrote:
> Currently with many common Ethernet devices in Linux, hardware TCP
> checksumming is enabled by default.  This seems fairly dangerous to me.
> Most link layer checksums are much stronger than the TCP/UDP checksum;
> most bit errors are caught by these.  However, one of the primary purposes
> of the TCP/UDP checksum is to detect errors occurring outside the
> protection of the link layer checksums -- errors when data is reassembled
> or copied across busses inside hosts and routers.

If you're getting errors copying things on buses inside of the machine,
don't you have bigger problems than corrupt packets?  For instance, why
doesn't your disk controller have the same problem?

Just curious.

-- Dave

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-12 23:53 ` Dave Hansen
@ 2004-11-12 23:56   ` John Heffner
  2004-11-13 11:32     ` Francois Romieu
  0 siblings, 1 reply; 9+ messages in thread
From: John Heffner @ 2004-11-12 23:56 UTC (permalink / raw)
  To: Dave Hansen; +Cc: netdev

On Fri, 12 Nov 2004, Dave Hansen wrote:

> On Fri, 2004-11-12 at 15:46, John Heffner wrote:
> > Currently with many common Ethernet devices in Linux, hardware TCP
> > checksumming is enabled by default.  This seems fairly dangerous to me.
> > Most link layer checksums are much stronger than the TCP/UDP checksum;
> > most bit errors are caught by these.  However, one of the primary purposes
> > of the TCP/UDP checksum is to detect errors occurring outside the
> > protection of the link layer checksums -- errors when data is reassembled
> > or copied across busses inside hosts and routers.
>
> If you're getting errors copying things on buses inside of the machine,
> don't you have bigger problems than corrupt packets?  For instance, why
> doesn't your disk controller have the same problem?
>
> Just curious.

It's a probem with the NIC.  It did the same thing in a different machine.
The point is that I think it's a good idea to mitigate the effects of
faulty hardware, especially if you can do so nearly for free.

  -John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-13  0:36   ` John Heffner
@ 2004-11-13  0:29     ` David S. Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2004-11-13  0:29 UTC (permalink / raw)
  To: John Heffner; +Cc: netdev

On Fri, 12 Nov 2004 19:36:35 -0500 (EST)
John Heffner <jheffner@psc.edu> wrote:

> Probably not a big deal (yeah, somethings buggy anyway and needs to be
> fixed or replaced), but I thought it worth pointing out.

It certainly was an interesting viewpoint.  I haven't totally
dismissed your ideas, so please don't get that impression.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-12 23:49 ` David S. Miller
@ 2004-11-13  0:36   ` John Heffner
  2004-11-13  0:29     ` David S. Miller
  0 siblings, 1 reply; 9+ messages in thread
From: John Heffner @ 2004-11-13  0:36 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

On Fri, 12 Nov 2004, David S. Miller wrote:

> The data could just as easily be corrupted on the way to the CPU when
> doing a copy+checksum operation.  It's the same problem you say exists
> with your networking card case except the path of the corruption is
> RAM-->CPU instead of RAM-->PCI Controller-->Networking Card
>
> I really don't buy this. :-)

Sure.  But we can get a check on one point of failure nearly free.  I've
measured about a 1% difference in CPU use with checksum on vs. off at 1
gigabit.

Probably not a big deal (yeah, somethings buggy anyway and needs to be
fixed or replaced), but I thought it worth pointing out.

  -John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-12 23:56   ` John Heffner
@ 2004-11-13 11:32     ` Francois Romieu
  0 siblings, 0 replies; 9+ messages in thread
From: Francois Romieu @ 2004-11-13 11:32 UTC (permalink / raw)
  To: John Heffner; +Cc: Dave Hansen, netdev

John Heffner <jheffner@psc.edu> :
[...]
> The point is that I think it's a good idea to mitigate the effects of
> faulty hardware, especially if you can do so nearly for free.

Perhaps it could go as a new option for ethtool -t.

--
Ueimor

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-12 23:46 A case AGAINST checksum offload John Heffner
  2004-11-12 23:49 ` David S. Miller
  2004-11-12 23:53 ` Dave Hansen
@ 2004-11-14 20:01 ` Florian Weimer
  2004-11-14 22:19   ` Pekka Pietikainen
  2 siblings, 1 reply; 9+ messages in thread
From: Florian Weimer @ 2004-11-14 20:01 UTC (permalink / raw)
  To: John Heffner; +Cc: netdev

* John Heffner:

> Currently with many common Ethernet devices in Linux, hardware TCP
> checksumming is enabled by default.  This seems fairly dangerous to me.
> Most link layer checksums are much stronger than the TCP/UDP checksum;
> most bit errors are caught by these.  However, one of the primary purposes
> of the TCP/UDP checksum is to detect errors occurring outside the
> protection of the link layer checksums -- errors when data is reassembled
> or copied across busses inside hosts and routers.

The IP checksum is quite bad at catching those, though.  Broken memory
banks or busses tend to introduce bit errors in distances which are
multiples of 16 bits (something like 64 or 256).  Because of the way
the IP checksum works, two such errors in the same packet cancel out
and go undetected.

I was once on the receiving end of such packets, and I can tell you
it's not a fun thing to debug. 8-(

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A case AGAINST checksum offload
  2004-11-14 20:01 ` Florian Weimer
@ 2004-11-14 22:19   ` Pekka Pietikainen
  0 siblings, 0 replies; 9+ messages in thread
From: Pekka Pietikainen @ 2004-11-14 22:19 UTC (permalink / raw)
  To: Florian Weimer; +Cc: John Heffner, netdev

On Sun, Nov 14, 2004 at 09:01:14PM +0100, Florian Weimer wrote:
> * John Heffner:
> 
> > of the TCP/UDP checksum is to detect errors occurring outside the
> > protection of the link layer checksums -- errors when data is reassembled
> > or copied across busses inside hosts and routers.
> 
> The IP checksum is quite bad at catching those, though.  Broken memory
> banks or busses tend to introduce bit errors in distances which are
> multiples of 16 bits (something like 64 or 256).  Because of the way
> the IP checksum works, two such errors in the same packet cancel out
> and go undetected.
> I was once on the receiving end of such packets, and I can tell you
> it's not a fun thing to debug. 8-(
Btw., "When the CRC and TCP Checksum Disagree" 
http://citeseer.ist.psu.edu/stone00when.html is well worth reading.

Doesn't go into the offload vs. host IP checksum case too heavily, though,
I'm not sure if anyone really has data on that. The impression I have is 
that the risk isn't that big. If you're having flipped bits in
your (non-ECC :-) ) memory, you lose. If your PCI bus flips bits,
you probably lose when the data is read off disk. If your NIC has a
bad checksum engine, well... Then the IP checksums end up bad on the remote
end, packets get dropped, people tend to notice and that chip gets host-based
checksums soon enough. 

What definately would make sense is using user-space checksums (or just
transmit output from a PRNG + the seed and compare the streams)
in driver/hardware stress testing. And testing all those corner cases which
the driver/NIC might have gotten wrong.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-11-14 22:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-12 23:46 A case AGAINST checksum offload John Heffner
2004-11-12 23:49 ` David S. Miller
2004-11-13  0:36   ` John Heffner
2004-11-13  0:29     ` David S. Miller
2004-11-12 23:53 ` Dave Hansen
2004-11-12 23:56   ` John Heffner
2004-11-13 11:32     ` Francois Romieu
2004-11-14 20:01 ` Florian Weimer
2004-11-14 22:19   ` Pekka Pietikainen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).