* SW csum errors
@ 2013-10-14 20:13 Kyle Hubert
2013-10-14 20:40 ` Eric Dumazet
2013-10-14 20:58 ` Stephen Hemminger
0 siblings, 2 replies; 6+ messages in thread
From: Kyle Hubert @ 2013-10-14 20:13 UTC (permalink / raw)
To: netdev
My problem is rather specific. I am working on an RDMA device, and we
have full end to end reliability. However, one of the initial spins of
our chip had some errors, since fixed, where the csum was unreliable.
So, we did exactly what Dave Miller warned not to do in the linked
message. We ran outgoing IP packets through the SKB checksum
function.. Unfortunately, we occasionally saw NFS csum errors on full
MTU packets.
Here is his response:
http://marc.info/?l=linux-netdev&m=128286758300676&w=2
Relevant portion:
"
Paged SKBs can have references to page cache pages and similar. These
can be updated asynchronously to the transmit, there is no locking at
all to freeze the contents, and therefore full checksum offload is
required to support SG correctly.
So don't get the idea to do the checksum in software in the infiniband
layer, and advertize hw checksumming support, to get around this :-)
"
Now that those chips have long gone, I am left pondering about these
packets "corrupted" before the device transfers them. Can I get more
information about these paged SKBs with asynchronous modifications?
How does NFS use them?
Thanks for your time,
-Kyle
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: SW csum errors
2013-10-14 20:13 SW csum errors Kyle Hubert
@ 2013-10-14 20:40 ` Eric Dumazet
2013-10-14 20:58 ` Stephen Hemminger
1 sibling, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2013-10-14 20:40 UTC (permalink / raw)
To: Kyle Hubert; +Cc: netdev
On Mon, 2013-10-14 at 16:13 -0400, Kyle Hubert wrote:
> My problem is rather specific. I am working on an RDMA device, and we
> have full end to end reliability. However, one of the initial spins of
> our chip had some errors, since fixed, where the csum was unreliable.
> So, we did exactly what Dave Miller warned not to do in the linked
> message. We ran outgoing IP packets through the SKB checksum
> function.. Unfortunately, we occasionally saw NFS csum errors on full
> MTU packets.
>
> Here is his response:
>
> http://marc.info/?l=linux-netdev&m=128286758300676&w=2
>
> Relevant portion:
>
> "
> Paged SKBs can have references to page cache pages and similar. These
> can be updated asynchronously to the transmit, there is no locking at
> all to freeze the contents, and therefore full checksum offload is
> required to support SG correctly.
>
> So don't get the idea to do the checksum in software in the infiniband
> layer, and advertize hw checksumming support, to get around this :-)
> "
>
> Now that those chips have long gone, I am left pondering about these
> packets "corrupted" before the device transfers them. Can I get more
> information about these paged SKBs with asynchronous modifications?
> How does NFS use them?
For a start try :
git grep -n SKBTX_SHARED_FRAG
git grep -n skb_has_shared_frag
git show cef401de7be8c4e
git show c9af6db4c11ccc6
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: SW csum errors
2013-10-14 20:13 SW csum errors Kyle Hubert
2013-10-14 20:40 ` Eric Dumazet
@ 2013-10-14 20:58 ` Stephen Hemminger
2013-10-16 15:10 ` Kyle Hubert
1 sibling, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2013-10-14 20:58 UTC (permalink / raw)
To: Kyle Hubert; +Cc: netdev
On Mon, 14 Oct 2013 16:13:15 -0400
Kyle Hubert <khubert@gmail.com> wrote:
> My problem is rather specific. I am working on an RDMA device, and we
> have full end to end reliability. However, one of the initial spins of
> our chip had some errors, since fixed, where the csum was unreliable.
> So, we did exactly what Dave Miller warned not to do in the linked
> message. We ran outgoing IP packets through the SKB checksum
> function.. Unfortunately, we occasionally saw NFS csum errors on full
> MTU packets.
>
> Here is his response:
>
> http://marc.info/?l=linux-netdev&m=128286758300676&w=2
>
> Relevant portion:
>
> "
> Paged SKBs can have references to page cache pages and similar. These
> can be updated asynchronously to the transmit, there is no locking at
> all to freeze the contents, and therefore full checksum offload is
> required to support SG correctly.
>
> So don't get the idea to do the checksum in software in the infiniband
> layer, and advertize hw checksumming support, to get around this :-)
> "
>
> Now that those chips have long gone, I am left pondering about these
> packets "corrupted" before the device transfers them. Can I get more
> information about these paged SKBs with asynchronous modifications?
> How does NFS use them?
You would have to either mark the pages as copy on write or copy the data.
Setting COW is expensive because you have to coordinate with other CPU's
on SMP. Not sure exactly how.
You can demonstrate this with either sendfile() or NFS where underlying
file contents are being modified while packet is in the queue.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: SW csum errors
2013-10-14 20:58 ` Stephen Hemminger
@ 2013-10-16 15:10 ` Kyle Hubert
2013-10-16 15:24 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Kyle Hubert @ 2013-10-16 15:10 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
On Mon, Oct 14, 2013 at 4:58 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon, 14 Oct 2013 16:13:15 -0400
> Kyle Hubert <khubert@gmail.com> wrote:
>
>> My problem is rather specific. I am working on an RDMA device, and we
>> have full end to end reliability. However, one of the initial spins of
>> our chip had some errors, since fixed, where the csum was unreliable.
>> So, we did exactly what Dave Miller warned not to do in the linked
>> message. We ran outgoing IP packets through the SKB checksum
>> function.. Unfortunately, we occasionally saw NFS csum errors on full
>> MTU packets.
>>
>> Here is his response:
>>
>> http://marc.info/?l=linux-netdev&m=128286758300676&w=2
>>
>> Relevant portion:
>>
>> "
>> Paged SKBs can have references to page cache pages and similar. These
>> can be updated asynchronously to the transmit, there is no locking at
>> all to freeze the contents, and therefore full checksum offload is
>> required to support SG correctly.
>>
>> So don't get the idea to do the checksum in software in the infiniband
>> layer, and advertize hw checksumming support, to get around this :-)
>> "
>>
>> Now that those chips have long gone, I am left pondering about these
>> packets "corrupted" before the device transfers them. Can I get more
>> information about these paged SKBs with asynchronous modifications?
>> How does NFS use them?
>
> You would have to either mark the pages as copy on write or copy the data.
> Setting COW is expensive because you have to coordinate with other CPU's
> on SMP. Not sure exactly how.
>
> You can demonstrate this with either sendfile() or NFS where underlying
> file contents are being modified while packet is in the queue.
Thanks, I didn't realize it was as simple as file backed pages being
changed. Yes, our device does support SG, so we do have zero-copy
sendfile() support. I'll concoct a simple test to prove this.
-Kyle
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: SW csum errors
2013-10-16 15:10 ` Kyle Hubert
@ 2013-10-16 15:24 ` Eric Dumazet
2013-10-16 15:58 ` Kyle Hubert
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-10-16 15:24 UTC (permalink / raw)
To: Kyle Hubert; +Cc: Stephen Hemminger, netdev
On Wed, 2013-10-16 at 11:10 -0400, Kyle Hubert wrote:
> Thanks, I didn't realize it was as simple as file backed pages being
> changed. Yes, our device does support SG, so we do have zero-copy
> sendfile() support. I'll concoct a simple test to prove this.
You also can use vmsplice()/splice() and touch anonymous memory,
no need to play with a file ;)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: SW csum errors
2013-10-16 15:24 ` Eric Dumazet
@ 2013-10-16 15:58 ` Kyle Hubert
0 siblings, 0 replies; 6+ messages in thread
From: Kyle Hubert @ 2013-10-16 15:58 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Stephen Hemminger, netdev
On Wed, Oct 16, 2013 at 11:24 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2013-10-16 at 11:10 -0400, Kyle Hubert wrote:
>
>> Thanks, I didn't realize it was as simple as file backed pages being
>> changed. Yes, our device does support SG, so we do have zero-copy
>> sendfile() support. I'll concoct a simple test to prove this.
>
> You also can use vmsplice()/splice() and touch anonymous memory,
> no need to play with a file ;)
Thanks, this would be even easier to test. It also reminds me that
there are other vectors to constructing SKB frags. It looks like if we
need to enable SW checksumming ever again, we should just disable the
SG feature and let the normal stack handle validity.
-Kyle
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-10-16 15:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-14 20:13 SW csum errors Kyle Hubert
2013-10-14 20:40 ` Eric Dumazet
2013-10-14 20:58 ` Stephen Hemminger
2013-10-16 15:10 ` Kyle Hubert
2013-10-16 15:24 ` Eric Dumazet
2013-10-16 15:58 ` Kyle Hubert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox