* [PATCH net] enic: fix rx skb checksum
@ 2014-12-18 10:28 Govindarajulu Varadarajan
2014-12-18 13:58 ` Sergei Shtylyov
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Govindarajulu Varadarajan @ 2014-12-18 10:28 UTC (permalink / raw)
To: davem, netdev, ssujith, benve
Cc: Govindarajulu Varadarajan, Jiri Benc, Stefan Assmann
Hardware always provides compliment of IP pseudo checksum. Stack expects
whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.
This causes checksum error in nf & ovs.
kernel: qg-19546f09-f2: hw csum failure
kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF O-------------- 3.10.0-123.8.1.el7.x86_64 #1
kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
kernel: Call Trace:
kernel: <IRQ> [<ffffffff815e237b>] dump_stack+0x19/0x1b
kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380
Hardware verifies IP & tcp/udp header checksum but does not provide payload
checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Stefan Assmann <sassmann@redhat.com>
Reported-by: Sunil Choudhary <schoudha@redhat.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 868d0f6..705f334 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1060,10 +1060,14 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
}
- if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc) {
- skb->csum = htons(checksum);
- skb->ip_summed = CHECKSUM_COMPLETE;
- }
+ /* Hardware does not provide whole packet checksum. It only
+ * provides pseudo checksum. Since hw validates the packet
+ * checksum but not provide us the checksum value. use
+ * CHECSUM_UNNECESSARY.
+ */
+ if ((netdev->features & NETIF_F_RXCSUM) && tcp_udp_csum_ok &&
+ ipv4_csum_ok)
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
if (vlan_stripped)
__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tci);
--
2.1.3
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH net] enic: fix rx skb checksum 2014-12-18 10:28 [PATCH net] enic: fix rx skb checksum Govindarajulu Varadarajan @ 2014-12-18 13:58 ` Sergei Shtylyov 2014-12-18 16:26 ` Eric Dumazet 2014-12-19 11:11 ` Jiri Benc 2 siblings, 0 replies; 10+ messages in thread From: Sergei Shtylyov @ 2014-12-18 13:58 UTC (permalink / raw) To: Govindarajulu Varadarajan, davem, netdev, ssujith, benve Cc: Jiri Benc, Stefan Assmann Hello. On 12/18/2014 1:28 PM, Govindarajulu Varadarajan wrote: > Hardware always provides compliment of IP pseudo checksum. Stack expects > whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set. > This causes checksum error in nf & ovs. > kernel: qg-19546f09-f2: hw csum failure > kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF O-------------- 3.10.0-123.8.1.el7.x86_64 #1 > kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014 > kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b > kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0 > kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00 > kernel: Call Trace: > kernel: <IRQ> [<ffffffff815e237b>] dump_stack+0x19/0x1b > kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40 > kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70 > kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20 > kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100 > kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4] > kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0 > kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch] > kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack] > kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 > kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4] > kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0 > kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 > kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140 > kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 > kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380 > Hardware verifies IP & tcp/udp header checksum but does not provide payload > checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet. > Cc: Jiri Benc <jbenc@redhat.com> > Cc: Stefan Assmann <sassmann@redhat.com> > Reported-by: Sunil Choudhary <schoudha@redhat.com> > Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> > --- > drivers/net/ethernet/cisco/enic/enic_main.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c > index 868d0f6..705f334 100644 > --- a/drivers/net/ethernet/cisco/enic/enic_main.c > +++ b/drivers/net/ethernet/cisco/enic/enic_main.c > @@ -1060,10 +1060,14 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq, > PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3); > } > > - if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc) { > - skb->csum = htons(checksum); > - skb->ip_summed = CHECKSUM_COMPLETE; > - } > + /* Hardware does not provide whole packet checksum. It only > + * provides pseudo checksum. Since hw validates the packet > + * checksum but not provide us the checksum value. use s/not/doesn't/, s/value./value,/. [...] WBR, Sergei ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-18 10:28 [PATCH net] enic: fix rx skb checksum Govindarajulu Varadarajan 2014-12-18 13:58 ` Sergei Shtylyov @ 2014-12-18 16:26 ` Eric Dumazet 2014-12-18 17:39 ` Jay Vosburgh 2014-12-19 11:11 ` Jiri Benc 2 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2014-12-18 16:26 UTC (permalink / raw) To: Govindarajulu Varadarajan Cc: davem, netdev, ssujith, benve, Jiri Benc, Stefan Assmann On Thu, 2014-12-18 at 15:58 +0530, Govindarajulu Varadarajan wrote: > Hardware always provides compliment of IP pseudo checksum. Stack expects > whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set. > > This causes checksum error in nf & ovs. > > kernel: qg-19546f09-f2: hw csum failure > kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF O-------------- 3.10.0-123.8.1.el7.x86_64 #1 > kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014 > kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b > kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0 > kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00 > kernel: Call Trace: > kernel: <IRQ> [<ffffffff815e237b>] dump_stack+0x19/0x1b > kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40 > kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70 > kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20 > kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100 > kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4] > kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0 > kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch] > kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack] > kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 > kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4] > kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0 > kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 > kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140 > kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 > kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380 > > Hardware verifies IP & tcp/udp header checksum but does not provide payload > checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet. > > Cc: Jiri Benc <jbenc@redhat.com> > Cc: Stefan Assmann <sassmann@redhat.com> > Reported-by: Sunil Choudhary <schoudha@redhat.com> > Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> > --- > drivers/net/ethernet/cisco/enic/enic_main.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c > index 868d0f6..705f334 100644 > --- a/drivers/net/ethernet/cisco/enic/enic_main.c > +++ b/drivers/net/ethernet/cisco/enic/enic_main.c > @@ -1060,10 +1060,14 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq, > PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3); > } > > - if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc) { > - skb->csum = htons(checksum); > - skb->ip_summed = CHECKSUM_COMPLETE; > - } > + /* Hardware does not provide whole packet checksum. It only > + * provides pseudo checksum. Since hw validates the packet > + * checksum but not provide us the checksum value. use > + * CHECSUM_UNNECESSARY. > + */ > + if ((netdev->features & NETIF_F_RXCSUM) && tcp_udp_csum_ok && > + ipv4_csum_ok) > + skb->ip_summed = CHECKSUM_UNNECESSARY; > > if (vlan_stripped) > __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tci); Hmm.. this looks like CHECKSUM_COMPLETE could be kept for tunneling cases. Check commit f8c6455bb04b944e ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE") for a start. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-18 16:26 ` Eric Dumazet @ 2014-12-18 17:39 ` Jay Vosburgh 2014-12-18 17:49 ` David Miller 2014-12-19 10:07 ` Jiri Benc 0 siblings, 2 replies; 10+ messages in thread From: Jay Vosburgh @ 2014-12-18 17:39 UTC (permalink / raw) To: Eric Dumazet Cc: Govindarajulu Varadarajan, davem, netdev, ssujith, benve, Jiri Benc, Stefan Assmann Eric Dumazet <eric.dumazet@gmail.com> wrote: >On Thu, 2014-12-18 at 15:58 +0530, Govindarajulu Varadarajan wrote: >> Hardware always provides compliment of IP pseudo checksum. Stack expects >> whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set. >> >> This causes checksum error in nf & ovs. >> >> kernel: qg-19546f09-f2: hw csum failure >> kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF O-------------- 3.10.0-123.8.1.el7.x86_64 #1 >> kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014 >> kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b >> kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0 >> kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00 >> kernel: Call Trace: >> kernel: <IRQ> [<ffffffff815e237b>] dump_stack+0x19/0x1b >> kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40 >> kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70 >> kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20 >> kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100 >> kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4] >> kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0 >> kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch] >> kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack] >> kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 >> kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4] >> kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0 >> kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 >> kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140 >> kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40 >> kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380 >> >> Hardware verifies IP & tcp/udp header checksum but does not provide payload >> checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet. >> >> Cc: Jiri Benc <jbenc@redhat.com> >> Cc: Stefan Assmann <sassmann@redhat.com> >> Reported-by: Sunil Choudhary <schoudha@redhat.com> >> Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> >> --- >> drivers/net/ethernet/cisco/enic/enic_main.c | 12 ++++++++---- >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c >> index 868d0f6..705f334 100644 >> --- a/drivers/net/ethernet/cisco/enic/enic_main.c >> +++ b/drivers/net/ethernet/cisco/enic/enic_main.c >> @@ -1060,10 +1060,14 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq, >> PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3); >> } >> >> - if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc) { >> - skb->csum = htons(checksum); >> - skb->ip_summed = CHECKSUM_COMPLETE; >> - } >> + /* Hardware does not provide whole packet checksum. It only >> + * provides pseudo checksum. Since hw validates the packet >> + * checksum but not provide us the checksum value. use >> + * CHECSUM_UNNECESSARY. >> + */ >> + if ((netdev->features & NETIF_F_RXCSUM) && tcp_udp_csum_ok && >> + ipv4_csum_ok) >> + skb->ip_summed = CHECKSUM_UNNECESSARY; >> >> if (vlan_stripped) >> __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tci); > >Hmm.. this looks like CHECKSUM_COMPLETE could be kept for tunneling >cases. > >Check commit f8c6455bb04b944e ("net/mlx4_en: Extend checksum offloading >by CHECKSUM COMPLETE") for a start. I've actually been looking into this "hw csum failure" (as it appears with OVS and VXLAN) the last couple of days, and, at least on the sky2 hardware I have, the problem doesn't appear to be the hardware's CHECKSUM_COMPLETE checksumming. Instead, it appears that when the encapsulated packet is decapsulated and forwarded to the guest or container, the checksum isn't updated for the encapsulated ethernet header. eth_type_trans does a pull for the ethernet header, but skb->csum isn't updated. I'm testing this patch, and it resolves the problem for me, but I'm not yet entirely sure whether breaks something else: diff --git a/net/core/dev.c b/net/core/dev.c index f411c28..df755e5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1694,6 +1694,7 @@ int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb) skb_scrub_packet(skb, true); skb->protocol = eth_type_trans(skb, dev); + skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); return 0; } I'd very much appreciate comments from someone who knows the checksum logic better than I do as to whether this is a reasonable thing to do. I've also not tested it on enic hardware. Govindarajulu, would you be able to test this against the unmodified enic driver and see if it resolves the problem for you? -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-18 17:39 ` Jay Vosburgh @ 2014-12-18 17:49 ` David Miller 2014-12-19 10:07 ` Jiri Benc 1 sibling, 0 replies; 10+ messages in thread From: David Miller @ 2014-12-18 17:49 UTC (permalink / raw) To: jay.vosburgh Cc: eric.dumazet, _govind, netdev, ssujith, benve, jbenc, sassmann From: Jay Vosburgh <jay.vosburgh@canonical.com> Date: Thu, 18 Dec 2014 09:39:27 -0800 > @@ -1694,6 +1694,7 @@ int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb) > > skb_scrub_packet(skb, true); > skb->protocol = eth_type_trans(skb, dev); > + skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); > > return 0; > } > > I'd very much appreciate comments from someone who knows the > checksum logic better than I do as to whether this is a reasonable thing > to do. This looks quite reasonable to me. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-18 17:39 ` Jay Vosburgh 2014-12-18 17:49 ` David Miller @ 2014-12-19 10:07 ` Jiri Benc 2014-12-19 10:52 ` Govindarajulu Varadarajan 1 sibling, 1 reply; 10+ messages in thread From: Jiri Benc @ 2014-12-19 10:07 UTC (permalink / raw) To: Jay Vosburgh Cc: Eric Dumazet, Govindarajulu Varadarajan, davem, netdev, ssujith, benve, Stefan Assmann On Thu, 18 Dec 2014 09:39:27 -0800, Jay Vosburgh wrote: > I've actually been looking into this "hw csum failure" (as it > appears with OVS and VXLAN) the last couple of days, and, at least on > the sky2 hardware I have, the problem doesn't appear to be the > hardware's CHECKSUM_COMPLETE checksumming. With the enic driver, the problem _is_ the hardware checksumming. While debugging the "hw csum failure" messages, I verified the checksum returned by the hardware directly in the driver (in enic_rq_indicate_buf). It appears that for some packets (most notably ICMP ones), the hardware returns 0xffff. I did not see any other wrong value, in other words, the returned checksum was either correct, or 0xffff. I have no idea whether the hardware verified the checksum for the 0xffff case and is just not returning the checksum or whether such packets come completely unverified. As for Govindarajulu's patch, I believe we can do better. First, its description does not match what I'm seeing (I see correct checksum provided in most cases) and second, the driver should provide CHECKSUM_COMPLETE whenever possible; and it seems to be possible. > I've also not tested it on enic hardware. Govindarajulu, would > you be able to test this against the unmodified enic driver and see if > it resolves the problem for you? I'm quite sure it does not, the skb->csum field is incorrect even before the skb enters the stack. Jiri -- Jiri Benc ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-19 10:07 ` Jiri Benc @ 2014-12-19 10:52 ` Govindarajulu Varadarajan 2014-12-19 11:07 ` Jiri Benc 0 siblings, 1 reply; 10+ messages in thread From: Govindarajulu Varadarajan @ 2014-12-19 10:52 UTC (permalink / raw) To: Jiri Benc Cc: Jay Vosburgh, Eric Dumazet, Govindarajulu Varadarajan, davem, netdev, ssujith, benve, Stefan Assmann On Fri, 19 Dec 2014, Jiri Benc wrote: > On Thu, 18 Dec 2014 09:39:27 -0800, Jay Vosburgh wrote: >> I've actually been looking into this "hw csum failure" (as it >> appears with OVS and VXLAN) the last couple of days, and, at least on >> the sky2 hardware I have, the problem doesn't appear to be the >> hardware's CHECKSUM_COMPLETE checksumming. > > With the enic driver, the problem _is_ the hardware checksumming. > > While debugging the "hw csum failure" messages, I verified the checksum > returned by the hardware directly in the driver (in > enic_rq_indicate_buf). It appears that for some packets (most notably > ICMP ones), the hardware returns 0xffff. I did not see any other wrong > value, in other words, the returned checksum was either correct, or > 0xffff. > Hardware returns 0xffff for non tcp/udp packets. For tcp/udp packet it returns pseudo checksum. Not the _whole_ pkt checksum. With the following changes, I see this output: diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c index e3dc629..0f2be67 100644 --- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -1092,6 +1092,24 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq, } if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc) { + if (printk_ratelimit()) { + const struct iphdr *iph = (struct iphdr *)skb->data; + __u16 length_for_csum = 0; + __wsum pseudo_csum = 0; + + length_for_csum = (be16_to_cpu(iph->tot_len) - (iph->ihl << 2)); + pseudo_csum = csum_tcpudp_nofold(iph->saddr, + iph->daddr, + length_for_csum, + iph->protocol, 0); + + pr_info("saddr=%x, daddr=%x, length=%d, proto=%d\n", + iph->saddr, iph->daddr, length_for_csum, iph->protocol); + pr_info("hw_checksum = %x, pseudo_checksum_32=%x, pseudo_checksum_fold=%x\n", + htons(checksum), + pseudo_csum, + csum_fold(pseudo_csum)); + } skb->csum = htons(checksum); skb->ip_summed = CHECKSUM_COMPLETE; } Output: Dec 18 11:13:18 a163 kernel: enic: saddr=96d8690a, daddr=a3ba6a0a, length=40, proto=6 Dec 18 11:13:18 a163 kernel: enic: hw_checksum = c457, pseudo_checksum_32=3a930115, pseudo_checksum_fold=c457 Dec 18 11:13:18 a163 kernel: enic: saddr=a37410a, daddr=a3ba6a0a, length=32, proto=6 Dec 18 11:13:18 a163 kernel: enic: hw_checksum = 80f9, pseudo_checksum_32=adf1d114, pseudo_checksum_fold=80f9 Dec 18 11:13:18 a163 kernel: enic: saddr=a37410a, daddr=a3ba6a0a, length=32, proto=6 Dec 18 11:13:18 a163 kernel: enic: hw_checksum = 80f9, pseudo_checksum_32=adf1d114, pseudo_checksum_fold=80f9 Clearly hw is returning folded pseudo checksum. > I have no idea whether the hardware verified the checksum for the > 0xffff case and is just not returning the checksum or whether such > packets come completely unverified. > Yes, hardware verifies the checksum and sets tcp_udp_csum_ok flag to 1. If pkt verification fails or pkt is not tcp/udp, tcp_udp_csum_ok is 0. And we send the pkt to stack with CHECKSUM_NONE. > As for Govindarajulu's patch, I believe we can do better. First, its > description does not match what I'm seeing (I see correct checksum > provided in most cases) and second, the driver should provide > CHECKSUM_COMPLETE whenever possible; and it seems to be possible. > Driver should use CHECKSUM_COMPLETE only if it can produce _whole_ pkt checksum. as described in include/linux/skbuff.h:75 * CHECKSUM_COMPLETE: * * This is the most generic way. The device supplied checksum of the _whole_ * packet as seen by netif_rx() and fills out in skb->csum. Meaning, the * hardware doesn't need to parse L3/L4 headers to implement this. * * Note: Even if device supports only some protocols, but is able to produce * skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY. Since enic hw verifies checksum but does not provide us whole pkt checksum, it should use CHECKSUM_UNNECESSARY. as described in include/linux/skbuff.h:47 * CHECKSUM_UNNECESSARY: * * The hardware you're dealing with doesn't calculate the full checksum * (as in CHECKSUM_COMPLETE), but it does parse headers and verify checksums * for specific protocols. For such packets it will set CHECKSUM_UNNECESSARY * if their checksums are okay. skb->csum is still undefined in this case Am I correct? >> I've also not tested it on enic hardware. Govindarajulu, would >> you be able to test this against the unmodified enic driver and see if >> it resolves the problem for you? > I think Jay Vosburgh's patch and my patch are addressing two different issues. I am trying to fix "Do not set CHECKSUM_COMPLETE, when driver does not have checksum of whole pkt." Is my understanding correct? Thanks > I'm quite sure it does not, the skb->csum field is incorrect even > before the skb enters the stack. > > Jiri > > -- > Jiri Benc > ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-19 10:52 ` Govindarajulu Varadarajan @ 2014-12-19 11:07 ` Jiri Benc 0 siblings, 0 replies; 10+ messages in thread From: Jiri Benc @ 2014-12-19 11:07 UTC (permalink / raw) To: Govindarajulu Varadarajan Cc: Jay Vosburgh, Eric Dumazet, davem, netdev, ssujith, benve, Stefan Assmann On Fri, 19 Dec 2014 16:22:34 +0530 (IST), Govindarajulu Varadarajan wrote: > Hardware returns 0xffff for non tcp/udp packets. That explains that I saw that with ICMP packets. > For tcp/udp packet it returns > pseudo checksum. Not the _whole_ pkt checksum. I see. I didn't get this from your patch, although you wrote it there, sorry. And I didn't dig that deep while debugging, I was satisfied with seeing 0xffff for the packets that caused problems :-) > Dec 18 11:13:18 a163 kernel: enic: saddr=96d8690a, daddr=a3ba6a0a, length=40, proto=6 > Dec 18 11:13:18 a163 kernel: enic: hw_checksum = c457, pseudo_checksum_32=3a930115, pseudo_checksum_fold=c457 > > Dec 18 11:13:18 a163 kernel: enic: saddr=a37410a, daddr=a3ba6a0a, length=32, proto=6 > Dec 18 11:13:18 a163 kernel: enic: hw_checksum = 80f9, pseudo_checksum_32=adf1d114, pseudo_checksum_fold=80f9 > > Dec 18 11:13:18 a163 kernel: enic: saddr=a37410a, daddr=a3ba6a0a, length=32, proto=6 > Dec 18 11:13:18 a163 kernel: enic: hw_checksum = 80f9, pseudo_checksum_32=adf1d114, pseudo_checksum_fold=80f9 > > Clearly hw is returning folded pseudo checksum. Indeed. > > I have no idea whether the hardware verified the checksum for the > > 0xffff case and is just not returning the checksum or whether such > > packets come completely unverified. > > Yes, hardware verifies the checksum and sets tcp_udp_csum_ok flag to 1. > If pkt verification fails or pkt is not tcp/udp, tcp_udp_csum_ok is 0. And we > send the pkt to stack with CHECKSUM_NONE. Thanks for the confirmation. > Driver should use CHECKSUM_COMPLETE only if it can produce _whole_ pkt checksum. > as described in include/linux/skbuff.h:75 Sure. I just misunderstood what the hardware provides. > I am trying to fix "Do not set CHECKSUM_COMPLETE, when driver does not have > checksum of whole pkt." > > Is my understanding correct? With the information you've just written (thanks for your patience), I think your patch is correct. Thanks a lot! Jiri -- Jiri Benc ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-18 10:28 [PATCH net] enic: fix rx skb checksum Govindarajulu Varadarajan 2014-12-18 13:58 ` Sergei Shtylyov 2014-12-18 16:26 ` Eric Dumazet @ 2014-12-19 11:11 ` Jiri Benc 2014-12-19 20:45 ` David Miller 2 siblings, 1 reply; 10+ messages in thread From: Jiri Benc @ 2014-12-19 11:11 UTC (permalink / raw) To: Govindarajulu Varadarajan; +Cc: davem, netdev, ssujith, benve, Stefan Assmann On Thu, 18 Dec 2014 15:58:42 +0530, Govindarajulu Varadarajan wrote: > Hardware always provides compliment of IP pseudo checksum. Stack expects > whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set. > > This causes checksum error in nf & ovs. > > [...] > > Hardware verifies IP & tcp/udp header checksum but does not provide payload > checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet. > > Cc: Jiri Benc <jbenc@redhat.com> > Cc: Stefan Assmann <sassmann@redhat.com> > Reported-by: Sunil Choudhary <schoudha@redhat.com> > Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Reviewed-by: Jiri Benc <jbenc@redhat.com> -- Jiri Benc ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] enic: fix rx skb checksum 2014-12-19 11:11 ` Jiri Benc @ 2014-12-19 20:45 ` David Miller 0 siblings, 0 replies; 10+ messages in thread From: David Miller @ 2014-12-19 20:45 UTC (permalink / raw) To: jbenc; +Cc: _govind, netdev, ssujith, benve, sassmann From: Jiri Benc <jbenc@redhat.com> Date: Fri, 19 Dec 2014 12:11:44 +0100 > On Thu, 18 Dec 2014 15:58:42 +0530, Govindarajulu Varadarajan wrote: >> Hardware always provides compliment of IP pseudo checksum. Stack expects >> whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set. >> >> This causes checksum error in nf & ovs. >> >> [...] >> >> Hardware verifies IP & tcp/udp header checksum but does not provide payload >> checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet. >> >> Cc: Jiri Benc <jbenc@redhat.com> >> Cc: Stefan Assmann <sassmann@redhat.com> >> Reported-by: Sunil Choudhary <schoudha@redhat.com> >> Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> > > Reviewed-by: Jiri Benc <jbenc@redhat.com> Applied, thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-12-19 20:46 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-18 10:28 [PATCH net] enic: fix rx skb checksum Govindarajulu Varadarajan 2014-12-18 13:58 ` Sergei Shtylyov 2014-12-18 16:26 ` Eric Dumazet 2014-12-18 17:39 ` Jay Vosburgh 2014-12-18 17:49 ` David Miller 2014-12-19 10:07 ` Jiri Benc 2014-12-19 10:52 ` Govindarajulu Varadarajan 2014-12-19 11:07 ` Jiri Benc 2014-12-19 11:11 ` Jiri Benc 2014-12-19 20:45 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).