* [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum
@ 2016-06-27 18:22 Cong Wang
2016-06-27 21:08 ` Or Gerlitz
0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2016-06-27 18:22 UTC (permalink / raw)
To: netdev; +Cc: Cong Wang, Shani Michaeli, Tariq Toukan, Yishai Hadas
The stack doesn't trust the complete csum by hardware
even when it is correct. In the case we fix csum by ourself
probably it is safe to just mark it as completed by software.
This should shut up a kernel warning from netdev_rx_csum_fault()
with mlx4 driver for ICMP packets.
Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Cc: Shani Michaeli <shanim@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index c1b3a9c..b44c434 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -732,6 +732,7 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
if (get_fixed_ipv6_csum(hw_checksum, skb, hdr))
return -1;
#endif
+ skb->csum_complete_sw = 1;
return 0;
}
--
2.1.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 18:22 [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum Cong Wang @ 2016-06-27 21:08 ` Or Gerlitz 2016-06-27 21:44 ` Cong Wang 0 siblings, 1 reply; 13+ messages in thread From: Or Gerlitz @ 2016-06-27 21:08 UTC (permalink / raw) To: Cong Wang, Tom Herbert Cc: Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > The stack doesn't trust the complete csum by hardware > even when it is correct. Can you explain that a little further? > In the case we fix csum by ourself > probably it is safe to just mark it as completed by software. > This should shut up a kernel warning from netdev_rx_csum_fault() > with mlx4 driver for ICMP packets. can you point/paste the exact warning and how to reproduce that? is that as simple as running ping and/or ping6? > Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') > Cc: Shani Michaeli <shanim@mellanox.com> > Cc: Tariq Toukan <tariqt@mellanox.com> > Cc: Yishai Hadas <yishaih@mellanox.com> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> > --- > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > index c1b3a9c..b44c434 100644 > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c > @@ -732,6 +732,7 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, > if (get_fixed_ipv6_csum(hw_checksum, skb, hdr)) > return -1; > #endif > + skb->csum_complete_sw = 1; > return 0; > } > > -- > 2.1.0 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 21:08 ` Or Gerlitz @ 2016-06-27 21:44 ` Cong Wang 2016-06-27 21:47 ` Tom Herbert 2016-06-28 14:30 ` Or Gerlitz 0 siblings, 2 replies; 13+ messages in thread From: Cong Wang @ 2016-06-27 21:44 UTC (permalink / raw) To: Or Gerlitz Cc: Tom Herbert, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: > On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >> The stack doesn't trust the complete csum by hardware >> even when it is correct. > > Can you explain that a little further? Sure, here is the code in __skb_checksum_complete(): /* skb->csum holds pseudo checksum */ sum = csum_fold(csum_add(skb->csum, csum)); if (likely(!sum)) { if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && !skb->csum_complete_sw) netdev_rx_csum_fault(skb->dev); } So when sum == 0, it means the checksum is correct. And we already set ->ip_summed to CHECKSUM_COMPLETE after check_csum(), and ->csum_complete_sw is initialized to 0 when we allocate the skb. This is why we trigger netdev_rx_csum_fault(). > >> In the case we fix csum by ourself >> probably it is safe to just mark it as completed by software. > >> This should shut up a kernel warning from netdev_rx_csum_fault() >> with mlx4 driver for ICMP packets. > > can you point/paste the exact warning and how to reproduce that? is > that as simple as running ping and/or ping6? Yes, ping is enough to reproduce it every time. The warning is below: [ 8693.680997] eth0: hw csum failure [ 8693.681003] CPU: 5 PID: 34 Comm: ksoftirqd/5 Not tainted 4.1.20-t6.el5 #1 [ 8693.681005] Hardware name: SYNNEX HYVE-ZEUS/X9DRD-iF, BIOS 3.0.4 12/06/2013 [ 8693.681008] 0000000000000000 ffff88085c15fae8 ffffffff81502872 ffff881051397800 [ 8693.681011] ffff881054c08b01 ffff88085c15fb08 ffffffff814569c5 0000000000000000 [ 8693.681014] ffff8808572d7200 ffff88085c15fb38 ffffffff81450738 ffff8808572d7200 [ 8693.681017] Call Trace: [ 8693.681025] [<ffffffff81502872>] dump_stack+0x4d/0x63 [ 8693.681030] [<ffffffff814569c5>] netdev_rx_csum_fault+0x38/0x3c [ 8693.681033] [<ffffffff81450738>] __skb_checksum_complete+0x6e/0xb6 [ 8693.681036] [<ffffffff814b65d8>] icmp_rcv+0x17a/0x32f [ 8693.681040] [<ffffffff8148b7f0>] ip_local_deliver_finish+0xd1/0x153 [ 8693.681042] [<ffffffff8148b9f7>] ip_local_deliver+0x8d/0x94 [ 8693.681045] [<ffffffff8148b71f>] ? xfrm4_policy_check.constprop.6+0x55/0x55 [ 8693.681048] [<ffffffff8148b651>] ip_rcv_finish+0x289/0x2cc [ 8693.681050] [<ffffffff8148bc7b>] ip_rcv+0x27d/0x30a [ 8693.681053] [<ffffffff81457e01>] __netif_receive_skb_core+0x3f2/0x483 [ 8693.681056] [<ffffffff81457eaa>] __netif_receive_skb+0x18/0x5a [ 8693.681058] [<ffffffff81457f7c>] process_backlog+0x90/0x10c [ 8693.681061] [<ffffffff814596e0>] net_rx_action+0x101/0x2aa [ 8693.681066] [<ffffffff8106c39f>] __do_softirq+0x10c/0x26d [ 8693.681068] [<ffffffff8106c51a>] run_ksoftirqd+0x1a/0x2f [ 8693.681071] [<ffffffff81083675>] smpboot_thread_fn+0x149/0x167 [ 8693.681074] [<ffffffff8108352c>] ? sort_range+0x24/0x24 [ 8693.681076] [<ffffffff8108352c>] ? sort_range+0x24/0x24 [ 8693.681080] [<ffffffff81080d9c>] kthread+0xae/0xb6 [ 8693.681082] [<ffffffff81080303>] ? add_sysfs_param.isra.4+0xe1/0x18c [ 8693.681085] [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61 [ 8693.681088] [<ffffffff81508152>] ret_from_fork+0x42/0x70 [ 8693.681090] [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 21:44 ` Cong Wang @ 2016-06-27 21:47 ` Tom Herbert 2016-06-27 21:49 ` Cong Wang 2016-06-28 14:30 ` Or Gerlitz 1 sibling, 1 reply; 13+ messages in thread From: Tom Herbert @ 2016-06-27 21:47 UTC (permalink / raw) To: Cong Wang Cc: Or Gerlitz, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 2:44 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: >> On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>> The stack doesn't trust the complete csum by hardware >>> even when it is correct. >> >> Can you explain that a little further? > > Sure, here is the code in __skb_checksum_complete(): > > /* skb->csum holds pseudo checksum */ > sum = csum_fold(csum_add(skb->csum, csum)); > if (likely(!sum)) { > if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && > !skb->csum_complete_sw) > netdev_rx_csum_fault(skb->dev); > } > > So when sum == 0, it means the checksum is correct. And > we already set ->ip_summed to CHECKSUM_COMPLETE > after check_csum(), and ->csum_complete_sw is initialized > to 0 when we allocate the skb. This is why we trigger > netdev_rx_csum_fault(). > Yes, but this also means that the driver gave the stack a checksum complete value that was incorrect. That's an error. Tom > >> >>> In the case we fix csum by ourself >>> probably it is safe to just mark it as completed by software. >> >>> This should shut up a kernel warning from netdev_rx_csum_fault() >>> with mlx4 driver for ICMP packets. >> >> can you point/paste the exact warning and how to reproduce that? is >> that as simple as running ping and/or ping6? > > Yes, ping is enough to reproduce it every time. > > The warning is below: > > [ 8693.680997] eth0: hw csum failure > [ 8693.681003] CPU: 5 PID: 34 Comm: ksoftirqd/5 Not tainted 4.1.20-t6.el5 #1 > [ 8693.681005] Hardware name: SYNNEX HYVE-ZEUS/X9DRD-iF, BIOS 3.0.4 12/06/2013 > [ 8693.681008] 0000000000000000 ffff88085c15fae8 ffffffff81502872 > ffff881051397800 > [ 8693.681011] ffff881054c08b01 ffff88085c15fb08 ffffffff814569c5 > 0000000000000000 > [ 8693.681014] ffff8808572d7200 ffff88085c15fb38 ffffffff81450738 > ffff8808572d7200 > [ 8693.681017] Call Trace: > [ 8693.681025] [<ffffffff81502872>] dump_stack+0x4d/0x63 > [ 8693.681030] [<ffffffff814569c5>] netdev_rx_csum_fault+0x38/0x3c > [ 8693.681033] [<ffffffff81450738>] __skb_checksum_complete+0x6e/0xb6 > [ 8693.681036] [<ffffffff814b65d8>] icmp_rcv+0x17a/0x32f > [ 8693.681040] [<ffffffff8148b7f0>] ip_local_deliver_finish+0xd1/0x153 > [ 8693.681042] [<ffffffff8148b9f7>] ip_local_deliver+0x8d/0x94 > [ 8693.681045] [<ffffffff8148b71f>] ? xfrm4_policy_check.constprop.6+0x55/0x55 > [ 8693.681048] [<ffffffff8148b651>] ip_rcv_finish+0x289/0x2cc > [ 8693.681050] [<ffffffff8148bc7b>] ip_rcv+0x27d/0x30a > [ 8693.681053] [<ffffffff81457e01>] __netif_receive_skb_core+0x3f2/0x483 > [ 8693.681056] [<ffffffff81457eaa>] __netif_receive_skb+0x18/0x5a > [ 8693.681058] [<ffffffff81457f7c>] process_backlog+0x90/0x10c > [ 8693.681061] [<ffffffff814596e0>] net_rx_action+0x101/0x2aa > [ 8693.681066] [<ffffffff8106c39f>] __do_softirq+0x10c/0x26d > [ 8693.681068] [<ffffffff8106c51a>] run_ksoftirqd+0x1a/0x2f > [ 8693.681071] [<ffffffff81083675>] smpboot_thread_fn+0x149/0x167 > [ 8693.681074] [<ffffffff8108352c>] ? sort_range+0x24/0x24 > [ 8693.681076] [<ffffffff8108352c>] ? sort_range+0x24/0x24 > [ 8693.681080] [<ffffffff81080d9c>] kthread+0xae/0xb6 > [ 8693.681082] [<ffffffff81080303>] ? add_sysfs_param.isra.4+0xe1/0x18c > [ 8693.681085] [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61 > [ 8693.681088] [<ffffffff81508152>] ret_from_fork+0x42/0x70 > [ 8693.681090] [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 21:47 ` Tom Herbert @ 2016-06-27 21:49 ` Cong Wang 2016-06-27 22:04 ` Tom Herbert 0 siblings, 1 reply; 13+ messages in thread From: Cong Wang @ 2016-06-27 21:49 UTC (permalink / raw) To: Tom Herbert Cc: Or Gerlitz, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 2:47 PM, Tom Herbert <tom@herbertland.com> wrote: > On Mon, Jun 27, 2016 at 2:44 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >> On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: >>> On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>> The stack doesn't trust the complete csum by hardware >>>> even when it is correct. >>> >>> Can you explain that a little further? >> >> Sure, here is the code in __skb_checksum_complete(): >> >> /* skb->csum holds pseudo checksum */ >> sum = csum_fold(csum_add(skb->csum, csum)); >> if (likely(!sum)) { >> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >> !skb->csum_complete_sw) >> netdev_rx_csum_fault(skb->dev); >> } >> >> So when sum == 0, it means the checksum is correct. And >> we already set ->ip_summed to CHECKSUM_COMPLETE >> after check_csum(), and ->csum_complete_sw is initialized >> to 0 when we allocate the skb. This is why we trigger >> netdev_rx_csum_fault(). >> > Yes, but this also means that the driver gave the stack a checksum > complete value that was incorrect. That's an error. That is the whole purpose of commit f8c6455bb04b944edb69e, isn't it? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 21:49 ` Cong Wang @ 2016-06-27 22:04 ` Tom Herbert 2016-06-27 22:53 ` Cong Wang 0 siblings, 1 reply; 13+ messages in thread From: Tom Herbert @ 2016-06-27 22:04 UTC (permalink / raw) To: Cong Wang Cc: Or Gerlitz, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 2:49 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Mon, Jun 27, 2016 at 2:47 PM, Tom Herbert <tom@herbertland.com> wrote: >> On Mon, Jun 27, 2016 at 2:44 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>> On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: >>>> On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>>> The stack doesn't trust the complete csum by hardware >>>>> even when it is correct. >>>> >>>> Can you explain that a little further? >>> >>> Sure, here is the code in __skb_checksum_complete(): >>> >>> /* skb->csum holds pseudo checksum */ >>> sum = csum_fold(csum_add(skb->csum, csum)); >>> if (likely(!sum)) { >>> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >>> !skb->csum_complete_sw) >>> netdev_rx_csum_fault(skb->dev); >>> } >>> >>> So when sum == 0, it means the checksum is correct. And >>> we already set ->ip_summed to CHECKSUM_COMPLETE >>> after check_csum(), and ->csum_complete_sw is initialized >>> to 0 when we allocate the skb. This is why we trigger >>> netdev_rx_csum_fault(). >>> >> Yes, but this also means that the driver gave the stack a checksum >> complete value that was incorrect. That's an error. > > That is the whole purpose of commit f8c6455bb04b944edb69e, > isn't it? No. Unless you've uncovered some other bug, what is probably happening is that driver receives a packet with a checksum complete value. It records the value in the skbuff and marks it as CHECKSUM_COMPLETE. Subsequently, the stack tries to validate a transport layer checksum, and the validation fails (checksum does not sum to zero). The stack will then call __skb_checksum_complete from __skb_checksum_validate_complete. In this case the stack computes that transport checksum by hand and sees that transport checksum is valid-- so that means that the original value in checksum complete was not correct, it is not set to the computed checksum of the whole packet. This is an important error because it catches issues where checksum is not correctly being pulled up. Tom ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 22:04 ` Tom Herbert @ 2016-06-27 22:53 ` Cong Wang 2016-06-27 23:16 ` Tom Herbert 0 siblings, 1 reply; 13+ messages in thread From: Cong Wang @ 2016-06-27 22:53 UTC (permalink / raw) To: Tom Herbert Cc: Or Gerlitz, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 3:04 PM, Tom Herbert <tom@herbertland.com> wrote: > On Mon, Jun 27, 2016 at 2:49 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >> On Mon, Jun 27, 2016 at 2:47 PM, Tom Herbert <tom@herbertland.com> wrote: >>> On Mon, Jun 27, 2016 at 2:44 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>> On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: >>>>> On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>>>> The stack doesn't trust the complete csum by hardware >>>>>> even when it is correct. >>>>> >>>>> Can you explain that a little further? >>>> >>>> Sure, here is the code in __skb_checksum_complete(): >>>> >>>> /* skb->csum holds pseudo checksum */ >>>> sum = csum_fold(csum_add(skb->csum, csum)); >>>> if (likely(!sum)) { >>>> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >>>> !skb->csum_complete_sw) >>>> netdev_rx_csum_fault(skb->dev); >>>> } >>>> >>>> So when sum == 0, it means the checksum is correct. And >>>> we already set ->ip_summed to CHECKSUM_COMPLETE >>>> after check_csum(), and ->csum_complete_sw is initialized >>>> to 0 when we allocate the skb. This is why we trigger >>>> netdev_rx_csum_fault(). >>>> >>> Yes, but this also means that the driver gave the stack a checksum >>> complete value that was incorrect. That's an error. >> >> That is the whole purpose of commit f8c6455bb04b944edb69e, >> isn't it? > > No. Unless you've uncovered some other bug, what is probably happening > is that driver receives a packet with a checksum complete value. It > records the value in the skbuff and marks it as CHECKSUM_COMPLETE. > Subsequently, the stack tries to validate a transport layer checksum, > and the validation fails (checksum does not sum to zero). The stack > will then call __skb_checksum_complete from > __skb_checksum_validate_complete. In this case the stack computes that > transport checksum by hand and sees that transport checksum is valid-- > so that means that the original value in checksum complete was not > correct, it is not set to the computed checksum of the whole packet. > This is an important error because it catches issues where checksum is > not correctly being pulled up. I see, the comments in mlx4 driver said: /* Although the stack expects checksum which doesn't include the pseudo * header, the HW adds it. To address that, we are subtracting the pseudo * header checksum from the checksum value provided by the HW. */ which seems imply it calculates a correct checksum for the whole packet here, but the stack disagrees. Therefore skb->csum is not still not what the stack expects. Given skb_checksum_simple_validate() always pass a null pseudo header, it looks like either the fix-up for pseudo header is not needed at all for ICMP case, OR we need to call skb_checksum_validate() for ICMPv4 case. Hmm... ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 22:53 ` Cong Wang @ 2016-06-27 23:16 ` Tom Herbert 0 siblings, 0 replies; 13+ messages in thread From: Tom Herbert @ 2016-06-27 23:16 UTC (permalink / raw) To: Cong Wang Cc: Or Gerlitz, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Mon, Jun 27, 2016 at 3:53 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Mon, Jun 27, 2016 at 3:04 PM, Tom Herbert <tom@herbertland.com> wrote: >> On Mon, Jun 27, 2016 at 2:49 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>> On Mon, Jun 27, 2016 at 2:47 PM, Tom Herbert <tom@herbertland.com> wrote: >>>> On Mon, Jun 27, 2016 at 2:44 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>>> On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: >>>>>> On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >>>>>>> The stack doesn't trust the complete csum by hardware >>>>>>> even when it is correct. >>>>>> >>>>>> Can you explain that a little further? >>>>> >>>>> Sure, here is the code in __skb_checksum_complete(): >>>>> >>>>> /* skb->csum holds pseudo checksum */ >>>>> sum = csum_fold(csum_add(skb->csum, csum)); >>>>> if (likely(!sum)) { >>>>> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >>>>> !skb->csum_complete_sw) >>>>> netdev_rx_csum_fault(skb->dev); >>>>> } >>>>> >>>>> So when sum == 0, it means the checksum is correct. And >>>>> we already set ->ip_summed to CHECKSUM_COMPLETE >>>>> after check_csum(), and ->csum_complete_sw is initialized >>>>> to 0 when we allocate the skb. This is why we trigger >>>>> netdev_rx_csum_fault(). >>>>> >>>> Yes, but this also means that the driver gave the stack a checksum >>>> complete value that was incorrect. That's an error. >>> >>> That is the whole purpose of commit f8c6455bb04b944edb69e, >>> isn't it? >> >> No. Unless you've uncovered some other bug, what is probably happening >> is that driver receives a packet with a checksum complete value. It >> records the value in the skbuff and marks it as CHECKSUM_COMPLETE. >> Subsequently, the stack tries to validate a transport layer checksum, >> and the validation fails (checksum does not sum to zero). The stack >> will then call __skb_checksum_complete from >> __skb_checksum_validate_complete. In this case the stack computes that >> transport checksum by hand and sees that transport checksum is valid-- >> so that means that the original value in checksum complete was not >> correct, it is not set to the computed checksum of the whole packet. >> This is an important error because it catches issues where checksum is >> not correctly being pulled up. > > I see, the comments in mlx4 driver said: > > /* Although the stack expects checksum which doesn't include the pseudo > * header, the HW adds it. To address that, we are subtracting the pseudo > * header checksum from the checksum value provided by the HW. > */ > > which seems imply it calculates a correct checksum for the whole > packet here, but the stack disagrees. Therefore skb->csum is not > still not what the stack expects. > Right, skb->csum is not what the stack expects. When it does the computation over the same data it arrives at a different value than what the driver sets. With this error pops that means the checksum in the packet is correct, but driver or something in the stack messed up skb->csum. > Given skb_checksum_simple_validate() always pass a null pseudo > header, it looks like either the fix-up for pseudo header is not needed > at all for ICMP case, OR we need to call skb_checksum_validate() > for ICMPv4 case. Hmm... Pseudo header is not part of IPv4 checksum calculation so skb_checksum_simple_validate is correct. Seems like a good chance driver is doing fix-up wrong for ICMP. Tom ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-27 21:44 ` Cong Wang 2016-06-27 21:47 ` Tom Herbert @ 2016-06-28 14:30 ` Or Gerlitz 2016-06-28 16:52 ` Cong Wang 1 sibling, 1 reply; 13+ messages in thread From: Or Gerlitz @ 2016-06-28 14:30 UTC (permalink / raw) To: Cong Wang Cc: Tom Herbert, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Tue, Jun 28, 2016 at 12:44 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote: > On Mon, Jun 27, 2016 at 2:08 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote: >> On Mon, Jun 27, 2016 at 9:22 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote: >> can you point/paste the exact warning and how to reproduce that? is >> that as simple as running ping and/or ping6? > Yes, ping is enough to reproduce it every time. > The warning is below: > [ 8693.680997] eth0: hw csum failure Wow, can you please report us the exact card type (lspci -nn | grep -i mellanox) and firmware version (ethtool -i $DEV) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-28 14:30 ` Or Gerlitz @ 2016-06-28 16:52 ` Cong Wang 2016-06-29 14:23 ` Tariq Toukan 0 siblings, 1 reply; 13+ messages in thread From: Cong Wang @ 2016-06-28 16:52 UTC (permalink / raw) To: Or Gerlitz Cc: Tom Herbert, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Tue, Jun 28, 2016 at 7:30 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote: > Wow, can you please report us the exact card type (lspci -nn | grep -i > mellanox) and firmware version (ethtool -i $DEV) See below. Does commit f8c6455bb04b944edb69e rely on any firmware change to get an expected checksum? $ lspci -nn | grep -i mellanox 82:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] $ ethtool -i eth0 driver: mlx4_en version: 2.2-1 (Feb 2014) firmware-version: 2.33.5220 bus-info: 0000:82:00.0 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-28 16:52 ` Cong Wang @ 2016-06-29 14:23 ` Tariq Toukan 2016-06-29 14:33 ` Or Gerlitz 2016-06-30 17:08 ` Cong Wang 0 siblings, 2 replies; 13+ messages in thread From: Tariq Toukan @ 2016-06-29 14:23 UTC (permalink / raw) To: Cong Wang, Or Gerlitz Cc: Tom Herbert, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto Hi Cong, > See below. Does commit f8c6455bb04b944edb69e rely on any firmware > change to get an expected checksum? > > $ lspci -nn | grep -i mellanox > 82:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 > Family [ConnectX-3] [15b3:1003] > > $ ethtool -i eth0 > driver: mlx4_en > version: 2.2-1 (Feb 2014) > firmware-version: 2.33.5220 > bus-info: 0000:82:00.0 I used same HW and FW, but was not able to reproduce. I have kernel HEAD of net-next: 8a79813c1401 net: ethernet: dwc_eth_qos: use phy_ethtool_{get|set}_link_ksettings I ran regular ping, and ping6. Checksum complete counter increases, but no warnings in dmesg. What kernel do you use? Does it repro on latest net-next? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-29 14:23 ` Tariq Toukan @ 2016-06-29 14:33 ` Or Gerlitz 2016-06-30 17:08 ` Cong Wang 1 sibling, 0 replies; 13+ messages in thread From: Or Gerlitz @ 2016-06-29 14:33 UTC (permalink / raw) To: Cong Wang Cc: Tom Herbert, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto, Tariq Toukan On Wed, Jun 29, 2016 at 5:23 PM, Tariq Toukan <ttoukan.linux@gmail.com> wrote: > Hi Cong, > >> See below. Does commit f8c6455bb04b944edb69e rely on any firmware >> change to get an expected checksum? >> >> $ lspci -nn | grep -i mellanox >> 82:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 >> Family [ConnectX-3] [15b3:1003] >> >> $ ethtool -i eth0 >> driver: mlx4_en >> version: 2.2-1 (Feb 2014) >> firmware-version: 2.33.5220 >> bus-info: 0000:82:00.0 > > I used same HW and FW, but was not able to reproduce. > I have kernel HEAD of net-next: 8a79813c1401 net: ethernet: dwc_eth_qos: use > phy_ethtool_{get|set}_link_ksettings > > I ran regular ping, and ping6. Checksum complete counter increases, but no > warnings in dmesg. > What kernel do you use? Does it repro on latest net-next? Also, can you share us somehow (e.g send to us as attachment, your compressed .config) Or. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum 2016-06-29 14:23 ` Tariq Toukan 2016-06-29 14:33 ` Or Gerlitz @ 2016-06-30 17:08 ` Cong Wang 1 sibling, 0 replies; 13+ messages in thread From: Cong Wang @ 2016-06-30 17:08 UTC (permalink / raw) To: Tariq Toukan Cc: Or Gerlitz, Tom Herbert, Linux Netdev List, Shani Michaeli, Tariq Toukan, Yishai Hadas, Carol L Soto On Wed, Jun 29, 2016 at 7:23 AM, Tariq Toukan <ttoukan.linux@gmail.com> wrote: > Hi Cong, > >> See below. Does commit f8c6455bb04b944edb69e rely on any firmware >> change to get an expected checksum? >> >> $ lspci -nn | grep -i mellanox >> 82:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 >> Family [ConnectX-3] [15b3:1003] >> >> $ ethtool -i eth0 >> driver: mlx4_en >> version: 2.2-1 (Feb 2014) >> firmware-version: 2.33.5220 >> bus-info: 0000:82:00.0 > > I used same HW and FW, but was not able to reproduce. > I have kernel HEAD of net-next: 8a79813c1401 net: ethernet: dwc_eth_qos: use > phy_ethtool_{get|set}_link_ksettings > > I ran regular ping, and ping6. Checksum complete counter increases, but no > warnings in dmesg. Thanks for testing it! This helps me a lot to identify the real bug. Actually the bug is in mirred action instead of mlx4 driver, I have a patch and just confirmed it fixes the bug. I will send it out in a few minutes. Again, thanks a lot for your help! ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-06-30 17:08 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-27 18:22 [Patch net] mlx4: set csum_complete_sw bit when fixing complete csum Cong Wang 2016-06-27 21:08 ` Or Gerlitz 2016-06-27 21:44 ` Cong Wang 2016-06-27 21:47 ` Tom Herbert 2016-06-27 21:49 ` Cong Wang 2016-06-27 22:04 ` Tom Herbert 2016-06-27 22:53 ` Cong Wang 2016-06-27 23:16 ` Tom Herbert 2016-06-28 14:30 ` Or Gerlitz 2016-06-28 16:52 ` Cong Wang 2016-06-29 14:23 ` Tariq Toukan 2016-06-29 14:33 ` Or Gerlitz 2016-06-30 17:08 ` Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).