netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: "Cong Wang" <xiyou.wangcong@gmail.com>,
	"Paweł Staszewski" <pstaszewski@itcare.pl>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Dimitris Michailidis <dmichail@google.com>
Subject: Re: Latest net-next kernel 4.19.0+
Date: Mon, 29 Oct 2018 20:52:39 -0700	[thread overview]
Message-ID: <db6848dc-cf1b-0989-570c-af5bdd1a7bd1@gmail.com> (raw)
In-Reply-To: <1e954663-ed05-4f33-4384-db880844f9d1@gmail.com>



On 10/29/2018 07:53 PM, Eric Dumazet wrote:
> 
> 
> On 10/29/2018 07:27 PM, Cong Wang wrote:
>> Hi,
>>
>> On Mon, Oct 29, 2018 at 5:19 PM Paweł Staszewski <pstaszewski@itcare.pl> wrote:
>>>
>>> Sorry not complete - followed by hw csum:
>>>
>>> [  342.190831] vlan1490: hw csum failure
>>> [  342.190835] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 4.19.0+ #1
>>> [  342.190836] Call Trace:
>>> [  342.190839]  <IRQ>
>>> [  342.190849]  dump_stack+0x46/0x5b
>>> [  342.190856]  __skb_checksum_complete+0x9a/0xa0
>>> [  342.190859]  tcp_v4_rcv+0xef/0x960
>>> [  342.190864]  ip_local_deliver_finish+0x49/0xd0
>>> [  342.190866]  ip_local_deliver+0x5e/0xe0
>>> [  342.190869]  ? ip_sublist_rcv_finish+0x50/0x50
>>> [  342.190870]  ip_rcv+0x41/0xc0
>>> [  342.190874]  __netif_receive_skb_one_core+0x4b/0x70
>>> [  342.190877]  netif_receive_skb_internal+0x2f/0xd0
>>> [  342.190879]  napi_gro_receive+0xb7/0xe0
>>> [  342.190884]  mlx5e_handle_rx_cqe+0x7a/0xd0
>>> [  342.190886]  mlx5e_poll_rx_cq+0xc6/0x930
>>> [  342.190888]  mlx5e_napi_poll+0xab/0xc90
>>
>>
>> We got exactly the same backtrace in our data center. However,
>> it is not easy for us to reproduce it, do you have any clue to reproduce it?
>>
>> If you do, try to tcpdump the packets triggering this warning, it could
>> be useful for debugging.
>>
>> Also, we tried to apply commit d55bef5059dd057bd, the warning _still_
>> occurs. We tried to revert the offending commit 88078d98d1bb, it
>> disappears. So it is likely that commit 88078d98d1bb introduces
>> more troubles than the one fixed by d55bef5059dd057bd.
>>
> 
> Or this could be that mlx5 driver is buggy when dealing with VLAN tags.
> 
> It both uses vlan_tci (hardware vlan offload) in skb _and_ this piece of code in mlx5e_handle_csum() 
> 
> 		if (network_depth > ETH_HLEN)
> 			/* CQE csum is calculated from the IP header and does
> 			 * not cover VLAN headers (if present). This will add
> 			 * the checksum manually.
> 			 */
> 			skb->csum = csum_partial(skb->data + ETH_HLEN,
> 						 network_depth - ETH_HLEN,
> 						 skb->csum);
> 
> 
> That seems strange to me, because skb_vlan_untag() will not adjust skb->csum in this case.
> 

Bug might be in NETIF_F_RXFCS mlx5 handling btw...

Code does :

if (unlikely(netdev->features & NETIF_F_RXFCS))
     skb->csum = csum_add(skb->csum,
                          (__force __wsum)mlx5e_get_fcs(skb));

But Dimitris told us that we need to take into account if FCS starts at odd or even offset.

->
if (unlikely(netdev->features & NETIF_F_RXFCS))
     skb->csum = csum_block_add(skb->csum,
                                (__force __wsum)mlx5e_get_fcs(skb),
                                skb->len);

  reply	other threads:[~2018-10-30 12:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30  0:10 Latest net-next kernel 4.19.0+ Paweł Staszewski
2018-10-30  0:11 ` Paweł Staszewski
2018-10-30  0:34   ` Paweł Staszewski
2018-10-30  2:27   ` Cong Wang
2018-10-30  2:43     ` Cong Wang
2018-10-30  2:53     ` Eric Dumazet
2018-10-30  3:52       ` Eric Dumazet [this message]
2018-10-30  6:09         ` Dimitris Michailidis
2018-10-30  7:29           ` Eric Dumazet
2018-10-30  8:09             ` Paweł Staszewski
2018-10-30 14:16               ` Eric Dumazet
2018-10-30 17:32                 ` Cong Wang
2018-10-30 17:50                   ` Eric Dumazet
2018-10-30 17:54                     ` Cong Wang
2018-10-31 21:05                   ` Saeed Mahameed
2018-10-31 21:17                     ` Cong Wang
2018-11-01 22:59                       ` Paweł Staszewski
2018-11-08 18:35                         ` Cong Wang
2018-10-31 21:22                     ` Paweł Staszewski
2018-10-31 21:24                 ` Paweł Staszewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db6848dc-cf1b-0989-570c-af5bdd1a7bd1@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=dmichail@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pstaszewski@itcare.pl \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).