netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Eric Dumazet <edumazet@google.com>
Cc: netdev <netdev@vger.kernel.org>, rossi.f@inwind.it
Subject: Re: [Bug 201423] New: eth0: hw csum failure
Date: Mon, 15 Oct 2018 09:21:32 -0700	[thread overview]
Message-ID: <20181015092132.514078d7@xeon-e3> (raw)
In-Reply-To: <CANn89iLA+rdFNXXdzogLHF1FqYg3CjpwXJbscWTJ8Bk8bN2Scw@mail.gmail.com>

On Mon, 15 Oct 2018 08:41:47 -0700
Eric Dumazet <edumazet@google.com> wrote:

> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> >
> >
> > Begin forwarded message:
> >
> > Date: Sun, 14 Oct 2018 10:42:48 +0000
> > From: bugzilla-daemon@bugzilla.kernel.org
> > To: stephen@networkplumber.org
> > Subject: [Bug 201423] New: eth0: hw csum failure
> >
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=201423
> >
> >             Bug ID: 201423
> >            Summary: eth0: hw csum failure
> >            Product: Networking
> >            Version: 2.5
> >     Kernel Version: 4.19.0-rc7
> >           Hardware: Intel
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Other
> >           Assignee: stephen@networkplumber.org
> >           Reporter: rossi.f@inwind.it
> >         Regression: No
> >
> > I have a P6T DELUXE V2 motherboard and using the sky2 driver for the ethernet
> > ports. I get the following error message:
> >
> > [  433.727397] eth0: hw csum failure
> > [  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
> > [  433.727406] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 1202    12/22/2010
> > [  433.727407] Call Trace:
> > [  433.727409]  <IRQ>
> > [  433.727415]  dump_stack+0x46/0x5b
> > [  433.727419]  __skb_checksum_complete+0xb0/0xc0
> > [  433.727423]  tcp_v4_rcv+0x528/0xb60
> > [  433.727426]  ? ipt_do_table+0x2d0/0x400
> > [  433.727429]  ip_local_deliver_finish+0x5a/0x110
> > [  433.727430]  ip_local_deliver+0xe1/0xf0
> > [  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  433.727432]  ip_rcv+0xca/0xe0
> > [  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
> > [  433.727438]  netif_receive_skb_internal+0x4e/0x130
> > [  433.727439]  napi_gro_receive+0x6a/0x80
> > [  433.727442]  sky2_poll+0x707/0xd20
> > [  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
> > [  433.727447]  net_rx_action+0x237/0x380
> > [  433.727449]  __do_softirq+0xdc/0x1e0
> > [  433.727452]  irq_exit+0xa9/0xb0
> > [  433.727453]  do_IRQ+0x45/0xc0
> > [  433.727455]  common_interrupt+0xf/0xf
> > [  433.727456]  </IRQ>
> > [  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
> > [  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f
> > ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1
> > 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
> > [  433.727462] RSP: 0000:ffffc900000a3e98 EFLAGS: 00000282 ORIG_RAX:
> > ffffffffffffffde
> > [  433.727463] RAX: ffff880237b1f280 RBX: 0000000000000004 RCX:
> > 000000000000001f
> > [  433.727464] RDX: 20c49ba5e353f7cf RSI: 000000002fe419c1 RDI:
> > 0000000000000000
> > [  433.727465] RBP: ffff880237b263a0 R08: 0000000000000714 R09:
> > 000000650512105d
> > [  433.727465] R10: 00000000ffffffff R11: 0000000000000342 R12:
> > 00000064fc2a8b1c
> > [  433.727466] R13: 00000064fc25b35f R14: 0000000000000004 R15:
> > ffffffff8204af20
> > [  433.727468]  ? cpuidle_enter_state+0x119/0x200
> > [  433.727471]  do_idle+0x1bf/0x200
> > [  433.727473]  cpu_startup_entry+0x6a/0x70
> > [  433.727475]  start_secondary+0x17f/0x1c0
> > [  433.727476]  secondary_startup_64+0xa4/0xb0
> > [  441.662954] eth0: hw csum failure
> > [  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19
> > [  441.662960] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 1202    12/22/2010
> > [  441.662960] Call Trace:
> > [  441.662963]  <IRQ>
> > [  441.662968]  dump_stack+0x46/0x5b
> > [  441.662972]  __skb_checksum_complete+0xb0/0xc0
> > [  441.662975]  tcp_v4_rcv+0x528/0xb60
> > [  441.662979]  ? ipt_do_table+0x2d0/0x400
> > [  441.662981]  ip_local_deliver_finish+0x5a/0x110
> > [  441.662983]  ip_local_deliver+0xe1/0xf0
> > [  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  441.662986]  ip_rcv+0xca/0xe0
> > [  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
> > [  441.662993]  netif_receive_skb_internal+0x4e/0x130
> > [  441.662994]  napi_gro_receive+0x6a/0x80
> > [  441.662998]  sky2_poll+0x707/0xd20
> > [  441.663000]  net_rx_action+0x237/0x380
> > [  441.663002]  __do_softirq+0xdc/0x1e0
> > [  441.663005]  irq_exit+0xa9/0xb0
> > [  441.663007]  do_IRQ+0x45/0xc0
> > [  441.663009]  common_interrupt+0xf/0xf
> > [  441.663010]  </IRQ>
> > [  441.663012] RIP: 0010:merge+0x22/0xb0
> > [  441.663014] Code: c3 31 c0 c3 90 90 90 90 41 56 41 55 41 54 55 48 89 d5 53
> > 48 89 cb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 <48> 85 c9
> > 74 70 48 85 d2 74 6b 49 89 fd 49 89 f6 49 89 e4 eb 14 48
> > [  441.663015] RSP: 0018:ffffc9000090b988 EFLAGS: 00000246 ORIG_RAX:
> > ffffffffffffffde
> > [  441.663017] RAX: 0000000000000000 RBX: ffff88021ab2d408 RCX:
> > ffff88021ab2d408
> > [  441.663018] RDX: ffff88021ab2d388 RSI: ffffffffa021c440 RDI:
> > 0000000000000000
> > [  441.663019] RBP: ffff88021ab2d388 R08: 0000000000005ecf R09:
> > 0000000000008500
> > [  441.663020] R10: ffffea000877ec00 R11: ffff880236803500 R12:
> > ffffffffa021c440
> > [  441.663021] R13: ffff88021ab2d448 R14: 0000000000000004 R15:
> > ffffc9000090b9e0
> > [  441.663048]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
> > [  441.663063]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
> > [  441.663065]  ? merge+0x57/0xb0
> > [  441.663080]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
> > [  441.663082]  list_sort+0x8b/0x230
> > [  441.663094]  radeon_cs_parser_fini+0xdf/0x110 [radeon]
> > [  441.663110]  radeon_cs_ioctl+0x2a4/0x710 [radeon]
> > [  441.663113]  ? __switch_to_asm+0x34/0x70
> > [  441.663114]  ? __switch_to_asm+0x40/0x70
> > [  441.663130]  ? radeon_cs_parser_init+0x20/0x20 [radeon]
> > [  441.663141]  drm_ioctl_kernel+0xa3/0xe0 [drm]
> > [  441.663149]  drm_ioctl+0x2e2/0x380 [drm]
> > [  441.663164]  ? radeon_cs_parser_init+0x20/0x20 [radeon]
> > [  441.663168]  ? page_add_new_anon_rmap+0x42/0x70
> > [  441.663171]  do_vfs_ioctl+0x9a/0x600
> > [  441.663173]  ksys_ioctl+0x35/0x60
> > [  441.663175]  __x64_sys_ioctl+0x11/0x20
> > [  441.663177]  do_syscall_64+0x3d/0xf0
> > [  441.663179]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [  441.663180] RIP: 0033:0x7f9377377f37
> > [  441.663182] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 ad
> > db 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01
> > f0 ff ff 73 01 c3 48 8b 0d 21 4f 2c 00 f7 d8 64 89 01 48
> > [  441.663183] RSP: 002b:00007f92c3130d28 EFLAGS: 00000246 ORIG_RAX:
> > 0000000000000010
> > [  441.663185] RAX: ffffffffffffffda RBX: 0000564498327ec0 RCX:
> > 00007f9377377f37
> > [  441.663186] RDX: 0000564498337ec8 RSI: 00000000c0206466 RDI:
> > 0000000000000010
> > [  441.663186] RBP: 0000564498337ec8 R08: 0000000000000000 R09:
> > 0000000000000000
> > [  441.663187] R10: 0000000000000000 R11: 0000000000000246 R12:
> > 00000000c0206466
> > [  441.663188] R13: 0000000000000010 R14: 0000000000000000 R15:
> > 0000564497a38120
> > [  462.833418] eth0: hw csum failure
> > [  462.833428] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
> > [  462.833429] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 1202    12/22/2010
> > [  462.833429] Call Trace:
> > [  462.833432]  <IRQ>
> > [  462.833438]  dump_stack+0x46/0x5b
> > [  462.833442]  __skb_checksum_complete+0xb0/0xc0
> > [  462.833446]  tcp_v4_rcv+0x528/0xb60
> > [  462.833449]  ? ipt_do_table+0x2d0/0x400
> > [  462.833452]  ip_local_deliver_finish+0x5a/0x110
> > [  462.833454]  ip_local_deliver+0xe1/0xf0
> > [  462.833455]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  462.833457]  ip_rcv+0xca/0xe0
> > [  462.833459]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  462.833461]  __netif_receive_skb_one_core+0x4b/0x70
> > [  462.833464]  netif_receive_skb_internal+0x4e/0x130
> > [  462.833466]  napi_gro_receive+0x6a/0x80
> > [  462.833469]  sky2_poll+0x707/0xd20
> > [  462.833471]  net_rx_action+0x237/0x380
> > [  462.833474]  __do_softirq+0xdc/0x1e0
> > [  462.833477]  irq_exit+0xa9/0xb0
> > [  462.833479]  do_IRQ+0x45/0xc0
> > [  462.833481]  common_interrupt+0xf/0xf
> > [  462.833482]  </IRQ>
> > [  462.833486] RIP: 0010:cpuidle_enter_state+0x124/0x200
> > [  462.833488] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f
> > ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1
> > 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
> > [  462.833489] RSP: 0018:ffffc900000a3e98 EFLAGS: 00000282 ORIG_RAX:
> > ffffffffffffffde
> > [  462.833491] RAX: ffff880237b1f280 RBX: 0000000000000004 RCX:
> > 000000000000001f
> > [  462.833492] RDX: 20c49ba5e353f7cf RSI: 000000002fe419c1 RDI:
> > 0000000000000000
> > [  462.833493] RBP: ffff880237b263a0 R08: 0000000000000000 R09:
> > 0000000000000000
> > [  462.833494] R10: 00000000ffffffff R11: 0000000000000273 R12:
> > 0000006bc3052131
> > [  462.833495] R13: 0000006bc2f99f57 R14: 0000000000000004 R15:
> > ffffffff8204af20
> > [  462.833498]  ? cpuidle_enter_state+0x119/0x200
> > [  462.833503]  do_idle+0x1bf/0x200
> > [  462.833506]  cpu_startup_entry+0x6a/0x70
> > [  462.833510]  start_secondary+0x17f/0x1c0
> > [  462.833513]  secondary_startup_64+0xa4/0xb0
> >
> > Something is changed between 4.17.12 and 4.18, after bisecting the problem I
> > got the following first bad commit:
> >
> > commit 88078d98d1bb085d72af8437707279e203524fa5
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Wed Apr 18 11:43:15 2018 -0700
> >
> >     net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
> >
> >     After working on IP defragmentation lately, I found that some large
> >     packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
> >     zero paddings on the last (small) fragment.
> >
> >     While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed
> >     to CHECKSUM_NONE, forcing a full csum validation, even if all prior
> >     fragments had CHECKSUM_COMPLETE set.
> >
> >     We can instead compute the checksum of the part we are trimming,
> >     usually smaller than the part we keep.
> >
> >     Signed-off-by: Eric Dumazet <edumazet@google.com>
> >     Signed-off-by: David S. Miller <davem@davemloft.net>
> >  
> 
> Thanks for bisecting !
> 
> This commit is known to expose some NIC/driver bugs.
> 
> Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
> ("net: sungem: fix rx checksum support")  for one driver needing a fix.
> 
> I assume SKY2_HW_NEW_LE is not set on your NIC ?

There are two variants of this chip, one does 1's compliment checksum, and
the other one does TCP checksum. Maybe the 1's compliment version is incorrectly
including the CRC.

Side note, not sure why but the driver only calls gro for checksummed packets.
Is that necessary?

  parent reply	other threads:[~2018-10-16  0:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 15:15 Fw: [Bug 201423] New: eth0: hw csum failure Stephen Hemminger
2018-10-15 15:41 ` Eric Dumazet
2018-10-15 16:12   ` Dave Stevenson
2018-10-15 16:21   ` Stephen Hemminger [this message]
2018-10-15 22:28   ` Fabio Rossi
2018-10-16  6:30   ` Andre Tomt
2018-10-16 13:00     ` Eric Dumazet
2018-10-19 21:58       ` Eric Dumazet
2018-10-19 22:25         ` Eric Dumazet
2018-10-21 13:34           ` Andre Tomt
2018-10-24 19:41             ` Andre Tomt
2018-10-25 17:38               ` Eric Dumazet
2018-10-26 11:45                 ` Andre Tomt
2018-10-26 12:38                   ` Andre Tomt
2018-10-26 12:59                     ` Eric Dumazet
2018-10-26 13:17                       ` Andre Tomt
2018-10-27 21:41                   ` Andre Tomt
2018-10-30 10:58                     ` Andre Tomt
2018-10-30 11:04                       ` Andre Tomt
2018-10-31  4:08                         ` Andre Tomt
2018-11-04  5:43                           ` Andre Tomt
2018-10-31  0:25         ` Fabio Rossi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181015092132.514078d7@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=rossi.f@inwind.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).