From: Sander Eikelenboom <linux@eikelenboom.it>
To: Eric Dumazet <edumazet@google.com>,
Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: kernel BUG at net/core/skbuff.c:2839 RIP [<ffffffff819109a2>] skb_segment+0x6b2/0x6d0
Date: Thu, 21 Nov 2013 13:00:27 +0100 [thread overview]
Message-ID: <1392359825.20131121130027@eikelenboom.it> (raw)
In-Reply-To: <1287049824.20131117201744@eikelenboom.it>
Hello Sander,
Sunday, November 17, 2013, 8:17:44 PM, you wrote:
> Hi Eric,
> With the linux-net changes from this merge window i get the kernel panic below (not with 3.12.0).
> It's on a machine running Xen, 2x rtl8169 nic, and using a bridge for guest networking.
> This panic in the host kernel only seems to occur when generating a lot of network traffic to and from a guest.
> I tried reverting "tcp: gso: fix truesize tracking" 0d08c42cf9a71530fef5ebcfe368f38f2dd0476f, but that didn't help.
> --
> Sander
> [ 1164.511712] ------------[ cut here ]------------
> [ 1164.518446] kernel BUG at net/core/skbuff.c:2839!
> [ 1164.525226] invalid opcode: 0000 [#2] PREEMPT SMP
> [ 1164.532024] Modules linked in:
> [ 1164.538713] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G D 3.12.0-mw-20131117+ #1
> [ 1164.545649] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 1164.552659] task: ffff880059b8a180 ti: ffff880059b96000 task.ti: ffff880059b96000
> [ 1164.559649] RIP: e030:[<ffffffff819109a2>] [<ffffffff819109a2>] skb_segment+0x6b2/0x6d0
> [ 1164.566860] RSP: e02b:ffff880059b97448 EFLAGS: 00010216
> [ 1164.574023] RAX: 0000000000000011 RBX: 0000000000006612 RCX: 0000000000006612
> [ 1164.581169] RDX: 00000000000005a8 RSI: 0000000000006612 RDI: ffff8800478ff682
> [ 1164.588115] RBP: ffff880059b97518 R08: ffff88004ca06a00 R09: 0000000000000011
> [ 1164.595169] R10: 000000000000606a R11: 0000000000000011 R12: 0000000000000000
> [ 1164.602214] R13: ffff88004ca06900 R14: ffff88004ca06a00 R15: ffff88004bb57f00
> [ 1164.609274] FS: 00007eff7dc67700(0000) GS:ffff88005f600000(0000) knlGS:0000000000000000
> [ 1164.616394] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1164.623562] CR2: 00007faa868fff30 CR3: 000000005877e000 CR4: 0000000000000660
> [ 1164.630807] Stack:
> [ 1164.637959] ffff880059b97458 ffffffff810d90dd ffff880059b97478 ffffffff8109f69a
> [ 1164.645296] ffff88004b824628 ffff88004b824400 ffff880059b974e8 ffff88004ca06a00
> [ 1164.652626] 0000001100000001 0000000000000040 0000000000000042 ffffffffffffffbe
> [ 1164.659816] Call Trace:
> [ 1164.667034] [<ffffffff810d90dd>] ? trace_hardirqs_on+0xd/0x10
> [ 1164.674313] [<ffffffff8109f69a>] ? local_bh_enable+0xaa/0x110
> [ 1164.681543] [<ffffffff819dafc2>] tcp_gso_segment+0x102/0x3e0
> [ 1164.688691] [<ffffffff819b8224>] ? ip_queue_xmit+0x194/0x480
> [ 1164.695741] [<ffffffff819e9fe4>] inet_gso_segment+0x124/0x350
> [ 1164.702836] [<ffffffff8191cb25>] skb_mac_gso_segment+0xd5/0x1d0
> [ 1164.709735] [<ffffffff8191ca92>] ? skb_mac_gso_segment+0x42/0x1d0
> [ 1164.716739] [<ffffffff8191cc7b>] __skb_gso_segment+0x5b/0xc0
> [ 1164.723802] [<ffffffff8191ce80>] dev_hard_start_xmit+0x1a0/0x500
> [ 1164.730744] [<ffffffff81939780>] sch_direct_xmit+0x100/0x280
> [ 1164.737531] [<ffffffff8191d404>] dev_queue_xmit+0x224/0x600
> [ 1164.744403] [<ffffffff8191d1e0>] ? dev_hard_start_xmit+0x500/0x500
> [ 1164.751317] [<ffffffff8194765e>] ? nf_hook_slow+0x11e/0x160
> [ 1164.758332] [<ffffffff81a14040>] ? deliver_clone+0x60/0x60
> [ 1164.765264] [<ffffffff81a140d7>] br_dev_queue_push_xmit+0x97/0x140
> [ 1164.772082] [<ffffffff81a1419d>] br_forward_finish+0x1d/0x60
> [ 1164.778925] [<ffffffff81a12490>] ? br_dev_free+0x30/0x30
> [ 1164.785714] [<ffffffff81a142f2>] __br_deliver+0x52/0x180
> [ 1164.792355] [<ffffffff81a146cd>] br_deliver+0x3d/0x50
> [ 1164.798950] [<ffffffff81a126be>] br_dev_xmit+0x22e/0x290
> [ 1164.805576] [<ffffffff81a12490>] ? br_dev_free+0x30/0x30
> [ 1164.812106] [<ffffffff8191d18d>] dev_hard_start_xmit+0x4ad/0x500
> [ 1164.818729] [<ffffffff8191d55e>] dev_queue_xmit+0x37e/0x600
> [ 1164.825314] [<ffffffff8191d1e0>] ? dev_hard_start_xmit+0x500/0x500
> [ 1164.831898] [<ffffffff819b7043>] ip_finish_output+0x293/0x610
> [ 1164.838483] [<ffffffff819b8a44>] ? ip_output+0x54/0xf0
> [ 1164.845055] [<ffffffff819b8a44>] ip_output+0x54/0xf0
> [ 1164.851400] [<ffffffff819b3e71>] ip_forward_finish+0x71/0x1a0
> [ 1164.857725] [<ffffffff819b42d8>] ip_forward+0x338/0x420
> [ 1164.864167] [<ffffffff819b1ca0>] ip_rcv_finish+0x150/0x660
> [ 1164.870477] [<ffffffff819b275b>] ip_rcv+0x22b/0x370
> [ 1164.876707] [<ffffffff81a0de22>] ? packet_rcv_spkt+0x42/0x190
> [ 1164.883040] [<ffffffff8191a3a2>] __netif_receive_skb_core+0x6e2/0x8b0
> [ 1164.889193] [<ffffffff81919dd4>] ? __netif_receive_skb_core+0x114/0x8b0
> [ 1164.895039] [<ffffffff810f28b9>] ? getnstimeofday+0x9/0x30
> [ 1164.900704] [<ffffffff8191a58c>] __netif_receive_skb+0x1c/0x70
> [ 1164.906327] [<ffffffff8191a7af>] netif_receive_skb+0x3f/0x50
> [ 1164.911892] [<ffffffff8191a8d4>] napi_gro_complete+0x114/0x140
> [ 1164.917459] [<ffffffff8191a7e0>] ? napi_gro_complete+0x20/0x140
> [ 1164.923048] [<ffffffff810dcf3a>] ? lock_release+0x12a/0x240
> [ 1164.928595] [<ffffffff819e9cb7>] ? inet_gro_receive+0x57/0x260
> [ 1164.934042] [<ffffffff8191b552>] dev_gro_receive+0x2b2/0x3f0
> [ 1164.939384] [<ffffffff8191b48b>] ? dev_gro_receive+0x1eb/0x3f0
> [ 1164.944704] [<ffffffff8191b849>] napi_gro_receive+0x29/0xc0
> [ 1164.949906] [<ffffffff816d9253>] rtl8169_poll+0x2d3/0x680
> [ 1164.955036] [<ffffffff8191aba1>] net_rx_action+0x171/0x270
> [ 1164.960180] [<ffffffff8109f27d>] __do_softirq+0xed/0x210
> [ 1164.965285] [<ffffffff8109f3f5>] run_ksoftirqd+0x55/0x90
> [ 1164.970334] [<ffffffff810c1e29>] smpboot_thread_fn+0x199/0x2a0
> [ 1164.975402] [<ffffffff810c1c90>] ? SyS_setgroups+0x150/0x150
> [ 1164.980438] [<ffffffff810bb00f>] kthread+0xdf/0x100
> [ 1164.985309] [<ffffffff810baf30>] ? __init_kthread_worker+0x70/0x70
> [ 1164.990221] [<ffffffff81a8a9cc>] ret_from_fork+0x7c/0xb0
> [ 1164.995085] [<ffffffff810baf30>] ? __init_kthread_worker+0x70/0x70
> [ 1164.999925] Code: ff 4c 8b 85 68 ff ff ff 44 8b 8d 50 ff ff ff 44 8b 95 48 ff ff ff 44 8b 9d 40 ff ff ff 0f 85 2c fe ff ff e9 23 fe ff ff 90 0f 0b <0f> 0b 48 c7 45 b0 ea ff ff ff e9 cf fc ff ff 0f 0b 0f 0b 66 66
> [ 1165.010399] RIP [<ffffffff819109a2>] skb_segment+0x6b2/0x6d0
> [ 1165.015512] RSP <ffff880059b97448>
> [ 1165.020980] ---[ end trace 88f75f0c791ac25c ]---
> [ 1165.026033] Kernel panic - not syncing: Fatal exception in interrupt
Hi Eric and Francois,
I have tested some more:
First tried with switching off GSO and GRO on the bridge, this didn't help.
Then i only switched off GRO on eth0 (r8169) and left the bridge alone. That helped to prevent the oops.
Below the output of ethtool -k for the bridge and eth0 after boot (so the default situation) with which the oops occurs.
And the part of dmesg where the r8169 get initialized on boot (there are 2, eth0 and eth1).
--
Sander
~# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: off
tx-checksum-ipv4: off
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: off
tx-scatter-gather: off
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off [requested on]
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
~# ethtool -k xen_bridge
Features for xen_bridge:
rx-checksumming: off [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: on
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: on
tx-fcoe-segmentation: on
tx-gre-segmentation: on
tx-ipip-segmentation: on
tx-sit-segmentation: on
tx-udp_tnl-segmentation: on
tx-mpls-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
[ 12.356379] r8169 0000:0b:00.0 eth0: RTL8168d/8111d at 0xffffc90000334000, 40:61:86:f4:67:d9, XID 081000c0 IRQ 128
[ 12.361803] r8169 0000:0b:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
[ 12.367291] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[ 12.372748] xen: registering gsi 51 triggering 0 polarity 1
[ 12.378225] xen: --> pirq=51 -> irq=51 (gsi=51)
[ 12.383612] r8169 0000:0a:00.0: enabling Mem-Wr-Inval
[ 12.389505] r8169 0000:0a:00.0 eth1: RTL8168d/8111d at 0xffffc90000336000, 40:61:86:f4:67:d8, XID 081000c0 IRQ 129
[ 12.395056] r8169 0000:0a:00.0 eth1: jumbo features [frames: 9200 bytes, tx checksumming: ko]
next prev parent reply other threads:[~2013-11-21 12:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-17 19:17 kernel BUG at net/core/skbuff.c:2839 RIP [<ffffffff819109a2>] skb_segment+0x6b2/0x6d0 Sander Eikelenboom
2013-11-17 19:59 ` Cong Wang
2013-11-21 12:00 ` Sander Eikelenboom [this message]
2013-11-21 14:10 ` Eric Dumazet
2013-11-21 14:14 ` Sander Eikelenboom
2013-11-21 15:46 ` Sander Eikelenboom
2013-11-21 18:17 ` David Miller
2013-11-21 18:23 ` Eric Dumazet
2013-11-21 18:38 ` David Miller
2013-11-21 19:10 ` [PATCH v2] gso: handle new frag_list of frags GRO packets Eric Dumazet
2013-11-21 19:12 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1392359825.20131121130027@eikelenboom.it \
--to=linux@eikelenboom.it \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=netdev@vger.kernel.org \
--cc=romieu@fr.zoreil.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox