* Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 @ 2015-08-12 19:19 linux 2015-08-12 20:41 ` Eric Dumazet 0 siblings, 1 reply; 20+ messages in thread From: linux @ 2015-08-12 19:19 UTC (permalink / raw) To: linux-kernel; +Cc: netdev Hi, On my box running Xen with a 4.2-rc6 kernel i still get this splat in dom0, which crashes the box. (i reported a similar splat before (at rc4) here, http://www.spinics.net/lists/netdev/msg337570.html) Never seen this one on 4.1, so it seems a regression. -- Sander [81133.193439] general protection fault: 0000 [#1] SMP [81133.204284] Modules linked in: [81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.2.0-rc6-20150811-linus-doflr+ #1 [81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [81133.236237] task: ffff880059b91580 ti: ffff880059bb4000 task.ti: ffff880059bb4000 [81133.246808] RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 [81133.257354] RSP: e02b:ffff880059bb7848 EFLAGS: 00010086 [81133.267749] RAX: ffff88004eddc7f0 RBX: ffff88000e20ae08 RCX: dead000000200200 [81133.278201] RDX: 0000000000000000 RSI: ffff88005f60e600 RDI: ffff88000e20ae08 [81133.288723] RBP: ffff880059bb7848 R08: 0000000000000001 R09: 0000000000000001 [81133.298930] R10: 0000000000000003 R11: ffff88000e20ad68 R12: 0000000000000000 [81133.308875] R13: 0000000101735569 R14: 0000000000015f90 R15: ffff88005f60e600 [81133.318845] FS: 00007f28c6f7c800(0000) GS:ffff88005f600000(0000) knlGS:0000000000000000 [81133.328864] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [81133.338693] CR2: ffff8000007f6800 CR3: 000000003d55c000 CR4: 0000000000000660 [81133.348462] Stack: [81133.358005] ffff880059bb7898 ffffffff8110fe3f ffffffff810fc261 0000000000000200 [81133.367682] 0000000000000003 ffff88000e20ad68 0000000000000000 ffff88005854d400 [81133.377064] 0000000000015f90 0000000000000000 ffff880059bb78c8 ffffffff819b5243 [81133.386374] Call Trace: [81133.395596] [<ffffffff8110fe3f>] mod_timer_pending+0x3f/0xe0 [81133.404999] [<ffffffff810fc261>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [81133.414255] [<ffffffff819b5243>] __nf_ct_refresh_acct+0xa3/0xb0 [81133.423137] [<ffffffff819bbe8b>] tcp_packet+0xb3b/0x1290 [81133.431894] [<ffffffff810cb8ca>] ? __local_bh_enable_ip+0x2a/0x90 [81133.440622] [<ffffffff819b4939>] ? __nf_conntrack_find_get+0x129/0x2a0 [81133.449339] [<ffffffff819b682c>] nf_conntrack_in+0x29c/0x7c0 [81133.457940] [<ffffffff81a67181>] ipv4_conntrack_in+0x21/0x30 [81133.466296] [<ffffffff819aea1c>] nf_iterate+0x4c/0x80 [81133.474401] [<ffffffff819aeab4>] nf_hook_slow+0x64/0xc0 [81133.482615] [<ffffffff81a211ec>] ip_rcv+0x2ec/0x380 [81133.490781] [<ffffffff81a209f0>] ? ip_local_deliver_finish+0x130/0x130 [81133.498790] [<ffffffff8197e140>] __netif_receive_skb_core+0x2a0/0x970 [81133.506714] [<ffffffff81a56db8>] ? inet_gro_receive+0x1c8/0x200 [81133.514609] [<ffffffff81980705>] __netif_receive_skb+0x15/0x70 [81133.522333] [<ffffffff8198077e>] netif_receive_skb_internal+0x1e/0x80 [81133.529840] [<ffffffff81980f3b>] napi_gro_receive+0x6b/0x90 [81133.537173] [<ffffffff81740fb6>] rtl8169_poll+0x2e6/0x600 [81133.544444] [<ffffffff810fc261>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [81133.551566] [<ffffffff81981ad7>] net_rx_action+0x1f7/0x300 [81133.558412] [<ffffffff810cb6c3>] __do_softirq+0x103/0x210 [81133.565353] [<ffffffff810cb807>] run_ksoftirqd+0x37/0x60 [81133.572359] [<ffffffff810e4de0>] smpboot_thread_fn+0x130/0x190 [81133.579215] [<ffffffff810e4cb0>] ? sort_range+0x20/0x20 [81133.586042] [<ffffffff810e1fae>] kthread+0xee/0x110 [81133.592792] [<ffffffff810e1ec0>] ? kthread_create_on_node+0x1b0/0x1b0 [81133.599694] [<ffffffff81af92df>] ret_from_fork+0x3f/0x70 [81133.606662] [<ffffffff810e1ec0>] ? kthread_create_on_node+0x1b0/0x1b0 [81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 08 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 [81133.627196] RIP [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 [81133.634036] RSP <ffff880059bb7848> [81133.640817] ---[ end trace eaf596e1fcf6a591 ]--- [81133.647521] Kernel panic - not syncing: Fatal exception in interrupt ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-12 19:19 Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 linux @ 2015-08-12 20:41 ` Eric Dumazet 2015-08-12 20:50 ` linux 0 siblings, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2015-08-12 20:41 UTC (permalink / raw) To: linux; +Cc: linux-kernel, netdev On Wed, 2015-08-12 at 21:19 +0200, linux@eikelenboom.it wrote: > Hi, > > On my box running Xen with a 4.2-rc6 kernel i still get this splat in > dom0, > which crashes the box. > (i reported a similar splat before (at rc4) here, > http://www.spinics.net/lists/netdev/msg337570.html) > > Never seen this one on 4.1, so it seems a regression. > > -- > Sander > > > [81133.193439] general protection fault: 0000 [#1] SMP > [81133.204284] Modules linked in: > [81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted > 4.2.0-rc6-20150811-linus-doflr+ #1 > [81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS > V1.8B1 09/13/2010 > [81133.236237] task: ffff880059b91580 ti: ffff880059bb4000 task.ti: > ffff880059bb4000 > [81133.246808] RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] > detach_if_pending+0x18/0x80 > [81133.257354] RSP: e02b:ffff880059bb7848 EFLAGS: 00010086 > [81133.267749] RAX: ffff88004eddc7f0 RBX: ffff88000e20ae08 RCX: > dead000000200200 > [81133.278201] RDX: 0000000000000000 RSI: ffff88005f60e600 RDI: > ffff88000e20ae08 > [81133.288723] RBP: ffff880059bb7848 R08: 0000000000000001 R09: > 0000000000000001 > [81133.298930] R10: 0000000000000003 R11: ffff88000e20ad68 R12: > 0000000000000000 > [81133.308875] R13: 0000000101735569 R14: 0000000000015f90 R15: > ffff88005f60e600 > [81133.318845] FS: 00007f28c6f7c800(0000) GS:ffff88005f600000(0000) > knlGS:0000000000000000 > [81133.328864] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [81133.338693] CR2: ffff8000007f6800 CR3: 000000003d55c000 CR4: > 0000000000000660 > [81133.348462] Stack: > [81133.358005] ffff880059bb7898 ffffffff8110fe3f ffffffff810fc261 > 0000000000000200 > [81133.367682] 0000000000000003 ffff88000e20ad68 0000000000000000 > ffff88005854d400 > [81133.377064] 0000000000015f90 0000000000000000 ffff880059bb78c8 > ffffffff819b5243 > [81133.386374] Call Trace: > [81133.395596] [<ffffffff8110fe3f>] mod_timer_pending+0x3f/0xe0 > [81133.404999] [<ffffffff810fc261>] ? > __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 > [81133.414255] [<ffffffff819b5243>] __nf_ct_refresh_acct+0xa3/0xb0 > [81133.423137] [<ffffffff819bbe8b>] tcp_packet+0xb3b/0x1290 > [81133.431894] [<ffffffff810cb8ca>] ? __local_bh_enable_ip+0x2a/0x90 > [81133.440622] [<ffffffff819b4939>] ? > __nf_conntrack_find_get+0x129/0x2a0 > [81133.449339] [<ffffffff819b682c>] nf_conntrack_in+0x29c/0x7c0 > [81133.457940] [<ffffffff81a67181>] ipv4_conntrack_in+0x21/0x30 > [81133.466296] [<ffffffff819aea1c>] nf_iterate+0x4c/0x80 > [81133.474401] [<ffffffff819aeab4>] nf_hook_slow+0x64/0xc0 > [81133.482615] [<ffffffff81a211ec>] ip_rcv+0x2ec/0x380 > [81133.490781] [<ffffffff81a209f0>] ? > ip_local_deliver_finish+0x130/0x130 > [81133.498790] [<ffffffff8197e140>] > __netif_receive_skb_core+0x2a0/0x970 > [81133.506714] [<ffffffff81a56db8>] ? inet_gro_receive+0x1c8/0x200 > [81133.514609] [<ffffffff81980705>] __netif_receive_skb+0x15/0x70 > [81133.522333] [<ffffffff8198077e>] > netif_receive_skb_internal+0x1e/0x80 > [81133.529840] [<ffffffff81980f3b>] napi_gro_receive+0x6b/0x90 > [81133.537173] [<ffffffff81740fb6>] rtl8169_poll+0x2e6/0x600 > [81133.544444] [<ffffffff810fc261>] ? > __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 > [81133.551566] [<ffffffff81981ad7>] net_rx_action+0x1f7/0x300 > [81133.558412] [<ffffffff810cb6c3>] __do_softirq+0x103/0x210 > [81133.565353] [<ffffffff810cb807>] run_ksoftirqd+0x37/0x60 > [81133.572359] [<ffffffff810e4de0>] smpboot_thread_fn+0x130/0x190 > [81133.579215] [<ffffffff810e4cb0>] ? sort_range+0x20/0x20 > [81133.586042] [<ffffffff810e1fae>] kthread+0xee/0x110 > [81133.592792] [<ffffffff810e1ec0>] ? > kthread_create_on_node+0x1b0/0x1b0 > [81133.599694] [<ffffffff81af92df>] ret_from_fork+0x3f/0x70 > [81133.606662] [<ffffffff810e1ec0>] ? > kthread_create_on_node+0x1b0/0x1b0 > [81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 > 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 08 > 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 > [81133.627196] RIP [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 > [81133.634036] RSP <ffff880059bb7848> > [81133.640817] ---[ end trace eaf596e1fcf6a591 ]--- > [81133.647521] Kernel panic - not syncing: Fatal exception in interrupt This looks like the bug fixed in David Miller net tree : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=2235f2ac75fd2501c251b0b699a9632e80239a6d ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-12 20:41 ` Eric Dumazet @ 2015-08-12 20:50 ` linux 2015-08-12 21:40 ` David Miller 0 siblings, 1 reply; 20+ messages in thread From: linux @ 2015-08-12 20:50 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-kernel, netdev On 2015-08-12 22:41, Eric Dumazet wrote: > On Wed, 2015-08-12 at 21:19 +0200, linux@eikelenboom.it wrote: >> Hi, >> >> On my box running Xen with a 4.2-rc6 kernel i still get this splat in >> dom0, >> which crashes the box. >> (i reported a similar splat before (at rc4) here, >> http://www.spinics.net/lists/netdev/msg337570.html) >> >> Never seen this one on 4.1, so it seems a regression. >> >> -- >> Sander >> >> >> [81133.193439] general protection fault: 0000 [#1] SMP >> [81133.204284] Modules linked in: >> [81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted >> 4.2.0-rc6-20150811-linus-doflr+ #1 >> [81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , >> BIOS >> V1.8B1 09/13/2010 >> [81133.236237] task: ffff880059b91580 ti: ffff880059bb4000 task.ti: >> ffff880059bb4000 >> [81133.246808] RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] >> detach_if_pending+0x18/0x80 >> [81133.257354] RSP: e02b:ffff880059bb7848 EFLAGS: 00010086 >> [81133.267749] RAX: ffff88004eddc7f0 RBX: ffff88000e20ae08 RCX: >> dead000000200200 >> [81133.278201] RDX: 0000000000000000 RSI: ffff88005f60e600 RDI: >> ffff88000e20ae08 >> [81133.288723] RBP: ffff880059bb7848 R08: 0000000000000001 R09: >> 0000000000000001 >> [81133.298930] R10: 0000000000000003 R11: ffff88000e20ad68 R12: >> 0000000000000000 >> [81133.308875] R13: 0000000101735569 R14: 0000000000015f90 R15: >> ffff88005f60e600 >> [81133.318845] FS: 00007f28c6f7c800(0000) GS:ffff88005f600000(0000) >> knlGS:0000000000000000 >> [81133.328864] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> [81133.338693] CR2: ffff8000007f6800 CR3: 000000003d55c000 CR4: >> 0000000000000660 >> [81133.348462] Stack: >> [81133.358005] ffff880059bb7898 ffffffff8110fe3f ffffffff810fc261 >> 0000000000000200 >> [81133.367682] 0000000000000003 ffff88000e20ad68 0000000000000000 >> ffff88005854d400 >> [81133.377064] 0000000000015f90 0000000000000000 ffff880059bb78c8 >> ffffffff819b5243 >> [81133.386374] Call Trace: >> [81133.395596] [<ffffffff8110fe3f>] mod_timer_pending+0x3f/0xe0 >> [81133.404999] [<ffffffff810fc261>] ? >> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >> [81133.414255] [<ffffffff819b5243>] __nf_ct_refresh_acct+0xa3/0xb0 >> [81133.423137] [<ffffffff819bbe8b>] tcp_packet+0xb3b/0x1290 >> [81133.431894] [<ffffffff810cb8ca>] ? __local_bh_enable_ip+0x2a/0x90 >> [81133.440622] [<ffffffff819b4939>] ? >> __nf_conntrack_find_get+0x129/0x2a0 >> [81133.449339] [<ffffffff819b682c>] nf_conntrack_in+0x29c/0x7c0 >> [81133.457940] [<ffffffff81a67181>] ipv4_conntrack_in+0x21/0x30 >> [81133.466296] [<ffffffff819aea1c>] nf_iterate+0x4c/0x80 >> [81133.474401] [<ffffffff819aeab4>] nf_hook_slow+0x64/0xc0 >> [81133.482615] [<ffffffff81a211ec>] ip_rcv+0x2ec/0x380 >> [81133.490781] [<ffffffff81a209f0>] ? >> ip_local_deliver_finish+0x130/0x130 >> [81133.498790] [<ffffffff8197e140>] >> __netif_receive_skb_core+0x2a0/0x970 >> [81133.506714] [<ffffffff81a56db8>] ? inet_gro_receive+0x1c8/0x200 >> [81133.514609] [<ffffffff81980705>] __netif_receive_skb+0x15/0x70 >> [81133.522333] [<ffffffff8198077e>] >> netif_receive_skb_internal+0x1e/0x80 >> [81133.529840] [<ffffffff81980f3b>] napi_gro_receive+0x6b/0x90 >> [81133.537173] [<ffffffff81740fb6>] rtl8169_poll+0x2e6/0x600 >> [81133.544444] [<ffffffff810fc261>] ? >> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >> [81133.551566] [<ffffffff81981ad7>] net_rx_action+0x1f7/0x300 >> [81133.558412] [<ffffffff810cb6c3>] __do_softirq+0x103/0x210 >> [81133.565353] [<ffffffff810cb807>] run_ksoftirqd+0x37/0x60 >> [81133.572359] [<ffffffff810e4de0>] smpboot_thread_fn+0x130/0x190 >> [81133.579215] [<ffffffff810e4cb0>] ? sort_range+0x20/0x20 >> [81133.586042] [<ffffffff810e1fae>] kthread+0xee/0x110 >> [81133.592792] [<ffffffff810e1ec0>] ? >> kthread_create_on_node+0x1b0/0x1b0 >> [81133.599694] [<ffffffff81af92df>] ret_from_fork+0x3f/0x70 >> [81133.606662] [<ffffffff810e1ec0>] ? >> kthread_create_on_node+0x1b0/0x1b0 >> [81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 >> 00 >> 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 >> 08 >> 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 >> [81133.627196] RIP [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 >> [81133.634036] RSP <ffff880059bb7848> >> [81133.640817] ---[ end trace eaf596e1fcf6a591 ]--- >> [81133.647521] Kernel panic - not syncing: Fatal exception in >> interrupt > > This looks like the bug fixed in David Miller net tree : > > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=2235f2ac75fd2501c251b0b699a9632e80239a6d Will pull the net-tree in and re-test. But since it only seems to crash after a day or two, that will take some time. Thanks, Sander ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-12 20:50 ` linux @ 2015-08-12 21:40 ` David Miller 2015-08-12 21:46 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: David Miller @ 2015-08-12 21:40 UTC (permalink / raw) To: linux; +Cc: eric.dumazet, linux-kernel, netdev From: linux@eikelenboom.it Date: Wed, 12 Aug 2015 22:50:42 +0200 > On 2015-08-12 22:41, Eric Dumazet wrote: >> On Wed, 2015-08-12 at 21:19 +0200, linux@eikelenboom.it wrote: >>> Hi, >>> On my box running Xen with a 4.2-rc6 kernel i still get this splat in >>> dom0, >>> which crashes the box. >>> (i reported a similar splat before (at rc4) here, >>> http://www.spinics.net/lists/netdev/msg337570.html) >>> Never seen this one on 4.1, so it seems a regression. >>> -- >>> Sander >>> [81133.193439] general protection fault: 0000 [#1] SMP >>> [81133.204284] Modules linked in: >>> [81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted >>> 4.2.0-rc6-20150811-linus-doflr+ #1 >>> [81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS >>> V1.8B1 09/13/2010 >>> [81133.236237] task: ffff880059b91580 ti: ffff880059bb4000 task.ti: >>> ffff880059bb4000 >>> [81133.246808] RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] >>> detach_if_pending+0x18/0x80 >>> [81133.257354] RSP: e02b:ffff880059bb7848 EFLAGS: 00010086 >>> [81133.267749] RAX: ffff88004eddc7f0 RBX: ffff88000e20ae08 RCX: >>> dead000000200200 >>> [81133.278201] RDX: 0000000000000000 RSI: ffff88005f60e600 RDI: >>> ffff88000e20ae08 >>> [81133.288723] RBP: ffff880059bb7848 R08: 0000000000000001 R09: >>> 0000000000000001 >>> [81133.298930] R10: 0000000000000003 R11: ffff88000e20ad68 R12: >>> 0000000000000000 >>> [81133.308875] R13: 0000000101735569 R14: 0000000000015f90 R15: >>> ffff88005f60e600 >>> [81133.318845] FS: 00007f28c6f7c800(0000) GS:ffff88005f600000(0000) >>> knlGS:0000000000000000 >>> [81133.328864] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >>> [81133.338693] CR2: ffff8000007f6800 CR3: 000000003d55c000 CR4: >>> 0000000000000660 >>> [81133.348462] Stack: >>> [81133.358005] ffff880059bb7898 ffffffff8110fe3f ffffffff810fc261 >>> 0000000000000200 >>> [81133.367682] 0000000000000003 ffff88000e20ad68 0000000000000000 >>> ffff88005854d400 >>> [81133.377064] 0000000000015f90 0000000000000000 ffff880059bb78c8 >>> ffffffff819b5243 >>> [81133.386374] Call Trace: >>> [81133.395596] [<ffffffff8110fe3f>] mod_timer_pending+0x3f/0xe0 >>> [81133.404999] [<ffffffff810fc261>] ? >>> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >>> [81133.414255] [<ffffffff819b5243>] __nf_ct_refresh_acct+0xa3/0xb0 >>> [81133.423137] [<ffffffff819bbe8b>] tcp_packet+0xb3b/0x1290 >>> [81133.431894] [<ffffffff810cb8ca>] ? __local_bh_enable_ip+0x2a/0x90 >>> [81133.440622] [<ffffffff819b4939>] ? >>> __nf_conntrack_find_get+0x129/0x2a0 >>> [81133.449339] [<ffffffff819b682c>] nf_conntrack_in+0x29c/0x7c0 >>> [81133.457940] [<ffffffff81a67181>] ipv4_conntrack_in+0x21/0x30 >>> [81133.466296] [<ffffffff819aea1c>] nf_iterate+0x4c/0x80 >>> [81133.474401] [<ffffffff819aeab4>] nf_hook_slow+0x64/0xc0 >>> [81133.482615] [<ffffffff81a211ec>] ip_rcv+0x2ec/0x380 >>> [81133.490781] [<ffffffff81a209f0>] ? >>> ip_local_deliver_finish+0x130/0x130 >>> [81133.498790] [<ffffffff8197e140>] >>> __netif_receive_skb_core+0x2a0/0x970 >>> [81133.506714] [<ffffffff81a56db8>] ? inet_gro_receive+0x1c8/0x200 >>> [81133.514609] [<ffffffff81980705>] __netif_receive_skb+0x15/0x70 >>> [81133.522333] [<ffffffff8198077e>] >>> netif_receive_skb_internal+0x1e/0x80 >>> [81133.529840] [<ffffffff81980f3b>] napi_gro_receive+0x6b/0x90 >>> [81133.537173] [<ffffffff81740fb6>] rtl8169_poll+0x2e6/0x600 >>> [81133.544444] [<ffffffff810fc261>] ? >>> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >>> [81133.551566] [<ffffffff81981ad7>] net_rx_action+0x1f7/0x300 >>> [81133.558412] [<ffffffff810cb6c3>] __do_softirq+0x103/0x210 >>> [81133.565353] [<ffffffff810cb807>] run_ksoftirqd+0x37/0x60 >>> [81133.572359] [<ffffffff810e4de0>] smpboot_thread_fn+0x130/0x190 >>> [81133.579215] [<ffffffff810e4cb0>] ? sort_range+0x20/0x20 >>> [81133.586042] [<ffffffff810e1fae>] kthread+0xee/0x110 >>> [81133.592792] [<ffffffff810e1ec0>] ? >>> kthread_create_on_node+0x1b0/0x1b0 >>> [81133.599694] [<ffffffff81af92df>] ret_from_fork+0x3f/0x70 >>> [81133.606662] [<ffffffff810e1ec0>] ? >>> kthread_create_on_node+0x1b0/0x1b0 >>> [81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 >>> 00 >>> 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 >>> 08 >>> 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 >>> [81133.627196] RIP [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 >>> [81133.634036] RSP <ffff880059bb7848> >>> [81133.640817] ---[ end trace eaf596e1fcf6a591 ]--- >>> [81133.647521] Kernel panic - not syncing: Fatal exception in >>> interrupt >> This looks like the bug fixed in David Miller net tree : >> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=2235f2ac75fd2501c251b0b699a9632e80239a6d > > Will pull the net-tree in and re-test. You should not pull the 'net-next', but rather the 'net' one. 'net' is not necessarily included in 'net-next'. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-12 21:40 ` David Miller @ 2015-08-12 21:46 ` Sander Eikelenboom 2015-08-12 22:41 ` Eric Dumazet 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-12 21:46 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, linux-kernel, netdev On 2015-08-12 23:40, David Miller wrote: > From: linux@eikelenboom.it > Date: Wed, 12 Aug 2015 22:50:42 +0200 > >> On 2015-08-12 22:41, Eric Dumazet wrote: >>> On Wed, 2015-08-12 at 21:19 +0200, linux@eikelenboom.it wrote: >>>> Hi, >>>> On my box running Xen with a 4.2-rc6 kernel i still get this splat >>>> in >>>> dom0, >>>> which crashes the box. >>>> (i reported a similar splat before (at rc4) here, >>>> http://www.spinics.net/lists/netdev/msg337570.html) >>>> Never seen this one on 4.1, so it seems a regression. >>>> -- >>>> Sander >>>> [81133.193439] general protection fault: 0000 [#1] SMP >>>> [81133.204284] Modules linked in: >>>> [81133.214934] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted >>>> 4.2.0-rc6-20150811-linus-doflr+ #1 >>>> [81133.225632] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , >>>> BIOS >>>> V1.8B1 09/13/2010 >>>> [81133.236237] task: ffff880059b91580 ti: ffff880059bb4000 task.ti: >>>> ffff880059bb4000 >>>> [81133.246808] RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] >>>> detach_if_pending+0x18/0x80 >>>> [81133.257354] RSP: e02b:ffff880059bb7848 EFLAGS: 00010086 >>>> [81133.267749] RAX: ffff88004eddc7f0 RBX: ffff88000e20ae08 RCX: >>>> dead000000200200 >>>> [81133.278201] RDX: 0000000000000000 RSI: ffff88005f60e600 RDI: >>>> ffff88000e20ae08 >>>> [81133.288723] RBP: ffff880059bb7848 R08: 0000000000000001 R09: >>>> 0000000000000001 >>>> [81133.298930] R10: 0000000000000003 R11: ffff88000e20ad68 R12: >>>> 0000000000000000 >>>> [81133.308875] R13: 0000000101735569 R14: 0000000000015f90 R15: >>>> ffff88005f60e600 >>>> [81133.318845] FS: 00007f28c6f7c800(0000) GS:ffff88005f600000(0000) >>>> knlGS:0000000000000000 >>>> [81133.328864] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >>>> [81133.338693] CR2: ffff8000007f6800 CR3: 000000003d55c000 CR4: >>>> 0000000000000660 >>>> [81133.348462] Stack: >>>> [81133.358005] ffff880059bb7898 ffffffff8110fe3f ffffffff810fc261 >>>> 0000000000000200 >>>> [81133.367682] 0000000000000003 ffff88000e20ad68 0000000000000000 >>>> ffff88005854d400 >>>> [81133.377064] 0000000000015f90 0000000000000000 ffff880059bb78c8 >>>> ffffffff819b5243 >>>> [81133.386374] Call Trace: >>>> [81133.395596] [<ffffffff8110fe3f>] mod_timer_pending+0x3f/0xe0 >>>> [81133.404999] [<ffffffff810fc261>] ? >>>> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >>>> [81133.414255] [<ffffffff819b5243>] __nf_ct_refresh_acct+0xa3/0xb0 >>>> [81133.423137] [<ffffffff819bbe8b>] tcp_packet+0xb3b/0x1290 >>>> [81133.431894] [<ffffffff810cb8ca>] ? >>>> __local_bh_enable_ip+0x2a/0x90 >>>> [81133.440622] [<ffffffff819b4939>] ? >>>> __nf_conntrack_find_get+0x129/0x2a0 >>>> [81133.449339] [<ffffffff819b682c>] nf_conntrack_in+0x29c/0x7c0 >>>> [81133.457940] [<ffffffff81a67181>] ipv4_conntrack_in+0x21/0x30 >>>> [81133.466296] [<ffffffff819aea1c>] nf_iterate+0x4c/0x80 >>>> [81133.474401] [<ffffffff819aeab4>] nf_hook_slow+0x64/0xc0 >>>> [81133.482615] [<ffffffff81a211ec>] ip_rcv+0x2ec/0x380 >>>> [81133.490781] [<ffffffff81a209f0>] ? >>>> ip_local_deliver_finish+0x130/0x130 >>>> [81133.498790] [<ffffffff8197e140>] >>>> __netif_receive_skb_core+0x2a0/0x970 >>>> [81133.506714] [<ffffffff81a56db8>] ? inet_gro_receive+0x1c8/0x200 >>>> [81133.514609] [<ffffffff81980705>] __netif_receive_skb+0x15/0x70 >>>> [81133.522333] [<ffffffff8198077e>] >>>> netif_receive_skb_internal+0x1e/0x80 >>>> [81133.529840] [<ffffffff81980f3b>] napi_gro_receive+0x6b/0x90 >>>> [81133.537173] [<ffffffff81740fb6>] rtl8169_poll+0x2e6/0x600 >>>> [81133.544444] [<ffffffff810fc261>] ? >>>> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >>>> [81133.551566] [<ffffffff81981ad7>] net_rx_action+0x1f7/0x300 >>>> [81133.558412] [<ffffffff810cb6c3>] __do_softirq+0x103/0x210 >>>> [81133.565353] [<ffffffff810cb807>] run_ksoftirqd+0x37/0x60 >>>> [81133.572359] [<ffffffff810e4de0>] smpboot_thread_fn+0x130/0x190 >>>> [81133.579215] [<ffffffff810e4cb0>] ? sort_range+0x20/0x20 >>>> [81133.586042] [<ffffffff810e1fae>] kthread+0xee/0x110 >>>> [81133.592792] [<ffffffff810e1ec0>] ? >>>> kthread_create_on_node+0x1b0/0x1b0 >>>> [81133.599694] [<ffffffff81af92df>] ret_from_fork+0x3f/0x70 >>>> [81133.606662] [<ffffffff810e1ec0>] ? >>>> kthread_create_on_node+0x1b0/0x1b0 >>>> [81133.613445] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 >>>> 00 >>>> 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 >>>> 08 >>>> 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 >>>> 48 >>>> [81133.627196] RIP [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 >>>> [81133.634036] RSP <ffff880059bb7848> >>>> [81133.640817] ---[ end trace eaf596e1fcf6a591 ]--- >>>> [81133.647521] Kernel panic - not syncing: Fatal exception in >>>> interrupt >>> This looks like the bug fixed in David Miller net tree : >>> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=2235f2ac75fd2501c251b0b699a9632e80239a6d >> >> Will pull the net-tree in and re-test. > > You should not pull the 'net-next', but rather the 'net' one. > > 'net' is not necessarily included in 'net-next'. Thanks for the reminder, but luckily i was aware of that, seen enough of your replies asking for patches to be resubmitted against "the other tree" ;) Kernel with patch is currently running so fingers crossed. -- Sander ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-12 21:46 ` Sander Eikelenboom @ 2015-08-12 22:41 ` Eric Dumazet 2015-08-14 22:09 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2015-08-12 22:41 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: David Miller, linux-kernel, netdev On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: > Thanks for the reminder, but luckily i was aware of that, > seen enough of your replies asking for patches to be resubmitted > against "the other tree" ;) > Kernel with patch is currently running so fingers crossed. Thanks for testing. I am definitely interested knowing your results. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-12 22:41 ` Eric Dumazet @ 2015-08-14 22:09 ` Sander Eikelenboom 2015-08-14 22:16 ` Sander Eikelenboom 2015-08-14 22:39 ` Eric Dumazet 0 siblings, 2 replies; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-14 22:09 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel On 2015-08-13 00:41, Eric Dumazet wrote: > On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: > >> Thanks for the reminder, but luckily i was aware of that, >> seen enough of your replies asking for patches to be resubmitted >> against "the other tree" ;) >> Kernel with patch is currently running so fingers crossed. > > Thanks for testing. I am definitely interested knowing your results. Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is breaking things (have to test if a revert helps) i get this in some guests: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0] [ 6620.282805] Modules linked in: [ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1 [ 6620.282805] task: ffffffff8221a580 ti: ffffffff82200000 task.ti: ffffffff82200000 [ 6620.282805] RIP: e030:[<ffffffff8100122a>] [<ffffffff8100122a>] xen_hypercall_xen_version+0xa/0x20 [ 6620.282805] RSP: e02b:ffff88000fc03d48 EFLAGS: 00000246 [ 6620.282805] RAX: 0000000000040006 RBX: 0000000000000200 RCX: ffffffff8100122a [ 6620.282805] RDX: 0000000000000001 RSI: 00000000deadbeef RDI: 00000000deadbeef [ 6620.282805] RBP: ffff88000fc03d60 R08: ffff88000fc03ee0 R09: 00000000000000ee [ 6620.282805] R10: ffffffff8220a0c0 R11: 0000000000000246 R12: 00000000ffffffff [ 6620.282805] R13: 0000000000000001 R14: ffff880003b53054 R15: 0000000000000005 [ 6620.282805] FS: 00007fec747ad800(0000) GS:ffff88000fc00000(0000) knlGS:0000000000000000 [ 6620.282805] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 6620.282805] CR2: 00007ffcb7a7a6d8 CR3: 0000000003164000 CR4: 0000000000000660 [ 6620.282805] Stack: [ 6620.282805] 0000000000000068 0000000000000007 ffffffff81008dbd ffff88000fc03dd8 [ 6620.282805] ffffffff81009592 0000000000000068 ffffffff8220a0c0 00000000000000ee [ 6620.282805] ffff88000fc03ee0 0000000000000200 0000000000000200 0000000000000001 [ 6620.282805] Call Trace: [ 6620.282805] <IRQ> [ 6620.282805] [<ffffffff81008dbd>] ? xen_force_evtchn_callback+0xd/0x10 [ 6620.282805] [<ffffffff81009592>] check_events+0x12/0x20 [ 6620.282805] [<ffffffff8100957f>] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 6620.282805] [<ffffffff81af79a5>] ? _raw_spin_unlock_irqrestore+0x25/0x30 [ 6620.282805] [<ffffffff8110ed43>] try_to_del_timer_sync+0x43/0x60 [ 6620.282805] [<ffffffff8110eda7>] del_timer_sync+0x47/0x60 [ 6620.282805] [<ffffffff81a2b698>] inet_csk_reqsk_queue_drop+0x118/0x1f0 [ 6620.282805] [<ffffffff81a2b8c6>] reqsk_timer_handler+0x156/0x260 [ 6620.282805] [<ffffffff81a2b770>] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0 [ 6620.282805] [<ffffffff8110f3c7>] call_timer_fn.isra.27+0x17/0x80 [ 6620.282805] [<ffffffff81a2b770>] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0 [ 6620.282805] [<ffffffff8110f55d>] run_timer_softirq+0x12d/0x200 [ 6620.282805] [<ffffffff810ca6c3>] __do_softirq+0x103/0x210 [ 6620.282805] [<ffffffff810ca9cb>] irq_exit+0x4b/0xa0 [ 6620.282805] [<ffffffff814f05d4>] xen_evtchn_do_upcall+0x34/0x50 [ 6620.282805] [<ffffffff81af932e>] xen_do_hypervisor_callback+0x1e/0x40 [ 6620.282805] <EOI> [ 6620.282805] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [ 6620.282805] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [ 6620.282805] [<ffffffff81008d60>] ? xen_safe_halt+0x10/0x20 [ 6620.282805] [<ffffffff810188d3>] ? default_idle+0x13/0x20 [ 6620.282805] [<ffffffff81018e1a>] ? arch_cpu_idle+0xa/0x10 [ 6620.282805] [<ffffffff810f8e7e>] ? default_idle_call+0x2e/0x50 [ 6620.282805] [<ffffffff810f9112>] ? cpu_startup_entry+0x272/0x2e0 [ 6620.282805] [<ffffffff81ae7967>] ? rest_init+0x77/0x80 [ 6620.282805] [<ffffffff82312f58>] ? start_kernel+0x43b/0x448 [ 6620.282805] [<ffffffff823124ef>] ? x86_64_start_reservations+0x2a/0x2c [ 6620.282805] [<ffffffff82316008>] ? xen_start_kernel+0x550/0x55c [ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-14 22:09 ` Sander Eikelenboom @ 2015-08-14 22:16 ` Sander Eikelenboom 2015-08-14 22:39 ` Eric Dumazet 1 sibling, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-14 22:16 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel On 2015-08-15 00:09, Sander Eikelenboom wrote: > On 2015-08-13 00:41, Eric Dumazet wrote: >> On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: >> >>> Thanks for the reminder, but luckily i was aware of that, >>> seen enough of your replies asking for patches to be resubmitted >>> against "the other tree" ;) >>> Kernel with patch is currently running so fingers crossed. >> >> Thanks for testing. I am definitely interested knowing your results. > > Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is > breaking things > (have to test if a revert helps) i get this in some guests: Should have done that before, because it wasn't in yet .. and likely to fix the issue, also pulled and compiling now. -- Sander > NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0] > [ 6620.282805] Modules linked in: > [ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > 4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1 > [ 6620.282805] task: ffffffff8221a580 ti: ffffffff82200000 task.ti: > ffffffff82200000 > [ 6620.282805] RIP: e030:[<ffffffff8100122a>] [<ffffffff8100122a>] > xen_hypercall_xen_version+0xa/0x20 > [ 6620.282805] RSP: e02b:ffff88000fc03d48 EFLAGS: 00000246 > [ 6620.282805] RAX: 0000000000040006 RBX: 0000000000000200 RCX: > ffffffff8100122a > [ 6620.282805] RDX: 0000000000000001 RSI: 00000000deadbeef RDI: > 00000000deadbeef > [ 6620.282805] RBP: ffff88000fc03d60 R08: ffff88000fc03ee0 R09: > 00000000000000ee > [ 6620.282805] R10: ffffffff8220a0c0 R11: 0000000000000246 R12: > 00000000ffffffff > [ 6620.282805] R13: 0000000000000001 R14: ffff880003b53054 R15: > 0000000000000005 > [ 6620.282805] FS: 00007fec747ad800(0000) GS:ffff88000fc00000(0000) > knlGS:0000000000000000 > [ 6620.282805] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 6620.282805] CR2: 00007ffcb7a7a6d8 CR3: 0000000003164000 CR4: > 0000000000000660 > [ 6620.282805] Stack: > [ 6620.282805] 0000000000000068 0000000000000007 ffffffff81008dbd > ffff88000fc03dd8 > [ 6620.282805] ffffffff81009592 0000000000000068 ffffffff8220a0c0 > 00000000000000ee > [ 6620.282805] ffff88000fc03ee0 0000000000000200 0000000000000200 > 0000000000000001 > [ 6620.282805] Call Trace: > [ 6620.282805] <IRQ> > [ 6620.282805] [<ffffffff81008dbd>] ? > xen_force_evtchn_callback+0xd/0x10 > [ 6620.282805] [<ffffffff81009592>] check_events+0x12/0x20 > [ 6620.282805] [<ffffffff8100957f>] ? > xen_restore_fl_direct_reloc+0x4/0x4 > [ 6620.282805] [<ffffffff81af79a5>] ? > _raw_spin_unlock_irqrestore+0x25/0x30 > [ 6620.282805] [<ffffffff8110ed43>] try_to_del_timer_sync+0x43/0x60 > [ 6620.282805] [<ffffffff8110eda7>] del_timer_sync+0x47/0x60 > [ 6620.282805] [<ffffffff81a2b698>] > inet_csk_reqsk_queue_drop+0x118/0x1f0 > [ 6620.282805] [<ffffffff81a2b8c6>] reqsk_timer_handler+0x156/0x260 > [ 6620.282805] [<ffffffff81a2b770>] ? > inet_csk_reqsk_queue_drop+0x1f0/0x1f0 > [ 6620.282805] [<ffffffff8110f3c7>] call_timer_fn.isra.27+0x17/0x80 > [ 6620.282805] [<ffffffff81a2b770>] ? > inet_csk_reqsk_queue_drop+0x1f0/0x1f0 > [ 6620.282805] [<ffffffff8110f55d>] run_timer_softirq+0x12d/0x200 > [ 6620.282805] [<ffffffff810ca6c3>] __do_softirq+0x103/0x210 > [ 6620.282805] [<ffffffff810ca9cb>] irq_exit+0x4b/0xa0 > [ 6620.282805] [<ffffffff814f05d4>] xen_evtchn_do_upcall+0x34/0x50 > [ 6620.282805] [<ffffffff81af932e>] > xen_do_hypervisor_callback+0x1e/0x40 > [ 6620.282805] <EOI> > [ 6620.282805] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 > [ 6620.282805] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 > [ 6620.282805] [<ffffffff81008d60>] ? xen_safe_halt+0x10/0x20 > [ 6620.282805] [<ffffffff810188d3>] ? default_idle+0x13/0x20 > [ 6620.282805] [<ffffffff81018e1a>] ? arch_cpu_idle+0xa/0x10 > [ 6620.282805] [<ffffffff810f8e7e>] ? default_idle_call+0x2e/0x50 > [ 6620.282805] [<ffffffff810f9112>] ? cpu_startup_entry+0x272/0x2e0 > [ 6620.282805] [<ffffffff81ae7967>] ? rest_init+0x77/0x80 > [ 6620.282805] [<ffffffff82312f58>] ? start_kernel+0x43b/0x448 > [ 6620.282805] [<ffffffff823124ef>] ? > x86_64_start_reservations+0x2a/0x2c > [ 6620.282805] [<ffffffff82316008>] ? xen_start_kernel+0x550/0x55c > [ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc > cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 > 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc > cc cc ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-14 22:09 ` Sander Eikelenboom 2015-08-14 22:16 ` Sander Eikelenboom @ 2015-08-14 22:39 ` Eric Dumazet 2015-08-17 9:09 ` Sander Eikelenboom 1 sibling, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2015-08-14 22:39 UTC (permalink / raw) To: Sander Eikelenboom Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote: > On 2015-08-13 00:41, Eric Dumazet wrote: > > On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: > > > >> Thanks for the reminder, but luckily i was aware of that, > >> seen enough of your replies asking for patches to be resubmitted > >> against "the other tree" ;) > >> Kernel with patch is currently running so fingers crossed. > > > > Thanks for testing. I am definitely interested knowing your results. > > Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is > breaking things > (have to test if a revert helps) i get this in some guests: Yes, this was fixed by : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-14 22:39 ` Eric Dumazet @ 2015-08-17 9:09 ` Sander Eikelenboom 2015-08-17 13:37 ` Eric Dumazet 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-17 9:09 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel Saturday, August 15, 2015, 12:39:25 AM, you wrote: > On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote: >> On 2015-08-13 00:41, Eric Dumazet wrote: >> > On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: >> > >> >> Thanks for the reminder, but luckily i was aware of that, >> >> seen enough of your replies asking for patches to be resubmitted >> >> against "the other tree" ;) >> >> Kernel with patch is currently running so fingers crossed. >> > >> > Thanks for testing. I am definitely interested knowing your results. >> >> Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is >> breaking things >> (have to test if a revert helps) i get this in some guests: > Yes, this was fixed by : > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af Hi Eric, With that patch i had a crash again this night, see below. -- Sander [177459.188808] general protection fault: 0000 [#1] SMP [177459.199746] Modules linked in: [177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150815-linus-doflr-net+ #1 [177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [177459.232247] task: ffffffff8221a580 ti: ffffffff82200000 task.ti: ffffffff82200000 [177459.242931] RIP: e030:[<ffffffff8110eb58>] [<ffffffff8110eb58>] detach_if_pending+0x18/0x80 [177459.253503] RSP: e02b:ffff88005f6039d8 EFLAGS: 00010086 [177459.264051] RAX: ffff8800584d6580 RBX: ffff880004901420 RCX: dead000000200200 [177459.274599] RDX: 0000000000000000 RSI: ffff88005f60e5c0 RDI: ffff880004901420 [177459.285122] RBP: ffff88005f6039d8 R08: 0000000000000001 R09: 0000000000000000 [177459.295286] R10: 0000000000000003 R11: ffff880004901394 R12: 0000000000000003 [177459.305388] R13: 000000010ae47040 R14: 0000000007b98a00 R15: ffff88005f60e5c0 [177459.315345] FS: 00007f51317ec700(0000) GS:ffff88005f600000(0000) knlGS:0000000000000000 [177459.325340] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [177459.335217] CR2: 00000000010f8000 CR3: 000000002a154000 CR4: 0000000000000660 [177459.345129] Stack: [177459.354783] ffff88005f603a28 ffffffff8110ee7f ffffffff810fb261 0000000000000200 [177459.364505] 0000000000000003 ffff880004901380 0000000000000003 ffff8800567d0d00 [177459.374064] 0000000007b98a00 0000000000000000 ffff88005f603a58 ffffffff819b3eb3 [177459.383532] Call Trace: [177459.392878] <IRQ> [177459.392935] [<ffffffff8110ee7f>] mod_timer_pending+0x3f/0xe0 [177459.411058] [<ffffffff810fb261>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [177459.419876] [<ffffffff819b3eb3>] __nf_ct_refresh_acct+0xa3/0xb0 [177459.428642] [<ffffffff819baafb>] tcp_packet+0xb3b/0x1290 [177459.437285] [<ffffffff81a2535e>] ? ip_output+0x5e/0xc0 [177459.445845] [<ffffffff810ca8ca>] ? __local_bh_enable_ip+0x2a/0x90 [177459.454331] [<ffffffff819b35a9>] ? __nf_conntrack_find_get+0x129/0x2a0 [177459.462642] [<ffffffff819b549c>] nf_conntrack_in+0x29c/0x7c0 [177459.470711] [<ffffffff81a65e9c>] ipv4_conntrack_local+0x4c/0x50 [177459.478753] [<ffffffff819ad67c>] nf_iterate+0x4c/0x80 [177459.486726] [<ffffffff81102437>] ? generic_handle_irq+0x27/0x40 [177459.494634] [<ffffffff819ad714>] nf_hook_slow+0x64/0xc0 [177459.502486] [<ffffffff81a22d40>] __ip_local_out_sk+0x90/0xa0 [177459.510248] [<ffffffff81a22c40>] ? ip_forward_options+0x1a0/0x1a0 [177459.517782] [<ffffffff81a22d66>] ip_local_out_sk+0x16/0x40 [177459.525044] [<ffffffff81a2343d>] ip_queue_xmit+0x14d/0x350 [177459.532247] [<ffffffff81a3ae7e>] tcp_transmit_skb+0x48e/0x960 [177459.539413] [<ffffffff81a3cddb>] tcp_xmit_probe_skb+0xdb/0xf0 [177459.546389] [<ffffffff81a3dffb>] tcp_write_wakeup+0x5b/0x150 [177459.553061] [<ffffffff81a3e51b>] tcp_keepalive_timer+0x1fb/0x230 [177459.559761] [<ffffffff81a3e320>] ? tcp_init_xmit_timers+0x20/0x20 [177459.566447] [<ffffffff8110f3c7>] call_timer_fn.isra.27+0x17/0x80 [177459.573121] [<ffffffff81a3e320>] ? tcp_init_xmit_timers+0x20/0x20 [177459.579778] [<ffffffff8110f55d>] run_timer_softirq+0x12d/0x200 [177459.586448] [<ffffffff810ca6c3>] __do_softirq+0x103/0x210 [177459.593138] [<ffffffff810ca9cb>] irq_exit+0x4b/0xa0 [177459.599783] [<ffffffff814f05d4>] xen_evtchn_do_upcall+0x34/0x50 [177459.606300] [<ffffffff81af93ae>] xen_do_hypervisor_callback+0x1e/0x40 [177459.612583] <EOI> [177459.612637] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [177459.625010] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [177459.631157] [<ffffffff81008d60>] ? xen_safe_halt+0x10/0x20 [177459.637158] [<ffffffff810188d3>] ? default_idle+0x13/0x20 [177459.643072] [<ffffffff81018e1a>] ? arch_cpu_idle+0xa/0x10 [177459.648809] [<ffffffff810f8e7e>] ? default_idle_call+0x2e/0x50 [177459.654650] [<ffffffff810f9112>] ? cpu_startup_entry+0x272/0x2e0 [177459.660488] [<ffffffff81ae79f7>] ? rest_init+0x77/0x80 [177459.666297] [<ffffffff82312f58>] ? start_kernel+0x43b/0x448 [177459.672092] [<ffffffff823124ef>] ? x86_64_start_reservations+0x2a/0x2c [177459.677800] [<ffffffff82316008>] ? xen_start_kernel+0x550/0x55c [177459.683451] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 08 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 [177459.695332] RIP [<ffffffff8110eb58>] detach_if_pending+0x18/0x80 [177459.701154] RSP <ffff88005f6039d8> (XEN) [2015-08-17 00:11:51.426] Hardware Dom0 crashed: rebooting machine in 5 seconds. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 9:09 ` Sander Eikelenboom @ 2015-08-17 13:37 ` Eric Dumazet 2015-08-17 13:48 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2015-08-17 13:37 UTC (permalink / raw) To: Sander Eikelenboom Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel On Mon, 2015-08-17 at 11:09 +0200, Sander Eikelenboom wrote: > Saturday, August 15, 2015, 12:39:25 AM, you wrote: > > > On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote: > >> On 2015-08-13 00:41, Eric Dumazet wrote: > >> > On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: > >> > > >> >> Thanks for the reminder, but luckily i was aware of that, > >> >> seen enough of your replies asking for patches to be resubmitted > >> >> against "the other tree" ;) > >> >> Kernel with patch is currently running so fingers crossed. > >> > > >> > Thanks for testing. I am definitely interested knowing your results. > >> > >> Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is > >> breaking things > >> (have to test if a revert helps) i get this in some guests: > > > > Yes, this was fixed by : > > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af > > > Hi Eric, > > With that patch i had a crash again this night, see below. > > -- > Sander > > [177459.188808] general protection fault: 0000 [#1] SMP > [177459.199746] Modules linked in: > [177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150815-linus-doflr-net+ #1 > [177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 > [177459.232247] task: ffffffff8221a580 ti: ffffffff82200000 task.ti: ffffffff82200000 > [177459.242931] RIP: e030:[<ffffffff8110eb58>] [<ffffffff8110eb58>] detach_if_pending+0x18/0x80 > [177459.253503] RSP: e02b:ffff88005f6039d8 EFLAGS: 00010086 > [177459.264051] RAX: ffff8800584d6580 RBX: ffff880004901420 RCX: dead000000200200 > [177459.274599] RDX: 0000000000000000 RSI: ffff88005f60e5c0 RDI: ffff880004901420 > [177459.285122] RBP: ffff88005f6039d8 R08: 0000000000000001 R09: 0000000000000000 > [177459.295286] R10: 0000000000000003 R11: ffff880004901394 R12: 0000000000000003 > [177459.305388] R13: 000000010ae47040 R14: 0000000007b98a00 R15: ffff88005f60e5c0 > [177459.315345] FS: 00007f51317ec700(0000) GS:ffff88005f600000(0000) knlGS:0000000000000000 > [177459.325340] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [177459.335217] CR2: 00000000010f8000 CR3: 000000002a154000 CR4: 0000000000000660 > [177459.345129] Stack: > [177459.354783] ffff88005f603a28 ffffffff8110ee7f ffffffff810fb261 0000000000000200 > [177459.364505] 0000000000000003 ffff880004901380 0000000000000003 ffff8800567d0d00 > [177459.374064] 0000000007b98a00 0000000000000000 ffff88005f603a58 ffffffff819b3eb3 > [177459.383532] Call Trace: > [177459.392878] <IRQ> > [177459.392935] [<ffffffff8110ee7f>] mod_timer_pending+0x3f/0xe0 > [177459.411058] [<ffffffff810fb261>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 > [177459.419876] [<ffffffff819b3eb3>] __nf_ct_refresh_acct+0xa3/0xb0 > [177459.428642] [<ffffffff819baafb>] tcp_packet+0xb3b/0x1290 > [177459.437285] [<ffffffff81a2535e>] ? ip_output+0x5e/0xc0 > [177459.445845] [<ffffffff810ca8ca>] ? __local_bh_enable_ip+0x2a/0x90 > [177459.454331] [<ffffffff819b35a9>] ? __nf_conntrack_find_get+0x129/0x2a0 > [177459.462642] [<ffffffff819b549c>] nf_conntrack_in+0x29c/0x7c0 > [177459.470711] [<ffffffff81a65e9c>] ipv4_conntrack_local+0x4c/0x50 > [177459.478753] [<ffffffff819ad67c>] nf_iterate+0x4c/0x80 > [177459.486726] [<ffffffff81102437>] ? generic_handle_irq+0x27/0x40 > [177459.494634] [<ffffffff819ad714>] nf_hook_slow+0x64/0xc0 > [177459.502486] [<ffffffff81a22d40>] __ip_local_out_sk+0x90/0xa0 > [177459.510248] [<ffffffff81a22c40>] ? ip_forward_options+0x1a0/0x1a0 > [177459.517782] [<ffffffff81a22d66>] ip_local_out_sk+0x16/0x40 > [177459.525044] [<ffffffff81a2343d>] ip_queue_xmit+0x14d/0x350 > [177459.532247] [<ffffffff81a3ae7e>] tcp_transmit_skb+0x48e/0x960 > [177459.539413] [<ffffffff81a3cddb>] tcp_xmit_probe_skb+0xdb/0xf0 > [177459.546389] [<ffffffff81a3dffb>] tcp_write_wakeup+0x5b/0x150 > [177459.553061] [<ffffffff81a3e51b>] tcp_keepalive_timer+0x1fb/0x230 > [177459.559761] [<ffffffff81a3e320>] ? tcp_init_xmit_timers+0x20/0x20 > [177459.566447] [<ffffffff8110f3c7>] call_timer_fn.isra.27+0x17/0x80 > [177459.573121] [<ffffffff81a3e320>] ? tcp_init_xmit_timers+0x20/0x20 > [177459.579778] [<ffffffff8110f55d>] run_timer_softirq+0x12d/0x200 > [177459.586448] [<ffffffff810ca6c3>] __do_softirq+0x103/0x210 > [177459.593138] [<ffffffff810ca9cb>] irq_exit+0x4b/0xa0 > [177459.599783] [<ffffffff814f05d4>] xen_evtchn_do_upcall+0x34/0x50 > [177459.606300] [<ffffffff81af93ae>] xen_do_hypervisor_callback+0x1e/0x40 > [177459.612583] <EOI> > [177459.612637] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 > [177459.625010] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 > [177459.631157] [<ffffffff81008d60>] ? xen_safe_halt+0x10/0x20 > [177459.637158] [<ffffffff810188d3>] ? default_idle+0x13/0x20 > [177459.643072] [<ffffffff81018e1a>] ? arch_cpu_idle+0xa/0x10 > [177459.648809] [<ffffffff810f8e7e>] ? default_idle_call+0x2e/0x50 > [177459.654650] [<ffffffff810f9112>] ? cpu_startup_entry+0x272/0x2e0 > [177459.660488] [<ffffffff81ae79f7>] ? rest_init+0x77/0x80 > [177459.666297] [<ffffffff82312f58>] ? start_kernel+0x43b/0x448 > [177459.672092] [<ffffffff823124ef>] ? x86_64_start_reservations+0x2a/0x2c > [177459.677800] [<ffffffff82316008>] ? xen_start_kernel+0x550/0x55c > [177459.683451] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 08 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 > [177459.695332] RIP [<ffffffff8110eb58>] detach_if_pending+0x18/0x80 > [177459.701154] RSP <ffff88005f6039d8> > (XEN) [2015-08-17 00:11:51.426] Hardware Dom0 crashed: rebooting machine in 5 seconds. > might be conntracking related then. You might try : 1) reproduce the issue without conntracking. 2) bisect the bug Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 13:37 ` Eric Dumazet @ 2015-08-17 13:48 ` Sander Eikelenboom 2015-08-17 14:02 ` Jon Christopherson 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-17 13:48 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel Monday, August 17, 2015, 3:37:13 PM, you wrote: > On Mon, 2015-08-17 at 11:09 +0200, Sander Eikelenboom wrote: >> Saturday, August 15, 2015, 12:39:25 AM, you wrote: >> >> > On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote: >> >> On 2015-08-13 00:41, Eric Dumazet wrote: >> >> > On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote: >> >> > >> >> >> Thanks for the reminder, but luckily i was aware of that, >> >> >> seen enough of your replies asking for patches to be resubmitted >> >> >> against "the other tree" ;) >> >> >> Kernel with patch is currently running so fingers crossed. >> >> > >> >> > Thanks for testing. I am definitely interested knowing your results. >> >> >> >> Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is >> >> breaking things >> >> (have to test if a revert helps) i get this in some guests: >> >> >> > Yes, this was fixed by : >> > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af >> >> >> Hi Eric, >> >> With that patch i had a crash again this night, see below. >> >> -- >> Sander >> >> [177459.188808] general protection fault: 0000 [#1] SMP >> [177459.199746] Modules linked in: >> [177459.210540] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6-20150815-linus-doflr-net+ #1 >> [177459.221441] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 >> [177459.232247] task: ffffffff8221a580 ti: ffffffff82200000 task.ti: ffffffff82200000 >> [177459.242931] RIP: e030:[<ffffffff8110eb58>] [<ffffffff8110eb58>] detach_if_pending+0x18/0x80 >> [177459.253503] RSP: e02b:ffff88005f6039d8 EFLAGS: 00010086 >> [177459.264051] RAX: ffff8800584d6580 RBX: ffff880004901420 RCX: dead000000200200 >> [177459.274599] RDX: 0000000000000000 RSI: ffff88005f60e5c0 RDI: ffff880004901420 >> [177459.285122] RBP: ffff88005f6039d8 R08: 0000000000000001 R09: 0000000000000000 >> [177459.295286] R10: 0000000000000003 R11: ffff880004901394 R12: 0000000000000003 >> [177459.305388] R13: 000000010ae47040 R14: 0000000007b98a00 R15: ffff88005f60e5c0 >> [177459.315345] FS: 00007f51317ec700(0000) GS:ffff88005f600000(0000) knlGS:0000000000000000 >> [177459.325340] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> [177459.335217] CR2: 00000000010f8000 CR3: 000000002a154000 CR4: 0000000000000660 >> [177459.345129] Stack: >> [177459.354783] ffff88005f603a28 ffffffff8110ee7f ffffffff810fb261 0000000000000200 >> [177459.364505] 0000000000000003 ffff880004901380 0000000000000003 ffff8800567d0d00 >> [177459.374064] 0000000007b98a00 0000000000000000 ffff88005f603a58 ffffffff819b3eb3 >> [177459.383532] Call Trace: >> [177459.392878] <IRQ> >> [177459.392935] [<ffffffff8110ee7f>] mod_timer_pending+0x3f/0xe0 >> [177459.411058] [<ffffffff810fb261>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >> [177459.419876] [<ffffffff819b3eb3>] __nf_ct_refresh_acct+0xa3/0xb0 >> [177459.428642] [<ffffffff819baafb>] tcp_packet+0xb3b/0x1290 >> [177459.437285] [<ffffffff81a2535e>] ? ip_output+0x5e/0xc0 >> [177459.445845] [<ffffffff810ca8ca>] ? __local_bh_enable_ip+0x2a/0x90 >> [177459.454331] [<ffffffff819b35a9>] ? __nf_conntrack_find_get+0x129/0x2a0 >> [177459.462642] [<ffffffff819b549c>] nf_conntrack_in+0x29c/0x7c0 >> [177459.470711] [<ffffffff81a65e9c>] ipv4_conntrack_local+0x4c/0x50 >> [177459.478753] [<ffffffff819ad67c>] nf_iterate+0x4c/0x80 >> [177459.486726] [<ffffffff81102437>] ? generic_handle_irq+0x27/0x40 >> [177459.494634] [<ffffffff819ad714>] nf_hook_slow+0x64/0xc0 >> [177459.502486] [<ffffffff81a22d40>] __ip_local_out_sk+0x90/0xa0 >> [177459.510248] [<ffffffff81a22c40>] ? ip_forward_options+0x1a0/0x1a0 >> [177459.517782] [<ffffffff81a22d66>] ip_local_out_sk+0x16/0x40 >> [177459.525044] [<ffffffff81a2343d>] ip_queue_xmit+0x14d/0x350 >> [177459.532247] [<ffffffff81a3ae7e>] tcp_transmit_skb+0x48e/0x960 >> [177459.539413] [<ffffffff81a3cddb>] tcp_xmit_probe_skb+0xdb/0xf0 >> [177459.546389] [<ffffffff81a3dffb>] tcp_write_wakeup+0x5b/0x150 >> [177459.553061] [<ffffffff81a3e51b>] tcp_keepalive_timer+0x1fb/0x230 >> [177459.559761] [<ffffffff81a3e320>] ? tcp_init_xmit_timers+0x20/0x20 >> [177459.566447] [<ffffffff8110f3c7>] call_timer_fn.isra.27+0x17/0x80 >> [177459.573121] [<ffffffff81a3e320>] ? tcp_init_xmit_timers+0x20/0x20 >> [177459.579778] [<ffffffff8110f55d>] run_timer_softirq+0x12d/0x200 >> [177459.586448] [<ffffffff810ca6c3>] __do_softirq+0x103/0x210 >> [177459.593138] [<ffffffff810ca9cb>] irq_exit+0x4b/0xa0 >> [177459.599783] [<ffffffff814f05d4>] xen_evtchn_do_upcall+0x34/0x50 >> [177459.606300] [<ffffffff81af93ae>] xen_do_hypervisor_callback+0x1e/0x40 >> [177459.612583] <EOI> >> [177459.612637] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 >> [177459.625010] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 >> [177459.631157] [<ffffffff81008d60>] ? xen_safe_halt+0x10/0x20 >> [177459.637158] [<ffffffff810188d3>] ? default_idle+0x13/0x20 >> [177459.643072] [<ffffffff81018e1a>] ? arch_cpu_idle+0xa/0x10 >> [177459.648809] [<ffffffff810f8e7e>] ? default_idle_call+0x2e/0x50 >> [177459.654650] [<ffffffff810f9112>] ? cpu_startup_entry+0x272/0x2e0 >> [177459.660488] [<ffffffff81ae79f7>] ? rest_init+0x77/0x80 >> [177459.666297] [<ffffffff82312f58>] ? start_kernel+0x43b/0x448 >> [177459.672092] [<ffffffff823124ef>] ? x86_64_start_reservations+0x2a/0x2c >> [177459.677800] [<ffffffff82316008>] ? xen_start_kernel+0x550/0x55c >> [177459.683451] Code: 77 28 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 08 55 48 89 e5 48 85 c0 74 6a 48 8b 0f 48 85 c9 48 89 08 74 04 <48> 89 41 08 84 d2 74 08 48 c7 47 08 00 00 00 00 f6 47 2a 10 48 >> [177459.695332] RIP [<ffffffff8110eb58>] detach_if_pending+0x18/0x80 >> [177459.701154] RSP <ffff88005f6039d8> >> (XEN) [2015-08-17 00:11:51.426] Hardware Dom0 crashed: rebooting machine in 5 seconds. >> > might be conntracking related then. > You might try : > 1) reproduce the issue without conntracking. Will see if i can do that. > 2) bisect the bug Hmm that's going to be quite painful, since i don't have an immediate and reliable testcase (running for "about two days" doessn't qualify). Especially since there are all kinds of other known bugs in between. > Thanks. -- Sander ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 13:48 ` Sander Eikelenboom @ 2015-08-17 14:02 ` Jon Christopherson 2015-08-17 14:21 ` Eric Dumazet 0 siblings, 1 reply; 20+ messages in thread From: Jon Christopherson @ 2015-08-17 14:02 UTC (permalink / raw) To: Sander Eikelenboom Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel, Eric Dumazet This is very similar to the behavior I am seeing in this bug: https://bugzilla.kernel.org/show_bug.cgi?id=102911 -Jon ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 14:02 ` Jon Christopherson @ 2015-08-17 14:21 ` Eric Dumazet 2015-08-17 14:25 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2015-08-17 14:21 UTC (permalink / raw) To: Jon Christopherson Cc: Sander Eikelenboom, David Miller, linux-kernel, netdev, xen-devel, david.vrabel On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: > This is very similar to the behavior I am seeing in this bug: > > https://bugzilla.kernel.org/show_bug.cgi?id=102911 OK, but have you applied the fix ? http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af It will be part of net iteration from David Miller to Linus Torvald. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 14:21 ` Eric Dumazet @ 2015-08-17 14:25 ` Sander Eikelenboom 2015-08-17 15:16 ` Jon Christopherson 2015-08-17 17:18 ` Eric Dumazet 0 siblings, 2 replies; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-17 14:25 UTC (permalink / raw) To: Eric Dumazet Cc: Jon Christopherson, David Miller, linux-kernel, netdev, xen-devel, david.vrabel Monday, August 17, 2015, 4:21:47 PM, you wrote: > On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: >> This is very similar to the behavior I am seeing in this bug: >> >> https://bugzilla.kernel.org/show_bug.cgi?id=102911 > OK, but have you applied the fix ? > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af > It will be part of net iteration from David Miller to Linus Torvald. I did have that patch in for my last report. But i don't think he had (looking at the second part of his oops). -- Sander ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 14:25 ` Sander Eikelenboom @ 2015-08-17 15:16 ` Jon Christopherson 2015-08-17 17:18 ` Eric Dumazet 1 sibling, 0 replies; 20+ messages in thread From: Jon Christopherson @ 2015-08-17 15:16 UTC (permalink / raw) To: Sander Eikelenboom Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel, Eric Dumazet On 08/17/2015 09:25 AM, Sander Eikelenboom wrote: > > > OK, but have you applied the fix ? > > > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af > > > It will be part of net iteration from David Miller to Linus Torvald. > > I did not have that fix applied, but will apply and test. Thanks, Jon ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 14:25 ` Sander Eikelenboom 2015-08-17 15:16 ` Jon Christopherson @ 2015-08-17 17:18 ` Eric Dumazet 2015-08-17 18:27 ` Sander Eikelenboom ` (2 more replies) 1 sibling, 3 replies; 20+ messages in thread From: Eric Dumazet @ 2015-08-17 17:18 UTC (permalink / raw) To: Sander Eikelenboom, Thomas Gleixner Cc: Jon Christopherson, David Miller, linux-kernel, netdev, xen-devel, david.vrabel From: Eric Dumazet <edumazet@google.com> On Mon, 2015-08-17 at 16:25 +0200, Sander Eikelenboom wrote: > Monday, August 17, 2015, 4:21:47 PM, you wrote: > > > On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: > >> This is very similar to the behavior I am seeing in this bug: > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=102911 > > > OK, but have you applied the fix ? > > > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af > > > It will be part of net iteration from David Miller to Linus Torvald. > > > I did have that patch in for my last report. > But i don't think he had (looking at the second part of his oops). > Then can you try following fix as well ? Thanks ! [PATCH] timer: fix a race in __mod_timer() lock_timer_base() can not catch following : CPU1 ( in __mod_timer() timer->flags |= TIMER_MIGRATING; spin_unlock(&base->lock); base = new_base; spin_lock(&base->lock); timer->flags &= ~TIMER_BASEMASK; CPU2 (in lock_timer_base()) see timer base is cpu0 base spin_lock_irqsave(&base->lock, *flags); if (timer->flags == tf) return base; // oops, wrong base timer->flags |= base->cpu // too late We must write timer->flags in one go, otherwise we can fool other cpus. Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> --- kernel/time/timer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 5e097fa9faf7..84190f02b521 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, spin_unlock(&base->lock); base = new_base; spin_lock(&base->lock); - timer->flags &= ~TIMER_BASEMASK; - timer->flags |= base->cpu; + WRITE_ONCE(timer->flags, + (timer->flags & ~TIMER_BASEMASK) | base->cpu); } } ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 17:18 ` Eric Dumazet @ 2015-08-17 18:27 ` Sander Eikelenboom 2015-08-17 19:13 ` Thomas Gleixner 2015-08-18 0:05 ` Jon Christopherson 2 siblings, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2015-08-17 18:27 UTC (permalink / raw) To: Eric Dumazet Cc: Thomas Gleixner, Jon Christopherson, David Miller, linux-kernel, netdev, xen-devel, david.vrabel On 2015-08-17 19:18, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > On Mon, 2015-08-17 at 16:25 +0200, Sander Eikelenboom wrote: >> Monday, August 17, 2015, 4:21:47 PM, you wrote: >> >> > On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: >> >> This is very similar to the behavior I am seeing in this bug: >> >> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=102911 >> >> > OK, but have you applied the fix ? >> >> > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af >> >> > It will be part of net iteration from David Miller to Linus Torvald. >> >> >> I did have that patch in for my last report. >> But i don't think he had (looking at the second part of his oops). >> > > Then can you try following fix as well ? > > Thanks ! Running now :) > > [PATCH] timer: fix a race in __mod_timer() > > lock_timer_base() can not catch following : > > CPU1 ( in __mod_timer() > timer->flags |= TIMER_MIGRATING; > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > timer->flags &= ~TIMER_BASEMASK; > CPU2 (in lock_timer_base()) > see timer base is cpu0 base > spin_lock_irqsave(&base->lock, > *flags); > if (timer->flags == tf) > return base; // oops, wrong base > timer->flags |= base->cpu // too late > > We must write timer->flags in one go, otherwise we can fool other cpus. > > Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if > disabled") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > kernel/time/timer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > index 5e097fa9faf7..84190f02b521 100644 > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long > expires, > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > - timer->flags &= ~TIMER_BASEMASK; > - timer->flags |= base->cpu; > + WRITE_ONCE(timer->flags, > + (timer->flags & ~TIMER_BASEMASK) | base->cpu); > } > } ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 17:18 ` Eric Dumazet 2015-08-17 18:27 ` Sander Eikelenboom @ 2015-08-17 19:13 ` Thomas Gleixner 2015-08-18 0:05 ` Jon Christopherson 2 siblings, 0 replies; 20+ messages in thread From: Thomas Gleixner @ 2015-08-17 19:13 UTC (permalink / raw) To: Eric Dumazet Cc: Sander Eikelenboom, Jon Christopherson, David Miller, linux-kernel, netdev, xen-devel, david.vrabel On Mon, 17 Aug 2015, Eric Dumazet wrote: > [PATCH] timer: fix a race in __mod_timer() > > lock_timer_base() can not catch following : > > CPU1 ( in __mod_timer() > timer->flags |= TIMER_MIGRATING; > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > timer->flags &= ~TIMER_BASEMASK; > CPU2 (in lock_timer_base()) > see timer base is cpu0 base > spin_lock_irqsave(&base->lock, *flags); > if (timer->flags == tf) > return base; // oops, wrong base > timer->flags |= base->cpu // too late > > We must write timer->flags in one go, otherwise we can fool other cpus. > > Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > kernel/time/timer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > index 5e097fa9faf7..84190f02b521 100644 > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > - timer->flags &= ~TIMER_BASEMASK; > - timer->flags |= base->cpu; > + WRITE_ONCE(timer->flags, > + (timer->flags & ~TIMER_BASEMASK) | base->cpu); Duh, yes. Picking it up for timers/urgent. Thanks for spotting it. tglx ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 2015-08-17 17:18 ` Eric Dumazet 2015-08-17 18:27 ` Sander Eikelenboom 2015-08-17 19:13 ` Thomas Gleixner @ 2015-08-18 0:05 ` Jon Christopherson 2 siblings, 0 replies; 20+ messages in thread From: Jon Christopherson @ 2015-08-18 0:05 UTC (permalink / raw) To: Eric Dumazet, Sander Eikelenboom, Thomas Gleixner Cc: David Miller, linux-kernel, netdev, xen-devel, david.vrabel On 08/17/2015 12:18 PM, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> <snip> > > Then can you try following fix as well ? > > Thanks ! > > [PATCH] timer: fix a race in __mod_timer() > <snip> I have been running the latest code from git with the 2 patches in this thread applied. No issues so far. -Jon ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-08-18 0:05 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-12 19:19 Linux 4.2-rc6 regression: RIP: e030:[<ffffffff8110fb18>] [<ffffffff8110fb18>] detach_if_pending+0x18/0x80 linux 2015-08-12 20:41 ` Eric Dumazet 2015-08-12 20:50 ` linux 2015-08-12 21:40 ` David Miller 2015-08-12 21:46 ` Sander Eikelenboom 2015-08-12 22:41 ` Eric Dumazet 2015-08-14 22:09 ` Sander Eikelenboom 2015-08-14 22:16 ` Sander Eikelenboom 2015-08-14 22:39 ` Eric Dumazet 2015-08-17 9:09 ` Sander Eikelenboom 2015-08-17 13:37 ` Eric Dumazet 2015-08-17 13:48 ` Sander Eikelenboom 2015-08-17 14:02 ` Jon Christopherson 2015-08-17 14:21 ` Eric Dumazet 2015-08-17 14:25 ` Sander Eikelenboom 2015-08-17 15:16 ` Jon Christopherson 2015-08-17 17:18 ` Eric Dumazet 2015-08-17 18:27 ` Sander Eikelenboom 2015-08-17 19:13 ` Thomas Gleixner 2015-08-18 0:05 ` Jon Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).