* Re: [Qemu-devel] [Bug 1379340] [NEW] qemu-kvm guest panic for smp trusty guests
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
@ 2014-10-16 9:21 ` Serge Hallyn
2014-10-29 17:18 ` [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD " Serge Hallyn
` (4 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: Serge Hallyn @ 2014-10-16 9:21 UTC (permalink / raw)
To: qemu-devel
affects: ubuntu/qemu
importance: high
affects: qemu
importance: high
** Also affects: qemu (Ubuntu)
Importance: High
Status: New
** Also affects: qemu
Importance: High
Status: New
** No longer affects: qemu-kvm (Ubuntu)
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1379340
Title:
qemu-kvm guest panic for smp trusty guests
Status in QEMU:
New
Status in “qemu” package in Ubuntu:
New
Bug description:
Just upgraded OpenStack compute hosts in our public cloud (using qemu-
kvm via libvirt) from Precise to Trusty (14.04.1), now on kernel
3.13.0-36-generic with qemu-kvm 2.0.0+dfsg-2ubuntu1.5.
Following the upgrade, whenever we try to start an smp/multicore
Trusty guest (existing or new), we run into this panic [1] inside the
guest just towards the end of boot. This happens consistently for smp
guests using the Trusty kernel (i.e., it also affects earlier Ubuntus
using the HWE kernel from Trusty but not their native versions). I
didn't have any other distro images to hand with 3.13.x kernels, but
none of the others I tested were affected (in the 3.2 - 3.16 kernel
range).
There are scarce similar reports out there, but the one we did find
pointed to a CPU feature as the trigger. We were running these hosts
with libvirt cpu mode set to "host-passthrough" (so qemu starts with
"-cpu host"), on AMD 6200 & 6300 Opteron hardware. Switching the guest
domains to use cpu mode "host-model" instead works around the issue
and is perfectly acceptable for most of our users.
We have various other Intel compute hosts and they don't seem to be
affected.
(1)
[ 11.256924] divide error: 0000 [#1] SMP
[ 11.258133] Modules linked in: kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lp parport psmouse floppy
[ 11.260228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 11.260228] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 11.260228] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 11.260228] RIP: 0010:[<ffffffff8104ed58>] [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP: 0018:ffff88023fc03c98 EFLAGS: 00010046
[ 11.260228] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[ 11.260228] RDX: ffffffff81eaf408 RSI: 0000000000000000 RDI: 0000000000000000
[ 11.260228] RBP: ffff88023fc03cb8 R08: ffffffff81eaf400 R09: 00000000ffffffff
[ 11.260228] R10: ffff880037612cc0 R11: ffffea0002eb0a00 R12: ffff8800374a33c0
[ 11.260228] R13: 0000000000000020 R14: 0000000000000001 R15: 0000000000000286
[ 11.260228] FS: 00007f1e8b538740(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
[ 11.260228] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.260228] CR2: 00007f1e8ae09d50 CR3: 0000000001c0e000 CR4: 00000000000406f0
[ 11.260228] Stack:
[ 11.260228] 0000000000000286 0000000000000001 0000000000000001 00000000000000c3
[ 11.260228] ffff88023fc03cc8 ffffffff81717ed6 ffff88023fc03ce0 ffffffff8172641a
[ 11.260228] ffff8800374a33c0 ffff88023fc03d18 ffffffff810aaeb0 ffff88023295e000
[ 11.260228] Call Trace:
[ 11.260228] <IRQ>
[ 11.260228] [<ffffffff81717ed6>] __ticket_unlock_slowpath+0x24/0x34
[ 11.260228] [<ffffffff8172641a>] _raw_spin_unlock_irqrestore+0x3a/0x40
[ 11.260228] [<ffffffff810aaeb0>] __wake_up_sync_key+0x50/0x60
[ 11.260228] [<ffffffff8160ca5a>] sock_def_readable+0x3a/0x70
[ 11.260228] [<ffffffff816fda0a>] packet_rcv+0x2fa/0x430
[ 11.260228] [<ffffffff816228b0>] __netif_receive_skb_core+0x360/0x840
[ 11.260228] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 11.260228] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 11.260228] [<ffffffff815288d4>] virtnet_poll+0x4d4/0x850
[ 11.260228] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 11.260228] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 11.260228] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 11.260228] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 11.260228] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 11.260228] <EOI>
[ 11.260228] [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
[ 11.260228] [<ffffffff8101c62f>] default_idle+0x1f/0xc0
[ 11.260228] [<ffffffff8101cef6>] arch_cpu_idle+0x26/0x30
[ 11.260228] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 11.260228] [<ffffffff8170ca77>] rest_init+0x77/0x80
[ 11.260228] [<ffffffff81d35f6b>] start_kernel+0x433/0x43e
[ 11.260228] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[ 11.260228] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[ 11.260228] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[ 11.260228] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[ 11.260228] Code: 66 44 39 e8 75 bd 0f b6 35 f6 06 e6 00 40 84 f6 75 2a 83 05 06 07 e6 00 01 48 c7 c0 6a b0 00 00 31 db 0f b7 0c 01 b8 05 00 00 00 <0f> 01 c1 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3 89 f0 31 c9
[ 11.260228] RIP [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP <ffff88023fc03c98>
[ 11.260228] ---[ end trace f1c26ff24745b331 ]---
[ 11.260228] Kernel panic - not syncing: Fatal exception in interrupt
[ 11.260228] Shutting down cpus with NMI
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1379340/+subscriptions
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD smp trusty guests
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
2014-10-16 9:21 ` [Qemu-devel] [Bug 1379340] [NEW] qemu-kvm guest panic for smp trusty guests Serge Hallyn
@ 2014-10-29 17:18 ` Serge Hallyn
2014-10-29 17:23 ` Serge Hallyn
` (3 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: Serge Hallyn @ 2014-10-29 17:18 UTC (permalink / raw)
To: qemu-devel
** Summary changed:
- qemu-kvm guest panic for smp trusty guests
+ qemu-kvm guest panic for AMD smp trusty guests
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1379340
Title:
qemu-kvm guest panic for AMD smp trusty guests
Status in QEMU:
New
Status in “qemu” package in Ubuntu:
New
Bug description:
Just upgraded OpenStack compute hosts in our public cloud (using qemu-
kvm via libvirt) from Precise to Trusty (14.04.1), now on kernel
3.13.0-36-generic with qemu-kvm 2.0.0+dfsg-2ubuntu1.5.
Following the upgrade, whenever we try to start an smp/multicore
Trusty guest (existing or new), we run into this panic [1] inside the
guest just towards the end of boot. This happens consistently for smp
guests using the Trusty kernel (i.e., it also affects earlier Ubuntus
using the HWE kernel from Trusty but not their native versions). I
didn't have any other distro images to hand with 3.13.x kernels, but
none of the others I tested were affected (in the 3.2 - 3.16 kernel
range).
There are scarce similar reports out there, but the one we did find
pointed to a CPU feature as the trigger. We were running these hosts
with libvirt cpu mode set to "host-passthrough" (so qemu starts with
"-cpu host"), on AMD 6200 & 6300 Opteron hardware. Switching the guest
domains to use cpu mode "host-model" instead works around the issue
and is perfectly acceptable for most of our users.
We have various other Intel compute hosts and they don't seem to be
affected.
(1)
[ 11.256924] divide error: 0000 [#1] SMP
[ 11.258133] Modules linked in: kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lp parport psmouse floppy
[ 11.260228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 11.260228] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 11.260228] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 11.260228] RIP: 0010:[<ffffffff8104ed58>] [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP: 0018:ffff88023fc03c98 EFLAGS: 00010046
[ 11.260228] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[ 11.260228] RDX: ffffffff81eaf408 RSI: 0000000000000000 RDI: 0000000000000000
[ 11.260228] RBP: ffff88023fc03cb8 R08: ffffffff81eaf400 R09: 00000000ffffffff
[ 11.260228] R10: ffff880037612cc0 R11: ffffea0002eb0a00 R12: ffff8800374a33c0
[ 11.260228] R13: 0000000000000020 R14: 0000000000000001 R15: 0000000000000286
[ 11.260228] FS: 00007f1e8b538740(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
[ 11.260228] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.260228] CR2: 00007f1e8ae09d50 CR3: 0000000001c0e000 CR4: 00000000000406f0
[ 11.260228] Stack:
[ 11.260228] 0000000000000286 0000000000000001 0000000000000001 00000000000000c3
[ 11.260228] ffff88023fc03cc8 ffffffff81717ed6 ffff88023fc03ce0 ffffffff8172641a
[ 11.260228] ffff8800374a33c0 ffff88023fc03d18 ffffffff810aaeb0 ffff88023295e000
[ 11.260228] Call Trace:
[ 11.260228] <IRQ>
[ 11.260228] [<ffffffff81717ed6>] __ticket_unlock_slowpath+0x24/0x34
[ 11.260228] [<ffffffff8172641a>] _raw_spin_unlock_irqrestore+0x3a/0x40
[ 11.260228] [<ffffffff810aaeb0>] __wake_up_sync_key+0x50/0x60
[ 11.260228] [<ffffffff8160ca5a>] sock_def_readable+0x3a/0x70
[ 11.260228] [<ffffffff816fda0a>] packet_rcv+0x2fa/0x430
[ 11.260228] [<ffffffff816228b0>] __netif_receive_skb_core+0x360/0x840
[ 11.260228] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 11.260228] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 11.260228] [<ffffffff815288d4>] virtnet_poll+0x4d4/0x850
[ 11.260228] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 11.260228] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 11.260228] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 11.260228] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 11.260228] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 11.260228] <EOI>
[ 11.260228] [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
[ 11.260228] [<ffffffff8101c62f>] default_idle+0x1f/0xc0
[ 11.260228] [<ffffffff8101cef6>] arch_cpu_idle+0x26/0x30
[ 11.260228] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 11.260228] [<ffffffff8170ca77>] rest_init+0x77/0x80
[ 11.260228] [<ffffffff81d35f6b>] start_kernel+0x433/0x43e
[ 11.260228] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[ 11.260228] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[ 11.260228] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[ 11.260228] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[ 11.260228] Code: 66 44 39 e8 75 bd 0f b6 35 f6 06 e6 00 40 84 f6 75 2a 83 05 06 07 e6 00 01 48 c7 c0 6a b0 00 00 31 db 0f b7 0c 01 b8 05 00 00 00 <0f> 01 c1 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3 89 f0 31 c9
[ 11.260228] RIP [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP <ffff88023fc03c98>
[ 11.260228] ---[ end trace f1c26ff24745b331 ]---
[ 11.260228] Kernel panic - not syncing: Fatal exception in interrupt
[ 11.260228] Shutting down cpus with NMI
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1379340/+subscriptions
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD smp trusty guests
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
2014-10-16 9:21 ` [Qemu-devel] [Bug 1379340] [NEW] qemu-kvm guest panic for smp trusty guests Serge Hallyn
2014-10-29 17:18 ` [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD " Serge Hallyn
@ 2014-10-29 17:23 ` Serge Hallyn
2014-11-16 4:47 ` new23d
` (2 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: Serge Hallyn @ 2014-10-29 17:23 UTC (permalink / raw)
To: qemu-devel
Thanks for reporting this bug.
Would it be possible for you to test with the daily upstream kernel
builds, to rule out any temporary kernel regressions? Instructions for
that are here: https://wiki.ubuntu.com/Kernel/MainlineBuilds
I'll also build an uptodate daily qemu build so we can test whether it has been fixed there.
I don't currently have any amd box available (but should in a few weeks)
to try and reproduce - and then hopefully bisect.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1379340
Title:
qemu-kvm guest panic for AMD smp trusty guests
Status in QEMU:
New
Status in “qemu” package in Ubuntu:
New
Bug description:
Just upgraded OpenStack compute hosts in our public cloud (using qemu-
kvm via libvirt) from Precise to Trusty (14.04.1), now on kernel
3.13.0-36-generic with qemu-kvm 2.0.0+dfsg-2ubuntu1.5.
Following the upgrade, whenever we try to start an smp/multicore
Trusty guest (existing or new), we run into this panic [1] inside the
guest just towards the end of boot. This happens consistently for smp
guests using the Trusty kernel (i.e., it also affects earlier Ubuntus
using the HWE kernel from Trusty but not their native versions). I
didn't have any other distro images to hand with 3.13.x kernels, but
none of the others I tested were affected (in the 3.2 - 3.16 kernel
range).
There are scarce similar reports out there, but the one we did find
pointed to a CPU feature as the trigger. We were running these hosts
with libvirt cpu mode set to "host-passthrough" (so qemu starts with
"-cpu host"), on AMD 6200 & 6300 Opteron hardware. Switching the guest
domains to use cpu mode "host-model" instead works around the issue
and is perfectly acceptable for most of our users.
We have various other Intel compute hosts and they don't seem to be
affected.
(1)
[ 11.256924] divide error: 0000 [#1] SMP
[ 11.258133] Modules linked in: kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lp parport psmouse floppy
[ 11.260228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 11.260228] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 11.260228] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 11.260228] RIP: 0010:[<ffffffff8104ed58>] [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP: 0018:ffff88023fc03c98 EFLAGS: 00010046
[ 11.260228] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[ 11.260228] RDX: ffffffff81eaf408 RSI: 0000000000000000 RDI: 0000000000000000
[ 11.260228] RBP: ffff88023fc03cb8 R08: ffffffff81eaf400 R09: 00000000ffffffff
[ 11.260228] R10: ffff880037612cc0 R11: ffffea0002eb0a00 R12: ffff8800374a33c0
[ 11.260228] R13: 0000000000000020 R14: 0000000000000001 R15: 0000000000000286
[ 11.260228] FS: 00007f1e8b538740(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
[ 11.260228] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.260228] CR2: 00007f1e8ae09d50 CR3: 0000000001c0e000 CR4: 00000000000406f0
[ 11.260228] Stack:
[ 11.260228] 0000000000000286 0000000000000001 0000000000000001 00000000000000c3
[ 11.260228] ffff88023fc03cc8 ffffffff81717ed6 ffff88023fc03ce0 ffffffff8172641a
[ 11.260228] ffff8800374a33c0 ffff88023fc03d18 ffffffff810aaeb0 ffff88023295e000
[ 11.260228] Call Trace:
[ 11.260228] <IRQ>
[ 11.260228] [<ffffffff81717ed6>] __ticket_unlock_slowpath+0x24/0x34
[ 11.260228] [<ffffffff8172641a>] _raw_spin_unlock_irqrestore+0x3a/0x40
[ 11.260228] [<ffffffff810aaeb0>] __wake_up_sync_key+0x50/0x60
[ 11.260228] [<ffffffff8160ca5a>] sock_def_readable+0x3a/0x70
[ 11.260228] [<ffffffff816fda0a>] packet_rcv+0x2fa/0x430
[ 11.260228] [<ffffffff816228b0>] __netif_receive_skb_core+0x360/0x840
[ 11.260228] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 11.260228] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 11.260228] [<ffffffff815288d4>] virtnet_poll+0x4d4/0x850
[ 11.260228] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 11.260228] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 11.260228] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 11.260228] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 11.260228] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 11.260228] <EOI>
[ 11.260228] [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
[ 11.260228] [<ffffffff8101c62f>] default_idle+0x1f/0xc0
[ 11.260228] [<ffffffff8101cef6>] arch_cpu_idle+0x26/0x30
[ 11.260228] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 11.260228] [<ffffffff8170ca77>] rest_init+0x77/0x80
[ 11.260228] [<ffffffff81d35f6b>] start_kernel+0x433/0x43e
[ 11.260228] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[ 11.260228] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[ 11.260228] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[ 11.260228] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[ 11.260228] Code: 66 44 39 e8 75 bd 0f b6 35 f6 06 e6 00 40 84 f6 75 2a 83 05 06 07 e6 00 01 48 c7 c0 6a b0 00 00 31 db 0f b7 0c 01 b8 05 00 00 00 <0f> 01 c1 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3 89 f0 31 c9
[ 11.260228] RIP [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP <ffff88023fc03c98>
[ 11.260228] ---[ end trace f1c26ff24745b331 ]---
[ 11.260228] Kernel panic - not syncing: Fatal exception in interrupt
[ 11.260228] Shutting down cpus with NMI
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1379340/+subscriptions
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD smp trusty guests
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
` (2 preceding siblings ...)
2014-10-29 17:23 ` Serge Hallyn
@ 2014-11-16 4:47 ` new23d
2014-11-16 5:20 ` new23d
2014-11-16 8:08 ` Paolo Bonzini
5 siblings, 0 replies; 6+ messages in thread
From: new23d @ 2014-11-16 4:47 UTC (permalink / raw)
To: qemu-devel
I have more-or-less the same underlying hardware and tested with [1].
Same results. Keen to help in any possible way.
[1] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11.11-trusty
/linux-
image-3.13.11-03131111-generic_3.13.11-03131111.201411111336_amd64.deb
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1379340
Title:
qemu-kvm guest panic for AMD smp trusty guests
Status in QEMU:
New
Status in “qemu” package in Ubuntu:
New
Bug description:
Just upgraded OpenStack compute hosts in our public cloud (using qemu-
kvm via libvirt) from Precise to Trusty (14.04.1), now on kernel
3.13.0-36-generic with qemu-kvm 2.0.0+dfsg-2ubuntu1.5.
Following the upgrade, whenever we try to start an smp/multicore
Trusty guest (existing or new), we run into this panic [1] inside the
guest just towards the end of boot. This happens consistently for smp
guests using the Trusty kernel (i.e., it also affects earlier Ubuntus
using the HWE kernel from Trusty but not their native versions). I
didn't have any other distro images to hand with 3.13.x kernels, but
none of the others I tested were affected (in the 3.2 - 3.16 kernel
range).
There are scarce similar reports out there, but the one we did find
pointed to a CPU feature as the trigger. We were running these hosts
with libvirt cpu mode set to "host-passthrough" (so qemu starts with
"-cpu host"), on AMD 6200 & 6300 Opteron hardware. Switching the guest
domains to use cpu mode "host-model" instead works around the issue
and is perfectly acceptable for most of our users.
We have various other Intel compute hosts and they don't seem to be
affected.
(1)
[ 11.256924] divide error: 0000 [#1] SMP
[ 11.258133] Modules linked in: kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lp parport psmouse floppy
[ 11.260228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 11.260228] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 11.260228] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 11.260228] RIP: 0010:[<ffffffff8104ed58>] [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP: 0018:ffff88023fc03c98 EFLAGS: 00010046
[ 11.260228] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[ 11.260228] RDX: ffffffff81eaf408 RSI: 0000000000000000 RDI: 0000000000000000
[ 11.260228] RBP: ffff88023fc03cb8 R08: ffffffff81eaf400 R09: 00000000ffffffff
[ 11.260228] R10: ffff880037612cc0 R11: ffffea0002eb0a00 R12: ffff8800374a33c0
[ 11.260228] R13: 0000000000000020 R14: 0000000000000001 R15: 0000000000000286
[ 11.260228] FS: 00007f1e8b538740(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
[ 11.260228] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.260228] CR2: 00007f1e8ae09d50 CR3: 0000000001c0e000 CR4: 00000000000406f0
[ 11.260228] Stack:
[ 11.260228] 0000000000000286 0000000000000001 0000000000000001 00000000000000c3
[ 11.260228] ffff88023fc03cc8 ffffffff81717ed6 ffff88023fc03ce0 ffffffff8172641a
[ 11.260228] ffff8800374a33c0 ffff88023fc03d18 ffffffff810aaeb0 ffff88023295e000
[ 11.260228] Call Trace:
[ 11.260228] <IRQ>
[ 11.260228] [<ffffffff81717ed6>] __ticket_unlock_slowpath+0x24/0x34
[ 11.260228] [<ffffffff8172641a>] _raw_spin_unlock_irqrestore+0x3a/0x40
[ 11.260228] [<ffffffff810aaeb0>] __wake_up_sync_key+0x50/0x60
[ 11.260228] [<ffffffff8160ca5a>] sock_def_readable+0x3a/0x70
[ 11.260228] [<ffffffff816fda0a>] packet_rcv+0x2fa/0x430
[ 11.260228] [<ffffffff816228b0>] __netif_receive_skb_core+0x360/0x840
[ 11.260228] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 11.260228] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 11.260228] [<ffffffff815288d4>] virtnet_poll+0x4d4/0x850
[ 11.260228] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 11.260228] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 11.260228] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 11.260228] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 11.260228] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 11.260228] <EOI>
[ 11.260228] [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
[ 11.260228] [<ffffffff8101c62f>] default_idle+0x1f/0xc0
[ 11.260228] [<ffffffff8101cef6>] arch_cpu_idle+0x26/0x30
[ 11.260228] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 11.260228] [<ffffffff8170ca77>] rest_init+0x77/0x80
[ 11.260228] [<ffffffff81d35f6b>] start_kernel+0x433/0x43e
[ 11.260228] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[ 11.260228] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[ 11.260228] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[ 11.260228] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[ 11.260228] Code: 66 44 39 e8 75 bd 0f b6 35 f6 06 e6 00 40 84 f6 75 2a 83 05 06 07 e6 00 01 48 c7 c0 6a b0 00 00 31 db 0f b7 0c 01 b8 05 00 00 00 <0f> 01 c1 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3 89 f0 31 c9
[ 11.260228] RIP [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP <ffff88023fc03c98>
[ 11.260228] ---[ end trace f1c26ff24745b331 ]---
[ 11.260228] Kernel panic - not syncing: Fatal exception in interrupt
[ 11.260228] Shutting down cpus with NMI
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1379340/+subscriptions
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD smp trusty guests
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
` (3 preceding siblings ...)
2014-11-16 4:47 ` new23d
@ 2014-11-16 5:20 ` new23d
2014-11-16 8:08 ` Paolo Bonzini
5 siblings, 0 replies; 6+ messages in thread
From: new23d @ 2014-11-16 5:20 UTC (permalink / raw)
To: qemu-devel
Update. Upon testing the 'daily' build as had been suggested, I found
that [1] appears to work for me in a stably. I'll update this thread if
I find it crashing in the near future.
[1] http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/linux-
image-3.18.0-999-generic_3.18.0-999.201411152105_amd64.deb
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1379340
Title:
qemu-kvm guest panic for AMD smp trusty guests
Status in QEMU:
New
Status in “qemu” package in Ubuntu:
New
Bug description:
Just upgraded OpenStack compute hosts in our public cloud (using qemu-
kvm via libvirt) from Precise to Trusty (14.04.1), now on kernel
3.13.0-36-generic with qemu-kvm 2.0.0+dfsg-2ubuntu1.5.
Following the upgrade, whenever we try to start an smp/multicore
Trusty guest (existing or new), we run into this panic [1] inside the
guest just towards the end of boot. This happens consistently for smp
guests using the Trusty kernel (i.e., it also affects earlier Ubuntus
using the HWE kernel from Trusty but not their native versions). I
didn't have any other distro images to hand with 3.13.x kernels, but
none of the others I tested were affected (in the 3.2 - 3.16 kernel
range).
There are scarce similar reports out there, but the one we did find
pointed to a CPU feature as the trigger. We were running these hosts
with libvirt cpu mode set to "host-passthrough" (so qemu starts with
"-cpu host"), on AMD 6200 & 6300 Opteron hardware. Switching the guest
domains to use cpu mode "host-model" instead works around the issue
and is perfectly acceptable for most of our users.
We have various other Intel compute hosts and they don't seem to be
affected.
(1)
[ 11.256924] divide error: 0000 [#1] SMP
[ 11.258133] Modules linked in: kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lp parport psmouse floppy
[ 11.260228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 11.260228] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 11.260228] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 11.260228] RIP: 0010:[<ffffffff8104ed58>] [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP: 0018:ffff88023fc03c98 EFLAGS: 00010046
[ 11.260228] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[ 11.260228] RDX: ffffffff81eaf408 RSI: 0000000000000000 RDI: 0000000000000000
[ 11.260228] RBP: ffff88023fc03cb8 R08: ffffffff81eaf400 R09: 00000000ffffffff
[ 11.260228] R10: ffff880037612cc0 R11: ffffea0002eb0a00 R12: ffff8800374a33c0
[ 11.260228] R13: 0000000000000020 R14: 0000000000000001 R15: 0000000000000286
[ 11.260228] FS: 00007f1e8b538740(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
[ 11.260228] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.260228] CR2: 00007f1e8ae09d50 CR3: 0000000001c0e000 CR4: 00000000000406f0
[ 11.260228] Stack:
[ 11.260228] 0000000000000286 0000000000000001 0000000000000001 00000000000000c3
[ 11.260228] ffff88023fc03cc8 ffffffff81717ed6 ffff88023fc03ce0 ffffffff8172641a
[ 11.260228] ffff8800374a33c0 ffff88023fc03d18 ffffffff810aaeb0 ffff88023295e000
[ 11.260228] Call Trace:
[ 11.260228] <IRQ>
[ 11.260228] [<ffffffff81717ed6>] __ticket_unlock_slowpath+0x24/0x34
[ 11.260228] [<ffffffff8172641a>] _raw_spin_unlock_irqrestore+0x3a/0x40
[ 11.260228] [<ffffffff810aaeb0>] __wake_up_sync_key+0x50/0x60
[ 11.260228] [<ffffffff8160ca5a>] sock_def_readable+0x3a/0x70
[ 11.260228] [<ffffffff816fda0a>] packet_rcv+0x2fa/0x430
[ 11.260228] [<ffffffff816228b0>] __netif_receive_skb_core+0x360/0x840
[ 11.260228] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 11.260228] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 11.260228] [<ffffffff815288d4>] virtnet_poll+0x4d4/0x850
[ 11.260228] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 11.260228] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 11.260228] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 11.260228] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 11.260228] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 11.260228] <EOI>
[ 11.260228] [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
[ 11.260228] [<ffffffff8101c62f>] default_idle+0x1f/0xc0
[ 11.260228] [<ffffffff8101cef6>] arch_cpu_idle+0x26/0x30
[ 11.260228] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 11.260228] [<ffffffff8170ca77>] rest_init+0x77/0x80
[ 11.260228] [<ffffffff81d35f6b>] start_kernel+0x433/0x43e
[ 11.260228] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[ 11.260228] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[ 11.260228] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[ 11.260228] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[ 11.260228] Code: 66 44 39 e8 75 bd 0f b6 35 f6 06 e6 00 40 84 f6 75 2a 83 05 06 07 e6 00 01 48 c7 c0 6a b0 00 00 31 db 0f b7 0c 01 b8 05 00 00 00 <0f> 01 c1 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3 89 f0 31 c9
[ 11.260228] RIP [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP <ffff88023fc03c98>
[ 11.260228] ---[ end trace f1c26ff24745b331 ]---
[ 11.260228] Kernel panic - not syncing: Fatal exception in interrupt
[ 11.260228] Shutting down cpus with NMI
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1379340/+subscriptions
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD smp trusty guests
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
` (4 preceding siblings ...)
2014-11-16 5:20 ` new23d
@ 2014-11-16 8:08 ` Paolo Bonzini
5 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2014-11-16 8:08 UTC (permalink / raw)
To: qemu-devel
** No longer affects: qemu
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1379340
Title:
qemu-kvm guest panic for AMD smp trusty guests
Status in “qemu” package in Ubuntu:
New
Bug description:
Just upgraded OpenStack compute hosts in our public cloud (using qemu-
kvm via libvirt) from Precise to Trusty (14.04.1), now on kernel
3.13.0-36-generic with qemu-kvm 2.0.0+dfsg-2ubuntu1.5.
Following the upgrade, whenever we try to start an smp/multicore
Trusty guest (existing or new), we run into this panic [1] inside the
guest just towards the end of boot. This happens consistently for smp
guests using the Trusty kernel (i.e., it also affects earlier Ubuntus
using the HWE kernel from Trusty but not their native versions). I
didn't have any other distro images to hand with 3.13.x kernels, but
none of the others I tested were affected (in the 3.2 - 3.16 kernel
range).
There are scarce similar reports out there, but the one we did find
pointed to a CPU feature as the trigger. We were running these hosts
with libvirt cpu mode set to "host-passthrough" (so qemu starts with
"-cpu host"), on AMD 6200 & 6300 Opteron hardware. Switching the guest
domains to use cpu mode "host-model" instead works around the issue
and is perfectly acceptable for most of our users.
We have various other Intel compute hosts and they don't seem to be
affected.
(1)
[ 11.256924] divide error: 0000 [#1] SMP
[ 11.258133] Modules linked in: kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lp parport psmouse floppy
[ 11.260228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-36-generic #63-Ubuntu
[ 11.260228] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 11.260228] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 11.260228] RIP: 0010:[<ffffffff8104ed58>] [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP: 0018:ffff88023fc03c98 EFLAGS: 00010046
[ 11.260228] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001
[ 11.260228] RDX: ffffffff81eaf408 RSI: 0000000000000000 RDI: 0000000000000000
[ 11.260228] RBP: ffff88023fc03cb8 R08: ffffffff81eaf400 R09: 00000000ffffffff
[ 11.260228] R10: ffff880037612cc0 R11: ffffea0002eb0a00 R12: ffff8800374a33c0
[ 11.260228] R13: 0000000000000020 R14: 0000000000000001 R15: 0000000000000286
[ 11.260228] FS: 00007f1e8b538740(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
[ 11.260228] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.260228] CR2: 00007f1e8ae09d50 CR3: 0000000001c0e000 CR4: 00000000000406f0
[ 11.260228] Stack:
[ 11.260228] 0000000000000286 0000000000000001 0000000000000001 00000000000000c3
[ 11.260228] ffff88023fc03cc8 ffffffff81717ed6 ffff88023fc03ce0 ffffffff8172641a
[ 11.260228] ffff8800374a33c0 ffff88023fc03d18 ffffffff810aaeb0 ffff88023295e000
[ 11.260228] Call Trace:
[ 11.260228] <IRQ>
[ 11.260228] [<ffffffff81717ed6>] __ticket_unlock_slowpath+0x24/0x34
[ 11.260228] [<ffffffff8172641a>] _raw_spin_unlock_irqrestore+0x3a/0x40
[ 11.260228] [<ffffffff810aaeb0>] __wake_up_sync_key+0x50/0x60
[ 11.260228] [<ffffffff8160ca5a>] sock_def_readable+0x3a/0x70
[ 11.260228] [<ffffffff816fda0a>] packet_rcv+0x2fa/0x430
[ 11.260228] [<ffffffff816228b0>] __netif_receive_skb_core+0x360/0x840
[ 11.260228] [<ffffffff81622da8>] __netif_receive_skb+0x18/0x60
[ 11.260228] [<ffffffff81622e13>] netif_receive_skb+0x23/0x90
[ 11.260228] [<ffffffff815288d4>] virtnet_poll+0x4d4/0x850
[ 11.260228] [<ffffffff81623192>] net_rx_action+0x152/0x250
[ 11.260228] [<ffffffff8106cbac>] __do_softirq+0xec/0x2c0
[ 11.260228] [<ffffffff8106d0f5>] irq_exit+0x105/0x110
[ 11.260228] [<ffffffff817312d6>] do_IRQ+0x56/0xc0
[ 11.260228] [<ffffffff81726a6d>] common_interrupt+0x6d/0x6d
[ 11.260228] <EOI>
[ 11.260228] [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10
[ 11.260228] [<ffffffff8101c62f>] default_idle+0x1f/0xc0
[ 11.260228] [<ffffffff8101cef6>] arch_cpu_idle+0x26/0x30
[ 11.260228] [<ffffffff810bed95>] cpu_startup_entry+0xc5/0x290
[ 11.260228] [<ffffffff8170ca77>] rest_init+0x77/0x80
[ 11.260228] [<ffffffff81d35f6b>] start_kernel+0x433/0x43e
[ 11.260228] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[ 11.260228] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[ 11.260228] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[ 11.260228] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[ 11.260228] Code: 66 44 39 e8 75 bd 0f b6 35 f6 06 e6 00 40 84 f6 75 2a 83 05 06 07 e6 00 01 48 c7 c0 6a b0 00 00 31 db 0f b7 0c 01 b8 05 00 00 00 <0f> 01 c1 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3 89 f0 31 c9
[ 11.260228] RIP [<ffffffff8104ed58>] kvm_unlock_kick+0xa8/0x100
[ 11.260228] RSP <ffff88023fc03c98>
[ 11.260228] ---[ end trace f1c26ff24745b331 ]---
[ 11.260228] Kernel panic - not syncing: Fatal exception in interrupt
[ 11.260228] Shutting down cpus with NMI
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1379340/+subscriptions
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-11-16 8:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20141009133107.5162.29320.malonedeb@wampee.canonical.com>
2014-10-16 9:21 ` [Qemu-devel] [Bug 1379340] [NEW] qemu-kvm guest panic for smp trusty guests Serge Hallyn
2014-10-29 17:18 ` [Qemu-devel] [Bug 1379340] Re: qemu-kvm guest panic for AMD " Serge Hallyn
2014-10-29 17:23 ` Serge Hallyn
2014-11-16 4:47 ` new23d
2014-11-16 5:20 ` new23d
2014-11-16 8:08 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).