From: Stefan Bader <stefan.bader@canonical.com>
To: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Shutdown panic in disable_nonboot_cpus after cpupool-numa-split
Date: Mon, 07 Jul 2014 13:33:14 +0200 [thread overview]
Message-ID: <53BA857A.8070608@canonical.com> (raw)
[-- Attachment #1.1: Type: text/plain, Size: 3673 bytes --]
I recently noticed that I get a panic (rebooting the system) on shutdown in some
cases. This happened only on my AMD system and also not all the time. Finally
realized that it is related to the use of using cpupool-numa-split
(libxl with xen-4.4 maybe, but not 100% sure 4.3 as well).
What happens is that on shutdown the hypervisor runs disable_nonboot_cpus which
call cpu_down for each online cpu. There is a BUG_ON in the code for the case of
cpu_down returning -EBUSY. This happens in my case as soon as the first cpu that
has been moved to pool-1 by cpupool-numa-split is attempted. The error is
returned by running the notifier_call_chain and I suspect that ends up calling
cpupool_cpu_remove which always returns EBUSY for cpus not in pool0.
I am not sure which end needs to be fixed but looping over all online cpus in
disable_nonboot_cpus sounds plausible. So maybe the check for pool-0 in
cpupool_cpu_remove is wrong...?
-Stefan
[I switched around printk and BUG_ON to actually see the offending cpu]
(XEN) mydbg: after notifier_call_chain in cpu_down
(XEN) Error taking CPU4 down: -16
(XEN) Xen BUG at cpu.c:196 [@190 normally]
(XEN) ----[ Xen-4.4.0 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e008:[<ffff82d08010184f>] disable_nonboot_cpus+0xff/0x110
(XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
(XEN) rax: ffff82d0802f8320 rbx: 00000000fffffff0 rcx: 0000000000000000
(XEN) rdx: ffff82d0802b0000 rsi: 000000000000000a rdi: ffff82d080267638
(XEN) rbp: 0000000000000004 rsp: ffff82d0802b7e50 r8: ffff83041ff90000
(XEN) r9: 0000000000010000 r10: 0000000000000001 r11: 0000000000000002
(XEN) r12: 0000000000000005 r13: ffff82d0802e2620 r14: 0000000000000003
(XEN) r15: ffff82d0802e2620 cr0: 000000008005003b cr4: 00000000000006f0
(XEN) cr3: 00000000dfc65000 cr2: ffff88002acdb798
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen stack trace from rsp=ffff82d0802b7e50:
(XEN) ffff82d0802e2620 0000000000000000 ffff82d0802f8320 ffff82d08019eb82
(XEN) efff82d0802f8380 ffff8300dfcff000 ffff8300dfcff000 ffff83040dca40a0
(XEN) ffff8300dfcff000 ffff82d0802f8308 ffff82d0802e2620 0000000000000003
(XEN) ffff82d0802e2620 ffff82d0801054be ffff8300dfcff180 ffff82d0802f8400
(XEN) ffff82d0802f8410 ffff82d080129970 0000000000000000 ffff82d080129c69
(XEN) ffff82d0802b0000 ffff8300dfcff000 00000000ffffffff ffff82d08015bd2b
(XEN) ffff8300dfafe000 00000000fee1dead 00007fada0b3fc8c 0000000000002001
(XEN) 0000000000000005 ffff880029717d78 0000000000000000 0000000000000246
(XEN) 00000000ffff0000 0000000000000000 0000000000000005 0000000000000000
(XEN) ffffffff810010ea 0000000000002001 0000000000003401 ffff880029717ce0
(XEN) 0000010000000000 ffffffff810010ea 000000000000e033 0000000000000246
(XEN) ffff880029717cc8 000000000000e02b 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffff8300dfafe000
(XEN) 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN) Xen call trace:
(XEN) [<ffff82d08010184f>] disable_nonboot_cpus+0xff/0x110
(XEN) [<ffff82d08019eb82>] enter_state_helper+0xc2/0x3c0
(XEN) [<ffff82d0801054be>] continue_hypercall_tasklet_handler+0xbe/0xd0
(XEN) [<ffff82d080129970>] do_tasklet_work+0x60/0xa0
(XEN) [<ffff82d080129c69>] do_tasklet+0x59/0x90
(XEN) [<ffff82d08015bd2b>] idle_loop+0x1b/0x50
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Xen BUG at cpu.c:196
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next reply other threads:[~2014-07-07 11:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-07 11:33 Stefan Bader [this message]
2014-07-07 12:00 ` Shutdown panic in disable_nonboot_cpus after cpupool-numa-split Andrew Cooper
2014-07-07 12:38 ` Jürgen Groß
2014-07-07 12:49 ` Stefan Bader
2014-07-07 13:03 ` Jürgen Groß
2014-07-07 14:08 ` Stefan Bader
2014-07-07 14:28 ` Juergen Gross
2014-07-07 14:43 ` Stefan Bader
2014-07-28 8:36 ` Stefan Bader
2014-07-28 8:50 ` Jürgen Groß
2014-07-28 9:02 ` Stefan Bader
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53BA857A.8070608@canonical.com \
--to=stefan.bader@canonical.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).