From: Wei Liu <Wei.Liu2@citrix.com>
To: xen-devel@lists.xen.org
Cc: wei.liu2@citrix.com, konrad.wilk@oracle.com
Subject: HVM bug: system crashes after offline online a vcpu
Date: Thu, 13 Dec 2012 15:12:17 +0000 [thread overview]
Message-ID: <1355411537.8376.52.camel@iceland> (raw)
Hi Konrad
I encountered a bug when trying to bring offline a cpu then online it
again in HVM. As I'm not very familiar with HVM stuffs I cannot come up
with a quick fix.
The HVM DomU is configured with 4 vcpus. After booting into command
prompt, I do following operations.
# echo 0 > /sys/devices/system/cpu/cpu3/online
# echo 1 > /sys/devices/system/cpu/cpu3/online
With Debian's default 2.6.32-5-amd64 kernel, the last log is:
Booting processor 3 APIC 0x6 ip 0x6000
With my own kernel which is of version 3.5, I'm able to get more logs:
[ 44.047358] Booting Node 0 Processor 3 APIC 0x6
[ 44.061201] ------------[ cut here ]------------
[ 44.065186] kernel BUG at kernel/hrtimer.c:1259!
[ 44.065186] invalid opcode: 0000 [#1] SMP
[ 44.065186] CPU 3
[ 44.065186] Modules linked in:
[ 44.065186]
[ 44.065186] Pid: 0, comm: swapper/3 Not tainted 3.5.0-xen-evtchn+ #50 Xen HVM domU
[ 44.065186] RIP: 0010:[<ffffffff8105682e>] [<ffffffff8105682e>] hrtimer_interrupt+0x24/0x1a5
[ 44.065186] RSP: 0000:ffff88000f463de8 EFLAGS: 00010046
[ 44.065186] RAX: ffffffff8105680a RBX: ffff88000f46e640 RCX: 00000000fffffffa
[ 44.065186] RDX: 00000000fffffffa RSI: 0000000000000000 RDI: ffff88000f46bd80
[ 44.065186] RBP: 0000000000000057 R08: ffff88000e000b40 R09: 0000000000000019
[ 44.065186] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88000e6e8e00
[ 44.065186] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
[ 44.065186] FS: 0000000000000000(0000) GS:ffff88000f460000(0000) knlGS:0000000000000000
[ 44.065186] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 44.065186] CR2: 0000000000000000 CR3: 000000000181b000 CR4: 00000000000007e0
[ 44.065186] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 44.065186] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 44.065186] Process swapper/3 (pid: 0, threadinfo ffff88000e62e000, task ffff88000e62aea0)
[ 44.065186] Stack:
[ 44.065186] 0000000000000001 ffff88000f46e680 ffffffff81013711 00000008cfba9b27
[ 44.065186] 00000000fffffffa ffff88000e6e97c0 0000000000000057 ffff88000e6e8e00
[ 44.065186] 0000000000000000 0000000000000001 0000000000000000 ffffffff81006954
[ 44.065186] Call Trace:
[ 44.065186] <IRQ>
[ 44.065186] [<ffffffff81013711>] ? paravirt_sched_clock+0x5/0x8
[ 44.065186] [<ffffffff81006954>] ? xen_timer_interrupt+0x26/0x162
[ 44.065186] [<ffffffff8109a220>] ? check_for_new_grace_period.isra.32+0x90/0x9a
[ 44.065186] [<ffffffff810956df>] ? handle_irq_event_percpu+0x32/0x1b0
[ 44.065186] [<ffffffff8128f88b>] ? irq_get_handler_data+0x7/0x16
[ 44.065186] [<ffffffff81097e39>] ? handle_percpu_irq+0x3a/0x4f
[ 44.065186] [<ffffffff8128f9ec>] ? __xen_evtchn_do_upcall_l2+0x131/0x1c0
[ 44.065186] [<ffffffff812913d3>] ? xen_evtchn_do_upcall+0x27/0x37
[ 44.065186] [<ffffffff8140081a>] ? xen_hvm_callback_vector+0x6a/0x70
[ 44.065186] <EOI>
[ 44.065186] [<ffffffff81094b8f>] ? cpumask_next+0x17/0x19
[ 44.065186] [<ffffffff813eb75b>] ? start_secondary+0x184/0x1e2
[ 44.065186] [<ffffffff813eb757>] ? start_secondary+0x180/0x1e2
[ 44.065186] [<ffffffff813eb5d7>] ? set_cpu_sibling_map+0x40e/0x40e
[ 44.065186] Code: 41 5d 41 5e 41 5f c3 41 57 41 56 41 55 41 54 55 53 48 c7 c3 40 e6 00 00 48 83 ec 28 65 48 03 1c 25 e8 db 00 00 83 7b 18 00 75 02 <0f> 0b 48
ff 43 20 48 bd ff ff ff ff ff ff ff 7f 41 be 03 00 00
[ 44.065186] RIP [<ffffffff8105682e>] hrtimer_interrupt+0x24/0x1a5
[ 44.065186] RSP <ffff88000f463de8>
[ 44.065186] ---[ end trace 9366352b116a03db ]---
[ 44.065186] Kernel panic - not syncing: Fatal exception in interrupt
And if I offline online cpu 2 in 2.6.32-5-amd64:
[ 27.933928] Booting processor 2 APIC 0x4 ip 0x6000
[ 25.708098] Initializing CPU#2
[ 25.708098] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 25.708098] CPU: L2 cache: 6144K
[ 25.708098] CPU 2/0x4 -> Node 0
[ 25.708098] CPU: Physical Processor ID: 0
[ 25.708098] CPU: Processor Core ID: 4
[ 28.028234] CPU2: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz stepping 07
[ 28.069320] checking TSC synchronization [CPU#0 -> CPU#2]: passed.
[ 25.708098] installing Xen timer for CPU 2
[ 28.098101] CPU0 attaching NULL sched-domain.
[ 28.098106] CPU1 attaching NULL sched-domain.
[ 28.098110] CPU3 attaching NULL sched-domain.
[ 28.098092] ------------[ cut here ]------------
[ 28.098092] WARNING: at /build/buildd-linux-2.6_2.6.32-30-amd64-d4MbNM/linux-2.6-2.6.32/debian/build/source_amd64_none/kernel/irq/chip.c:88 unbind_from_irq+0
x147/0x159()
[ 28.098092] Hardware name: HVM domU
[ 28.144127] CPU0 attaching sched-domain:
[ 28.144131] domain 0: span 0-3 level CPU
[ 28.144133] groups: 0 1 2 3
[ 28.144139] CPU1 attaching sched-domain:
[ 28.144142] domain 0: span 0-3 level CPU
[ 28.144145] groups: 1 2 3 0
[ 28.144150] CPU2 attaching sched-domain:
[ 28.144152] domain 0: span 0-3 level CPU
[ 28.144155] groups: 2 3 0 1
[ 28.144160] CPU3 attaching sched-domain:
[ 28.144162] domain 0: span 0-3 level CPU
[ 28.144165] groups: 3 0 1 2
[ 28.209159] Destroying IRQ18 without calling free_irq
[ 28.215985] Modules linked in: loop parport_pc parport psmouse evdev serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr i2c_piix4 i2c_core butto
n processor ext3 jbd mbcache ata_generic ata_piix libata floppy thermal thermal_sys xen_blkfront scsi_mod [last unloaded: scsi_wait_scan]
[ 28.224050] Pid: 0, comm: swapper Not tainted 2.6.32-5-amd64 #1
[ 28.224050] Call Trace:
[ 28.224050] [<ffffffff811ef131>] ? unbind_from_irq+0x147/0x159
[ 28.224050] [<ffffffff811ef131>] ? unbind_from_irq+0x147/0x159
[ 28.224050] [<ffffffff8104dd7c>] ? warn_slowpath_common+0x77/0xa3
[ 28.224050] [<ffffffff8104de04>] ? warn_slowpath_fmt+0x51/0x59
[ 28.224050] [<ffffffff810e4493>] ? get_partial_node+0x15/0x85
[ 28.224050] [<ffffffff811966fd>] ? kvasprintf+0x41/0x68
[ 28.224050] [<ffffffff8109639e>] ? dynamic_irq_cleanup_x+0x4b/0xc2
[ 28.224050] [<ffffffff811ef131>] ? unbind_from_irq+0x147/0x159
[ 28.224050] [<ffffffff811ef5b7>] ? bind_virq_to_irqhandler+0x14c/0x15d
[ 28.224050] [<ffffffff8100df77>] ? xen_timer_interrupt+0x0/0x18d
[ 28.224050] [<ffffffff812f5121>] ? set_cpu_sibling_map+0x2f4/0x311
[ 28.224050] [<ffffffff8100df0d>] ? xen_setup_timer+0x55/0xa2
[ 28.224050] [<ffffffff8100df71>] ? xen_hvm_setup_cpu_clockevents+0x17/0x1d
[ 28.224050] [<ffffffff812f52fc>] ? start_secondary+0x17c/0x185
[ 28.224050] ---[ end trace db1493923b5e103d ]---
The logs for cpu 2 in my 3.5 kernel is identical to those for cpu 3.
Wei.
next reply other threads:[~2012-12-13 15:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-13 15:12 Wei Liu [this message]
2012-12-19 16:04 ` HVM bug: system crashes after offline online a vcpu Konrad Rzeszutek Wilk
2012-12-19 16:18 ` Wei Liu
2012-12-19 17:01 ` Wei Liu
2012-12-19 17:40 ` Konrad Rzeszutek Wilk
2012-12-19 17:52 ` Wei Liu
2012-12-19 19:24 ` Konrad Rzeszutek Wilk
2012-12-20 17:48 ` Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1355411537.8376.52.camel@iceland \
--to=wei.liu2@citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.