From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: dom0 pvops crash apparently due to guest migration Date: Mon, 29 Nov 2010 10:53:48 -0800 Message-ID: <4CF3F6BC.3020906@goop.org> References: <19699.38294.527879.943733@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <19699.38294.527879.943733@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Jackson Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 11/29/2010 03:59 AM, Ian Jackson wrote: > One of my test boxes encountered the crash whose oops you see below. > It doesn't do it every time, or even every time on this machine (since > the credit2 test in the same run worked). The crash seems to have > occurred just at the end of the migration of a PV guest. Do you have a feel for what the likelihood of failure is? Has this started happening recently? > The setup is 32-bit dom0 and domU on 64-bit Xen. > The pvops kernel version was 56eabf9f2a6632d3b2ef. > > The complete logs are here: > http://www.chiark.greenend.org.uk/~xensrcts/logs/2847/test-amd64-i386-xl-multivcpu/ > (The machine has since been reused so those logs are what there is.) > > Ian. > > ------------[ cut here ]------------ > kernel BUG at arch/x86/mm/fault.c:210! > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/devices/virtual/net/lo/operstate > Modules linked in: e1000e [last unloaded: scsi_wait_scan] > > Pid: 22, comm: xenwatch Not tainted (2.6.32.26 #1) > EIP: 0061:[] EFLAGS: 00010082 CPU: 0 > EIP is at vmalloc_sync_one+0x118/0x128 > EAX: 003f8360 EBX: 1fc1b067 ECX: ffffffe0 EDX: ab273fff > ESI: 00000000 EDI: c182adf0 EBP: dfcdbe88 ESP: dfcdbe64 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > Process xenwatch (pid: 22, ti=dfcda000 task=dfccc510 task.ti=dfcda000) > Stack: > dbd7b384 00cdbe88 00000000 c568f200 dbd7b384 ab273fff f7c00000 c568f200 > <0> dbd7b384 dfcdbea8 c104ca9a c182adf0 c1780204 dbd75f40 dfd45a20 dbd75f40 > <0> dfcdbf5c dfcdbeb4 c10df14a dfcdbf1c dfcdbef8 c12313b1 0000001b 00000008 > Call Trace: > [] ? vmalloc_sync_all+0x5c/0xbe > [] ? alloc_vm_area+0x44/0x4b Hm, I'm still not really sure why alloc_vm_area() does a vmalloc_sync_all in the first place... But that BUG shouldn't happen regardless. J > [] ? blkif_map+0x2d/0x204 > [] ? frontend_changed+0x194/0x209 > [] ? xenbus_otherend_changed+0x5c/0x61 > [] ? frontend_changed+0xa/0xd > [] ? xenwatch_thread+0xf6/0x11e > [] ? autoremove_wake_function+0x0/0x33 > [] ? xenwatch_thread+0x0/0x11e > [] ? kthread+0x61/0x66 > [] ? kthread+0x0/0x66 > [] ? kernel_thread_helper+0x7/0x10 > Code: eb fe 89 d8 89 f2 ff 15 08 7d 68 c1 89 d6 8b 55 f0 89 c3 89 c8 0f ac d0 0c 89 c1 89 d8 0f ac f0 0c c1 e1 05 c1 e0 05 39 c1 74 06 <0f> 0b eb fe 31 ff 83 c4 18 89 f8 5b 5e 5f 5d c3 55 89 e5 56 53 > EIP: [] vmalloc_sync_one+0x118/0x128 SS:ESP 0069:dfcdbe64 > ---[ end trace 7b608ed9c5e5ed4e ]--- >