From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christophe Saout Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1860! Date: Tue, 04 Jan 2011 16:19:02 +0100 Message-ID: <1294154342.24719.6.camel@leto.intern.saout.de> References: <20101227155314.GG3728@dumpdata.com> <20101228104256.GJ2754@reaktio.net> <1294153817.24719.3.camel@leto.intern.saout.de> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1294153817.24719.3.camel@leto.intern.saout.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com Cc: Teck Choon Giam , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org Hi again, > > > > While doing LVM snapshot for migration and get the following: > > > > > > > > Dec 26 15:58:29 xen01 kernel: ------------[ cut here ]------------ > > > > Dec 26 15:58:29 xen01 kernel: kernel BUG at arch/x86/xen/mmu.c:1860! > > > > Dec 26 15:58:29 xen01 kernel: invalid opcode: 0000 [#1] SMP > > > > Dec 26 15:58:29 xen01 kernel: last sysfs file: /sys/block/dm-26/dev > > > > Dec 26 15:58:29 xen01 kernel: CPU 0 > > > > Dec 26 15:58:29 xen01 kernel: Modules linked in: ipt_MASQUERADE > > > > It would be very good to track this down and get it fixed.. > > hopefully you're able to help a bit and try some things to debug it. > > > > Konrad maybe has some ideas to try.. > > I am seeing this with an lvcreate here, so I guess it's somehow related > to device-mapper stuff in general. > > It doesn't look like this has been resolved yet. Somewhere I saw a > request for the hypervisor message related to the pinning failure. > > Here it is: > > (XEN) mm.c:2364:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn 41114f (pfn d514f) > (XEN) mm.c:2733:d0 Error while pinning mfn 41114f > > I have a bit of experience in debugging things, so if I can help someone > with more information... Additional information: This happened with a number of commands now. However, I am running a multipath setup and every time the crash seemed to be caused in the process context of the multipath daemon. I think the daemon listens to events from the device-mapper subsystem to watch for changes and the problem somehow arises from there, since on another machine with the same XEN/Dom0 version without such a daemon I never had any troubles with LVM. [] pin_pagetable_pfn+0x52/0x60 [] xen_alloc_ptpage+0x9c/0xa0 [] xen_alloc_pte+0xe/0x10 [] __pte_alloc+0x7e/0xf0 [] handle_mm_fault+0x855/0x930 [] ? pvclock_clocksource_read+0x4e/0x100 [] ? do_mmap_pgoff+0x33c/0x380 [] do_page_fault+0x116/0x3e0 [] page_fault+0x25/0x30 Cheers, Christophe