From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: Xen dom0 crash in get_phys_to_machine Date: Fri, 22 Oct 2010 15:26:32 -0700 Message-ID: <4CC20F98.70107@goop.org> References: <19636.5260.149513.257699@wylie.me.uk> <1287752727.12843.4406.camel@qabil.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1287752727.12843.4406.camel@qabil.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Gianni Tedesco Cc: "Alan J. Wylie" , "xen-devel@lists.xensource.com" , Jeremy, Fitzhardinge List-Id: xen-devel@lists.xenproject.org On 10/22/2010 06:05 AM, Gianni Tedesco wrote: > On Tue, 2010-10-12 at 08:55 +0100, Alan J. Wylie wrote: >> Further to my previous report: >> >> http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00257.html >> Message-ID: <19629.39326.337589.71778@wylie.me.uk> >> >> I've added some debugging and have tracked down the crash to the >> recently modified code in arch/x86/xen/mmu.c >> >> Since the last version of the code that worked for me, mmu.c has been >> modified with a lot of P2M changes. It now crashes in >> get_phys_to_machine(). >> >> Having tracked down the crash and the offending value of pfn, I then >> further modified the code only to print if ( pfn == 0x18C3 ), and also >> to print intermediate values. >> >> <7>ALANW get_phys_to_machine pfn 000018C3 >> <7> topidx 00000000 >> <7> mididx 0000000C >> <7> idx 000000C3 >> (XEN) d0:v0: unhandled page fault (ec=0000) >> >> If there is any more debugging that I can do, I'll be only too happy to >> oblige. > FWIW, when I was checking for any call where pfn > max_pfn - and I got: > > p2m_top[0][10][104] max_pfn=0 > > The p2m seems to have been correctly initialised: > > xen_build_dynamic_phys_to_machine: topidx=0 mididx=375 max_pfn=192512 > > But then it looks like something is trampling max_pfn and possibly other > important data structures. > > I can get a working pvops dom0 by reverting to commit > e6b9b2cbca5093e8e38d3e314e2f6415ad951c60 - with the same config. > > git-bisect between that commit and head turned up some nonsense about a > ata_piix change which just added a spinlock > 876b3a81850fc237f643a065ea78ce2ad7665767 - so I assume that is a bisect > problem and that this commit is unrelated... Yeah. If the problem appears as a function of kernel size, then bisection is going to give you more or less random results, unfortunately. J