From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: Xen dom0 crash: "d0:v0: unhandled page fault (ec=0000)" Date: Mon, 01 Nov 2010 14:16:33 -0400 Message-ID: <4CCF0401.5040704@goop.org> References: <19629.39326.337589.71778@wylie.me.uk> <1287498599.12843.2111.camel@qabil.uk.xensource.com> <4CBDB229.3030501@infinitumb.de> <1287503143.12843.2191.camel@qabil.uk.xensource.com> <4CBE2A43.70200@hfp.de> <1287564863.12843.4194.camel@qabil.uk.xensource.com> <1288367063.23619.51.camel@qabil.uk.xensource.com> <20101029161553.GA27408@dumpdata.com> <4CCEC2A8.6040103@goop.org> <20101101173940.GA6068@dumpdata.com> <20101101174602.GA6227@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20101101174602.GA6227@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Konrad Rzeszutek Wilk Cc: "Alan J. Wylie" , "xen-devel@lists.xensource.com" , Gianni Tedesco , Stefan Kuhne , sven , Andreas Kinzler List-Id: xen-devel@lists.xenproject.org On 11/01/2010 01:46 PM, Konrad Rzeszutek Wilk wrote: > On Mon, Nov 01, 2010 at 01:39:40PM -0400, Konrad Rzeszutek Wilk wrote: >>>>> http://pastebin.com/3m0DpDdW - 2.6.32.24-gd0054d6-dirty - broken >> .. snip.. >>> The way is this is supposed to work is: >>> >>> 1. Xen gives the domain N pages >>> 2. There's an E820 which describes M pages (M > N) >>> 3. The kernel traverses the existing E820 and finds holes and adds >>> the memory to a new E820_RAM region beyond M >>> 4. Set up P2M for pages up to N >>> 5. When the kernel maps all "RAM", the region from N-M is not >>> present, and has no valid P2M mapping; in that case, xen_make_pte >>> will return a non-present pte. >> Right, and somehow his machine/kernel is not doing this. His 'N' ends up being 'M' so >> the region N-M is added to the "RAM", and xen_make_pte I _think_ returns a non-present pte >> (or maybe it does present a present pte?) In the previous kernel (2.6.32.18), it >> does exactly what you described. > Not that I am actually sure what is causing this. The interesting part is that > he sees this twice: > > [ 0.000000] last_pfn = 0x2d0699 max_arch_pfn = 0x400000000 > [ 0.000000] last_pfn = 0x2f000 max_arch_pfn = 0x400000000 > > And he mentioned on IRC to me that this was not due to any debugging patches. That's just printed by e820_end_pfn(), which is called a few times. Does it happen native? J