From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andre Przywara Subject: Re: acpidump crashes on some machines Date: Thu, 23 Aug 2012 12:14:56 +0200 Message-ID: <503602A0.9040908@amd.com> References: <4FE1C423.6070001@amd.com> <20120620145127.GD12787@phenom.dumpdata.com> <4FE32DDA.10204@amd.com> <20120630014825.GA7003@phenom.dumpdata.com> <20120630021936.GA27100@phenom.dumpdata.com> <50114002.2030700@amd.com> <20120817205207.GA3002@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20120817205207.GA3002@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: Jeremy Fitzhardinge , xen-devel , david.vrabel@citrix.com, Jan Beulich List-Id: xen-devel@lists.xenproject.org On 08/17/2012 10:52 PM, Konrad Rzeszutek Wilk wrote: > On Thu, Jul 26, 2012 at 03:02:58PM +0200, Andre Przywara wrote: >> On 06/30/2012 04:19 AM, Konrad Rzeszutek Wilk wrote: >> >> Konrad, David, >> >> back on track for this issue. Thanks for your input, I could do some >> more debugging (see below for a refresh): >> >> It seems like it affects only the first page of the 1:1 mapping. I >> didn't have an issues with the last PFN or the page behind it (which >> failed properly). >> >> David, thanks for the hint with varying dom0_mem parameter. I >> thought I already checked this, but I did it once again and it >> turned out that it is only an issue if dom0_mem is smaller than the >> ACPI area, which generates a hole in the memory map. So we have >> (simplified) >> * 1:1 mapping to 1 MB >> * normal mapping till dom0_mem >> * unmapped area till ACPI E820 area >> * ACPI E820 1:1 mapping >> >> As far as I could chase it down the 1:1 mapping itself looks OK, I >> couldn't find any off-by-one bugs here. So maybe it is code that >> later on invalidates areas between the normal guest mapping and the >> ACPI mem? > > I think I found it. Can you try this pls [and if you can't find > early_to_phys.. just use the __set_phys_to call] Yes, that works. At least after a quick test on my test box. Both the test module and acpidump work as expected. If I replace the "<" in your patch with the original "<=", I get the warning (and due to the "continue" it also works). I also successfully tested the minimal fix (just replacing <= with <). I will feed it to the testers here to cover more machines. Do you want to keep the warnings in (which exceed 80 characters, btw)? Thanks a lot and: Tested-by: Andre Przywara Regards, Andre. > > From ab915d98f321b0fcca1932747c632b5f0f299f55 Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk > Date: Fri, 17 Aug 2012 16:43:28 -0400 > Subject: [PATCH] xen/setup: Fix one-off error when adding for-balloon PFNs to > the P2M. > > When we are finished with return PFNs to the hypervisor, then > populate it back, and also mark the E820 MMIO and E820 gaps > as IDENTITY_FRAMEs, we then call P2M to set areas that can > be used for ballooning. We were off by one, and ended up > over-writting a P2M entry that most likely was an IDENTITY_FRAME. > For example: > > 1-1 mapping on 40000->40200 > 1-1 mapping on bc558->bc5ac > 1-1 mapping on bc5b4->bc8c5 > 1-1 mapping on bc8c6->bcb7c > 1-1 mapping on bcd00->100000 > Released 614 pages of unused memory > Set 277889 page(s) to 1-1 mapping > Populating 40200-40466 pfn range: 614 pages added > > => here we set from 40466 up to bc559 P2M tree to be > INVALID_P2M_ENTRY. We should have done it up to bc558. > > The end result is that if anybody is trying to construct > a PTE for PFN bc558 they end up with ~PAGE_PRESENT. > > CC: stable@vger.kernel.org > Reported-by: Andre Przywara > Signed-off-by: Konrad Rzeszutek Wilk > --- > arch/x86/xen/setup.c | 11 +++++++++-- > 1 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index ead8557..030a55a 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -78,9 +78,16 @@ static void __init xen_add_extra_mem(u64 start, u64 size) > memblock_reserve(start, size); > > xen_max_p2m_pfn = PFN_DOWN(start + size); > + for (pfn = PFN_DOWN(start); pfn < xen_max_p2m_pfn; pfn++) { > + unsigned long mfn = pfn_to_mfn(pfn); > + > + if (WARN(mfn == pfn, "Trying to over-write 1-1 mapping (pfn: %lx)\n", pfn)) > + continue; > + WARN(mfn != INVALID_P2M_ENTRY, "Trying to remove %lx which has %lx mfn!\n", > + pfn, mfn); > > - for (pfn = PFN_DOWN(start); pfn <= xen_max_p2m_pfn; pfn++) > - __set_phys_to_machine(pfn, INVALID_P2M_ENTRY); > + early_set_phys_to_machine(pfn, INVALID_P2M_ENTRY); > + } > } > > static unsigned long __init xen_do_chunk(unsigned long start, > -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany