From mboxrd@z Thu Jan  1 00:00:00 1970
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: acpidump crashes on some machines
Date: Fri, 17 Aug 2012 16:52:07 -0400
Message-ID: <20120817205207.GA3002@phenom.dumpdata.com>
References: <4FE1C423.6070001@amd.com>
	<20120620145127.GD12787@phenom.dumpdata.com>
	<4FE32DDA.10204@amd.com>
	<20120630014825.GA7003@phenom.dumpdata.com>
	<20120630021936.GA27100@phenom.dumpdata.com>
	<50114002.2030700@amd.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Content-Disposition: inline
In-Reply-To: <50114002.2030700@amd.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Andre Przywara <andre.przywara@amd.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>, xen-devel <xen-devel@lists.xensource.com>, david.vrabel@citrix.com, Jan Beulich <JBeulich@suse.com>
List-Id: xen-devel@lists.xenproject.org

On Thu, Jul 26, 2012 at 03:02:58PM +0200, Andre Przywara wrote:
> On 06/30/2012 04:19 AM, Konrad Rzeszutek Wilk wrote:
> 
> Konrad, David,
> 
> back on track for this issue. Thanks for your input, I could do some
> more debugging (see below for a refresh):
> 
> It seems like it affects only the first page of the 1:1 mapping. I
> didn't have an issues with the last PFN or the page behind it (which
> failed properly).
> 
> David, thanks for the hint with varying dom0_mem parameter. I
> thought I already checked this, but I did it once again and it
> turned out that it is only an issue if dom0_mem is smaller than the
> ACPI area, which generates a hole in the memory map. So we have
> (simplified)
> * 1:1 mapping to 1 MB
> * normal mapping till dom0_mem
> * unmapped area till ACPI E820 area
> * ACPI E820 1:1 mapping
> 
> As far as I could chase it down the 1:1 mapping itself looks OK, I
> couldn't find any off-by-one bugs here. So maybe it is code that
> later on invalidates areas between the normal guest mapping and the
> ACPI mem?

I think I found it. Can you try this pls [and if you can't find
early_to_phys.. just use the __set_phys_to call]

>>From ab915d98f321b0fcca1932747c632b5f0f299f55 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 17 Aug 2012 16:43:28 -0400
Subject: [PATCH] xen/setup: Fix one-off error when adding for-balloon PFNs to
 the P2M.

When we are finished with return PFNs to the hypervisor, then
populate it back, and also mark the E820 MMIO and E820 gaps
as IDENTITY_FRAMEs, we then call P2M to set areas that can
be used for ballooning. We were off by one, and ended up
over-writting a P2M entry that most likely was an IDENTITY_FRAME.
For example:

1-1 mapping on 40000->40200
1-1 mapping on bc558->bc5ac
1-1 mapping on bc5b4->bc8c5
1-1 mapping on bc8c6->bcb7c
1-1 mapping on bcd00->100000
Released 614 pages of unused memory
Set 277889 page(s) to 1-1 mapping
Populating 40200-40466 pfn range: 614 pages added

=> here we set from 40466 up to bc559 P2M tree to be
INVALID_P2M_ENTRY. We should have done it up to bc558.

The end result is that if anybody is trying to construct
a PTE for PFN bc558 they end up with ~PAGE_PRESENT.

CC: stable@vger.kernel.org
Reported-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/xen/setup.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index ead8557..030a55a 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -78,9 +78,16 @@ static void __init xen_add_extra_mem(u64 start, u64 size)
 	memblock_reserve(start, size);
 
 	xen_max_p2m_pfn = PFN_DOWN(start + size);
+	for (pfn = PFN_DOWN(start); pfn < xen_max_p2m_pfn; pfn++) {
+		unsigned long mfn = pfn_to_mfn(pfn);
+
+		if (WARN(mfn == pfn, "Trying to over-write 1-1 mapping (pfn: %lx)\n", pfn))
+			continue;
+		WARN(mfn != INVALID_P2M_ENTRY, "Trying to remove %lx which has %lx mfn!\n",
+			pfn, mfn);
 
-	for (pfn = PFN_DOWN(start); pfn <= xen_max_p2m_pfn; pfn++)
-		__set_phys_to_machine(pfn, INVALID_P2M_ENTRY);
+		early_set_phys_to_machine(pfn, INVALID_P2M_ENTRY);
+	}
 }
 
 static unsigned long __init xen_do_chunk(unsigned long start,
-- 
1.7.7.6