All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>, JBeulich@suse.com
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: Q:pt_base in COMPAT mode offset by two pages. Was:Re: [Xen-devel] [PATCH 02/11] xen/x86: Use memblock_reserve for sensitive areas.
Date: Tue, 21 Aug 2012 15:03:17 -0400	[thread overview]
Message-ID: <20120821190317.GA13035@phenom.dumpdata.com> (raw)
In-Reply-To: <20120821172732.GA23715@phenom.dumpdata.com>

On Tue, Aug 21, 2012 at 01:27:32PM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Aug 20, 2012 at 10:13:05AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Fri, Aug 17, 2012 at 06:35:12PM +0100, Stefano Stabellini wrote:
> > > On Thu, 16 Aug 2012, Konrad Rzeszutek Wilk wrote:
> > > > instead of a big memblock_reserve. This way we can be more
> > > > selective in freeing regions (and it also makes it easier
> > > > to understand where is what).
> > > > 
> > > > [v1: Move the auto_translate_physmap to proper line]
> > > > [v2: Per Stefano suggestion add more comments]
> > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > 
> > > much better now!
> > 
> > Thought interestingly enough it breaks 32-bit dom0s (and only dom0s).
> > Will have a revised patch posted shortly.
> 
> Jan, I thought something odd. Part of this code replaces this:
> 
> 	memblock_reserve(__pa(xen_start_info->mfn_list),
> 		xen_start_info->pt_base - xen_start_info->mfn_list);
> 
> with a more region-by-region area. What I found out that if I boot this
> as 32-bit guest with a 64-bit hypervisor the xen_start_info->pt_base is
> actually wrong.
> 
> Specifically this is what bootup says:
> 
> (good working case - 32bit hypervisor with 32-bit dom0):
> (XEN)  Loaded kernel: c1000000->c1a23000
> (XEN)  Init. ramdisk: c1a23000->cf730e00
> (XEN)  Phys-Mach map: cf731000->cf831000
> (XEN)  Start info:    cf831000->cf83147c
> (XEN)  Page tables:   cf832000->cf8b5000
> (XEN)  Boot stack:    cf8b5000->cf8b6000
> (XEN)  TOTAL:         c0000000->cfc00000
> 
> [    0.000000] PT: cf832000 (f832000)
> [    0.000000] Reserving PT: f832000->f8b5000
> 
> And with a 64-bit hypervisor:
> 
> XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN)  Loaded kernel: 00000000c1000000->00000000c1a23000
> (XEN)  Init. ramdisk: 00000000c1a23000->00000000cf730e00
> (XEN)  Phys-Mach map: 00000000cf731000->00000000cf831000
> (XEN)  Start info:    00000000cf831000->00000000cf8314b4
> (XEN)  Page tables:   00000000cf832000->00000000cf8b6000
> (XEN)  Boot stack:    00000000cf8b6000->00000000cf8b7000
> (XEN)  TOTAL:         00000000c0000000->00000000cfc00000
> (XEN)  ENTRY ADDRESS: 00000000c16bb22c
> 
> [    0.000000] PT: cf834000 (f834000)
> [    0.000000] Reserving PT: f834000->f8b8000
> 
> So the pt_base is offset by two pages. And looking at c/s 13257
> its not clear to me why this two page offset was added?
> 
> The toolstack works fine - so launching 32-bit guests either
> under a 32-bit hypervisor or 64-bit works fine:
> ] domainbuilder: detail: xc_dom_alloc_segment:   page tables  : 0xcf805000 -> 0xcf885000  (pfn 0xf805 + 0x80 pages)
> [    0.000000] PT: cf805000 (f805000)
> 

And this patch on top of the others fixes this..


>From 806c312e50f122c47913145cf884f53dd09d9199 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Tue, 21 Aug 2012 14:31:24 -0400
Subject: [PATCH] xen/x86: Workaround 64-bit hypervisor and 32-bit initial
 domain.

If a 64-bit hypervisor is booted with a 32-bit initial domain,
the hypervisor deals with the initial domain as "compat" and
does some extra adjustments (like pagetables are 4 bytes instead
of 8). It also adjusts the xen_start_info->pt_base incorrectly.

When booted with a 32-bit hypervisor (32-bit initial domain):
..
(XEN)  Start info:    cf831000->cf83147c
(XEN)  Page tables:   cf832000->cf8b5000
..
[    0.000000] PT: cf832000 (f832000)
[    0.000000] Reserving PT: f832000->f8b5000

And with a 64-bit hypervisor:
(XEN)  Start info:    00000000cf831000->00000000cf8314b4
(XEN)  Page tables:   00000000cf832000->00000000cf8b6000

[    0.000000] PT: cf834000 (f834000)
[    0.000000] Reserving PT: f834000->f8b8000

To deal with this, we keep keep track of the highest physical
address we have reserved via memblock_reserve. If that address
does not overlap with pt_base, we have a gap which we reserve.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/xen/enlighten.c |   30 +++++++++++++++++++++---------
 1 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index e532eb5..511f92d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1002,19 +1002,24 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
  * If the MFN is not in the m2p (provided to us by the hypervisor) this
  * function won't do anything. In practice this means that the XenBus
  * MFN won't be available for the initial domain. */
-static void __init xen_reserve_mfn(unsigned long mfn)
+static unsigned long __init xen_reserve_mfn(unsigned long mfn)
 {
-	unsigned long pfn;
+	unsigned long pfn, end_pfn = 0;
 
 	if (!mfn)
-		return;
+		return end_pfn;
+
 	pfn = mfn_to_pfn(mfn);
-	if (phys_to_machine_mapping_valid(pfn))
-		memblock_reserve(PFN_PHYS(pfn), PAGE_SIZE);
+	if (phys_to_machine_mapping_valid(pfn)) {
+		end_pfn = PFN_PHYS(pfn) + PAGE_SIZE;
+		memblock_reserve(PFN_PHYS(pfn), end_pfn);
+	}
+	return end_pfn;
 }
 static void __init xen_reserve_internals(void)
 {
 	unsigned long size;
+	unsigned long last_phys = 0;
 
 	if (!xen_pv_domain())
 		return;
@@ -1022,12 +1027,13 @@ static void __init xen_reserve_internals(void)
 	/* xen_start_info does not exist in the M2P, hence can't use
 	 * xen_reserve_mfn. */
 	memblock_reserve(__pa(xen_start_info), PAGE_SIZE);
+	last_phys = __pa(xen_start_info) + PAGE_SIZE;
 
-	xen_reserve_mfn(PFN_DOWN(xen_start_info->shared_info));
-	xen_reserve_mfn(xen_start_info->store_mfn);
+	last_phys = max(xen_reserve_mfn(PFN_DOWN(xen_start_info->shared_info)), last_phys);
+	last_phys = max(xen_reserve_mfn(xen_start_info->store_mfn), last_phys);
 
 	if (!xen_initial_domain())
-		xen_reserve_mfn(xen_start_info->console.domU.mfn);
+		last_phys = max(xen_reserve_mfn(xen_start_info->console.domU.mfn), last_phys);
 
 	if (xen_feature(XENFEAT_auto_translated_physmap))
 		return;
@@ -1043,8 +1049,14 @@ static void __init xen_reserve_internals(void)
 	 * a lot (and call memblock_reserve for each PAGE), so lets just use
 	 * the easy way and reserve it wholesale. */
 	memblock_reserve(__pa(xen_start_info->mfn_list), size);
-
+	last_phys = max(__pa(xen_start_info->mfn_list) + size, last_phys);
 	/* The pagetables are reserved in mmu.c */
+
+	/* Under 64-bit hypervisor with a 32-bit domain, the hypervisor
+	 * offsets the pt_base by two pages. Hence the reservation that is done
+	 * in mmu.c misses two pages. We correct it here if we detect this. */
+	if (last_phys < __pa(xen_start_info->pt_base))
+		memblock_reserve(last_phys, __pa(xen_start_info->pt_base) - last_phys);
 }
 void xen_setup_shared_info(void)
 {
-- 
1.7.7.6


  reply	other threads:[~2012-08-21 19:13 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-16 16:03 [PATCH] Boot PV guests with more than 128GB (v3) for v3.7 Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 01/11] xen/p2m: Fix the comment describing the P2M tree Konrad Rzeszutek Wilk
2012-08-17 17:29   ` [Xen-devel] " Stefano Stabellini
2012-08-16 16:03 ` [PATCH 02/11] xen/x86: Use memblock_reserve for sensitive areas Konrad Rzeszutek Wilk
2012-08-17 17:35   ` [Xen-devel] " Stefano Stabellini
2012-08-20 14:13     ` Konrad Rzeszutek Wilk
2012-08-21 17:27       ` Q:pt_base in COMPAT mode offset by two pages. Was:Re: " Konrad Rzeszutek Wilk
2012-08-21 19:03         ` Konrad Rzeszutek Wilk [this message]
2012-08-22 10:48           ` Stefano Stabellini
2012-08-22 14:00             ` Konrad Rzeszutek Wilk
2012-08-22 14:12           ` Jan Beulich
2012-08-22 14:41             ` Stefano Stabellini
2012-08-22 14:57             ` Konrad Rzeszutek Wilk
2012-08-22 15:59           ` Jan Beulich
2012-08-22 16:21             ` Konrad Rzeszutek Wilk
2012-08-22 18:55             ` [Xen-devel] Q:pt_base in COMPAT mode offset by two pages. Was:Re: " Konrad Rzeszutek Wilk
2012-08-23  6:23               ` Jan Beulich
2012-08-23  6:23                 ` Jan Beulich
2012-08-16 16:03 ` [PATCH 03/11] xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 04/11] xen/mmu: Provide comments describing the _ka and _va aliasing issue Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 05/11] xen/mmu: use copy_page instead of memcpy Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 06/11] xen/mmu: For 64-bit do not call xen_map_identity_early Konrad Rzeszutek Wilk
2012-08-17 17:41   ` [Xen-devel] " Stefano Stabellini
2012-08-17 17:45     ` Konrad Rzeszutek Wilk
2012-08-20 11:45       ` Stefano Stabellini
2012-08-20 11:53         ` Ian Campbell
2012-08-20 11:58           ` Stefano Stabellini
2012-08-20 12:06             ` Konrad Rzeszutek Wilk
2012-08-20 12:19               ` Stefano Stabellini
2012-08-23 15:40             ` Konrad Rzeszutek Wilk
2012-08-23 15:57               ` Stefano Stabellini
2012-08-16 16:03 ` [PATCH 07/11] xen/mmu: Recycle the Xen provided L4, L3, and L2 pages Konrad Rzeszutek Wilk
2012-08-17 18:07   ` [Xen-devel] " Stefano Stabellini
2012-08-17 18:05     ` Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 08/11] xen/p2m: Add logic to revector a P2M tree to use __va leafs Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 09/11] xen/mmu: Copy and revector the P2M tree Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 10/11] xen/mmu: Remove from __ka space PMD entries for pagetables Konrad Rzeszutek Wilk
2012-08-16 16:03 ` [PATCH 11/11] xen/mmu: Release just the MFN list, not MFN list and part of pagetables Konrad Rzeszutek Wilk
2012-08-21 14:18   ` [Xen-devel] " Stefano Stabellini
2012-08-21 14:57     ` Konrad Rzeszutek Wilk
2012-08-21 15:27       ` Stefano Stabellini
2012-09-17 18:06   ` William Dauchy
2012-09-17 18:18     ` Konrad Rzeszutek Wilk
2012-08-17 17:39 ` [PATCH] Boot PV guests with more than 128GB (v3) for v3.7 Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120821190317.GA13035@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.