xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: David Vrabel <david.vrabel@citrix.com>
To: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Cc: Attilio Rao <attilio.rao@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS
Date: Wed, 15 Aug 2012 20:15:47 +0100	[thread overview]
Message-ID: <502BF563.8010902@citrix.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1208151418380.2278@kaball.uk.xensource.com>

On 15/08/12 14:55, Stefano Stabellini wrote:
> On Wed, 15 Aug 2012, David Vrabel wrote:
>> On 14/08/12 15:12, Attilio Rao wrote:
>>> On 14/08/12 14:57, David Vrabel wrote:
>>>> On 14/08/12 13:24, Attilio Rao wrote:
>> After looking at it some more, I think this pv-ops is unnecessary. How
>> about the following patch to just remove it completely?
>>
>> I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning
>> is sound.
> 
> Do you have more then 4G to dom0 on those boxes?

I've tested with 6G now, both 64-bit and 32-bit with HIGHPTE.

> It certainly fixed a serious crash at the time it was introduced, see
> http://marc.info/?l=linux-kernel&m=129901609503574 and
> http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big
> changed in kernel_physical_mapping_init, I think we still need it.
> Depending on the e820 of your test box, the kernel could crash (or not),
> possibly in different places.
>
>>>> Having said that, I couldn't immediately see where pages in (end, 
>>>> pgt_buf_top] was getting set RO.  Can you point me to where it's 
>>>> done?
>>>>
>>>
>>> As mentioned in the comment, please look at xen_set_pte_init().
>>
>> xen_set_pte_init() only ensures it doesn't set the PTE as writable if it
>> is already present and read-only.
> 
> look at mask_rw_pte and read the threads linked above, unfortunately it
> is not that simple.

Yes, I was remembering what 32-bit did here.

The 64-bit version is a bit confused and it often ends up /not/ clearing
RW for the direct mapping of the pages in the pgt_buf because any
existing RW mappings will be used as-is.  See phys_pte_init() which
checks for an existing mapping and only sets the PTE if it is not
already set.

pgd_populate(), pud_populate(), and pmd_populate() take care of clearing
RW for the newly allocated page table pages, so mask_rw_pte() only needs
to consider clearing RW for mappings of the the existing page table PFNs
which all lie outside the range (pt_buf_start, pt_buf_top].

This patch makes it more sensible, and makes removal of the pv-op safe
if pgt_buf lies outside the initial mapping.

index 04c1f61..2fd5e86 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1400,14 +1400,13 @@ static pte_t __init mask_rw_pte(pte_t *ptep,
pte_t pte)
 	unsigned long pfn = pte_pfn(pte);

 	/*
-	 * If the new pfn is within the range of the newly allocated
-	 * kernel pagetable, and it isn't being mapped into an
-	 * early_ioremap fixmap slot as a freshly allocated page, make sure
-	 * it is RO.
+	 * If this is a PTE of an early_ioremap fixmap slot but
+	 * outside the range (pgt_buf_start, pgt_buf_top], then this
+	 * PTE is mapping a PFN in the current page table.  Make
+	 * sure it is RO.
 	 */
-	if (((!is_early_ioremap_ptep(ptep) &&
-			pfn >= pgt_buf_start && pfn < pgt_buf_top)) ||
-			(is_early_ioremap_ptep(ptep) && pfn != (pgt_buf_end - 1)))
+	if (is_early_ioremap_ptep(ptep)
+	    && (pfn < pgt_buf_start || pfn >= pgt_buf_top))
 		pte = pte_wrprotect(pte);

 	return pte;


>> 8<----------------------
>> x86: remove x86_init.mapping.pagetable_reserve paravirt op
>>
>> The x86_init.mapping.pagetable_reserve paravirt op is used for Xen
>> guests to set the writable flag for the mapping of (pgt_buf_end,
>> pgt_buf_top].  This is not necessary as these pages are never set as
>> read-only as they have never contained page tables.
> 
> Is this actually true? It is possible when pagetable pages are
> allocated by alloc_low_page.

These newly allocated page table pages will be set read-only when they
are linked into the page tables with pgd_populate(), pud_populate() and
friends.

>> When running as a Xen guest, the initial page tables are provided by
>> Xen (these are reserved with memblock_reserve() in
>> xen_setup_kernel_pagetable()) and constructed in brk space (for 32-bit
>> guests) or in the kernel's .data section (for 64-bit guests, see
>> head_64.S).
>>
>> Since these are all marked as reserved, (pgt_buf_start, pgt_buf_top]
>> does not overlap with them and the mappings for these PFNs will be
>> read-write.
> 
> We are talking about pagetable pages built by
> kernel_physical_mapping_init.
> 
> 
>> Since Xen doesn't need to change the mapping its implementation
>> becomes the same as a native and we can simply remove this pv-op
>> completely.
> 
> I don't think so.

David

  reply	other threads:[~2012-08-15 19:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-14 12:24 [PATCH v2 0/2] XEN/X86: Document pagetable_reserve PVOPS and enforce a better semantic Attilio Rao
2012-08-14 12:24 ` [PATCH v2 1/2] XEN, X86: Improve semantic support for pagetable_reserve PVOPS Attilio Rao
2012-08-15 17:25   ` Stefano Stabellini
2012-08-15 17:15     ` Attilio Rao
2012-08-15 17:46       ` Stefano Stabellini
2012-08-15 18:43         ` Ian Campbell
2012-08-15 18:50           ` Attilio Rao
2012-08-15 18:47         ` Attilio Rao
2012-08-16  8:12           ` Ian Campbell
2012-08-16  9:53           ` Stefano Stabellini
2012-08-15 17:33     ` Stefano Stabellini
2012-08-14 12:24 ` [PATCH v2 2/2] Xen: Document the semantic of the " Attilio Rao
2012-08-14 13:57   ` David Vrabel
2012-08-14 14:12     ` Attilio Rao
2012-08-15 11:19       ` David Vrabel
2012-08-15 13:55         ` Stefano Stabellini
2012-08-15 19:15           ` David Vrabel [this message]
2012-08-16 11:07             ` Stefano Stabellini
2012-08-16 14:33               ` David Vrabel
2012-08-15 17:46   ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=502BF563.8010902@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=attilio.rao@citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).