From: David Vrabel <david.vrabel@citrix.com>
To: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Cc: Attilio Rao <attilio.rao@citrix.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS
Date: Wed, 15 Aug 2012 20:15:47 +0100 [thread overview]
Message-ID: <502BF563.8010902@citrix.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1208151418380.2278@kaball.uk.xensource.com>
On 15/08/12 14:55, Stefano Stabellini wrote:
> On Wed, 15 Aug 2012, David Vrabel wrote:
>> On 14/08/12 15:12, Attilio Rao wrote:
>>> On 14/08/12 14:57, David Vrabel wrote:
>>>> On 14/08/12 13:24, Attilio Rao wrote:
>> After looking at it some more, I think this pv-ops is unnecessary. How
>> about the following patch to just remove it completely?
>>
>> I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning
>> is sound.
>
> Do you have more then 4G to dom0 on those boxes?
I've tested with 6G now, both 64-bit and 32-bit with HIGHPTE.
> It certainly fixed a serious crash at the time it was introduced, see
> http://marc.info/?l=linux-kernel&m=129901609503574 and
> http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big
> changed in kernel_physical_mapping_init, I think we still need it.
> Depending on the e820 of your test box, the kernel could crash (or not),
> possibly in different places.
>
>>>> Having said that, I couldn't immediately see where pages in (end,
>>>> pgt_buf_top] was getting set RO. Can you point me to where it's
>>>> done?
>>>>
>>>
>>> As mentioned in the comment, please look at xen_set_pte_init().
>>
>> xen_set_pte_init() only ensures it doesn't set the PTE as writable if it
>> is already present and read-only.
>
> look at mask_rw_pte and read the threads linked above, unfortunately it
> is not that simple.
Yes, I was remembering what 32-bit did here.
The 64-bit version is a bit confused and it often ends up /not/ clearing
RW for the direct mapping of the pages in the pgt_buf because any
existing RW mappings will be used as-is. See phys_pte_init() which
checks for an existing mapping and only sets the PTE if it is not
already set.
pgd_populate(), pud_populate(), and pmd_populate() take care of clearing
RW for the newly allocated page table pages, so mask_rw_pte() only needs
to consider clearing RW for mappings of the the existing page table PFNs
which all lie outside the range (pt_buf_start, pt_buf_top].
This patch makes it more sensible, and makes removal of the pv-op safe
if pgt_buf lies outside the initial mapping.
index 04c1f61..2fd5e86 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1400,14 +1400,13 @@ static pte_t __init mask_rw_pte(pte_t *ptep,
pte_t pte)
unsigned long pfn = pte_pfn(pte);
/*
- * If the new pfn is within the range of the newly allocated
- * kernel pagetable, and it isn't being mapped into an
- * early_ioremap fixmap slot as a freshly allocated page, make sure
- * it is RO.
+ * If this is a PTE of an early_ioremap fixmap slot but
+ * outside the range (pgt_buf_start, pgt_buf_top], then this
+ * PTE is mapping a PFN in the current page table. Make
+ * sure it is RO.
*/
- if (((!is_early_ioremap_ptep(ptep) &&
- pfn >= pgt_buf_start && pfn < pgt_buf_top)) ||
- (is_early_ioremap_ptep(ptep) && pfn != (pgt_buf_end - 1)))
+ if (is_early_ioremap_ptep(ptep)
+ && (pfn < pgt_buf_start || pfn >= pgt_buf_top))
pte = pte_wrprotect(pte);
return pte;
>> 8<----------------------
>> x86: remove x86_init.mapping.pagetable_reserve paravirt op
>>
>> The x86_init.mapping.pagetable_reserve paravirt op is used for Xen
>> guests to set the writable flag for the mapping of (pgt_buf_end,
>> pgt_buf_top]. This is not necessary as these pages are never set as
>> read-only as they have never contained page tables.
>
> Is this actually true? It is possible when pagetable pages are
> allocated by alloc_low_page.
These newly allocated page table pages will be set read-only when they
are linked into the page tables with pgd_populate(), pud_populate() and
friends.
>> When running as a Xen guest, the initial page tables are provided by
>> Xen (these are reserved with memblock_reserve() in
>> xen_setup_kernel_pagetable()) and constructed in brk space (for 32-bit
>> guests) or in the kernel's .data section (for 64-bit guests, see
>> head_64.S).
>>
>> Since these are all marked as reserved, (pgt_buf_start, pgt_buf_top]
>> does not overlap with them and the mappings for these PFNs will be
>> read-write.
>
> We are talking about pagetable pages built by
> kernel_physical_mapping_init.
>
>
>> Since Xen doesn't need to change the mapping its implementation
>> becomes the same as a native and we can simply remove this pv-op
>> completely.
>
> I don't think so.
David
next prev parent reply other threads:[~2012-08-15 19:15 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-14 12:24 [PATCH v2 0/2] XEN/X86: Document pagetable_reserve PVOPS and enforce a better semantic Attilio Rao
2012-08-14 12:24 ` [PATCH v2 1/2] XEN, X86: Improve semantic support for pagetable_reserve PVOPS Attilio Rao
2012-08-15 17:25 ` Stefano Stabellini
2012-08-15 17:15 ` Attilio Rao
2012-08-15 17:46 ` Stefano Stabellini
2012-08-15 18:43 ` Ian Campbell
2012-08-15 18:50 ` Attilio Rao
2012-08-15 18:47 ` Attilio Rao
2012-08-16 8:12 ` Ian Campbell
2012-08-16 9:53 ` Stefano Stabellini
2012-08-15 17:33 ` Stefano Stabellini
2012-08-14 12:24 ` [PATCH v2 2/2] Xen: Document the semantic of the " Attilio Rao
2012-08-14 13:57 ` David Vrabel
2012-08-14 14:12 ` Attilio Rao
2012-08-15 11:19 ` David Vrabel
2012-08-15 13:55 ` Stefano Stabellini
2012-08-15 19:15 ` David Vrabel [this message]
2012-08-16 11:07 ` Stefano Stabellini
2012-08-16 14:33 ` David Vrabel
2012-08-15 17:46 ` Stefano Stabellini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=502BF563.8010902@citrix.com \
--to=david.vrabel@citrix.com \
--cc=Stefano.Stabellini@eu.citrix.com \
--cc=attilio.rao@citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).