From: David Vrabel <david.vrabel@citrix.com>
To: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Cc: Attilio Rao <attilio.rao@citrix.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2 2/2] Xen: Document the semantic of the pagetable_reserve PVOPS
Date: Thu, 16 Aug 2012 15:33:14 +0100 [thread overview]
Message-ID: <502D04AA.8030107@citrix.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1208161200200.2278@kaball.uk.xensource.com>
On 16/08/12 12:07, Stefano Stabellini wrote:
> On Wed, 15 Aug 2012, David Vrabel wrote:
>> On 15/08/12 14:55, Stefano Stabellini wrote:
>>> On Wed, 15 Aug 2012, David Vrabel wrote:
>>>> On 14/08/12 15:12, Attilio Rao wrote:
>>>>> On 14/08/12 14:57, David Vrabel wrote:
>>>>>> On 14/08/12 13:24, Attilio Rao wrote:
>>>> After looking at it some more, I think this pv-ops is unnecessary. How
>>>> about the following patch to just remove it completely?
>>>>
>>>> I've only smoke-tested 32-bit and 64-bit dom0 but I think the reasoning
>>>> is sound.
>>>
>>> Do you have more then 4G to dom0 on those boxes?
>>
>> I've tested with 6G now, both 64-bit and 32-bit with HIGHPTE.
>>
>>> It certainly fixed a serious crash at the time it was introduced, see
>>> http://marc.info/?l=linux-kernel&m=129901609503574 and
>>> http://marc.info/?l=linux-kernel&m=130133909408229. Unless something big
>>> changed in kernel_physical_mapping_init, I think we still need it.
>>> Depending on the e820 of your test box, the kernel could crash (or not),
>>> possibly in different places.
FYI, it looks like pgt_buf is now located low down, which is why these
changes worked for me. Possibly this changed as part of a memblock
refactor.
>>>>>> Having said that, I couldn't immediately see where pages in (end,
>>>>>> pgt_buf_top] was getting set RO. Can you point me to where it's
>>>>>> done?
>>>>>>
>>>>>
>>>>> As mentioned in the comment, please look at xen_set_pte_init().
>>>>
>>>> xen_set_pte_init() only ensures it doesn't set the PTE as writable if it
>>>> is already present and read-only.
>>>
>>> look at mask_rw_pte and read the threads linked above, unfortunately it
>>> is not that simple.
>>
>> Yes, I was remembering what 32-bit did here.
>>
>> The 64-bit version is a bit confused and it often ends up /not/ clearing
>> RW for the direct mapping of the pages in the pgt_buf because any
>> existing RW mappings will be used as-is. See phys_pte_init() which
>> checks for an existing mapping and only sets the PTE if it is not
>> already set.
>
> not all the pagetable pages might be already mapped, even if they are
> already hooked into the pagetable
Yes, I think this is easy to handle though.
/*
* Make sure that active page table pages are not mapped RW.
*/
if (is_early_ioremap_ptep(ptep)) {
/*
* If we're updating an early ioremap PTE, then this PFN may
* already be in the linear mapping. If it is, use the
* existing RW bit.
*/
unsigned int level;
pte_t *linear_pte;
linear_pte = lookup_address(__va(PFN_PHYS(pfn)), &level);
if (linear_pte && !pte_write(*linear_pt))
pte = pte_wrprotect(pte);
} else if (pfn >= pgt_buf_start && pfn < pgt_buf_end) {
/*
* The PFN may not be mapped but may be hooked into the page
* tables. Make sure this new mapping is read-only.
*/
pte = pte_wrprotect(pte);
}
However, the real subtlety are page tables that are mapped as they
themselves are hooked in.
As as example, let's say pgt_buf_start (s) is on a 1 GiB boundary and
pgt_buf_top (t) is below the next 2 MiB boundary. We're mapping memory
with 4 KiB pages from s to s + 4 MiB. This requires a new PUD and two
new PMDs.
To map this region:
1. Allocate a new PUD. (@ e = pgt_buf_end)
2. Allocate a new PMD. (@ e + 1)
3. Fill in this PMD's PTEs. This covers pgt_buf so sets (s, e + 1) as RO.
4. Call pmd_populate() to hook-in this PMD. This does not set e+1 as RO
(but this is ok as it already is).
5. Allocate a new PMD (@ e + 2)
6. Fill in this PMD's PTEs.
7. Call pmd_populate() to hook in this PMD. This does not set e+2 as RO
as the region isn't mapped yet.
8. Call pud_populate() to hook in this PUD. This sets e as RO but e + 2
is still RW.
9. Boom!
It may be possible to fixup the permissions as the pages are hooked in.
i.e., if this page's entries cover (pgt_buf_start, pgt_buf_end], walk
the entries and any child tables and fixup the permissions of the leaf
entries.
This would walk the tables a few times unless we were careful to only
walk them when hooking a page into an active table.
It was fun trying to understand this, but I think I'll give up now...
David
next prev parent reply other threads:[~2012-08-16 14:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-14 12:24 [PATCH v2 0/2] XEN/X86: Document pagetable_reserve PVOPS and enforce a better semantic Attilio Rao
2012-08-14 12:24 ` [PATCH v2 1/2] XEN, X86: Improve semantic support for pagetable_reserve PVOPS Attilio Rao
2012-08-15 17:25 ` Stefano Stabellini
2012-08-15 17:15 ` Attilio Rao
2012-08-15 17:46 ` Stefano Stabellini
2012-08-15 18:43 ` Ian Campbell
2012-08-15 18:50 ` Attilio Rao
2012-08-15 18:47 ` Attilio Rao
2012-08-16 8:12 ` Ian Campbell
2012-08-16 9:53 ` Stefano Stabellini
2012-08-15 17:33 ` Stefano Stabellini
2012-08-14 12:24 ` [PATCH v2 2/2] Xen: Document the semantic of the " Attilio Rao
2012-08-14 13:57 ` David Vrabel
2012-08-14 14:12 ` Attilio Rao
2012-08-15 11:19 ` David Vrabel
2012-08-15 13:55 ` Stefano Stabellini
2012-08-15 19:15 ` David Vrabel
2012-08-16 11:07 ` Stefano Stabellini
2012-08-16 14:33 ` David Vrabel [this message]
2012-08-15 17:46 ` Stefano Stabellini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=502D04AA.8030107@citrix.com \
--to=david.vrabel@citrix.com \
--cc=Stefano.Stabellini@eu.citrix.com \
--cc=attilio.rao@citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).