* Writable page tables questions @ 2015-01-04 17:17 Junji Zhi 2015-01-05 17:28 ` Andrew Cooper 0 siblings, 1 reply; 5+ messages in thread From: Junji Zhi @ 2015-01-04 17:17 UTC (permalink / raw) To: xen-devel Hi, I'm Junji, a newbie in Xen and hoping I can contribute to the community one day. I have a few questions regarding the writable page tables, while reading The Definitive Guide to the Xen Hypervisor by David Chisnall: 1. Writable page tables is one Xen memory assist technique, applied to paravirtualized guests ONLY. HVM does not apply. Correct? 2. According to the book, when a guest wants to modify its page table, it triggers a trap into the hypervisor and it does a few steps: (1) it invalidates a PTE that points to the page containing the page table. Is my understanding correct? Q: What does "invalidate" really mean here? Does it mean simply flipping a bit in the PTE of the page table, or removing the PTE completely? Does it also need to invalidate the TLB entry? (2) then the control goes back to the guest and it can write/read the page table now. (3) The book's words pasted: "When an address referenced by the newly invalidated page directory entry is referenced (read or write), a page fault occurs. " Q: The description of step (3) is confusing. What does it mean by "an address referenced by the newly invalidated page directory entry is referenced"? Does it mean the case when the guest code is accessing an virtual address that needs to search the invalidated page table for translation? Thanks and I really appreciate any comment or responses. Junji ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Writable page tables questions 2015-01-04 17:17 Writable page tables questions Junji Zhi @ 2015-01-05 17:28 ` Andrew Cooper 2015-01-06 9:55 ` Ian Campbell 0 siblings, 1 reply; 5+ messages in thread From: Andrew Cooper @ 2015-01-05 17:28 UTC (permalink / raw) To: Junji Zhi, xen-devel On 04/01/2015 17:17, Junji Zhi wrote: > Hi, > > I'm Junji, a newbie in Xen and hoping I can contribute to the > community one day. I have a few questions regarding the writable page > tables, while reading The Definitive Guide to the Xen Hypervisor by > David Chisnall: > > 1. Writable page tables is one Xen memory assist technique, applied to > paravirtualized guests ONLY. HVM does not apply. Correct? > > 2. According to the book, when a guest wants to modify its page table, > it triggers a trap into the hypervisor and it does a few steps: > > (1) it invalidates a PTE that points to the page containing the page > table. Is my understanding correct? > > Q: What does "invalidate" really mean here? Does it mean simply > flipping a bit in the PTE of the page table, or removing the PTE > completely? Does it also need to invalidate the TLB entry? > > (2) then the control goes back to the guest and it can write/read the > page table now. > > (3) The book's words pasted: "When an address referenced by the newly > invalidated page directory entry is referenced (read or write), a page > fault occurs. " > > Q: The description of step (3) is confusing. What does it mean by "an > address referenced by the newly invalidated page directory entry is > referenced"? Does it mean the case when the guest code is accessing an > virtual address that needs to search the invalidated page table for > translation? I do not have the Chisnall book to hand at the moment, so cannot comment as to the exact text in it. However, looking at the code as it exists today, XENFEAT_writable_page_tables (there is a typo in the ABI) is strictly only offered to HVM guests, and not to PV guests. PV guests must, under all circumstances, have their pagetables reachable from any cr3 read-only. Any ability to write to an active pagetable without an audit from Xen would be a security issue, as a guest could give itself access to frames which belonged to Xen or other guests. Updating an individual PTE can be done by either writing directly to it, in which case Xen will trap, emulate and audit the attempt, or use an appropriate hypercall, which will be more efficient as no emulation is required. A PV guest is required to perform its own TLB management when necessary (again, hypercall or trap and emulate). Updating pagetables in general can either be done by updating each PTE individually, or by constructing a new pagetable from scratch, pinning it (via hypercall), which performs all the auditing at once, then introducing it into the active set of pagetables. An example might be: 1) Write all 512 entries into a regular page 2) Unmap the page (taking its refcount to 0, to permit a typechange) 3) Pinning the page as a specific type of pagetable (each level of pagetables have a different type, for refcounting purposes) 4) PTE write or hypercall to introduce this new pagetable into the active set. The important points are that nothing can ever be changed in the active set of pagetables without an audit by Xen, but the cost of the audit can be amortised by constructing pagetables separately in a regular page first. I hope this helps to clarify the situation. ~Andrew ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Writable page tables questions 2015-01-05 17:28 ` Andrew Cooper @ 2015-01-06 9:55 ` Ian Campbell 2015-01-08 11:19 ` Tim Deegan 0 siblings, 1 reply; 5+ messages in thread From: Ian Campbell @ 2015-01-06 9:55 UTC (permalink / raw) To: Andrew Cooper; +Cc: Junji Zhi, xen-devel On Mon, 2015-01-05 at 17:28 +0000, Andrew Cooper wrote: > On 04/01/2015 17:17, Junji Zhi wrote: > > Hi, > > > > I'm Junji, a newbie in Xen and hoping I can contribute to the > > community one day. I have a few questions regarding the writable page > > tables, while reading The Definitive Guide to the Xen Hypervisor by > > David Chisnall: > > > > 1. Writable page tables is one Xen memory assist technique, applied to > > paravirtualized guests ONLY. HVM does not apply. Correct? > > > > 2. According to the book, when a guest wants to modify its page table, > > it triggers a trap into the hypervisor and it does a few steps: > > > > (1) it invalidates a PTE that points to the page containing the page > > table. Is my understanding correct? > > > > Q: What does "invalidate" really mean here? Does it mean simply > > flipping a bit in the PTE of the page table, or removing the PTE > > completely? At least clearing the present bit, what happens to the other bits in the PTE is up to the implementation I think. > Does it also need to invalidate the TLB entry? Yes, I think so, else the CPU might subsequently use a stale mapping. > > (2) then the control goes back to the guest and it can write/read the > > page table now. > > > > (3) The book's words pasted: "When an address referenced by the newly > > invalidated page directory entry is referenced (read or write), a page > > fault occurs. " > > > > Q: The description of step (3) is confusing. What does it mean by "an > > address referenced by the newly invalidated page directory entry is > > referenced"? Does it mean the case when the guest code is accessing an > > virtual address that needs to search the invalidated page table for > > translation? Yes, it means when something tries to access memory which would have been mapped by the PT page which was removed in (1). > I do not have the Chisnall book to hand at the moment, so cannot comment > as to the exact text in it. > > However, looking at the code as it exists today, > XENFEAT_writable_page_tables (there is a typo in the ABI) is strictly > only offered to HVM guests, and not to PV guests. XENFEAT_writable_page_tables is different from "out of sync" PT updates, which is what Junji (and the book) seems to be referring to. I don't know if modern Xen still does this for PV (I think it still does for shadow mode HVM under at least some circumstances) but at at one point in time (presumably when the book was written) it used to be that Xen would handle an emulated write to a r/o page table page by: * unhooking it from the higher level PTs which referenced it, flushing TLBs * map the PT page itself r/w (contrary to the usual invariant that it be mapped r/o, which is Xen's usual invariant) At which point any subsequent writes to the now out-of-sync PT page can just happen without trapping. This is safe because after the unhook the PT is not part of any cr3 and the invariant is not violated (the guest doesn't really know this is happening, for all it knows all writes are still being emulated). At some point something would try and access the memory which would be mapped by the out of sync PT page and Xen will, in the page fault handler: * make all the mappings r/o again (+ tlb flush) * validate all the entries in the page * rehook it into the higher level PTs which should reference it At which point the mappings are available again and Xen's invariants are preserved. The tlb flushes involved in the above are reasonably expensive, IIRC Xen flip flopped a bit (years ago now) on whether it is worthwhile doing this or not, which is why I'm not sure if it still does or not. This is all different from XENFEAT_writable_page_tables that you talk about which is where the guest is informed that it is not obliged to make the regular mappings r/o in the first place, i.e. to ignore Xen's invariant completely. Ian. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Writable page tables questions 2015-01-06 9:55 ` Ian Campbell @ 2015-01-08 11:19 ` Tim Deegan 2015-01-08 11:30 ` Ian Campbell 0 siblings, 1 reply; 5+ messages in thread From: Tim Deegan @ 2015-01-08 11:19 UTC (permalink / raw) To: Ian Campbell; +Cc: Andrew Cooper, xen-devel, Junji Zhi At 09:55 +0000 on 06 Jan (1420534536), Ian Campbell wrote: > The tlb flushes involved in the above are reasonably expensive, IIRC Xen > flip flopped a bit (years ago now) on whether it is worthwhile doing > this or not, which is why I'm not sure if it still does or not. The current "writable pagetables" code for PV guests emulates the write and validates the resulting PTE. If it passes validation, it updates it, without ever making the page actually writable to the guest itself. The code is in xen/arch/x86/mm.c, as ptwr_* Cheers, Tim. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Writable page tables questions 2015-01-08 11:19 ` Tim Deegan @ 2015-01-08 11:30 ` Ian Campbell 0 siblings, 0 replies; 5+ messages in thread From: Ian Campbell @ 2015-01-08 11:30 UTC (permalink / raw) To: Tim Deegan; +Cc: Andrew Cooper, xen-devel, Junji Zhi On Thu, 2015-01-08 at 12:19 +0100, Tim Deegan wrote: > At 09:55 +0000 on 06 Jan (1420534536), Ian Campbell wrote: > > The tlb flushes involved in the above are reasonably expensive, IIRC Xen > > flip flopped a bit (years ago now) on whether it is worthwhile doing > > this or not, which is why I'm not sure if it still does or not. > > The current "writable pagetables" code for PV guests emulates the > write and validates the resulting PTE. If it passes validation, it > updates it, without ever making the page actually writable to the > guest itself. Indeed, it seems like the mode I was on about was removed 9 years ago: commit 228f081e08474febb96ee694f6d1b3d6d7465052 Author: kfraser@localhost.localdomain <kfraser@localhost.localdomain> Date: Fri Aug 11 16:07:22 2006 +0100 [XEN] Remove batched writable pagetable logic. Benchmarks show it provides little or no benefit (except on synthetic benchmarks). Also it is complicated and likely to hinder efforts to reduce lockign granularity. Signed-off-by: Keir Fraser <keir@xensource.com> $ git describe --contains 228f081e08474febb96ee694f6d1b3d6d7465052 3.0.3-branched~459 So in 3.0.3 apparently. Ian. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-01-08 11:30 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-04 17:17 Writable page tables questions Junji Zhi 2015-01-05 17:28 ` Andrew Cooper 2015-01-06 9:55 ` Ian Campbell 2015-01-08 11:19 ` Tim Deegan 2015-01-08 11:30 ` Ian Campbell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.