* x86 MMU: RMap Interface @ 2020-07-19 22:32 contact 2020-07-20 15:49 ` Sean Christopherson 0 siblings, 1 reply; 4+ messages in thread From: contact @ 2020-07-19 22:32 UTC (permalink / raw) To: kvm Hi, I'm a bit confused by the interface for interacting with the page rmap. For context, on a TDP-enabled x86-64 host, I'm logging each time a GFN->PFN mapping is created/modified/removed for a non-MMIO page (kernel version 5.4). First, my understanding is that the page rmap is a mapping of non-MMIO PFNs back to the GFNs that use them. The interface for creating an rmap entry (and thus, a new GFN->PFN mapping) appears to be rmap_add() and is quite straightforward. However, rmap_remove() does not appear to be the (only) function for removing an entry from the page rmap. For instance, kvm_zap_rmapp()---used by the mmu_notifier for invalidations---jumps straight to pte_list_remove(), while drop_spte() uses rmap_remove(). Would it be fair to say that mmu_spte_clear_track_bits() is found on all paths for removing an entry from the page rmap? Second, for updates to the frame numbers in an existing SPTE, there are both mmu_set_spte() and mmu_spte_set(). Could someone please clarify the difference between these functions? Finally, much of the logic between the page rmap and parent PTE rmaps (understandably) overlaps. However, with TDP-enabled, I'm not entirely sure what the role of the parent PTE rmaps is relative to the page rmap. Could someone possibly clarify? Thanks, and best wishes, Kevin ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: x86 MMU: RMap Interface 2020-07-19 22:32 x86 MMU: RMap Interface contact @ 2020-07-20 15:49 ` Sean Christopherson 2020-08-15 21:08 ` Kevin Loughlin [not found] ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org> 0 siblings, 2 replies; 4+ messages in thread From: Sean Christopherson @ 2020-07-20 15:49 UTC (permalink / raw) To: contact; +Cc: kvm On Sun, Jul 19, 2020 at 06:32:22PM -0400, contact@kevinloughlin.org wrote: > Hi, > > I'm a bit confused by the interface for interacting with the page rmap. For > context, on a TDP-enabled x86-64 host, I'm logging each time a GFN->PFN > mapping is created/modified/removed for a non-MMIO page (kernel version > 5.4). > > First, my understanding is that the page rmap is a mapping of non-MMIO PFNs > back to the GFNs that use them. The interface for creating an rmap entry > (and thus, a new GFN->PFN mapping) appears to be rmap_add() and is quite > straightforward. However, rmap_remove() does not appear to be the (only) > function for removing an entry from the page rmap. For instance, > kvm_zap_rmapp()---used by the mmu_notifier for invalidations---jumps > straight to pte_list_remove(), while drop_spte() uses rmap_remove(). The rmaps are associated with the memslot, the drop_spte() path allows KVM to clean up SPTEs without having to guarantee the validity of the memslot that was used to create the SPTE. > Would it be fair to say that mmu_spte_clear_track_bits() is found on all > paths for removing an entry from the page rmap? Yes, that should hold true. > Second, for updates to the frame numbers in an existing SPTE, there are both > mmu_set_spte() and mmu_spte_set(). Could someone please clarify the > difference between these functions? mmu_set_spte() is the higher level helper that is used during a page fault or prefetch to convert a host PFN and basic access permissions into a SPTE value, handle large/huge page interactions and accounting, add the rmap, etc..., and of course eventually update the SPTE. mmu_spte_set() is a low level helper that does nothing more than write a SPTE. It's just a wrapper to __set_spte() that also WARNs if the old SPTE is present. > Finally, much of the logic between the page rmap and parent PTE rmaps > (understandably) overlaps. However, with TDP-enabled, I'm not entirely sure > what the role of the parent PTE rmaps is relative to the page rmap. Could > someone possibly clarify? KVM needs the backpointers to remove the SPTE for a shadow page, which exists in the parent shadow page, when the child is zapped, e.g. if a L2 SP is removed, its SPTE in a L3 SP needs to be updated. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: x86 MMU: RMap Interface 2020-07-20 15:49 ` Sean Christopherson @ 2020-08-15 21:08 ` Kevin Loughlin [not found] ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org> 1 sibling, 0 replies; 4+ messages in thread From: Kevin Loughlin @ 2020-08-15 21:08 UTC (permalink / raw) To: kvm Given this info, am I correct in saying that all non-MMIO guest pages are (1) added to the rmap upon being marked present, and (2) removed from the rmap upon being marked non-present? I primarily ask because I'm observing behavior (running x86-64 guest with TDP/EPT enabled) wherein multiple SPTEs appear to be added to the rmap for the same GFN<->PFN mapping (sometimes later followed by multiple removals of the same GFN<->PFN mapping). My understanding was that, for a given guest, each GFN<->PFN mapping corresponds to exactly one rmap entry (and vice versa). Is this incorrect? I observe the behavior I mentioned whether I log upon rmap updates, or upon mmu_spte_set() (for non-present->present) and mmu_clear_track_bits() (for present->non-present). Perhaps I'm missing a more obvious interface for logging when the PFNs backing guest pages are marked as present/non-present? Best wishes, and thanks again for the help, Kevin On Mon, Jul 20, 2020 at 11:49 AM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Sun, Jul 19, 2020 at 06:32:22PM -0400, contact@kevinloughlin.org wrote: > > Hi, > > > > I'm a bit confused by the interface for interacting with the page rmap. For > > context, on a TDP-enabled x86-64 host, I'm logging each time a GFN->PFN > > mapping is created/modified/removed for a non-MMIO page (kernel version > > 5.4). > > > > First, my understanding is that the page rmap is a mapping of non-MMIO PFNs > > back to the GFNs that use them. The interface for creating an rmap entry > > (and thus, a new GFN->PFN mapping) appears to be rmap_add() and is quite > > straightforward. However, rmap_remove() does not appear to be the (only) > > function for removing an entry from the page rmap. For instance, > > kvm_zap_rmapp()---used by the mmu_notifier for invalidations---jumps > > straight to pte_list_remove(), while drop_spte() uses rmap_remove(). > > The rmaps are associated with the memslot, the drop_spte() path allows KVM > to clean up SPTEs without having to guarantee the validity of the memslot > that was used to create the SPTE. > > > Would it be fair to say that mmu_spte_clear_track_bits() is found on all > > paths for removing an entry from the page rmap? > > Yes, that should hold true. > > > Second, for updates to the frame numbers in an existing SPTE, there are both > > mmu_set_spte() and mmu_spte_set(). Could someone please clarify the > > difference between these functions? > > mmu_set_spte() is the higher level helper that is used during a page fault > or prefetch to convert a host PFN and basic access permissions into a SPTE > value, handle large/huge page interactions and accounting, add the rmap, > etc..., and of course eventually update the SPTE. > > mmu_spte_set() is a low level helper that does nothing more than write a > SPTE. It's just a wrapper to __set_spte() that also WARNs if the old SPTE > is present. > > > Finally, much of the logic between the page rmap and parent PTE rmaps > > (understandably) overlaps. However, with TDP-enabled, I'm not entirely sure > > what the role of the parent PTE rmaps is relative to the page rmap. Could > > someone possibly clarify? > > KVM needs the backpointers to remove the SPTE for a shadow page, which > exists in the parent shadow page, when the child is zapped, e.g. if a L2 SP > is removed, its SPTE in a L3 SP needs to be updated. ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org>]
* Re: x86 MMU: RMap Interface [not found] ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org> @ 2020-08-17 16:54 ` Sean Christopherson 0 siblings, 0 replies; 4+ messages in thread From: Sean Christopherson @ 2020-08-17 16:54 UTC (permalink / raw) To: contact; +Cc: kvm On Fri, Aug 14, 2020 at 11:44:49PM -0400, contact@kevinloughlin.org wrote: > Thanks! > > Given this info, am I correct in saying that all non-MMIO guest pages are > (1) added to the rmap upon being marked present, and (2) removed from the > rmap upon being marked non-present? > > I primarily ask because I'm observing behavior (running x86-64 guest with > TDP/EPT enabled) wherein multiple SPTEs appear to be added to the rmap for > the same GFN<->PFN mapping (sometimes later followed by multiple removals of > the same GFN<->PFN mapping). My understanding was that, for a given guest, > each GFN<->PFN mapping corresponds to exactly one rmap entry (and vice > versa). Is this incorrect? > > I observe the behavior I mentioned whether I log upon rmap updates, or upon > mmu_spte_set() (for non-present->present) and mmu_clear_track_bits() (for > present->non-present). Perhaps I'm missing a more obvious interface for > logging when the PFNs backing guest pages are marked as present/non-present? The basic premise is correct, but there are exceptions (or rather, at least one exception that immediately comes to mind). With TDP and no nested VMs, a given instance of the MMU will have a 1:1 GFN:PFN mapping. But, if the MMU is recreated (reloaded with a different EPTP), e.g. as part of a fast zap, then there may be mappings for the GFN:PFN in both the old MMU/EPTP instance and the new MMU/EPTP instance, and thus multiple rmaps. KVM currently does a fast zap (and MMU reload) when deleting memslots, which happens multiple times during boot, so the behavior you're observing is expected. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-08-17 16:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-19 22:32 x86 MMU: RMap Interface contact
2020-07-20 15:49 ` Sean Christopherson
2020-08-15 21:08 ` Kevin Loughlin
[not found] ` <b7f5d039b4e4b12697ee5e65cf03d25b@kevinloughlin.org>
2020-08-17 16:54 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).