* Calling to kvm_mmu_load @ 2013-10-21 7:56 Arthur Chunqi Li 2013-10-23 6:21 ` Paolo Bonzini 0 siblings, 1 reply; 10+ messages in thread From: Arthur Chunqi Li @ 2013-10-21 7:56 UTC (permalink / raw) To: kvm; +Cc: Jan Kiszka Hi there, I noticed that kvm_mmu_reload() is called every time in vcpu enter, and kvm_mmu_load() is called in this function when root_hpa is INVALID_PAGE. I get confused why and when root_hpa can be set to INVALID_PAGE? I find one condition that if vcpu get request KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa, but this condition cannot cover all occasions. Thanks, Arthur -- Arthur Chunqi Li Department of Computer Science School of EECS Peking University Beijing, China ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-21 7:56 Calling to kvm_mmu_load Arthur Chunqi Li @ 2013-10-23 6:21 ` Paolo Bonzini 2013-10-24 7:55 ` Arthur Chunqi Li 0 siblings, 1 reply; 10+ messages in thread From: Paolo Bonzini @ 2013-10-23 6:21 UTC (permalink / raw) To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto: > Hi there, > > I noticed that kvm_mmu_reload() is called every time in vcpu enter, > and kvm_mmu_load() is called in this function when root_hpa is > INVALID_PAGE. I get confused why and when root_hpa can be set to > INVALID_PAGE? I find one condition that if vcpu get request > KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa, > but this condition cannot cover all occasions. Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context. In "normal" cases and without EPT, it should be called when CR3 changes or when the paging mode changes (32-bit, PAE, 64-bit, no paging). With EPT, this kind of change won't reset the MMU (CR3 changes won't cause a vmexit at all, in fact). With nested virtualization, roots are invalidated whenever kvm->arch.mmu changes meaning from L1->L0 or L2->L0 or vice versa (in the special case where EPT is disabled on L0, this is trivially because vmentry loads CR3 from the vmcs02). Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-23 6:21 ` Paolo Bonzini @ 2013-10-24 7:55 ` Arthur Chunqi Li 2013-10-25 0:43 ` Paolo Bonzini 0 siblings, 1 reply; 10+ messages in thread From: Arthur Chunqi Li @ 2013-10-24 7:55 UTC (permalink / raw) To: Paolo Bonzini; +Cc: kvm, Jan Kiszka Hi Paolo, Thanks for your reply. On Wed, Oct 23, 2013 at 2:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto: >> Hi there, >> >> I noticed that kvm_mmu_reload() is called every time in vcpu enter, >> and kvm_mmu_load() is called in this function when root_hpa is >> INVALID_PAGE. I get confused why and when root_hpa can be set to >> INVALID_PAGE? I find one condition that if vcpu get request >> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa, >> but this condition cannot cover all occasions. > > Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context. > In "normal" cases and without EPT, it should be called when CR3 changes > or when the paging mode changes (32-bit, PAE, 64-bit, no paging). With > EPT, this kind of change won't reset the MMU (CR3 changes won't cause a > vmexit at all, in fact). When EPT is enabled, why will root_hpa be set to INVALID_PAGE when a VM boots? I find that Qemu reset root_hpa with KVM_REQ_MMU_RELOAD request several time when booting a VM, why? And will VM use EPT from the very beginning when booting? > > With nested virtualization, roots are invalidated whenever kvm->arch.mmu > changes meaning from L1->L0 or L2->L0 or vice versa (in the special case > where EPT is disabled on L0, this is trivially because vmentry loads CR3 > from the vmcs02). Besides, in function tdp_page_fault(), I find two different execution flow which may not reach __direct_map() (which I think is the normal path to handle PF), they are fast_page_fault() and try_async_pf(). When will these two paths called when handling EPT page fault? Thanks, Arthur > > Paolo -- Arthur Chunqi Li Department of Computer Science School of EECS Peking University Beijing, China ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-24 7:55 ` Arthur Chunqi Li @ 2013-10-25 0:43 ` Paolo Bonzini 2013-10-29 5:39 ` Arthur Chunqi Li 0 siblings, 1 reply; 10+ messages in thread From: Paolo Bonzini @ 2013-10-25 0:43 UTC (permalink / raw) To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka Il 24/10/2013 08:55, Arthur Chunqi Li ha scritto: > Hi Paolo, > > Thanks for your reply. > > On Wed, Oct 23, 2013 at 2:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto: >>> Hi there, >>> >>> I noticed that kvm_mmu_reload() is called every time in vcpu enter, >>> and kvm_mmu_load() is called in this function when root_hpa is >>> INVALID_PAGE. I get confused why and when root_hpa can be set to >>> INVALID_PAGE? I find one condition that if vcpu get request >>> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa, >>> but this condition cannot cover all occasions. >> >> Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context. >> In "normal" cases and without EPT, it should be called when CR3 changes >> or when the paging mode changes (32-bit, PAE, 64-bit, no paging). With >> EPT, this kind of change won't reset the MMU (CR3 changes won't cause a >> vmexit at all, in fact). > > When EPT is enabled, why will root_hpa be set to INVALID_PAGE when a > VM boots? Because EPT page tables are only built lazily. The EPT page tables start all-invalid, and are built as the guest accesses pages at new guest physical addresses (instead, shadow page tables are built as the guest accesses pages at new guest virtual addresses). > I find that Qemu reset root_hpa with KVM_REQ_MMU_RELOAD > request several time when booting a VM, why? This happens when the memory map changes. A previously-valid guest physical address might become invalid now, and the EPT page tables have to be "emptied". > And will VM use EPT from the very beginning when booting? Yes. But it's not the VM. It's KVM that uses EPT. The VM only uses EPT if you're using nested virtualization, and EPT is enabled. L1's KVM uses EPT, L2 doesn't (because it doesn't run KVM). >> With nested virtualization, roots are invalidated whenever kvm->arch.mmu >> changes meaning from L1->L0 or L2->L0 or vice versa (in the special case >> where EPT is disabled on L0, this is trivially because vmentry loads CR3 >> from the vmcs02). > > Besides, in function tdp_page_fault(), I find two different execution > flow which may not reach __direct_map() (which I think is the normal > path to handle PF), they are fast_page_fault() and try_async_pf(). > When will these two paths called when handling EPT page fault? fast_page_fault() is called if you're using dirty page tracking. It checks if we have a read-only page that is in a writeable memory slot (SPTE_HOST_WRITEABLE) and whose PTE allows writes (SPTE_MMU_WRITEABLE). If these conditions are satisfied, the page was read-only because of dirty page tracking; it is made read-write with a single cmpxchg and sets the bit for the page in the dirty bitmap. try_async_pf will inject a "dummy" pagefault instead of creating the EPT page table, and create the page table in the background. The guest will do something else (run another task) until the EPT page table has been created; then a second "dummy" pagefault is injected. kvm_arch_async_page_not_present signals the first page fault, kvm_arch_async_page_present signals the second. For this to happen, the guest must have enabled the asynchronous page fault feature with a write to a KVM-specific MSR. Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-25 0:43 ` Paolo Bonzini @ 2013-10-29 5:39 ` Arthur Chunqi Li 2013-10-29 12:55 ` Paolo Bonzini 0 siblings, 1 reply; 10+ messages in thread From: Arthur Chunqi Li @ 2013-10-29 5:39 UTC (permalink / raw) To: Paolo Bonzini; +Cc: kvm, Jan Kiszka Hi Paolo, On Fri, Oct 25, 2013 at 8:43 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 24/10/2013 08:55, Arthur Chunqi Li ha scritto: >> Hi Paolo, >> >> Thanks for your reply. >> >> On Wed, Oct 23, 2013 at 2:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: >>> Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto: >>>> Hi there, >>>> >>>> I noticed that kvm_mmu_reload() is called every time in vcpu enter, >>>> and kvm_mmu_load() is called in this function when root_hpa is >>>> INVALID_PAGE. I get confused why and when root_hpa can be set to >>>> INVALID_PAGE? I find one condition that if vcpu get request >>>> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa, >>>> but this condition cannot cover all occasions. >>> >>> Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context. >>> In "normal" cases and without EPT, it should be called when CR3 changes >>> or when the paging mode changes (32-bit, PAE, 64-bit, no paging). With >>> EPT, this kind of change won't reset the MMU (CR3 changes won't cause a >>> vmexit at all, in fact). >> >> When EPT is enabled, why will root_hpa be set to INVALID_PAGE when a >> VM boots? > > Because EPT page tables are only built lazily. The EPT page tables > start all-invalid, and are built as the guest accesses pages at new > guest physical addresses (instead, shadow page tables are built as the > guest accesses pages at new guest virtual addresses). > >> I find that Qemu reset root_hpa with KVM_REQ_MMU_RELOAD >> request several time when booting a VM, why? > > This happens when the memory map changes. A previously-valid guest > physical address might become invalid now, and the EPT page tables have > to be "emptied". > >> And will VM use EPT from the very beginning when booting? > > Yes. But it's not the VM. It's KVM that uses EPT. > > The VM only uses EPT if you're using nested virtualization, and EPT is > enabled. L1's KVM uses EPT, L2 doesn't (because it doesn't run KVM). > >>> With nested virtualization, roots are invalidated whenever kvm->arch.mmu >>> changes meaning from L1->L0 or L2->L0 or vice versa (in the special case >>> where EPT is disabled on L0, this is trivially because vmentry loads CR3 >>> from the vmcs02). >> >> Besides, in function tdp_page_fault(), I find two different execution >> flow which may not reach __direct_map() (which I think is the normal >> path to handle PF), they are fast_page_fault() and try_async_pf(). >> When will these two paths called when handling EPT page fault? > > fast_page_fault() is called if you're using dirty page tracking. It > checks if we have a read-only page that is in a writeable memory slot > (SPTE_HOST_WRITEABLE) and whose PTE allows writes (SPTE_MMU_WRITEABLE). > If these conditions are satisfied, the page was read-only because of > dirty page tracking; it is made read-write with a single cmpxchg and > sets the bit for the page in the dirty bitmap. What is the dirty page tracking code path? I find a obsoleted flag "dirty_page_log_all" in the very previous codes, but I cannot get the most recent version of tracking dirty pages. Besides, I noticed that memory management in KVM uses the mechanism with "struct kvm_memory_slot". How is kvm_memory_slot used with the cooperation of Linux memory management? Thanks, Arthur > > try_async_pf will inject a "dummy" pagefault instead of creating the EPT > page table, and create the page table in the background. The guest will > do something else (run another task) until the EPT page table has been > created; then a second "dummy" pagefault is injected. > kvm_arch_async_page_not_present signals the first page fault, > kvm_arch_async_page_present signals the second. For this to happen, the > guest must have enabled the asynchronous page fault feature with a write > to a KVM-specific MSR. > > Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-29 5:39 ` Arthur Chunqi Li @ 2013-10-29 12:55 ` Paolo Bonzini 2013-10-30 11:39 ` Arthur Chunqi Li 2013-10-31 8:05 ` Arthur Chunqi Li 0 siblings, 2 replies; 10+ messages in thread From: Paolo Bonzini @ 2013-10-29 12:55 UTC (permalink / raw) To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka Il 29/10/2013 06:39, Arthur Chunqi Li ha scritto: > What is the dirty page tracking code path? I find a obsoleted flag > "dirty_page_log_all" in the very previous codes, but I cannot get the > most recent version of tracking dirty pages. Basically everything that accesses the dirty_bitmap field of struct kvm_memory_slot is involved. It all starts when the KVM_SET_USER_MEMORY_REGION ioctl is called with the KVM_MEM_LOG_DIRTY_PAGES flag set. > Besides, I noticed that memory management in KVM uses the mechanism > with "struct kvm_memory_slot". How is kvm_memory_slot used with the > cooperation of Linux memory management? kvm_memory_slot just maps a host userspace address range to a guest physical address range. Cooperation with Linux memory management is done with the Linux MMU notifiers. MMU notifiers let KVM know that a page has been swapped out, and KVM reacts by invalidating the shadow page tables for the corresponding guest physical address. Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-29 12:55 ` Paolo Bonzini @ 2013-10-30 11:39 ` Arthur Chunqi Li 2013-10-30 11:44 ` Paolo Bonzini 2013-10-31 8:05 ` Arthur Chunqi Li 1 sibling, 1 reply; 10+ messages in thread From: Arthur Chunqi Li @ 2013-10-30 11:39 UTC (permalink / raw) To: Paolo Bonzini; +Cc: kvm, Jan Kiszka On Tue, Oct 29, 2013 at 8:55 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 29/10/2013 06:39, Arthur Chunqi Li ha scritto: >> What is the dirty page tracking code path? I find a obsoleted flag >> "dirty_page_log_all" in the very previous codes, but I cannot get the >> most recent version of tracking dirty pages. > > Basically everything that accesses the dirty_bitmap field of struct > kvm_memory_slot is involved. It all starts when the > KVM_SET_USER_MEMORY_REGION ioctl is called with the > KVM_MEM_LOG_DIRTY_PAGES flag set. > >> Besides, I noticed that memory management in KVM uses the mechanism >> with "struct kvm_memory_slot". How is kvm_memory_slot used with the >> cooperation of Linux memory management? > > kvm_memory_slot just maps a host userspace address range to a guest > physical address range. Cooperation with Linux memory management is > done with the Linux MMU notifiers. MMU notifiers let KVM know that a > page has been swapped out, and KVM reacts by invalidating the shadow > page tables for the corresponding guest physical address. So for each VM, qemu need to register its memory region and KVM stores this region of GPA to HVA mapping in kvm_memory_slot, and at the first page fault KVM uses EPT to map GPA to HPA. Am I right? In this way, how is ballooning mechanism implemented in KVM memory management module? Thanks, Arthur > > Paolo -- Arthur Chunqi Li Department of Computer Science School of EECS Peking University Beijing, China ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-30 11:39 ` Arthur Chunqi Li @ 2013-10-30 11:44 ` Paolo Bonzini 0 siblings, 0 replies; 10+ messages in thread From: Paolo Bonzini @ 2013-10-30 11:44 UTC (permalink / raw) To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka Il 30/10/2013 12:39, Arthur Chunqi Li ha scritto: >> > >> > kvm_memory_slot just maps a host userspace address range to a guest >> > physical address range. Cooperation with Linux memory management is >> > done with the Linux MMU notifiers. MMU notifiers let KVM know that a >> > page has been swapped out, and KVM reacts by invalidating the shadow >> > page tables for the corresponding guest physical address. > So for each VM, qemu need to register its memory region and KVM stores > this region of GPA to HVA mapping in kvm_memory_slot, and at the first > page fault KVM uses EPT to map GPA to HPA. Am I right? Yes. > In this way, how is ballooning mechanism implemented in KVM memory > management module? Ballooning is done entirely in userspace with a madvise(MADV_DONTNEED) call on the HVA. The userspace has its own GPA->HVA mapping that is separate from the memslots (e.g. memory_region_find + memory_region_get_ram_ptr in QEMU). Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-29 12:55 ` Paolo Bonzini 2013-10-30 11:39 ` Arthur Chunqi Li @ 2013-10-31 8:05 ` Arthur Chunqi Li 2013-10-31 9:52 ` Paolo Bonzini 1 sibling, 1 reply; 10+ messages in thread From: Arthur Chunqi Li @ 2013-10-31 8:05 UTC (permalink / raw) To: Paolo Bonzini; +Cc: kvm, Jan Kiszka Hi Paolo, On Tue, Oct 29, 2013 at 8:55 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 29/10/2013 06:39, Arthur Chunqi Li ha scritto: >> What is the dirty page tracking code path? I find a obsoleted flag >> "dirty_page_log_all" in the very previous codes, but I cannot get the >> most recent version of tracking dirty pages. > > Basically everything that accesses the dirty_bitmap field of struct > kvm_memory_slot is involved. It all starts when the > KVM_SET_USER_MEMORY_REGION ioctl is called with the > KVM_MEM_LOG_DIRTY_PAGES flag set. I find the mechanism here is set all pages read-only to track all the dirty pages. But EPT provides such a dirty bit in EPT paging structures. Why don't we use this? Arthur > >> Besides, I noticed that memory management in KVM uses the mechanism >> with "struct kvm_memory_slot". How is kvm_memory_slot used with the >> cooperation of Linux memory management? > > kvm_memory_slot just maps a host userspace address range to a guest > physical address range. Cooperation with Linux memory management is > done with the Linux MMU notifiers. MMU notifiers let KVM know that a > page has been swapped out, and KVM reacts by invalidating the shadow > page tables for the corresponding guest physical address. > > Paolo -- Arthur Chunqi Li Department of Computer Science School of EECS Peking University Beijing, China ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Calling to kvm_mmu_load 2013-10-31 8:05 ` Arthur Chunqi Li @ 2013-10-31 9:52 ` Paolo Bonzini 0 siblings, 0 replies; 10+ messages in thread From: Paolo Bonzini @ 2013-10-31 9:52 UTC (permalink / raw) To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka Il 31/10/2013 09:05, Arthur Chunqi Li ha scritto: >> > >> > Basically everything that accesses the dirty_bitmap field of struct >> > kvm_memory_slot is involved. It all starts when the >> > KVM_SET_USER_MEMORY_REGION ioctl is called with the >> > KVM_MEM_LOG_DIRTY_PAGES flag set. > I find the mechanism here is set all pages read-only to track all the > dirty pages. But EPT provides such a dirty bit in EPT paging > structures. Why don't we use this? It doesn't provide it on all processors. Check eptad in /sys/module/kvm_intel/parameters. Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-10-31 9:52 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-10-21 7:56 Calling to kvm_mmu_load Arthur Chunqi Li 2013-10-23 6:21 ` Paolo Bonzini 2013-10-24 7:55 ` Arthur Chunqi Li 2013-10-25 0:43 ` Paolo Bonzini 2013-10-29 5:39 ` Arthur Chunqi Li 2013-10-29 12:55 ` Paolo Bonzini 2013-10-30 11:39 ` Arthur Chunqi Li 2013-10-30 11:44 ` Paolo Bonzini 2013-10-31 8:05 ` Arthur Chunqi Li 2013-10-31 9:52 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).