Calling to kvm_mmu

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Calling to kvm_mmu_load
@ 2013-10-21  7:56 Arthur Chunqi Li
  2013-10-23  6:21 ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Arthur Chunqi Li @ 2013-10-21  7:56 UTC (permalink / raw)
  To: kvm; +Cc: Jan Kiszka

Hi there,

I noticed that kvm_mmu_reload() is called every time in vcpu enter,
and kvm_mmu_load() is called in this function when root_hpa is
INVALID_PAGE. I get confused why and when root_hpa can be set to
INVALID_PAGE? I find one condition that if vcpu get request
KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa,
but this condition cannot cover all occasions.

Thanks,
Arthur

-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-21  7:56 Calling to kvm_mmu_load Arthur Chunqi Li
@ 2013-10-23  6:21 ` Paolo Bonzini
  2013-10-24  7:55   ` Arthur Chunqi Li
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-10-23  6:21 UTC (permalink / raw)
  To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka

Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto:
> Hi there,
> 
> I noticed that kvm_mmu_reload() is called every time in vcpu enter,
> and kvm_mmu_load() is called in this function when root_hpa is
> INVALID_PAGE. I get confused why and when root_hpa can be set to
> INVALID_PAGE? I find one condition that if vcpu get request
> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa,
> but this condition cannot cover all occasions.

Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context.
In "normal" cases and without EPT, it should be called when CR3 changes
or when the paging mode changes (32-bit, PAE, 64-bit, no paging).  With
EPT, this kind of change won't reset the MMU (CR3 changes won't cause a
vmexit at all, in fact).

With nested virtualization, roots are invalidated whenever kvm->arch.mmu
changes meaning from L1->L0 or L2->L0 or vice versa (in the special case
where EPT is disabled on L0, this is trivially because vmentry loads CR3
from the vmcs02).

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-23  6:21 ` Paolo Bonzini
@ 2013-10-24  7:55   ` Arthur Chunqi Li
  2013-10-25  0:43     ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Arthur Chunqi Li @ 2013-10-24  7:55 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Jan Kiszka

Hi Paolo,

Thanks for your reply.

On Wed, Oct 23, 2013 at 2:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto:
>> Hi there,
>>
>> I noticed that kvm_mmu_reload() is called every time in vcpu enter,
>> and kvm_mmu_load() is called in this function when root_hpa is
>> INVALID_PAGE. I get confused why and when root_hpa can be set to
>> INVALID_PAGE? I find one condition that if vcpu get request
>> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa,
>> but this condition cannot cover all occasions.
>
> Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context.
> In "normal" cases and without EPT, it should be called when CR3 changes
> or when the paging mode changes (32-bit, PAE, 64-bit, no paging).  With
> EPT, this kind of change won't reset the MMU (CR3 changes won't cause a
> vmexit at all, in fact).

When EPT is enabled, why will root_hpa be set to INVALID_PAGE when a
VM boots? I find that Qemu reset root_hpa with KVM_REQ_MMU_RELOAD
request several time when booting a VM, why?

And will VM use EPT from the very beginning when booting?

>
> With nested virtualization, roots are invalidated whenever kvm->arch.mmu
> changes meaning from L1->L0 or L2->L0 or vice versa (in the special case
> where EPT is disabled on L0, this is trivially because vmentry loads CR3
> from the vmcs02).

Besides, in function tdp_page_fault(), I find two different execution
flow which may not reach __direct_map() (which I think is the normal
path to handle PF), they are fast_page_fault() and try_async_pf().
When will these two paths called when handling EPT page fault?

Thanks,
Arthur
>
> Paolo



-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-24  7:55   ` Arthur Chunqi Li
@ 2013-10-25  0:43     ` Paolo Bonzini
  2013-10-29  5:39       ` Arthur Chunqi Li
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-10-25  0:43 UTC (permalink / raw)
  To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka

Il 24/10/2013 08:55, Arthur Chunqi Li ha scritto:
> Hi Paolo,
> 
> Thanks for your reply.
> 
> On Wed, Oct 23, 2013 at 2:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto:
>>> Hi there,
>>>
>>> I noticed that kvm_mmu_reload() is called every time in vcpu enter,
>>> and kvm_mmu_load() is called in this function when root_hpa is
>>> INVALID_PAGE. I get confused why and when root_hpa can be set to
>>> INVALID_PAGE? I find one condition that if vcpu get request
>>> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa,
>>> but this condition cannot cover all occasions.
>>
>> Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context.
>> In "normal" cases and without EPT, it should be called when CR3 changes
>> or when the paging mode changes (32-bit, PAE, 64-bit, no paging).  With
>> EPT, this kind of change won't reset the MMU (CR3 changes won't cause a
>> vmexit at all, in fact).
> 
> When EPT is enabled, why will root_hpa be set to INVALID_PAGE when a
> VM boots?

Because EPT page tables are only built lazily.  The EPT page tables
start all-invalid, and are built as the guest accesses pages at new
guest physical addresses (instead, shadow page tables are built as the
guest accesses pages at new guest virtual addresses).

> I find that Qemu reset root_hpa with KVM_REQ_MMU_RELOAD
> request several time when booting a VM, why?

This happens when the memory map changes.  A previously-valid guest
physical address might become invalid now, and the EPT page tables have
to be "emptied".

> And will VM use EPT from the very beginning when booting?

Yes.  But it's not the VM.  It's KVM that uses EPT.

The VM only uses EPT if you're using nested virtualization, and EPT is
enabled.  L1's KVM uses EPT, L2 doesn't (because it doesn't run KVM).

>> With nested virtualization, roots are invalidated whenever kvm->arch.mmu
>> changes meaning from L1->L0 or L2->L0 or vice versa (in the special case
>> where EPT is disabled on L0, this is trivially because vmentry loads CR3
>> from the vmcs02).
> 
> Besides, in function tdp_page_fault(), I find two different execution
> flow which may not reach __direct_map() (which I think is the normal
> path to handle PF), they are fast_page_fault() and try_async_pf().
> When will these two paths called when handling EPT page fault?

fast_page_fault() is called if you're using dirty page tracking.  It
checks if we have a read-only page that is in a writeable memory slot
(SPTE_HOST_WRITEABLE) and whose PTE allows writes (SPTE_MMU_WRITEABLE).
 If these conditions are satisfied, the page was read-only because of
dirty page tracking; it is made read-write with a single cmpxchg and
sets the bit for the page in the dirty bitmap.

try_async_pf will inject a "dummy" pagefault instead of creating the EPT
page table, and create the page table in the background.  The guest will
do something else (run another task) until the EPT page table has been
created; then a second "dummy" pagefault is injected.
kvm_arch_async_page_not_present signals the first page fault,
kvm_arch_async_page_present signals the second.  For this to happen, the
guest must have enabled the asynchronous page fault feature with a write
to a KVM-specific MSR.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-25  0:43     ` Paolo Bonzini
@ 2013-10-29  5:39       ` Arthur Chunqi Li
  2013-10-29 12:55         ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Arthur Chunqi Li @ 2013-10-29  5:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Jan Kiszka

Hi Paolo,

On Fri, Oct 25, 2013 at 8:43 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 24/10/2013 08:55, Arthur Chunqi Li ha scritto:
>> Hi Paolo,
>>
>> Thanks for your reply.
>>
>> On Wed, Oct 23, 2013 at 2:21 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> Il 21/10/2013 08:56, Arthur Chunqi Li ha scritto:
>>>> Hi there,
>>>>
>>>> I noticed that kvm_mmu_reload() is called every time in vcpu enter,
>>>> and kvm_mmu_load() is called in this function when root_hpa is
>>>> INVALID_PAGE. I get confused why and when root_hpa can be set to
>>>> INVALID_PAGE? I find one condition that if vcpu get request
>>>> KVM_REQ_MMU_RELOAD, kvm_mmu_unload() is called to invalid root_hpa,
>>>> but this condition cannot cover all occasions.
>>>
>>> Look also at mmu_free_roots, kvm_mmu_unload and kvm_mmu_reset_context.
>>> In "normal" cases and without EPT, it should be called when CR3 changes
>>> or when the paging mode changes (32-bit, PAE, 64-bit, no paging).  With
>>> EPT, this kind of change won't reset the MMU (CR3 changes won't cause a
>>> vmexit at all, in fact).
>>
>> When EPT is enabled, why will root_hpa be set to INVALID_PAGE when a
>> VM boots?
>
> Because EPT page tables are only built lazily.  The EPT page tables
> start all-invalid, and are built as the guest accesses pages at new
> guest physical addresses (instead, shadow page tables are built as the
> guest accesses pages at new guest virtual addresses).
>
>> I find that Qemu reset root_hpa with KVM_REQ_MMU_RELOAD
>> request several time when booting a VM, why?
>
> This happens when the memory map changes.  A previously-valid guest
> physical address might become invalid now, and the EPT page tables have
> to be "emptied".
>
>> And will VM use EPT from the very beginning when booting?
>
> Yes.  But it's not the VM.  It's KVM that uses EPT.
>
> The VM only uses EPT if you're using nested virtualization, and EPT is
> enabled.  L1's KVM uses EPT, L2 doesn't (because it doesn't run KVM).
>
>>> With nested virtualization, roots are invalidated whenever kvm->arch.mmu
>>> changes meaning from L1->L0 or L2->L0 or vice versa (in the special case
>>> where EPT is disabled on L0, this is trivially because vmentry loads CR3
>>> from the vmcs02).
>>
>> Besides, in function tdp_page_fault(), I find two different execution
>> flow which may not reach __direct_map() (which I think is the normal
>> path to handle PF), they are fast_page_fault() and try_async_pf().
>> When will these two paths called when handling EPT page fault?
>
> fast_page_fault() is called if you're using dirty page tracking.  It
> checks if we have a read-only page that is in a writeable memory slot
> (SPTE_HOST_WRITEABLE) and whose PTE allows writes (SPTE_MMU_WRITEABLE).
>  If these conditions are satisfied, the page was read-only because of
> dirty page tracking; it is made read-write with a single cmpxchg and
> sets the bit for the page in the dirty bitmap.

What is the dirty page tracking code path? I find a obsoleted flag
"dirty_page_log_all" in the very previous codes, but I cannot get the
most recent version of tracking dirty pages.

Besides, I noticed that memory management in KVM uses the mechanism
with "struct kvm_memory_slot". How is kvm_memory_slot used with the
cooperation of Linux memory management?

Thanks,
Arthur

>
> try_async_pf will inject a "dummy" pagefault instead of creating the EPT
> page table, and create the page table in the background.  The guest will
> do something else (run another task) until the EPT page table has been
> created; then a second "dummy" pagefault is injected.
> kvm_arch_async_page_not_present signals the first page fault,
> kvm_arch_async_page_present signals the second.  For this to happen, the
> guest must have enabled the asynchronous page fault feature with a write
> to a KVM-specific MSR.
>
> Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-29  5:39       ` Arthur Chunqi Li
@ 2013-10-29 12:55         ` Paolo Bonzini
  2013-10-30 11:39           ` Arthur Chunqi Li
  2013-10-31  8:05           ` Arthur Chunqi Li
  0 siblings, 2 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-10-29 12:55 UTC (permalink / raw)
  To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka

Il 29/10/2013 06:39, Arthur Chunqi Li ha scritto:
> What is the dirty page tracking code path? I find a obsoleted flag
> "dirty_page_log_all" in the very previous codes, but I cannot get the
> most recent version of tracking dirty pages.

Basically everything that accesses the dirty_bitmap field of struct
kvm_memory_slot is involved.  It all starts when the
KVM_SET_USER_MEMORY_REGION ioctl is called with the
KVM_MEM_LOG_DIRTY_PAGES flag set.

> Besides, I noticed that memory management in KVM uses the mechanism
> with "struct kvm_memory_slot". How is kvm_memory_slot used with the
> cooperation of Linux memory management?

kvm_memory_slot just maps a host userspace address range to a guest
physical address range.  Cooperation with Linux memory management is
done with the Linux MMU notifiers.  MMU notifiers let KVM know that a
page has been swapped out, and KVM reacts by invalidating the shadow
page tables for the corresponding guest physical address.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-29 12:55         ` Paolo Bonzini
@ 2013-10-30 11:39           ` Arthur Chunqi Li
  2013-10-30 11:44             ` Paolo Bonzini
  2013-10-31  8:05           ` Arthur Chunqi Li
  1 sibling, 1 reply; 10+ messages in thread
From: Arthur Chunqi Li @ 2013-10-30 11:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Jan Kiszka

On Tue, Oct 29, 2013 at 8:55 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/10/2013 06:39, Arthur Chunqi Li ha scritto:
>> What is the dirty page tracking code path? I find a obsoleted flag
>> "dirty_page_log_all" in the very previous codes, but I cannot get the
>> most recent version of tracking dirty pages.
>
> Basically everything that accesses the dirty_bitmap field of struct
> kvm_memory_slot is involved.  It all starts when the
> KVM_SET_USER_MEMORY_REGION ioctl is called with the
> KVM_MEM_LOG_DIRTY_PAGES flag set.
>
>> Besides, I noticed that memory management in KVM uses the mechanism
>> with "struct kvm_memory_slot". How is kvm_memory_slot used with the
>> cooperation of Linux memory management?
>
> kvm_memory_slot just maps a host userspace address range to a guest
> physical address range.  Cooperation with Linux memory management is
> done with the Linux MMU notifiers.  MMU notifiers let KVM know that a
> page has been swapped out, and KVM reacts by invalidating the shadow
> page tables for the corresponding guest physical address.
So for each VM, qemu need to register its memory region and KVM stores
this region of GPA to HVA mapping in kvm_memory_slot, and at the first
page fault KVM uses EPT to map GPA to HPA. Am I right?

In this way, how is ballooning mechanism implemented in KVM memory
management module?

Thanks,
Arthur
>
> Paolo



-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-30 11:39           ` Arthur Chunqi Li
@ 2013-10-30 11:44             ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-10-30 11:44 UTC (permalink / raw)
  To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka

Il 30/10/2013 12:39, Arthur Chunqi Li ha scritto:
>> >
>> > kvm_memory_slot just maps a host userspace address range to a guest
>> > physical address range.  Cooperation with Linux memory management is
>> > done with the Linux MMU notifiers.  MMU notifiers let KVM know that a
>> > page has been swapped out, and KVM reacts by invalidating the shadow
>> > page tables for the corresponding guest physical address.
> So for each VM, qemu need to register its memory region and KVM stores
> this region of GPA to HVA mapping in kvm_memory_slot, and at the first
> page fault KVM uses EPT to map GPA to HPA. Am I right?

Yes.

> In this way, how is ballooning mechanism implemented in KVM memory
> management module?

Ballooning is done entirely in userspace with a madvise(MADV_DONTNEED)
call on the HVA.  The userspace has its own GPA->HVA mapping that is
separate from the memslots (e.g. memory_region_find +
memory_region_get_ram_ptr in QEMU).

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-29 12:55         ` Paolo Bonzini
  2013-10-30 11:39           ` Arthur Chunqi Li
@ 2013-10-31  8:05           ` Arthur Chunqi Li
  2013-10-31  9:52             ` Paolo Bonzini
  1 sibling, 1 reply; 10+ messages in thread
From: Arthur Chunqi Li @ 2013-10-31  8:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, Jan Kiszka

Hi Paolo,

On Tue, Oct 29, 2013 at 8:55 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 29/10/2013 06:39, Arthur Chunqi Li ha scritto:
>> What is the dirty page tracking code path? I find a obsoleted flag
>> "dirty_page_log_all" in the very previous codes, but I cannot get the
>> most recent version of tracking dirty pages.
>
> Basically everything that accesses the dirty_bitmap field of struct
> kvm_memory_slot is involved.  It all starts when the
> KVM_SET_USER_MEMORY_REGION ioctl is called with the
> KVM_MEM_LOG_DIRTY_PAGES flag set.

I find the mechanism here is set all pages read-only to track all the
dirty pages. But EPT provides such a dirty bit in EPT paging
structures. Why don't we use this?

Arthur
>
>> Besides, I noticed that memory management in KVM uses the mechanism
>> with "struct kvm_memory_slot". How is kvm_memory_slot used with the
>> cooperation of Linux memory management?
>
> kvm_memory_slot just maps a host userspace address range to a guest
> physical address range.  Cooperation with Linux memory management is
> done with the Linux MMU notifiers.  MMU notifiers let KVM know that a
> page has been swapped out, and KVM reacts by invalidating the shadow
> page tables for the corresponding guest physical address.
>
> Paolo



-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calling to kvm_mmu_load
  2013-10-31  8:05           ` Arthur Chunqi Li
@ 2013-10-31  9:52             ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-10-31  9:52 UTC (permalink / raw)
  To: Arthur Chunqi Li; +Cc: kvm, Jan Kiszka

Il 31/10/2013 09:05, Arthur Chunqi Li ha scritto:
>> >
>> > Basically everything that accesses the dirty_bitmap field of struct
>> > kvm_memory_slot is involved.  It all starts when the
>> > KVM_SET_USER_MEMORY_REGION ioctl is called with the
>> > KVM_MEM_LOG_DIRTY_PAGES flag set.
> I find the mechanism here is set all pages read-only to track all the
> dirty pages. But EPT provides such a dirty bit in EPT paging
> structures. Why don't we use this?

It doesn't provide it on all processors.  Check eptad in
/sys/module/kvm_intel/parameters.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-10-31  9:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-21  7:56 Calling to kvm_mmu_load Arthur Chunqi Li
2013-10-23  6:21 ` Paolo Bonzini
2013-10-24  7:55   ` Arthur Chunqi Li
2013-10-25  0:43     ` Paolo Bonzini
2013-10-29  5:39       ` Arthur Chunqi Li
2013-10-29 12:55         ` Paolo Bonzini
2013-10-30 11:39           ` Arthur Chunqi Li
2013-10-30 11:44             ` Paolo Bonzini
2013-10-31  8:05           ` Arthur Chunqi Li
2013-10-31  9:52             ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).