From: Kai Huang <kai.huang@linux.intel.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: pbonzini@redhat.com, gleb@kernel.org, linux@arm.linux.org.uk,
kvm@vger.kernel.org
Subject: Re: [PATCH 2/6] KVM: MMU: Add mmu help functions to support PML
Date: Thu, 05 Feb 2015 13:59:37 +0800 [thread overview]
Message-ID: <54D306C9.2060501@linux.intel.com> (raw)
In-Reply-To: <20150203173440.GG19731@potion.redhat.com>
On 02/04/2015 01:34 AM, Radim Krčmář wrote:
> 2015-01-28 10:54+0800, Kai Huang:
>> This patch adds new mmu layer functions to clear/set D-bit for memory slot, and
>> to write protect superpages for memory slot.
>>
>> In case of PML, CPU logs the dirty GPA automatically to PML buffer when CPU
>> updates D-bit from 0 to 1, therefore we don't have to write protect 4K pages,
>> instead, we only need to clear D-bit in order to log that GPA.
>>
>> For superpages, we still write protect it and let page fault code to handle
>> dirty page logging, as we still need to split superpage to 4K pages in PML.
>>
>> As PML is always enabled during guest's lifetime, to eliminate unnecessary PML
>> GPA logging, we set D-bit manually for the slot with dirty logging disabled.
>>
>> Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
> This message contains just several stylistic tips, I wasn't able to
> learn enough about large pages to review today.
Hi Radim,
Thanks for your review and sorry for the late reply. I was working on
something else these days.
There are two reasons we still write protect the large page in case of
PML. One is we still need to split large page to 4K page in dirty
logging mode, and PML doesn't split large page automatically. The second
is PML only logs dirty GPA but it doesn't distinguish if the dirty GPA
is in 4K page or large page.
For rest of your comments, as this patch series have already been in
Paolo's queue branch, I think I can send further patches to enhance
them, if Paolo agrees the enhancement is needed.
Thanks,
-Kai
>
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index b18e65c..c438224 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -4368,6 +4448,121 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
>> +void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
>> + struct kvm_memory_slot *memslot)
>> +{
>> + gfn_t last_gfn;
>> + int i;
>> + bool flush = false;
>> +
>> + last_gfn = memslot->base_gfn + memslot->npages - 1;
>> +
>> + spin_lock(&kvm->mmu_lock);
>> +
>> + for (i = PT_PAGE_TABLE_LEVEL + 1; /* skip rmap for 4K page */
> ("+ 1" is the only difference from kvm_mmu_slot_remove_write_access();
> new argument, initial level, would be better.
>
> Btw, PT_PAGE_TABLE_LEVEL + 1 == PT_DIRECTORY_LEVEL)
>
>> + i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
>> + unsigned long *rmapp;
>> + unsigned long last_index, index;
>> +
>> + rmapp = memslot->arch.rmap[i - PT_PAGE_TABLE_LEVEL];
>> + last_index = gfn_to_index(last_gfn, memslot->base_gfn, i);
>> +
>> + for (index = 0; index <= last_index; ++index, ++rmapp) {
>> + if (*rmapp)
>> + flush |= __rmap_write_protect(kvm, rmapp,
>> + false);
>> +
>> + if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> + cond_resched_lock(&kvm->mmu_lock);
>> + }
>> + }
>> + spin_unlock(&kvm->mmu_lock);
>> +
>> + /* see kvm_mmu_slot_remove_write_access */
>> + lockdep_assert_held(&kvm->slots_lock);
>> +
>> + if (flush)
>> + kvm_flush_remote_tlbs(kvm);
>> +}
>> +void kvm_mmu_slot_set_dirty(struct kvm *kvm,
>> + struct kvm_memory_slot *memslot)
>> +{
>> + gfn_t last_gfn;
>> + int i;
>> + bool flush = false;
>> +
>> + last_gfn = memslot->base_gfn + memslot->npages - 1;
>> +
>> + spin_lock(&kvm->mmu_lock);
>> +
>> + for (i = PT_PAGE_TABLE_LEVEL;
>> + i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
>> + unsigned long *rmapp;
>> + unsigned long last_index, index;
>> +
>> + rmapp = memslot->arch.rmap[i - PT_PAGE_TABLE_LEVEL];
>> + last_index = gfn_to_index(last_gfn, memslot->base_gfn, i);
>> +
>> + for (index = 0; index <= last_index; ++index, ++rmapp) {
>> + if (*rmapp)
>> + flush |= __rmap_set_dirty(kvm, rmapp);
> (And yet another similar walker ... We can either pass a worker function
> accepting 'kvm' and 'rmapp', use a 'switch' with a new operations enum,
> or have code duplication. Paolo?)
>
>> +
>> + if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> + cond_resched_lock(&kvm->mmu_lock);
>> + }
>> + }
>> +
>> + spin_unlock(&kvm->mmu_lock);
>> +
>> + lockdep_assert_held(&kvm->slots_lock);
>> +
>> + /* see kvm_mmu_slot_leaf_clear_dirty */
>> + if (flush)
>> + kvm_flush_remote_tlbs(kvm);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_mmu_slot_set_dirty);
>> +void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
>> + struct kvm_memory_slot *memslot)
>> +{
>> + gfn_t last_gfn;
>> + unsigned long *rmapp;
>> + unsigned long last_index, index;
>> + bool flush = false;
>> +
>> + last_gfn = memslot->base_gfn + memslot->npages - 1;
>> +
>> + spin_lock(&kvm->mmu_lock);
>> +
> (If we abstracted with switch or function, we could also add max level
> argument to cover this function.)
>
>> + rmapp = memslot->arch.rmap[PT_PAGE_TABLE_LEVEL - 1];
>> + last_index = gfn_to_index(last_gfn, memslot->base_gfn,
>> + PT_PAGE_TABLE_LEVEL);
>> +
>> + for (index = 0; index <= last_index; ++index, ++rmapp) {
>> + if (*rmapp)
>> + flush |= __rmap_clear_dirty(kvm, rmapp);
>> +
>> + if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> + cond_resched_lock(&kvm->mmu_lock);
>> + }
>> +
>> + spin_unlock(&kvm->mmu_lock);
>> +
>> + lockdep_assert_held(&kvm->slots_lock);
>> +
>> + /*
>> + * It's also safe to flush TLBs out of mmu lock here as currently this
>> + * function is only used for dirty logging, in which case flushing TLB
>> + * out of mmu lock also guarantees no dirty pages will be lost in
>> + * dirty_bitmap.
>> + */
>> + if (flush)
>> + kvm_flush_remote_tlbs(kvm);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_mmu_slot_leaf_clear_dirty);
>> @@ -1215,6 +1215,60 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
> (After looking at following two pairs of functions, you'll hopefully
> understand why I don't like C very much.)
>
>> +static bool spte_clear_dirty(struct kvm *kvm, u64 *sptep)
>> +{
>> + u64 spte = *sptep;
>> +
>> + rmap_printk("rmap_clear_dirty: spte %p %llx\n", sptep, *sptep);
>> +
>> + spte &= ~shadow_dirty_mask;
>> +
>> + return mmu_spte_update(sptep, spte);
>> +}
>> +static bool spte_set_dirty(struct kvm *kvm, u64 *sptep)
>> +{
>> + u64 spte = *sptep;
>> +
>> + rmap_printk("rmap_set_dirty: spte %p %llx\n", sptep, *sptep);
>> +
>> + spte |= shadow_dirty_mask;
>> +
>> + return mmu_spte_update(sptep, spte);
>> +}
>> +static bool __rmap_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
>> +{
>> + u64 *sptep;
>> + struct rmap_iterator iter;
>> + bool flush = false;
>> +
>> + for (sptep = rmap_get_first(*rmapp, &iter); sptep;) {
>> + BUG_ON(!(*sptep & PT_PRESENT_MASK));
>> +
>> + flush |= spte_clear_dirty(kvm, sptep);
>> + sptep = rmap_get_next(&iter);
>> + }
>> +
>> + return flush;
>> +}
>> +static bool __rmap_set_dirty(struct kvm *kvm, unsigned long *rmapp)
>> +{
>> + u64 *sptep;
>> + struct rmap_iterator iter;
>> + bool flush = false;
>> +
>> + for (sptep = rmap_get_first(*rmapp, &iter); sptep;) {
>> + BUG_ON(!(*sptep & PT_PRESENT_MASK));
>> +
>> + flush |= spte_set_dirty(kvm, sptep);
>> + sptep = rmap_get_next(&iter);
>> + }
>> +
>> + return flush;
>> +}
next prev parent reply other threads:[~2015-02-05 6:07 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-28 2:54 [PATCH 0/6] KVM: VMX: Page Modification Logging (PML) support Kai Huang
2015-01-28 2:54 ` [PATCH 1/6] KVM: Rename kvm_arch_mmu_write_protect_pt_masked to be more generic for log dirty Kai Huang
2015-01-28 2:54 ` [PATCH 2/6] KVM: MMU: Add mmu help functions to support PML Kai Huang
2015-02-03 17:34 ` Radim Krčmář
2015-02-05 5:59 ` Kai Huang [this message]
2015-02-05 14:51 ` Radim Krčmář
2015-01-28 2:54 ` [PATCH 3/6] KVM: MMU: Explicitly set D-bit for writable spte Kai Huang
2015-01-28 2:54 ` [PATCH 4/6] KVM: x86: Change parameter of kvm_mmu_slot_remove_write_access Kai Huang
2015-02-03 16:28 ` Radim Krčmář
2015-01-28 2:54 ` [PATCH 5/6] KVM: x86: Add new dirty logging kvm_x86_ops for PML Kai Huang
2015-02-03 15:53 ` Radim Krčmář
2015-02-05 6:29 ` Kai Huang
2015-02-05 14:52 ` Radim Krčmář
2015-01-28 2:54 ` [PATCH 6/6] KVM: VMX: Add PML support in VMX Kai Huang
2015-02-03 15:18 ` Radim Krčmář
2015-02-03 15:39 ` Paolo Bonzini
2015-02-03 16:02 ` Radim Krčmář
2015-02-05 6:23 ` Kai Huang
2015-02-05 15:04 ` Radim Krčmář
2015-02-06 0:22 ` Kai Huang
2015-02-06 0:28 ` Kai Huang
2015-02-06 16:00 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54D306C9.2060501@linux.intel.com \
--to=kai.huang@linux.intel.com \
--cc=gleb@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).