From: Marc Zyngier <marc.zyngier@arm.com>
To: Mario Smarduch <m.smarduch@samsung.com>
Cc: "kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
"christoffer.dall@linaro.org" <christoffer.dall@linaro.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
이정석 <jays.lee@samsung.com>, 정성진 <sungjinn.chung@samsung.com>
Subject: Re: [PATCH 3/5] live migration support for VM dirty log management
Date: Thu, 17 Apr 2014 09:00:18 +0100 [thread overview]
Message-ID: <87a9bkxu5p.fsf@approximate.cambridge.arm.com> (raw)
In-Reply-To: <534F2F9B.80004@samsung.com> (Mario Smarduch's message of "Thu, 17 Apr 2014 02:34:19 +0100")
On Thu, Apr 17 2014 at 2:34:19 am BST, Mario Smarduch <m.smarduch@samsung.com> wrote:
> Add support for dirty bitmap management. Wanted to make it generic but function
> does a couple things different then the x86 version.
>
>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
> arch/arm/include/asm/kvm_host.h | 3 ++
> arch/arm/kvm/arm.c | 71 ++++++++++++++++++++++++++++++++++++++-
> arch/arm/kvm/mmu.c | 53 +++++++++++++++++++++++++++++
> 3 files changed, 126 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 7ac1fdc..16ed4e4 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -230,5 +230,8 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
>
> void kvm_tlb_flush_vm(struct kvm *kvm);
> void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
> +void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> + struct kvm_memory_slot *slot,
> + gfn_t gfn_offset, unsigned long mask);
>
> #endif /* __ARM_KVM_HOST_H__ */
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 7714cc6..7882343 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -785,9 +785,78 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> }
> }
>
> +
> +/**
> + * kvm_mmu_slot_remove_access - retrieves the log of dirty pages for a memslot.
> + * It's itteratively during migration to retrieve pages written since
> + * last call. In the process write protects ptes that are dirty for next
> + * time, holds the mmu_lock while write protecting dirty pages.
> + *
> + * @kvm: The KVM pointer
> + * @log: Bitmap of dirty pages return.
> + */
So let's play the difference game with x86:
> int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> {
> - return -EINVAL;
> + int r;
> + struct kvm_memory_slot *memslot;
> + unsigned long n, i;
> + unsigned long *dirty_bitmap;
> + unsigned long *dirty_bitmap_buffer;
> + bool is_dirty = false;
> + gfn_t offset;
You've moved offset out of the loop.
> + mutex_lock(&kvm->slots_lock);
> + r = -EINVAL;
> +
> + /* Return with error code will cause migration to abort, this happens
> + * when initial write protection of VM to manage dirty pages fails
> + */
> + if (kvm->arch.migration_in_progress == -1)
> + goto out;
You've added this test. How does x86 cope with the same case?
> + if (log->slot >= KVM_USER_MEM_SLOTS)
> + goto out;
> +
> + memslot = id_to_memslot(kvm->memslots, log->slot);
> + dirty_bitmap = memslot->dirty_bitmap;
> +
> + r = -ENOENT;
> + if (!dirty_bitmap)
> + goto out;
> +
> + n = kvm_dirty_bitmap_bytes(memslot);
> + dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
> + memset(dirty_bitmap_buffer, 0, n);
> +
> + spin_lock(&kvm->mmu_lock);
> + for (i = 0; i < n / sizeof(long); i++) {
> + unsigned long mask;
> +
> + if (!dirty_bitmap[i])
> + continue;
> +
> + is_dirty = true;
> + offset = i * BITS_PER_LONG;
> + kvm_mmu_write_protect_pt_masked(kvm, memslot, offset,
> + dirty_bitmap[i]);
> + mask = dirty_bitmap[i];
> + dirty_bitmap_buffer[i] = mask;
> + dirty_bitmap[i] = 0;
You've expanded the xchg macro, and moved it around.
> + }
> +
> + if (is_dirty)
> + kvm_tlb_flush_vm(kvm);
This can be easily abstracted to be a kvm_flush_remote_tlbs on x86, and
a HW broadcast on ARM.
> +
> + spin_unlock(&kvm->mmu_lock);
> + r = -EFAULT;
> +
> + if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
> + goto out;
> +
> + r = 0;
> +out:
> + mutex_unlock(&kvm->slots_lock);
> + return r;
> }
So only two small differences that can be easily factored out. I stand
by my initial analysis, and ask again you move this function to the
generic code as a weak symbol. There isn't any value in duplicating it.
> static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index b85ab56..47bec1c 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -773,6 +773,59 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> spin_unlock(&kvm->mmu_lock);
> }
>
> +
> +/**
> + * kvm_mmu_write_protected_pt_masked - after migration thread write protects
> + * the entire VM address space itterative calls are made to get diry pages
> + * as the VM pages are being migrated. New dirty pages may be subset
> + * of initial WPed VM or new writes faulted in. Here write protect new
> + * dirty pages again in preparation of next dirty log read. This function is
> + * called as a result KVM_GET_DIRTY_LOG ioctl, to determine what pages
> + * need to be migrated.
> + * 'kvm->mmu_lock' must be held to protect against concurrent modification
> + * of page tables (2nd stage fault, mmu modifiers, device writes)
> + *
> + * @kvm: The KVM pointer
> + * @slot: The memory slot the dirty log is retrieved for
> + * @gfn_offset: The gfn offset in memory slot
> + * @mask: The mask of dirty pages at offset 'gnf_offset in this memory
> + * slot to be writ protect
> + */
> +
> +void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> + struct kvm_memory_slot *slot,
> + gfn_t gfn_offset, unsigned long mask)
> +{
> + phys_addr_t ipa;
> + pgd_t *pgdp = kvm->arch.pgd, *pgd;
> + pud_t *pud;
> + pmd_t *pmd;
> + pte_t *pte, new_pte;
> +
> + /* walk set bits in the mask and write protect corresponding pages */
> + while (mask) {
> + ipa = (slot->base_gfn + gfn_offset + __ffs(mask)) << PAGE_SHIFT;
> + pgd = pgdp + pgd_index(ipa);
> + if (!pgd_present(*pgd))
> + goto update_mask;
I think something is wrong in your logic. If there is no PGD, it means a
whole 1GB isn't present. Yet you're just clearing one bit from the mask
and doing it again. As you're only looking at BITS_PER_LONG contiguous pages at a
time, it is likely that the same thing will happen for the other pages,
and you're just wasting precious CPU cycles here.
> + pud = pud_offset(pgd, ipa);
> + if (!pud_present(*pud))
> + goto update_mask;
> + pmd = pmd_offset(pud, ipa);
> + if (!pmd_present(*pmd))
> + goto update_mask;
> + pte = pte_offset_kernel(pmd, ipa);
> + if (!pte_present(*pte))
> + goto update_mask;
> + if ((*pte & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY)
> + goto update_mask;
> + new_pte = pfn_pte(pte_pfn(*pte), PAGE_S2);
> + *pte = new_pte;
I'd like to see these two lines in a separate function (something like
"stage2_mark_pte_ro")...
> +update_mask:
> + mask &= mask - 1;
> + }
> +}
> +
> static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> struct kvm_memory_slot *memslot,
> unsigned long fault_status)
--
Jazz is not dead. It just smells funny.
next prev parent reply other threads:[~2014-04-17 8:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-17 1:34 [PATCH 3/5] live migration support for VM dirty log management Mario Smarduch
2014-04-17 8:00 ` Marc Zyngier [this message]
2014-04-18 3:10 ` Mario Smarduch
2014-04-18 8:23 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a9bkxu5p.fsf@approximate.cambridge.arm.com \
--to=marc.zyngier@arm.com \
--cc=christoffer.dall@linaro.org \
--cc=jays.lee@samsung.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=m.smarduch@samsung.com \
--cc=sungjinn.chung@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox