linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1)
@ 2014-11-07  0:40 Mario Smarduch
  2014-11-07  0:40 ` [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support Mario Smarduch
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

Patch series adds support for ARMv7 and generic dirty page logging support. 
As we try to move towards generic dirty page logging additional logic is moved 
to generic code. Initially armv7 KVM_GET_DIRTY_LOG ioctl reuses generic code, 
to be followed by armv8 and x86. 

Testing:
- Generally live migration + checksumming of source/destination memory regions 
  is used to validate correctness.
- ARMv7 - qemu machvirt, VExpress - Exynos 5440, FastModels - lmbench + dirty 
  guest memory cycling. On FastModels you must set 'migrate_set_downtime=1'
  for higher dirty rate (for example 2000 pges/100mS).

See https://github.com/mjsmar/arm-dirtylog-tests for details.

In addition to ARMv7 patch series was compiled for:
- x86_64 - defconfig 
- ia64 - ia64-linux-gcc4.6.3 - defconfig, ia64 Kconfig defines BROKEN worked 
  around that to make sure new changes don't break build. Eventually build
  breaks due to other reasons.
- mips - mips64-linux-gcc4.6.3 - malta_kvm_defconfig
- ppc - powerpc64-linux-gcc4.6.3 - pseries_defconfig
- s390 - s390x-linux-gcc4.6.3 - defconfig
- armv8 - aarch64-linux-gnu-gcc4.8.1 - defconfig

ARMv7 Dirty page logging implementation overivew-
- initially write protects memory region pages 2nd stage page tables
- add support to read dirty page log and again write protect dirty pages 
  for next pass.
- second stage huge page are dissolved into small page tables to keep track of
  dirty pages at page granularity. Tracking at huge page granularity limits
  migration to an almost idle system. Small page size logging supports higher
  memory dirty rates.
- In the event migration is canceled, normal behavior is resumed huge pages
  are rebuilt over time.

Changes since v12:
- Added Paolos and James Hogan's comments to extend kvm_get_dirty_log() to
  make it further generic by adding write protection in addition to dirty bit
  map handling. This led to new generic function kvm_get_dirty_log_protect().

Changes since v11:
- Implemented Alex's comments to simplify generic layer.

Changes since v10:
- addressed wanghaibin comments 
- addressed Christoffers comments

Changes since v9:
- Split patches into generic and architecture specific variants for TLB Flushing
  and dirty log read (patches 1,2 & 3,4,5,6)
- rebased to 3.16.0-rc1
- Applied Christoffers comments.

Mario Smarduch (7):
  KVM: Add architecture-defined TLB flush support
  KVM: Add generic support for dirty page logging
  KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  arm: KVM: Add ARMv7 API to flush TLBs
  arm: KVM: Add initial dirty page locking infrastructure
  arm: KVM: dirty log read write protect support
  arm: KVM: ARMv7 dirty page logging 2nd stage page fault

 arch/arm/include/asm/kvm_asm.h        |    1 +
 arch/arm/include/asm/kvm_host.h       |   14 +++
 arch/arm/include/asm/kvm_mmu.h        |   20 ++++
 arch/arm/include/asm/pgtable-3level.h |    1 +
 arch/arm/kvm/Kconfig                  |    1 +
 arch/arm/kvm/arm.c                    |   17 +++
 arch/arm/kvm/interrupts.S             |   11 ++
 arch/arm/kvm/mmu.c                    |  209 +++++++++++++++++++++++++++++++--
 arch/x86/kvm/x86.c                    |   22 ++--
 include/linux/kvm_host.h              |    9 ++
 virt/kvm/Kconfig                      |    3 +
 virt/kvm/kvm_main.c                   |   96 +++++++++++++++
 12 files changed, 385 insertions(+), 19 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07  9:39   ` Marc Zyngier
  2014-11-07  0:40 ` [PATCH v13 2/7] KVM: Add generic support for dirty page logging Mario Smarduch
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

Allow architectures to override the generic kvm_flush_remote_tlbs()
function via HAVE_KVM_ARCH_TLB_FLUSH_ALL. ARMv7 will need this to
provide its own TLB flush interface.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 virt/kvm/Kconfig    |    3 +++
 virt/kvm/kvm_main.c |    2 ++
 2 files changed, 5 insertions(+)

diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index fc0c5e6..3796a21 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -37,3 +37,6 @@ config HAVE_KVM_CPU_RELAX_INTERCEPT
 
 config KVM_VFIO
        bool
+
+config HAVE_KVM_ARCH_TLB_FLUSH_ALL
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 33712fb..887df87 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -184,6 +184,7 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)
 	return called;
 }
 
+#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
 	long dirty_count = kvm->tlbs_dirty;
@@ -194,6 +195,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
 EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
+#endif
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
 {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 2/7] KVM: Add generic support for dirty page logging
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
  2014-11-07  0:40 ` [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07  9:07   ` Cornelia Huck
  2014-11-07  0:40 ` [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG Mario Smarduch
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
by several architectures. Building on that we intrdoduce 
kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
space.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 include/linux/kvm_host.h |    9 +++++
 virt/kvm/kvm_main.c      |   95 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a4c33b3..c55dd75 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -590,6 +590,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
 			struct kvm_dirty_log *log, int *is_dirty);
+
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty);
+
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+					struct kvm_memory_slot *slot,
+					gfn_t gfn_offset,
+					unsigned long mask);
+
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 				struct kvm_dirty_log *log);
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 887df87..f017760 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -981,6 +981,101 @@ out:
 }
 EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
+#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) || \
+    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM) || \
+    defined(CONFIG_ARM64)
+/*
+ * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
+ * logging, calling this function is illegal. Otherwise the function is defined
+ * in arch subtree and this restriction is removed.
+ */
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+					struct kvm_memory_slot *slot,
+					gfn_t gfn_offset,
+					unsigned long mask)
+{
+	BUG();
+}
+#endif
+
+/**
+ * kvm_get_dirty_log_protect - get a snapshot of dirty pages, and if any pages
+ *	are dirty write protect them for next write.
+ * @kvm:	pointer to kvm instance
+ * @log:	slot id and address to which we copy the log
+ * @is_dirty:	flag set if any page is dirty
+ *
+ * We need to keep it in mind that VCPU threads can write to the bitmap
+ * concurrently. So, to avoid losing track of dirty pages we keep the
+ * following order:
+ *
+ *    1. Take a snapshot of the bit and clear it if needed.
+ *    2. Write protect the corresponding page.
+ *    3. Copy the snapshot to the userspace.
+ *    4. Upon return caller flushes TLB's if needed.
+ *
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
+ *
+ */
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty)
+{
+	struct kvm_memory_slot *memslot;
+	int r, i;
+	unsigned long n;
+	unsigned long *dirty_bitmap;
+	unsigned long *dirty_bitmap_buffer;
+
+	r = -EINVAL;
+	if (log->slot >= KVM_USER_MEM_SLOTS)
+		goto out;
+
+	memslot = id_to_memslot(kvm->memslots, log->slot);
+
+	dirty_bitmap = memslot->dirty_bitmap;
+	r = -ENOENT;
+	if (!dirty_bitmap)
+		goto out;
+
+	n = kvm_dirty_bitmap_bytes(memslot);
+
+	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
+	memset(dirty_bitmap_buffer, 0, n);
+
+	spin_lock(&kvm->mmu_lock);
+	*is_dirty = false;
+	for (i = 0; i < n / sizeof(long); i++) {
+		unsigned long mask;
+		gfn_t offset;
+
+		if (!dirty_bitmap[i])
+			continue;
+
+		*is_dirty = true;
+
+		mask = xchg(&dirty_bitmap[i], 0);
+		dirty_bitmap_buffer[i] = mask;
+
+		offset = i * BITS_PER_LONG;
+		kvm_arch_mmu_write_protect_pt_masked(kvm, memslot, offset,
+								mask);
+	}
+
+	spin_unlock(&kvm->mmu_lock);
+
+	r = -EFAULT;
+	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
+		goto out;
+
+	r = 0;
+out:
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
+
 bool kvm_largepages_enabled(void)
 {
 	return largepages_enabled;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
  2014-11-07  0:40 ` [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support Mario Smarduch
  2014-11-07  0:40 ` [PATCH v13 2/7] KVM: Add generic support for dirty page logging Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07  7:44   ` Paolo Bonzini
  2014-11-07  0:40 ` [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs Mario Smarduch
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

In the next patches, we will move parts of x86's kvm_vm_ioctl_get_dirty_log
implementation to generic code; leave the arch-specific code at the end,
similar to the existing generic function kvm_get_dirty_log.

Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c |   22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f1e22d..dc8e66b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3606,13 +3606,13 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
  *
  *   1. Take a snapshot of the bit and clear it if needed.
  *   2. Write protect the corresponding page.
- *   3. Flush TLB's if needed.
- *   4. Copy the snapshot to the userspace.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
  *
- * Between 2 and 3, the guest may write to the page using the remaining TLB
- * entry.  This is not a problem because the page will be reported dirty at
- * step 4 using the snapshot taken before and step 3 ensures that successive
- * writes will be logged for the next call.
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
@@ -3661,6 +3661,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 
 	spin_unlock(&kvm->mmu_lock);
 
+	r = 0;
+	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
+		r = -EFAULT;
+
 	/* See the comments in kvm_mmu_slot_remove_write_access(). */
 	lockdep_assert_held(&kvm->slots_lock);
 
@@ -3670,12 +3674,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 	 */
 	if (is_dirty)
 		kvm_flush_remote_tlbs(kvm);
-
-	r = -EFAULT;
-	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
-		goto out;
-
-	r = 0;
 out:
 	mutex_unlock(&kvm->slots_lock);
 	return r;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
                   ` (2 preceding siblings ...)
  2014-11-07  0:40 ` [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07  9:44   ` Marc Zyngier
  2014-11-07 20:18   ` Christoffer Dall
  2014-11-07  0:40 ` [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure Mario Smarduch
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds ARMv7 architecture TLB Flush function.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_asm.h  |    1 +
 arch/arm/include/asm/kvm_host.h |   12 ++++++++++++
 arch/arm/kvm/Kconfig            |    1 +
 arch/arm/kvm/interrupts.S       |   11 +++++++++++
 4 files changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 3a67bec..25410b2 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 6dfb404..fad598c 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -220,6 +220,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+/**
+ * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries without address
+ * parameter.
+ */
+static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 466bd29..a099df4 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -23,6 +23,7 @@ config KVM
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
 	select KVM_MMIO
 	select KVM_ARM_HOST
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	depends on ARM_VIRT_EXT && ARM_LPAE
 	---help---
 	  Support hosting virtualized guest machines. You will also
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 01dcb0e..79caf79 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	bx	lr
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ *
+ * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
+ * parameter
+ */
+
+ENTRY(__kvm_tlb_flush_vmid)
+	b	__kvm_tlb_flush_vmid_ipa
+ENDPROC(__kvm_tlb_flush_vmid)
+
 /********************************************************************
  * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
  * domain, for all VMIDs
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
                   ` (3 preceding siblings ...)
  2014-11-07  0:40 ` [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07 10:15   ` Marc Zyngier
  2014-11-07  0:40 ` [PATCH v13 6/7] arm: KVM: dirty log read write protect support Mario Smarduch
  2014-11-07  0:40 ` [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault Mario Smarduch
  6 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for initial write protection of VM memlsot. This patch
series assumes that huge PUDs will not be used in 2nd stage tables, which is
awlays valid on ARMv7.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h       |    2 +
 arch/arm/include/asm/kvm_mmu.h        |   20 +++++
 arch/arm/include/asm/pgtable-3level.h |    1 +
 arch/arm/kvm/mmu.c                    |  140 +++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index fad598c..72510ca 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -245,4 +245,6 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
+
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 5cc0b0f..08ab5e8 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -114,6 +114,26 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= L_PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
+}
+
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
 ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 06e0bc0..d29c880 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -130,6 +130,7 @@
 #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
 #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
 #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 16e7994..3b86522 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
 #define pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
+#define kvm_pud_huge(_x)	pud_huge(_x)
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
@@ -746,6 +747,133 @@ static bool transparent_hugepage_adjust(pfn_t *pfnp, phys_addr_t *ipap)
 	return false;
 }
 
+#ifdef CONFIG_ARM
+/**
+ * stage2_wp_ptes - write protect PMD range
+ * @pmd:	pointer to pmd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
+{
+	pte_t *pte;
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		if (!pte_none(*pte)) {
+			if (!kvm_s2pte_readonly(pte))
+				kvm_set_s2pte_readonly(pte);
+			}
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+/**
+ * stage2_wp_pmds - write protect PUD range
+ * @pud:	pointer to pud entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
+{
+	pmd_t *pmd;
+	phys_addr_t next;
+
+	pmd = pmd_offset(pud, addr);
+
+	do {
+		next = kvm_pmd_addr_end(addr, end);
+		if (!pmd_none(*pmd)) {
+			if (kvm_pmd_huge(*pmd)) {
+				if (!kvm_s2pmd_readonly(pmd))
+					kvm_set_s2pmd_readonly(pmd);
+			} else {
+				stage2_wp_ptes(pmd, addr, next);
+			}
+		}
+	} while (pmd++, addr = next, addr != end);
+}
+
+/**
+  * stage2_wp_puds - write protect PGD range
+  * @kvm:	pointer to kvm structure
+  * @pud:	pointer to pgd entry
+  * @addr:	range start address
+  * @end:	range end address
+  *
+  * While walking the PUD range huge PUD pages are ignored.
+  */
+static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
+					phys_addr_t addr, phys_addr_t end)
+{
+	pud_t *pud;
+	phys_addr_t next;
+
+	pud = pud_offset(pgd, addr);
+	do {
+		next = kvm_pud_addr_end(addr, end);
+		if (!pud_none(*pud)) {
+			/* TODO:PUD not supported, revisit later if supported */
+			BUG_ON(kvm_pud_huge(*pud));
+			stage2_wp_pmds(pud, addr, next);
+		}
+	} while (pud++, addr = next, addr != end);
+}
+
+/**
+ * stage2_wp_range() - write protect stage2 memory region range
+ * @kvm:	The KVM pointer
+ * @addr:	Start address of range
+ * @end:	End address of range
+ */
+static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
+{
+	pgd_t *pgd;
+	phys_addr_t next;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	do {
+		/*
+		 * Release kvm_mmu_lock periodically if the memory region is
+		 * large. Otherwise, we may see kernel panics with
+		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
+		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
+		 * will also starve other vCPUs.
+		 */
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
+		next = kvm_pgd_addr_end(addr, end);
+		if (pgd_present(*pgd))
+			stage2_wp_puds(kvm, pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
+}
+
+/**
+ * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot to write protect
+ *
+ * Called to start logging dirty pages after memory region
+ * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
+ * all present PMD and PTEs are write protected in the memory region.
+ * Afterwards read of dirty page log can be called.
+ *
+ * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
+ * serializing operations for VM memory regions.
+ */
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
+{
+	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
+	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
+	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
+
+	spin_lock(&kvm->mmu_lock);
+	stage2_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
+	kvm_flush_remote_tlbs(kvm);
+}
+#endif
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot,
 			  unsigned long fault_status)
@@ -1129,6 +1257,18 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 		unmap_stage2_range(kvm, gpa, size);
 		spin_unlock(&kvm->mmu_lock);
 	}
+
+#ifdef CONFIG_ARM
+	/*
+	 * At this point memslot has been committed and there is an
+	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
+	 * memory slot is write protected.
+	 */
+	if ((change != KVM_MR_DELETE) && change != KVM_MR_MOVE &&
+					(mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
+		kvm_mmu_wp_memory_region(kvm, mem->slot);
+#endif
+
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 6/7] arm: KVM: dirty log read write protect support
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
                   ` (4 preceding siblings ...)
  2014-11-07  0:40 ` [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07  7:38   ` Paolo Bonzini
  2014-11-07 10:19   ` Marc Zyngier
  2014-11-07  0:40 ` [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault Mario Smarduch
  6 siblings, 2 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
calls. We call kvm_get_dirty_log_protect() function to do most of the work.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/arm.c  |   37 +++++++++++++++++++++++++++++++++++++
 arch/arm/kvm/mmu.c  |   22 ++++++++++++++++++++++
 virt/kvm/kvm_main.c |    3 +--
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index a99e0cd..212d835 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -737,9 +737,46 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	}
 }
 
+/**
+ * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
+ * @kvm:	kvm instance
+ * @log:	slot id and address to which we copy the log
+ *
+ * We need to keep it in mind that VCPU threads can write to the bitmap
+ * concurrently.  So, to avoid losing data, we keep the following order for
+ * each bit:
+ *
+ *   1. Take a snapshot of the bit and clear it if needed.
+ *   2. Write protect the corresponding page.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
+ *
+ * Steps 1,2,3 are handled by kvm_get_dirty_log_protect().
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
+ */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
+#ifdef CONFIG_ARM
+	int r;
+	bool is_dirty = false;
+
+	mutex_lock(&kvm->slots_lock);
+
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
+	if (r)
+		goto out;
+
+	if (is_dirty)
+		kvm_flush_remote_tlbs(kvm);
+out:
+	mutex_unlock(&kvm->slots_lock);
+	return r;
+#else /* ARM64 */
 	return -EINVAL;
+#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 3b86522..2f5131e 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -872,6 +872,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	spin_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
+
+/**
+ * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot associated with mask
+ * @gfn_offset:	The gfn offset in memory slot
+ * @mask:	The mask of dirty pages at offset 'gfn_offset' in this memory
+ *		slot to be write protected
+ *
+ * Walks bits set in mask write protects the associated pte's. Caller must
+ * acquire kvm_mmu_lock.
+ */
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+		struct kvm_memory_slot *slot,
+		gfn_t gfn_offset, unsigned long mask)
+{
+	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
+	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
+	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
+
+	stage2_wp_range(kvm, start, end);
+}
 #endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f017760..c80dd2f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -982,8 +982,7 @@ out:
 EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
 #if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) || \
-    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM) || \
-    defined(CONFIG_ARM64)
+    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM64)
 /*
  * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
  * logging, calling this function is illegal. Otherwise the function is defined
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault
  2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
                   ` (5 preceding siblings ...)
  2014-11-07  0:40 ` [PATCH v13 6/7] arm: KVM: dirty log read write protect support Mario Smarduch
@ 2014-11-07  0:40 ` Mario Smarduch
  2014-11-07 10:33   ` Marc Zyngier
  6 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07  0:40 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for handling 2nd stage page faults during migration,
it disables faulting in huge pages, and dissolves huge pages to page tables.
In case migration is canceled huge pages are used again.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c |   47 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 40 insertions(+), 7 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 2f5131e..d511fc0 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,15 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -626,7 +635,8 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte, bool iomap,
+			  bool logging_active)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
@@ -641,6 +651,18 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty memory logging, clear PMD entry for huge page and split
+	 * into smaller pages, to track dirty memory at page granularity.
+	 */
+	if (logging_active && kvm_pmd_huge(*pmd)) {
+		phys_addr_t ipa = pmd_pfn(*pmd) << PAGE_SHIFT;
+
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, ipa);
+		put_page(virt_to_page(pmd));
+	}
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -693,7 +715,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte, true, false);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -910,6 +932,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	struct vm_area_struct *vma;
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
+	bool logging_active = kvm_get_logging_state(memslot);
 
 	write_fault = kvm_is_write_fault(kvm_vcpu_get_hsr(vcpu));
 	if (fault_status == FSC_PERM && !write_fault) {
@@ -920,7 +943,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	/* Let's check if we will get back a huge page backed by hugetlbfs */
 	down_read(&current->mm->mmap_sem);
 	vma = find_vma_intersection(current->mm, hva, hva + 1);
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -966,7 +989,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	if (hugetlb) {
@@ -986,10 +1009,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
 		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-				     mem_type == PAGE_S2_DEVICE);
+					mem_type == PAGE_S2_DEVICE,
+					logging_active);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1139,7 +1164,15 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with logging_active == false,
+	 * because MMU notifiers will have unmapped a huge PMD before calling
+	 * ->change_pte() (which in turn calls kvm_set_spte_hva()) and therefore
+	 * stage2_set_pte() never needs to clear out a huge PMD through this
+	 * calling path.
+	 */
+
+	stage2_set_pte(kvm, NULL, gpa, pte, false, false);
 }
 
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v13 6/7] arm: KVM: dirty log read write protect support
  2014-11-07  0:40 ` [PATCH v13 6/7] arm: KVM: dirty log read write protect support Mario Smarduch
@ 2014-11-07  7:38   ` Paolo Bonzini
  2014-11-07 19:47     ` Mario Smarduch
  2014-11-07 10:19   ` Marc Zyngier
  1 sibling, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2014-11-07  7:38 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/2014 01:40, Mario Smarduch wrote:
> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/kvm/arm.c  |   37 +++++++++++++++++++++++++++++++++++++
>  arch/arm/kvm/mmu.c  |   22 ++++++++++++++++++++++
>  virt/kvm/kvm_main.c |    3 +--
>  3 files changed, 60 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index a99e0cd..212d835 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -737,9 +737,46 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	}
>  }
>  
> +/**
> + * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
> + * @kvm:	kvm instance
> + * @log:	slot id and address to which we copy the log
> + *
> + * We need to keep it in mind that VCPU threads can write to the bitmap
> + * concurrently.  So, to avoid losing data, we keep the following order for
> + * each bit:
> + *
> + *   1. Take a snapshot of the bit and clear it if needed.
> + *   2. Write protect the corresponding page.
> + *   3. Copy the snapshot to the userspace.
> + *   4. Flush TLB's if needed.
> + *
> + * Steps 1,2,3 are handled by kvm_get_dirty_log_protect().
> + * Between 2 and 4, the guest may write to the page using the remaining TLB
> + * entry.  This is not a problem because the page is reported dirty using
> + * the snapshot taken before and step 4 ensures that writes done after
> + * exiting to userspace will be logged for the next call.
> + */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> +#ifdef CONFIG_ARM
> +	int r;
> +	bool is_dirty = false;
> +
> +	mutex_lock(&kvm->slots_lock);
> +
> +	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
> +	if (r)
> +		goto out;
> +
> +	if (is_dirty)
> +		kvm_flush_remote_tlbs(kvm);

Should the caller should always flush TLBs if is_dirty is true, even if
kvm_get_dirty_log_protect reported an error?  That can happen if the
error occurred in the final copy to userspace, after page tables have
been modified.

Of course, in this case userspace cannot use the dirty log anymore since
it has been irrimediably trashed.

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  2014-11-07  0:40 ` [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG Mario Smarduch
@ 2014-11-07  7:44   ` Paolo Bonzini
  2014-11-07 19:50     ` Mario Smarduch
  0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2014-11-07  7:44 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/2014 01:40, Mario Smarduch wrote:
> In the next patches, we will move parts of x86's kvm_vm_ioctl_get_dirty_log
> implementation to generic code; leave the arch-specific code at the end,
> similar to the existing generic function kvm_get_dirty_log.
> 
> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

You should have a "Signed-off-by: Mario Smarduch ..." here, and you 
should also add my authorship.  You can do the latter with "git rebase 
-i", adding

  x git commit --amend -C HEAD --author 'Paolo Bonzini <pbonzini@redhat.com>'

after this patch.  If you're using a patch queue instead (quilt or
similar) you can just edit the "From" line in the patch.

I guess if you reply to the patch with just

   Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

the ARM maintainers can do the above for you, if there's no need for
a v14.

I'll send an eighth patch to actually switch x86 to the new function.
Again, the maintainers can apply it in the right place, but please
include it yourself if you have to do a v14.

Thanks,

Paolo

> ---
>  arch/x86/kvm/x86.c |   22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8f1e22d..dc8e66b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3606,13 +3606,13 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
>   *
>   *   1. Take a snapshot of the bit and clear it if needed.
>   *   2. Write protect the corresponding page.
> - *   3. Flush TLB's if needed.
> - *   4. Copy the snapshot to the userspace.
> + *   3. Copy the snapshot to the userspace.
> + *   4. Flush TLB's if needed.
>   *
> - * Between 2 and 3, the guest may write to the page using the remaining TLB
> - * entry.  This is not a problem because the page will be reported dirty at
> - * step 4 using the snapshot taken before and step 3 ensures that successive
> - * writes will be logged for the next call.
> + * Between 2 and 4, the guest may write to the page using the remaining TLB
> + * entry.  This is not a problem because the page is reported dirty using
> + * the snapshot taken before and step 4 ensures that writes done after
> + * exiting to userspace will be logged for the next call.
>   */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> @@ -3661,6 +3661,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  
>  	spin_unlock(&kvm->mmu_lock);
>  
> +	r = 0;
> +	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
> +		r = -EFAULT;
> +
>  	/* See the comments in kvm_mmu_slot_remove_write_access(). */
>  	lockdep_assert_held(&kvm->slots_lock);
>  
> @@ -3670,12 +3674,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  	 */
>  	if (is_dirty)
>  		kvm_flush_remote_tlbs(kvm);
> -
> -	r = -EFAULT;
> -	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
> -		goto out;
> -
> -	r = 0;
>  out:
>  	mutex_unlock(&kvm->slots_lock);
>  	return r;
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 2/7] KVM: Add generic support for dirty page logging
  2014-11-07  0:40 ` [PATCH v13 2/7] KVM: Add generic support for dirty page logging Mario Smarduch
@ 2014-11-07  9:07   ` Cornelia Huck
  2014-11-07  9:26     ` Paolo Bonzini
  2014-11-07 18:55     ` Mario Smarduch
  0 siblings, 2 replies; 29+ messages in thread
From: Cornelia Huck @ 2014-11-07  9:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 06 Nov 2014 16:40:43 -0800
Mario Smarduch <m.smarduch@samsung.com> wrote:

> kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
> by several architectures. Building on that we intrdoduce 
> kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
> for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
> space.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  include/linux/kvm_host.h |    9 +++++
>  virt/kvm/kvm_main.c      |   95 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 104 insertions(+)
> 

> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 887df87..f017760 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -981,6 +981,101 @@ out:
>  }
>  EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
> 
> +#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) || \
> +    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM) || \
> +    defined(CONFIG_ARM64)

Does this deserve a config symbol that can be selected by architectures
actually using kvm_get_dirty_log_protect()? I.e.,

#ifndef CONFIG_KVM_ARCH_DIRTY_LOG_PROTECT

or so?

> +/*
> + * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
> + * logging, calling this function is illegal. Otherwise the function is defined
> + * in arch subtree and this restriction is removed.
> + */

What about

/*
 * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
 * logging, we must never end up calling this function. Architectures that do
 * use it define their own version of this function.
 */

instead?

> +void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
> +					struct kvm_memory_slot *slot,
> +					gfn_t gfn_offset,
> +					unsigned long mask)
> +{
> +	BUG();
> +}
> +#endif
> +
(...)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 2/7] KVM: Add generic support for dirty page logging
  2014-11-07  9:07   ` Cornelia Huck
@ 2014-11-07  9:26     ` Paolo Bonzini
  2014-11-07 18:55     ` Mario Smarduch
  1 sibling, 0 replies; 29+ messages in thread
From: Paolo Bonzini @ 2014-11-07  9:26 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/2014 10:07, Cornelia Huck wrote:
>> > +#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) || \
>> > +    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM) || \
>> > +    defined(CONFIG_ARM64)
> Does this deserve a config symbol that can be selected by architectures
> actually using kvm_get_dirty_log_protect()? I.e.,
> 
> #ifndef CONFIG_KVM_ARCH_DIRTY_LOG_PROTECT
> 
> or so?

Yes, either that or invert the #if to negative logic.

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support
  2014-11-07  0:40 ` [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support Mario Smarduch
@ 2014-11-07  9:39   ` Marc Zyngier
  0 siblings, 0 replies; 29+ messages in thread
From: Marc Zyngier @ 2014-11-07  9:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/11/14 00:40, Mario Smarduch wrote:
> Allow architectures to override the generic kvm_flush_remote_tlbs()
> function via HAVE_KVM_ARCH_TLB_FLUSH_ALL. ARMv7 will need this to
> provide its own TLB flush interface.
> 
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.

> ---
>  virt/kvm/Kconfig    |    3 +++
>  virt/kvm/kvm_main.c |    2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index fc0c5e6..3796a21 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -37,3 +37,6 @@ config HAVE_KVM_CPU_RELAX_INTERCEPT
>  
>  config KVM_VFIO
>         bool
> +
> +config HAVE_KVM_ARCH_TLB_FLUSH_ALL
> +       bool
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 33712fb..887df87 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -184,6 +184,7 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)
>  	return called;
>  }
>  
> +#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
>  void kvm_flush_remote_tlbs(struct kvm *kvm)
>  {
>  	long dirty_count = kvm->tlbs_dirty;
> @@ -194,6 +195,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
>  	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
>  }
>  EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
> +#endif
>  
>  void kvm_reload_remote_mmus(struct kvm *kvm)
>  {
> 


-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs
  2014-11-07  0:40 ` [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs Mario Smarduch
@ 2014-11-07  9:44   ` Marc Zyngier
  2014-11-07 18:58     ` Mario Smarduch
  2014-11-07 20:18   ` Christoffer Dall
  1 sibling, 1 reply; 29+ messages in thread
From: Marc Zyngier @ 2014-11-07  9:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/11/14 00:40, Mario Smarduch wrote:
> This patch adds ARMv7 architecture TLB Flush function.
> 
> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_asm.h  |    1 +
>  arch/arm/include/asm/kvm_host.h |   12 ++++++++++++
>  arch/arm/kvm/Kconfig            |    1 +
>  arch/arm/kvm/interrupts.S       |   11 +++++++++++
>  4 files changed, 25 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 3a67bec..25410b2 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
>  
>  extern void __kvm_flush_vm_context(void);
>  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> +extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>  #endif
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 6dfb404..fad598c 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -220,6 +220,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> +/**
> + * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries
> + * @kvm:	pointer to kvm structure.
> + *
> + * Interface to HYP function to flush all VM TLB entries without address
> + * parameter.
> + */
> +static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> +{
> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +}
> +
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
> index 466bd29..a099df4 100644
> --- a/arch/arm/kvm/Kconfig
> +++ b/arch/arm/kvm/Kconfig
> @@ -23,6 +23,7 @@ config KVM
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>  	select KVM_MMIO
>  	select KVM_ARM_HOST
> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL

[utter nit] Move this up by two lines so that the "HAVE_KVM_*" options
are together. The ARM tree has a tendency to sort the config options by
alphabetical order (though that's clearly not enforced in this file...).

>  	depends on ARM_VIRT_EXT && ARM_LPAE
>  	---help---
>  	  Support hosting virtualized guest machines. You will also
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index 01dcb0e..79caf79 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>  	bx	lr
>  ENDPROC(__kvm_tlb_flush_vmid_ipa)
>  
> +/**
> + * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
> + *
> + * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
> + * parameter
> + */
> +
> +ENTRY(__kvm_tlb_flush_vmid)
> +	b	__kvm_tlb_flush_vmid_ipa
> +ENDPROC(__kvm_tlb_flush_vmid)
> +
>  /********************************************************************
>   * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
>   * domain, for all VMIDs
> 

Other than that:

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure
  2014-11-07  0:40 ` [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure Mario Smarduch
@ 2014-11-07 10:15   ` Marc Zyngier
  2014-11-07 19:07     ` Mario Smarduch
  0 siblings, 1 reply; 29+ messages in thread
From: Marc Zyngier @ 2014-11-07 10:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/11/14 00:40, Mario Smarduch wrote:
> Add support for initial write protection of VM memlsot. This patch
                                                 memslots
> series assumes that huge PUDs will not be used in 2nd stage tables, which is
> awlays valid on ARMv7.
  always
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_host.h       |    2 +
>  arch/arm/include/asm/kvm_mmu.h        |   20 +++++
>  arch/arm/include/asm/pgtable-3level.h |    1 +
>  arch/arm/kvm/mmu.c                    |  140 +++++++++++++++++++++++++++++++++
>  4 files changed, 163 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index fad598c..72510ca 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -245,4 +245,6 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> +void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
> +
>  #endif /* __ARM_KVM_HOST_H__ */
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 5cc0b0f..08ab5e8 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -114,6 +114,26 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
>  	pmd_val(*pmd) |= L_PMD_S2_RDWR;
>  }
>  
> +static inline void kvm_set_s2pte_readonly(pte_t *pte)
> +{
> +	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
> +}
> +
> +static inline bool kvm_s2pte_readonly(pte_t *pte)
> +{
> +	return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
> +}
> +
> +static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
> +{
> +	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
> +}
> +
> +static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> +{
> +	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
> +}
> +
>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>  #define kvm_pgd_addr_end(addr, end)					\
>  ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
> diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
> index 06e0bc0..d29c880 100644
> --- a/arch/arm/include/asm/pgtable-3level.h
> +++ b/arch/arm/include/asm/pgtable-3level.h
> @@ -130,6 +130,7 @@
>  #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
>  #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
>  
> +#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
>  #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>  
>  /*
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 16e7994..3b86522 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
>  #define pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>  
>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
> +#define kvm_pud_huge(_x)	pud_huge(_x)
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  {
> @@ -746,6 +747,133 @@ static bool transparent_hugepage_adjust(pfn_t *pfnp, phys_addr_t *ipap)
>  	return false;
>  }
>  
> +#ifdef CONFIG_ARM
> +/**
> + * stage2_wp_ptes - write protect PMD range
> + * @pmd:	pointer to pmd entry
> + * @addr:	range start address
> + * @end:	range end address
> + */
> +static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
> +{
> +	pte_t *pte;
> +
> +	pte = pte_offset_kernel(pmd, addr);
> +	do {
> +		if (!pte_none(*pte)) {
> +			if (!kvm_s2pte_readonly(pte))
> +				kvm_set_s2pte_readonly(pte);
> +			}
> +	} while (pte++, addr += PAGE_SIZE, addr != end);
> +}
> +
> +/**
> + * stage2_wp_pmds - write protect PUD range
> + * @pud:	pointer to pud entry
> + * @addr:	range start address
> + * @end:	range end address
> + */
> +static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
> +{
> +	pmd_t *pmd;
> +	phys_addr_t next;
> +
> +	pmd = pmd_offset(pud, addr);
> +
> +	do {
> +		next = kvm_pmd_addr_end(addr, end);
> +		if (!pmd_none(*pmd)) {
> +			if (kvm_pmd_huge(*pmd)) {
> +				if (!kvm_s2pmd_readonly(pmd))
> +					kvm_set_s2pmd_readonly(pmd);
> +			} else {
> +				stage2_wp_ptes(pmd, addr, next);
> +			}
> +		}
> +	} while (pmd++, addr = next, addr != end);
> +}
> +
> +/**
> +  * stage2_wp_puds - write protect PGD range
> +  * @kvm:	pointer to kvm structure
> +  * @pud:	pointer to pgd entry
> +  * @addr:	range start address
> +  * @end:	range end address
> +  *
> +  * While walking the PUD range huge PUD pages are ignored.

Well, not really. Huge PUDs are actually triggering a kernel panic.

> +  */
> +static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,

Why do we have a "kvm" parameter here?

> +					phys_addr_t addr, phys_addr_t end)
> +{
> +	pud_t *pud;
> +	phys_addr_t next;
> +
> +	pud = pud_offset(pgd, addr);
> +	do {
> +		next = kvm_pud_addr_end(addr, end);
> +		if (!pud_none(*pud)) {
> +			/* TODO:PUD not supported, revisit later if supported */
> +			BUG_ON(kvm_pud_huge(*pud));
> +			stage2_wp_pmds(pud, addr, next);
> +		}
> +	} while (pud++, addr = next, addr != end);
> +}
> +
> +/**
> + * stage2_wp_range() - write protect stage2 memory region range
> + * @kvm:	The KVM pointer
> + * @addr:	Start address of range
> + * @end:	End address of range
> + */
> +static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
> +{
> +	pgd_t *pgd;
> +	phys_addr_t next;
> +
> +	pgd = kvm->arch.pgd + pgd_index(addr);
> +	do {
> +		/*
> +		 * Release kvm_mmu_lock periodically if the memory region is
> +		 * large. Otherwise, we may see kernel panics with
> +		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
> +		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
> +		 * will also starve other vCPUs.
> +		 */
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);
> +
> +		next = kvm_pgd_addr_end(addr, end);
> +		if (pgd_present(*pgd))
> +			stage2_wp_puds(kvm, pgd, addr, next);
> +	} while (pgd++, addr = next, addr != end);
> +}
> +
> +/**
> + * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
> + * @kvm:	The KVM pointer
> + * @slot:	The memory slot to write protect
> + *
> + * Called to start logging dirty pages after memory region
> + * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
> + * all present PMD and PTEs are write protected in the memory region.
> + * Afterwards read of dirty page log can be called.
> + *
> + * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
> + * serializing operations for VM memory regions.
> + */
> +void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
> +{
> +	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
> +	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
> +	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
> +
> +	spin_lock(&kvm->mmu_lock);
> +	stage2_wp_range(kvm, start, end);
> +	spin_unlock(&kvm->mmu_lock);
> +	kvm_flush_remote_tlbs(kvm);
> +}
> +#endif
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot,
>  			  unsigned long fault_status)
> @@ -1129,6 +1257,18 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  		unmap_stage2_range(kvm, gpa, size);
>  		spin_unlock(&kvm->mmu_lock);
>  	}
> +
> +#ifdef CONFIG_ARM
> +	/*
> +	 * At this point memslot has been committed and there is an
> +	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
> +	 * memory slot is write protected.
> +	 */
> +	if ((change != KVM_MR_DELETE) && change != KVM_MR_MOVE &&

[nit] extraneous set of parenthesis.

> +					(mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
> +		kvm_mmu_wp_memory_region(kvm, mem->slot);
> +#endif
> +
>  }
>  
>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 6/7] arm: KVM: dirty log read write protect support
  2014-11-07  0:40 ` [PATCH v13 6/7] arm: KVM: dirty log read write protect support Mario Smarduch
  2014-11-07  7:38   ` Paolo Bonzini
@ 2014-11-07 10:19   ` Marc Zyngier
  1 sibling, 0 replies; 29+ messages in thread
From: Marc Zyngier @ 2014-11-07 10:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/11/14 00:40, Mario Smarduch wrote:
> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/kvm/arm.c  |   37 +++++++++++++++++++++++++++++++++++++
>  arch/arm/kvm/mmu.c  |   22 ++++++++++++++++++++++
>  virt/kvm/kvm_main.c |    3 +--
>  3 files changed, 60 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index a99e0cd..212d835 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -737,9 +737,46 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	}
>  }
>  
> +/**
> + * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
> + * @kvm:	kvm instance
> + * @log:	slot id and address to which we copy the log
> + *
> + * We need to keep it in mind that VCPU threads can write to the bitmap
> + * concurrently.  So, to avoid losing data, we keep the following order for
> + * each bit:
> + *
> + *   1. Take a snapshot of the bit and clear it if needed.
> + *   2. Write protect the corresponding page.
> + *   3. Copy the snapshot to the userspace.
> + *   4. Flush TLB's if needed.
> + *
> + * Steps 1,2,3 are handled by kvm_get_dirty_log_protect().
> + * Between 2 and 4, the guest may write to the page using the remaining TLB
> + * entry.  This is not a problem because the page is reported dirty using
> + * the snapshot taken before and step 4 ensures that writes done after
> + * exiting to userspace will be logged for the next call.
> + */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> +#ifdef CONFIG_ARM
> +	int r;
> +	bool is_dirty = false;
> +
> +	mutex_lock(&kvm->slots_lock);
> +
> +	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
> +	if (r)
> +		goto out;
> +
> +	if (is_dirty)
> +		kvm_flush_remote_tlbs(kvm);
> +out:
> +	mutex_unlock(&kvm->slots_lock);
> +	return r;
> +#else /* ARM64 */
>  	return -EINVAL;
> +#endif
>  }
>  
>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 3b86522..2f5131e 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -872,6 +872,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
>  	spin_unlock(&kvm->mmu_lock);
>  	kvm_flush_remote_tlbs(kvm);
>  }
> +
> +/**
> + * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages
> + * @kvm:	The KVM pointer
> + * @slot:	The memory slot associated with mask
> + * @gfn_offset:	The gfn offset in memory slot
> + * @mask:	The mask of dirty pages at offset 'gfn_offset' in this memory
> + *		slot to be write protected
> + *
> + * Walks bits set in mask write protects the associated pte's. Caller must
> + * acquire kvm_mmu_lock.
> + */
> +void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
> +		struct kvm_memory_slot *slot,
> +		gfn_t gfn_offset, unsigned long mask)
> +{
> +	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
> +	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
> +	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
> +
> +	stage2_wp_range(kvm, start, end);
> +}
>  #endif
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index f017760..c80dd2f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -982,8 +982,7 @@ out:
>  EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
>  
>  #if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) || \
> -    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM) || \
> -    defined(CONFIG_ARM64)
> +    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM64)

Yeah, that's exactly why Cornelia's comment on having a proper config
symbol is pertinent.

>  /*
>   * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
>   * logging, calling this function is illegal. Otherwise the function is defined
> 

Other that that:

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault
  2014-11-07  0:40 ` [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault Mario Smarduch
@ 2014-11-07 10:33   ` Marc Zyngier
  2014-11-07 19:21     ` Mario Smarduch
  0 siblings, 1 reply; 29+ messages in thread
From: Marc Zyngier @ 2014-11-07 10:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/11/14 00:40, Mario Smarduch wrote:
> This patch adds support for handling 2nd stage page faults during migration,
> it disables faulting in huge pages, and dissolves huge pages to page tables.
> In case migration is canceled huge pages are used again.
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/kvm/mmu.c |   47 ++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 40 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 2f5131e..d511fc0 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -47,6 +47,15 @@ static phys_addr_t hyp_idmap_vector;
>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>  #define kvm_pud_huge(_x)	pud_huge(_x)
>  
> +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> +{
> +#ifdef CONFIG_ARM
> +	return !!memslot->dirty_bitmap;
> +#else
> +	return false;
> +#endif
> +}
> +
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  {
>  	/*
> @@ -626,7 +635,8 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>  }
>  
>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> -			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
> +			  phys_addr_t addr, const pte_t *new_pte, bool iomap,
> +			  bool logging_active)

Yuk. Yet another parameter. Can't we have a set of flags instead,
indicating both iomap and logging in one go? That would make things more
readable (at least for me).

>  {
>  	pmd_t *pmd;
>  	pte_t *pte, old_pte;
> @@ -641,6 +651,18 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		return 0;
>  	}
>  
> +	/*
> +	 * While dirty memory logging, clear PMD entry for huge page and split
> +	 * into smaller pages, to track dirty memory at page granularity.
> +	 */
> +	if (logging_active && kvm_pmd_huge(*pmd)) {
> +		phys_addr_t ipa = pmd_pfn(*pmd) << PAGE_SHIFT;
> +
> +		pmd_clear(pmd);
> +		kvm_tlb_flush_vmid_ipa(kvm, ipa);
> +		put_page(virt_to_page(pmd));
> +	}
> +

If we have huge PUDs (like on arn64 with 4k Pages and 4 levels of
translation), would we need something similar? If that's the case, a
comment would be very welcome.

>  	/* Create stage-2 page mappings - Level 2 */
>  	if (pmd_none(*pmd)) {
>  		if (!cache)
> @@ -693,7 +715,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>  		if (ret)
>  			goto out;
>  		spin_lock(&kvm->mmu_lock);
> -		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
> +		ret = stage2_set_pte(kvm, &cache, addr, &pte, true, false);
>  		spin_unlock(&kvm->mmu_lock);
>  		if (ret)
>  			goto out;
> @@ -910,6 +932,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	struct vm_area_struct *vma;
>  	pfn_t pfn;
>  	pgprot_t mem_type = PAGE_S2;
> +	bool logging_active = kvm_get_logging_state(memslot);
>  
>  	write_fault = kvm_is_write_fault(kvm_vcpu_get_hsr(vcpu));
>  	if (fault_status == FSC_PERM && !write_fault) {
> @@ -920,7 +943,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>  	down_read(&current->mm->mmap_sem);
>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> -	if (is_vm_hugetlb_page(vma)) {
> +	if (is_vm_hugetlb_page(vma) && !logging_active) {
>  		hugetlb = true;
>  		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>  	} else {
> @@ -966,7 +989,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	spin_lock(&kvm->mmu_lock);
>  	if (mmu_notifier_retry(kvm, mmu_seq))
>  		goto out_unlock;
> -	if (!hugetlb && !force_pte)
> +	if (!hugetlb && !force_pte && !logging_active)
>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>  
>  	if (hugetlb) {
> @@ -986,10 +1009,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		}
>  		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
>  		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
> -				     mem_type == PAGE_S2_DEVICE);
> +					mem_type == PAGE_S2_DEVICE,
> +					logging_active);
>  	}
>  
> -
> +	if (write_fault)
> +		mark_page_dirty(kvm, gfn);
>  out_unlock:
>  	spin_unlock(&kvm->mmu_lock);
>  	kvm_release_pfn_clean(pfn);
> @@ -1139,7 +1164,15 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
>  {
>  	pte_t *pte = (pte_t *)data;
>  
> -	stage2_set_pte(kvm, NULL, gpa, pte, false);
> +	/*
> +	 * We can always call stage2_set_pte with logging_active == false,
> +	 * because MMU notifiers will have unmapped a huge PMD before calling
> +	 * ->change_pte() (which in turn calls kvm_set_spte_hva()) and therefore
> +	 * stage2_set_pte() never needs to clear out a huge PMD through this
> +	 * calling path.
> +	 */
> +
> +	stage2_set_pte(kvm, NULL, gpa, pte, false, false);
>  }
>  
>  
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 2/7] KVM: Add generic support for dirty page logging
  2014-11-07  9:07   ` Cornelia Huck
  2014-11-07  9:26     ` Paolo Bonzini
@ 2014-11-07 18:55     ` Mario Smarduch
  1 sibling, 0 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 18:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/07/2014 01:07 AM, Cornelia Huck wrote:
> On Thu, 06 Nov 2014 16:40:43 -0800
> Mario Smarduch <m.smarduch@samsung.com> wrote:
> 
>> kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
>> by several architectures. Building on that we intrdoduce 
>> kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
>> for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
>> space.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  include/linux/kvm_host.h |    9 +++++
>>  virt/kvm/kvm_main.c      |   95 ++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 104 insertions(+)
>>
> 
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 887df87..f017760 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -981,6 +981,101 @@ out:
>>  }
>>  EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
>>
>> +#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) || \
>> +    defined(CONFIG_IA64) || defined(CONFIG_X86) || defined(CONFIG_ARM) || \
>> +    defined(CONFIG_ARM64)
> 
> Does this deserve a config symbol that can be selected by architectures
> actually using kvm_get_dirty_log_protect()? I.e.,
> 
> #ifndef CONFIG_KVM_ARCH_DIRTY_LOG_PROTECT
> 
> or so?

Yes definitely, not sure why I went this way after using Kconfig
defines before.

> 
>> +/*
>> + * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
>> + * logging, calling this function is illegal. Otherwise the function is defined
>> + * in arch subtree and this restriction is removed.
>> + */
> 
> What about
> 
> /*
>  * For architectures that don't use kvm_get_dirty_log_protect() for dirty page
>  * logging, we must never end up calling this function. Architectures that do
>  * use it define their own version of this function.
>  */
> 
> instead

Yeah that reads better, getting the illegal out of there.
> 
>> +void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>> +					struct kvm_memory_slot *slot,
>> +					gfn_t gfn_offset,
>> +					unsigned long mask)
>> +{
>> +	BUG();
>> +}
>> +#endif
>> +
> (...)
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs
  2014-11-07  9:44   ` Marc Zyngier
@ 2014-11-07 18:58     ` Mario Smarduch
  0 siblings, 0 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 18:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/07/2014 01:44 AM, Marc Zyngier wrote:
> On 07/11/14 00:40, Mario Smarduch wrote:
>> This patch adds ARMv7 architecture TLB Flush function.
>>
>> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_asm.h  |    1 +
>>  arch/arm/include/asm/kvm_host.h |   12 ++++++++++++
>>  arch/arm/kvm/Kconfig            |    1 +
>>  arch/arm/kvm/interrupts.S       |   11 +++++++++++
>>  4 files changed, 25 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 3a67bec..25410b2 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
>>  
>>  extern void __kvm_flush_vm_context(void);
>>  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> +extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>  
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>>  #endif
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index 6dfb404..fad598c 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -220,6 +220,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>  }
>>  
>> +/**
>> + * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries
>> + * @kvm:	pointer to kvm structure.
>> + *
>> + * Interface to HYP function to flush all VM TLB entries without address
>> + * parameter.
>> + */
>> +static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>> +{
>> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +}
>> +
>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>  {
>>  	return 0;
>> diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
>> index 466bd29..a099df4 100644
>> --- a/arch/arm/kvm/Kconfig
>> +++ b/arch/arm/kvm/Kconfig
>> @@ -23,6 +23,7 @@ config KVM
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>>  	select KVM_MMIO
>>  	select KVM_ARM_HOST
>> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
> 
> [utter nit] Move this up by two lines so that the "HAVE_KVM_*" options
> are together. The ARM tree has a tendency to sort the config options by
> alphabetical order (though that's clearly not enforced in this file...).

Ah, ok will update.
> 
>>  	depends on ARM_VIRT_EXT && ARM_LPAE
>>  	---help---
>>  	  Support hosting virtualized guest machines. You will also
>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>> index 01dcb0e..79caf79 100644
>> --- a/arch/arm/kvm/interrupts.S
>> +++ b/arch/arm/kvm/interrupts.S
>> @@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>>  	bx	lr
>>  ENDPROC(__kvm_tlb_flush_vmid_ipa)
>>  
>> +/**
>> + * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
>> + *
>> + * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
>> + * parameter
>> + */
>> +
>> +ENTRY(__kvm_tlb_flush_vmid)
>> +	b	__kvm_tlb_flush_vmid_ipa
>> +ENDPROC(__kvm_tlb_flush_vmid)
>> +
>>  /********************************************************************
>>   * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
>>   * domain, for all VMIDs
>>
> 
> Other than that:
> 
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure
  2014-11-07 10:15   ` Marc Zyngier
@ 2014-11-07 19:07     ` Mario Smarduch
  0 siblings, 0 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/07/2014 02:15 AM, Marc Zyngier wrote:
> On 07/11/14 00:40, Mario Smarduch wrote:
>> Add support for initial write protection of VM memlsot. This patch
>                                                  memslots
>> series assumes that huge PUDs will not be used in 2nd stage tables, which is
>> awlays valid on ARMv7.
>   always
Spell checks still - will update.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h       |    2 +
>>  arch/arm/include/asm/kvm_mmu.h        |   20 +++++
>>  arch/arm/include/asm/pgtable-3level.h |    1 +
>>  arch/arm/kvm/mmu.c                    |  140 +++++++++++++++++++++++++++++++++
>>  4 files changed, 163 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index fad598c..72510ca 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -245,4 +245,6 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
>>  int kvm_perf_init(void);
>>  int kvm_perf_teardown(void);
>>  
>> +void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>> +
>>  #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 5cc0b0f..08ab5e8 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -114,6 +114,26 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
>>  	pmd_val(*pmd) |= L_PMD_S2_RDWR;
>>  }
>>  
>> +static inline void kvm_set_s2pte_readonly(pte_t *pte)
>> +{
>> +	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
>> +}
>> +
>> +static inline bool kvm_s2pte_readonly(pte_t *pte)
>> +{
>> +	return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
>> +}
>> +
>> +static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
>> +{
>> +	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
>> +}
>> +
>> +static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>> +{
>> +	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>> +}
>> +
>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>  #define kvm_pgd_addr_end(addr, end)					\
>>  ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
>> diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
>> index 06e0bc0..d29c880 100644
>> --- a/arch/arm/include/asm/pgtable-3level.h
>> +++ b/arch/arm/include/asm/pgtable-3level.h
>> @@ -130,6 +130,7 @@
>>  #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
>>  #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
>>  
>> +#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
>>  #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>  
>>  /*
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 16e7994..3b86522 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
>>  #define pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>>  
>>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>> +#define kvm_pud_huge(_x)	pud_huge(_x)
>>  
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>  {
>> @@ -746,6 +747,133 @@ static bool transparent_hugepage_adjust(pfn_t *pfnp, phys_addr_t *ipap)
>>  	return false;
>>  }
>>  
>> +#ifdef CONFIG_ARM
>> +/**
>> + * stage2_wp_ptes - write protect PMD range
>> + * @pmd:	pointer to pmd entry
>> + * @addr:	range start address
>> + * @end:	range end address
>> + */
>> +static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
>> +{
>> +	pte_t *pte;
>> +
>> +	pte = pte_offset_kernel(pmd, addr);
>> +	do {
>> +		if (!pte_none(*pte)) {
>> +			if (!kvm_s2pte_readonly(pte))
>> +				kvm_set_s2pte_readonly(pte);
>> +			}
>> +	} while (pte++, addr += PAGE_SIZE, addr != end);
>> +}
>> +
>> +/**
>> + * stage2_wp_pmds - write protect PUD range
>> + * @pud:	pointer to pud entry
>> + * @addr:	range start address
>> + * @end:	range end address
>> + */
>> +static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
>> +{
>> +	pmd_t *pmd;
>> +	phys_addr_t next;
>> +
>> +	pmd = pmd_offset(pud, addr);
>> +
>> +	do {
>> +		next = kvm_pmd_addr_end(addr, end);
>> +		if (!pmd_none(*pmd)) {
>> +			if (kvm_pmd_huge(*pmd)) {
>> +				if (!kvm_s2pmd_readonly(pmd))
>> +					kvm_set_s2pmd_readonly(pmd);
>> +			} else {
>> +				stage2_wp_ptes(pmd, addr, next);
>> +			}
>> +		}
>> +	} while (pmd++, addr = next, addr != end);
>> +}
>> +
>> +/**
>> +  * stage2_wp_puds - write protect PGD range
>> +  * @kvm:	pointer to kvm structure
>> +  * @pud:	pointer to pgd entry
>> +  * @addr:	range start address
>> +  * @end:	range end address
>> +  *
>> +  * While walking the PUD range huge PUD pages are ignored.
> 
> Well, not really. Huge PUDs are actually triggering a kernel panic.

Yes will rephrase, it doesn't just dismiss the condition.
Left over from earlier version where we failed on the ioctl
during initial write protect, but even then it still failed.
> 
>> +  */
>> +static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
> 
> Why do we have a "kvm" parameter here?

That's again left over from an early version where we needed it
while breaking up huge pages, will remove it.
> 
>> +					phys_addr_t addr, phys_addr_t end)
>> +{
>> +	pud_t *pud;
>> +	phys_addr_t next;
>> +
>> +	pud = pud_offset(pgd, addr);
>> +	do {
>> +		next = kvm_pud_addr_end(addr, end);
>> +		if (!pud_none(*pud)) {
>> +			/* TODO:PUD not supported, revisit later if supported */
>> +			BUG_ON(kvm_pud_huge(*pud));
>> +			stage2_wp_pmds(pud, addr, next);
>> +		}
>> +	} while (pud++, addr = next, addr != end);
>> +}
>> +
>> +/**
>> + * stage2_wp_range() - write protect stage2 memory region range
>> + * @kvm:	The KVM pointer
>> + * @addr:	Start address of range
>> + * @end:	End address of range
>> + */
>> +static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
>> +{
>> +	pgd_t *pgd;
>> +	phys_addr_t next;
>> +
>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>> +	do {
>> +		/*
>> +		 * Release kvm_mmu_lock periodically if the memory region is
>> +		 * large. Otherwise, we may see kernel panics with
>> +		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
>> +		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
>> +		 * will also starve other vCPUs.
>> +		 */
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
>> +
>> +		next = kvm_pgd_addr_end(addr, end);
>> +		if (pgd_present(*pgd))
>> +			stage2_wp_puds(kvm, pgd, addr, next);
>> +	} while (pgd++, addr = next, addr != end);
>> +}
>> +
>> +/**
>> + * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
>> + * @kvm:	The KVM pointer
>> + * @slot:	The memory slot to write protect
>> + *
>> + * Called to start logging dirty pages after memory region
>> + * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
>> + * all present PMD and PTEs are write protected in the memory region.
>> + * Afterwards read of dirty page log can be called.
>> + *
>> + * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
>> + * serializing operations for VM memory regions.
>> + */
>> +void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
>> +{
>> +	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
>> +	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
>> +	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
>> +
>> +	spin_lock(&kvm->mmu_lock);
>> +	stage2_wp_range(kvm, start, end);
>> +	spin_unlock(&kvm->mmu_lock);
>> +	kvm_flush_remote_tlbs(kvm);
>> +}
>> +#endif
>> +
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot,
>>  			  unsigned long fault_status)
>> @@ -1129,6 +1257,18 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  		unmap_stage2_range(kvm, gpa, size);
>>  		spin_unlock(&kvm->mmu_lock);
>>  	}
>> +
>> +#ifdef CONFIG_ARM
>> +	/*
>> +	 * At this point memslot has been committed and there is an
>> +	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
>> +	 * memory slot is write protected.
>> +	 */
>> +	if ((change != KVM_MR_DELETE) && change != KVM_MR_MOVE &&
> 
> [nit] extraneous set of parenthesis.
> 
>> +					(mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
Will change.
>> +		kvm_mmu_wp_memory_region(kvm, mem->slot);
>> +#endif
>> +
>>  }
>>  
>>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
>>
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault
  2014-11-07 10:33   ` Marc Zyngier
@ 2014-11-07 19:21     ` Mario Smarduch
  0 siblings, 0 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 19:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/07/2014 02:33 AM, Marc Zyngier wrote:
> On 07/11/14 00:40, Mario Smarduch wrote:
>> This patch adds support for handling 2nd stage page faults during migration,
>> it disables faulting in huge pages, and dissolves huge pages to page tables.
>> In case migration is canceled huge pages are used again.
>>
>> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/kvm/mmu.c |   47 ++++++++++++++++++++++++++++++++++++++++-------
>>  1 file changed, 40 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 2f5131e..d511fc0 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -47,6 +47,15 @@ static phys_addr_t hyp_idmap_vector;
>>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>>  #define kvm_pud_huge(_x)	pud_huge(_x)
>>  
>> +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>> +{
>> +#ifdef CONFIG_ARM
>> +	return !!memslot->dirty_bitmap;
>> +#else
>> +	return false;
>> +#endif
>> +}
>> +
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>  {
>>  	/*
>> @@ -626,7 +635,8 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>>  }
>>  
>>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>> -			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
>> +			  phys_addr_t addr, const pte_t *new_pte, bool iomap,
>> +			  bool logging_active)
> 
> Yuk. Yet another parameter. Can't we have a set of flags instead,
> indicating both iomap and logging in one go? That would make things more
> readable (at least for me).

Sure that could be changed, I didn't like the idea adding another line
myself, needed some suggestions.

> 
>>  {
>>  	pmd_t *pmd;
>>  	pte_t *pte, old_pte;
>> @@ -641,6 +651,18 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  		return 0;
>>  	}
>>  
>> +	/*
>> +	 * While dirty memory logging, clear PMD entry for huge page and split
>> +	 * into smaller pages, to track dirty memory at page granularity.
>> +	 */
>> +	if (logging_active && kvm_pmd_huge(*pmd)) {
>> +		phys_addr_t ipa = pmd_pfn(*pmd) << PAGE_SHIFT;
>> +
>> +		pmd_clear(pmd);
>> +		kvm_tlb_flush_vmid_ipa(kvm, ipa);
>> +		put_page(virt_to_page(pmd));
>> +	}
>> +
> 
> If we have huge PUDs (like on arn64 with 4k Pages and 4 levels of
> translation), would we need something similar? If that's the case, a
> comment would be very welcome.

I'm not sure what we do if we encounter a huge pud, break that
up into small pages? I was thinking if huge puds are enabled
then disable page logging. I don't know what kind of comment to
put in here now other then it's not supported

> 
>>  	/* Create stage-2 page mappings - Level 2 */
>>  	if (pmd_none(*pmd)) {
>>  		if (!cache)
>> @@ -693,7 +715,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>>  		if (ret)
>>  			goto out;
>>  		spin_lock(&kvm->mmu_lock);
>> -		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
>> +		ret = stage2_set_pte(kvm, &cache, addr, &pte, true, false);
>>  		spin_unlock(&kvm->mmu_lock);
>>  		if (ret)
>>  			goto out;
>> @@ -910,6 +932,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	struct vm_area_struct *vma;
>>  	pfn_t pfn;
>>  	pgprot_t mem_type = PAGE_S2;
>> +	bool logging_active = kvm_get_logging_state(memslot);
>>  
>>  	write_fault = kvm_is_write_fault(kvm_vcpu_get_hsr(vcpu));
>>  	if (fault_status == FSC_PERM && !write_fault) {
>> @@ -920,7 +943,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>>  	down_read(&current->mm->mmap_sem);
>>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
>> -	if (is_vm_hugetlb_page(vma)) {
>> +	if (is_vm_hugetlb_page(vma) && !logging_active) {
>>  		hugetlb = true;
>>  		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>>  	} else {
>> @@ -966,7 +989,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	spin_lock(&kvm->mmu_lock);
>>  	if (mmu_notifier_retry(kvm, mmu_seq))
>>  		goto out_unlock;
>> -	if (!hugetlb && !force_pte)
>> +	if (!hugetlb && !force_pte && !logging_active)
>>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>>  
>>  	if (hugetlb) {
>> @@ -986,10 +1009,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		}
>>  		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
>>  		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
>> -				     mem_type == PAGE_S2_DEVICE);
>> +					mem_type == PAGE_S2_DEVICE,
>> +					logging_active);
>>  	}
>>  
>> -
>> +	if (write_fault)
>> +		mark_page_dirty(kvm, gfn);
>>  out_unlock:
>>  	spin_unlock(&kvm->mmu_lock);
>>  	kvm_release_pfn_clean(pfn);
>> @@ -1139,7 +1164,15 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
>>  {
>>  	pte_t *pte = (pte_t *)data;
>>  
>> -	stage2_set_pte(kvm, NULL, gpa, pte, false);
>> +	/*
>> +	 * We can always call stage2_set_pte with logging_active == false,
>> +	 * because MMU notifiers will have unmapped a huge PMD before calling
>> +	 * ->change_pte() (which in turn calls kvm_set_spte_hva()) and therefore
>> +	 * stage2_set_pte() never needs to clear out a huge PMD through this
>> +	 * calling path.
>> +	 */
>> +
>> +	stage2_set_pte(kvm, NULL, gpa, pte, false, false);
>>  }
>>  
>>  
>>
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 6/7] arm: KVM: dirty log read write protect support
  2014-11-07  7:38   ` Paolo Bonzini
@ 2014-11-07 19:47     ` Mario Smarduch
  2014-11-08  7:28       ` Paolo Bonzini
  0 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 19:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/06/2014 11:38 PM, Paolo Bonzini wrote:
> 
> 
> On 07/11/2014 01:40, Mario Smarduch wrote:
>> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
>> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/kvm/arm.c  |   37 +++++++++++++++++++++++++++++++++++++
>>  arch/arm/kvm/mmu.c  |   22 ++++++++++++++++++++++
>>  virt/kvm/kvm_main.c |    3 +--
>>  3 files changed, 60 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index a99e0cd..212d835 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -737,9 +737,46 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>  	}
>>  }
>>  
>> +/**
>> + * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
>> + * @kvm:	kvm instance
>> + * @log:	slot id and address to which we copy the log
>> + *
>> + * We need to keep it in mind that VCPU threads can write to the bitmap
>> + * concurrently.  So, to avoid losing data, we keep the following order for
>> + * each bit:
>> + *
>> + *   1. Take a snapshot of the bit and clear it if needed.
>> + *   2. Write protect the corresponding page.
>> + *   3. Copy the snapshot to the userspace.
>> + *   4. Flush TLB's if needed.
>> + *
>> + * Steps 1,2,3 are handled by kvm_get_dirty_log_protect().
>> + * Between 2 and 4, the guest may write to the page using the remaining TLB
>> + * entry.  This is not a problem because the page is reported dirty using
>> + * the snapshot taken before and step 4 ensures that writes done after
>> + * exiting to userspace will be logged for the next call.
>> + */
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  {
>> +#ifdef CONFIG_ARM
>> +	int r;
>> +	bool is_dirty = false;
>> +
>> +	mutex_lock(&kvm->slots_lock);
>> +
>> +	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
>> +	if (r)
>> +		goto out;
>> +
>> +	if (is_dirty)
>> +		kvm_flush_remote_tlbs(kvm);
> 
> Should the caller should always flush TLBs if is_dirty is true, even if
> kvm_get_dirty_log_protect reported an error?  That can happen if the
> error occurred in the final copy to userspace, after page tables have
> been modified.

Upon error return userspace should terminate logging, error out whether
used
for migration or other use cases, with some stale spte TLBs cached
read/write,
which doesn't appear to be harmful.

But you mention 'final copy' which makes me think I'm missing something?

> 
> Of course, in this case userspace cannot use the dirty log anymore since
> it has been irrimediably trashed.
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  2014-11-07  7:44   ` Paolo Bonzini
@ 2014-11-07 19:50     ` Mario Smarduch
  2014-11-07 20:02       ` Christoffer Dall
  0 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 19:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/06/2014 11:44 PM, Paolo Bonzini wrote:
> 
> 
> On 07/11/2014 01:40, Mario Smarduch wrote:
>> In the next patches, we will move parts of x86's kvm_vm_ioctl_get_dirty_log
>> implementation to generic code; leave the arch-specific code at the end,
>> similar to the existing generic function kvm_get_dirty_log.
>>
>> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> You should have a "Signed-off-by: Mario Smarduch ..." here, and you 
> should also add my authorship.  You can do the latter with "git rebase 
> -i", adding
> 
>   x git commit --amend -C HEAD --author 'Paolo Bonzini <pbonzini@redhat.com>'
> 
> after this patch.  If you're using a patch queue instead (quilt or
> similar) you can just edit the "From" line in the patch.
> 
> I guess if you reply to the patch with just
> 
>    Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> 
> the ARM maintainers can do the above for you, if there's no need for
> a v14.
> 
> I'll send an eighth patch to actually switch x86 to the new function.
> Again, the maintainers can apply it in the right place, but please
> include it yourself if you have to do a v14.
> 
> Thanks,
> 
> Paolo

I kind of thought something was wrong :) Thanks for the explanation,
until I get the hang of it I'll defer to ARM maintainers. Maybe later
update SubmittingPatches documentation.

- Mario
> 
>> ---
>>  arch/x86/kvm/x86.c |   22 ++++++++++------------
>>  1 file changed, 10 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 8f1e22d..dc8e66b 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3606,13 +3606,13 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
>>   *
>>   *   1. Take a snapshot of the bit and clear it if needed.
>>   *   2. Write protect the corresponding page.
>> - *   3. Flush TLB's if needed.
>> - *   4. Copy the snapshot to the userspace.
>> + *   3. Copy the snapshot to the userspace.
>> + *   4. Flush TLB's if needed.
>>   *
>> - * Between 2 and 3, the guest may write to the page using the remaining TLB
>> - * entry.  This is not a problem because the page will be reported dirty at
>> - * step 4 using the snapshot taken before and step 3 ensures that successive
>> - * writes will be logged for the next call.
>> + * Between 2 and 4, the guest may write to the page using the remaining TLB
>> + * entry.  This is not a problem because the page is reported dirty using
>> + * the snapshot taken before and step 4 ensures that writes done after
>> + * exiting to userspace will be logged for the next call.
>>   */
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  {
>> @@ -3661,6 +3661,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  
>>  	spin_unlock(&kvm->mmu_lock);
>>  
>> +	r = 0;
>> +	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
>> +		r = -EFAULT;
>> +
>>  	/* See the comments in kvm_mmu_slot_remove_write_access(). */
>>  	lockdep_assert_held(&kvm->slots_lock);
>>  
>> @@ -3670,12 +3674,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  	 */
>>  	if (is_dirty)
>>  		kvm_flush_remote_tlbs(kvm);
>> -
>> -	r = -EFAULT;
>> -	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
>> -		goto out;
>> -
>> -	r = 0;
>>  out:
>>  	mutex_unlock(&kvm->slots_lock);
>>  	return r;
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  2014-11-07 19:50     ` Mario Smarduch
@ 2014-11-07 20:02       ` Christoffer Dall
  2014-11-07 20:44         ` Mario Smarduch
  0 siblings, 1 reply; 29+ messages in thread
From: Christoffer Dall @ 2014-11-07 20:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 07, 2014 at 11:50:09AM -0800, Mario Smarduch wrote:
> On 11/06/2014 11:44 PM, Paolo Bonzini wrote:
> > 
> > 
> > On 07/11/2014 01:40, Mario Smarduch wrote:
> >> In the next patches, we will move parts of x86's kvm_vm_ioctl_get_dirty_log
> >> implementation to generic code; leave the arch-specific code at the end,
> >> similar to the existing generic function kvm_get_dirty_log.
> >>
> >> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > 
> > You should have a "Signed-off-by: Mario Smarduch ..." here, and you 
> > should also add my authorship.  You can do the latter with "git rebase 
> > -i", adding
> > 
> >   x git commit --amend -C HEAD --author 'Paolo Bonzini <pbonzini@redhat.com>'
> > 
> > after this patch.  If you're using a patch queue instead (quilt or
> > similar) you can just edit the "From" line in the patch.
> > 
> > I guess if you reply to the patch with just
> > 
> >    Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> > 
> > the ARM maintainers can do the above for you, if there's no need for
> > a v14.
> > 
> > I'll send an eighth patch to actually switch x86 to the new function.
> > Again, the maintainers can apply it in the right place, but please
> > include it yourself if you have to do a v14.
> > 
> > Thanks,
> > 
> > Paolo
> 
> I kind of thought something was wrong :) Thanks for the explanation,
> until I get the hang of it I'll defer to ARM maintainers. Maybe later
> update SubmittingPatches documentation.
> 
Since you're doing a respin (right?) then please get the authorship
and signed-off-by's right for the next iteration.

If you're having trouble with the git mangling to get it right, don't be
afraid to reach out, here or on IRC.

-Christoffer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs
  2014-11-07  0:40 ` [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs Mario Smarduch
  2014-11-07  9:44   ` Marc Zyngier
@ 2014-11-07 20:18   ` Christoffer Dall
  2014-11-07 20:46     ` Mario Smarduch
  1 sibling, 1 reply; 29+ messages in thread
From: Christoffer Dall @ 2014-11-07 20:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 06, 2014 at 04:40:45PM -0800, Mario Smarduch wrote:
> This patch adds ARMv7 architecture TLB Flush function.
> 
> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_asm.h  |    1 +
>  arch/arm/include/asm/kvm_host.h |   12 ++++++++++++
>  arch/arm/kvm/Kconfig            |    1 +
>  arch/arm/kvm/interrupts.S       |   11 +++++++++++
>  4 files changed, 25 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
> index 3a67bec..25410b2 100644
> --- a/arch/arm/include/asm/kvm_asm.h
> +++ b/arch/arm/include/asm/kvm_asm.h
> @@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
>  
>  extern void __kvm_flush_vm_context(void);
>  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> +extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>  #endif
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 6dfb404..fad598c 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -220,6 +220,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> +/**
> + * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries

this name is not matching the function

> + * @kvm:	pointer to kvm structure.
> + *
> + * Interface to HYP function to flush all VM TLB entries without address
> + * parameter.
> + */
> +static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> +{
> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> +}
> +
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
> index 466bd29..a099df4 100644
> --- a/arch/arm/kvm/Kconfig
> +++ b/arch/arm/kvm/Kconfig
> @@ -23,6 +23,7 @@ config KVM
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>  	select KVM_MMIO
>  	select KVM_ARM_HOST
> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>  	depends on ARM_VIRT_EXT && ARM_LPAE
>  	---help---
>  	  Support hosting virtualized guest machines. You will also
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index 01dcb0e..79caf79 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>  	bx	lr
>  ENDPROC(__kvm_tlb_flush_vmid_ipa)
>  
> +/**
> + * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
> + *
> + * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
> + * parameter
> + */
> +
> +ENTRY(__kvm_tlb_flush_vmid)
> +	b	__kvm_tlb_flush_vmid_ipa
> +ENDPROC(__kvm_tlb_flush_vmid)
> +
>  /********************************************************************
>   * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
>   * domain, for all VMIDs
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  2014-11-07 20:02       ` Christoffer Dall
@ 2014-11-07 20:44         ` Mario Smarduch
  2014-11-07 21:07           ` Christoffer Dall
  0 siblings, 1 reply; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 20:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/07/2014 12:02 PM, Christoffer Dall wrote:
> On Fri, Nov 07, 2014 at 11:50:09AM -0800, Mario Smarduch wrote:
>> On 11/06/2014 11:44 PM, Paolo Bonzini wrote:
>>>
>>>
>>> On 07/11/2014 01:40, Mario Smarduch wrote:
>>>> In the next patches, we will move parts of x86's kvm_vm_ioctl_get_dirty_log
>>>> implementation to generic code; leave the arch-specific code at the end,
>>>> similar to the existing generic function kvm_get_dirty_log.
>>>>
>>>> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> You should have a "Signed-off-by: Mario Smarduch ..." here, and you 
>>> should also add my authorship.  You can do the latter with "git rebase 
>>> -i", adding
>>>
>>>   x git commit --amend -C HEAD --author 'Paolo Bonzini <pbonzini@redhat.com>'
>>>
>>> after this patch.  If you're using a patch queue instead (quilt or
>>> similar) you can just edit the "From" line in the patch.
>>>
>>> I guess if you reply to the patch with just
>>>
>>>    Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>
>>> the ARM maintainers can do the above for you, if there's no need for
>>> a v14.
>>>
>>> I'll send an eighth patch to actually switch x86 to the new function.
>>> Again, the maintainers can apply it in the right place, but please
>>> include it yourself if you have to do a v14.
>>>
>>> Thanks,
>>>
>>> Paolo
>>
>> I kind of thought something was wrong :) Thanks for the explanation,
>> until I get the hang of it I'll defer to ARM maintainers. Maybe later
>> update SubmittingPatches documentation.
>>
> Since you're doing a respin (right?) then please get the authorship
> and signed-off-by's right for the next iteration.

Yes
> 
> If you're having trouble with the git mangling to get it right, don't be
> afraid to reach out, here or on IRC.

Yes definitely need some coaching on this, this is not git 101 anymore.

Thanks.
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs
  2014-11-07 20:18   ` Christoffer Dall
@ 2014-11-07 20:46     ` Mario Smarduch
  0 siblings, 0 replies; 29+ messages in thread
From: Mario Smarduch @ 2014-11-07 20:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/07/2014 12:18 PM, Christoffer Dall wrote:
> On Thu, Nov 06, 2014 at 04:40:45PM -0800, Mario Smarduch wrote:
>> This patch adds ARMv7 architecture TLB Flush function.
>>
>> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_asm.h  |    1 +
>>  arch/arm/include/asm/kvm_host.h |   12 ++++++++++++
>>  arch/arm/kvm/Kconfig            |    1 +
>>  arch/arm/kvm/interrupts.S       |   11 +++++++++++
>>  4 files changed, 25 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
>> index 3a67bec..25410b2 100644
>> --- a/arch/arm/include/asm/kvm_asm.h
>> +++ b/arch/arm/include/asm/kvm_asm.h
>> @@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
>>  
>>  extern void __kvm_flush_vm_context(void);
>>  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
>> +extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
>>  
>>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>>  #endif
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index 6dfb404..fad598c 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -220,6 +220,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>  }
>>  
>> +/**
>> + * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries
> 
> this name is not matching the function

Yes leftover from previous version, will update.
> 
>> + * @kvm:	pointer to kvm structure.
>> + *
>> + * Interface to HYP function to flush all VM TLB entries without address
>> + * parameter.
>> + */
>> +static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>> +{
>> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> +}
>> +
>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>  {
>>  	return 0;
>> diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
>> index 466bd29..a099df4 100644
>> --- a/arch/arm/kvm/Kconfig
>> +++ b/arch/arm/kvm/Kconfig
>> @@ -23,6 +23,7 @@ config KVM
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>>  	select KVM_MMIO
>>  	select KVM_ARM_HOST
>> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>>  	depends on ARM_VIRT_EXT && ARM_LPAE
>>  	---help---
>>  	  Support hosting virtualized guest machines. You will also
>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>> index 01dcb0e..79caf79 100644
>> --- a/arch/arm/kvm/interrupts.S
>> +++ b/arch/arm/kvm/interrupts.S
>> @@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>>  	bx	lr
>>  ENDPROC(__kvm_tlb_flush_vmid_ipa)
>>  
>> +/**
>> + * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
>> + *
>> + * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
>> + * parameter
>> + */
>> +
>> +ENTRY(__kvm_tlb_flush_vmid)
>> +	b	__kvm_tlb_flush_vmid_ipa
>> +ENDPROC(__kvm_tlb_flush_vmid)
>> +
>>  /********************************************************************
>>   * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
>>   * domain, for all VMIDs
>> -- 
>> 1.7.9.5
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG
  2014-11-07 20:44         ` Mario Smarduch
@ 2014-11-07 21:07           ` Christoffer Dall
  0 siblings, 0 replies; 29+ messages in thread
From: Christoffer Dall @ 2014-11-07 21:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 07, 2014 at 12:44:13PM -0800, Mario Smarduch wrote:
> On 11/07/2014 12:02 PM, Christoffer Dall wrote:
> > On Fri, Nov 07, 2014 at 11:50:09AM -0800, Mario Smarduch wrote:
> >> On 11/06/2014 11:44 PM, Paolo Bonzini wrote:
> >>>
> >>>
> >>> On 07/11/2014 01:40, Mario Smarduch wrote:
> >>>> In the next patches, we will move parts of x86's kvm_vm_ioctl_get_dirty_log
> >>>> implementation to generic code; leave the arch-specific code at the end,
> >>>> similar to the existing generic function kvm_get_dirty_log.
> >>>>
> >>>> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com>
> >>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >>>
> >>> You should have a "Signed-off-by: Mario Smarduch ..." here, and you 
> >>> should also add my authorship.  You can do the latter with "git rebase 
> >>> -i", adding
> >>>
> >>>   x git commit --amend -C HEAD --author 'Paolo Bonzini <pbonzini@redhat.com>'
> >>>
> >>> after this patch.  If you're using a patch queue instead (quilt or
> >>> similar) you can just edit the "From" line in the patch.
> >>>
> >>> I guess if you reply to the patch with just
> >>>
> >>>    Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >>>
> >>> the ARM maintainers can do the above for you, if there's no need for
> >>> a v14.
> >>>
> >>> I'll send an eighth patch to actually switch x86 to the new function.
> >>> Again, the maintainers can apply it in the right place, but please
> >>> include it yourself if you have to do a v14.
> >>>
> >>> Thanks,
> >>>
> >>> Paolo
> >>
> >> I kind of thought something was wrong :) Thanks for the explanation,
> >> until I get the hang of it I'll defer to ARM maintainers. Maybe later
> >> update SubmittingPatches documentation.
> >>
> > Since you're doing a respin (right?) then please get the authorship
> > and signed-off-by's right for the next iteration.
> 
> Yes
> > 
> > If you're having trouble with the git mangling to get it right, don't be
> > afraid to reach out, here or on IRC.
> 
> Yes definitely need some coaching on this, this is not git 101 anymore.
> 
It's not that complicated; just get all the patches applied and use 'git
rebase -i', mark Paolo's commit as edit, when it stops at that commit
do:

 # git commit -s --amend -C HEAD --author 'Paolo Bonzini <pbonzini@redhat.com>'
 # git rebase --continue

and the '-s' to the commit should add your signed-off-by automatically.

-Christoffer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v13 6/7] arm: KVM: dirty log read write protect support
  2014-11-07 19:47     ` Mario Smarduch
@ 2014-11-08  7:28       ` Paolo Bonzini
  0 siblings, 0 replies; 29+ messages in thread
From: Paolo Bonzini @ 2014-11-08  7:28 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/2014 20:47, Mario Smarduch wrote:
>> That can happen if the error occurred in the final
>> copy to userspace, after page tables have been modified.
>
> Upon error return userspace should terminate logging, error out whether
> used for migration or other use cases, with some stale spte TLBs cached
> read/write, which doesn't appear to be harmful.

Fair enough; for x86 I elected to always do the TLB flush, but you can
do it differently.  Perhaps add a comment with the above paragraph, thouhg.

> But you mention 'final copy' which makes me think I'm missing something?

I meant final copy before leaving the function.

Paolo

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-11-08  7:28 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-07  0:40 [PATCH v13 0/7] KVM/arm/x86: dirty page logging support for ARMv7 (3.17.0-rc1) Mario Smarduch
2014-11-07  0:40 ` [PATCH v13 1/7] KVM: Add architecture-defined TLB flush support Mario Smarduch
2014-11-07  9:39   ` Marc Zyngier
2014-11-07  0:40 ` [PATCH v13 2/7] KVM: Add generic support for dirty page logging Mario Smarduch
2014-11-07  9:07   ` Cornelia Huck
2014-11-07  9:26     ` Paolo Bonzini
2014-11-07 18:55     ` Mario Smarduch
2014-11-07  0:40 ` [PATCH v13 3/7] KVM: x86: flush TLBs last before returning from KVM_GET_DIRTY_LOG Mario Smarduch
2014-11-07  7:44   ` Paolo Bonzini
2014-11-07 19:50     ` Mario Smarduch
2014-11-07 20:02       ` Christoffer Dall
2014-11-07 20:44         ` Mario Smarduch
2014-11-07 21:07           ` Christoffer Dall
2014-11-07  0:40 ` [PATCH v13 4/7] arm: KVM: Add ARMv7 API to flush TLBs Mario Smarduch
2014-11-07  9:44   ` Marc Zyngier
2014-11-07 18:58     ` Mario Smarduch
2014-11-07 20:18   ` Christoffer Dall
2014-11-07 20:46     ` Mario Smarduch
2014-11-07  0:40 ` [PATCH v13 5/7] arm: KVM: Add initial dirty page locking infrastructure Mario Smarduch
2014-11-07 10:15   ` Marc Zyngier
2014-11-07 19:07     ` Mario Smarduch
2014-11-07  0:40 ` [PATCH v13 6/7] arm: KVM: dirty log read write protect support Mario Smarduch
2014-11-07  7:38   ` Paolo Bonzini
2014-11-07 19:47     ` Mario Smarduch
2014-11-08  7:28       ` Paolo Bonzini
2014-11-07 10:19   ` Marc Zyngier
2014-11-07  0:40 ` [PATCH v13 7/7] arm: KVM: ARMv7 dirty page logging 2nd stage page fault Mario Smarduch
2014-11-07 10:33   ` Marc Zyngier
2014-11-07 19:21     ` Mario Smarduch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).