kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10 v4] KVM: introduce readonly memslot
@ 2012-07-17 14:39 Xiao Guangrong
  2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:39 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault-pfn
and async is not allowed, then the vm will crash.

As Avi's suggestion, We introduce readonly memory region to map ROM/ROMD
to the guest, read access is happy for readonly memslot, write access on
readonly memslot will cause KVM_EXIT_MMIO exit.

The test case attached is used to test this feather.

Changlog:
- fix memslot flag check
- fix caching the mmio info into spte which is caused by write on readonly
  memslot
- extend mmio-exit info to indicate whether the mmio-exit is caused by
  readonly fault


[-- Attachment #2: migrate-perf.tar.bz2 --]
[-- Type: application/x-bzip, Size: 305078 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/10] KVM: fix missing check for memslot flags
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
@ 2012-07-17 14:40 ` Xiao Guangrong
  2012-07-17 14:41 ` [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:40 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Check flags when memslot is registered from userspace as Avi's suggestion

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 78cf42f..7cb29bb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -689,6 +689,14 @@ void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)
 	slots->generation++;
 }

+static int check_memory_region_flags(struct kvm_userspace_memory_region *mem)
+{
+	if (mem->flags & ~KVM_MEM_LOG_DIRTY_PAGES)
+		return -EINVAL;
+
+	return 0;
+}
+
 /*
  * Allocate some memory and give it an address in the guest physical address
  * space.
@@ -709,6 +717,10 @@ int __kvm_set_memory_region(struct kvm *kvm,
 	struct kvm_memory_slot old, new;
 	struct kvm_memslots *slots, *old_memslots;

+	r = check_memory_region_flags(mem);
+	if (r)
+		goto out;
+
 	r = -EINVAL;
 	/* General sanity checks */
 	if (mem->memory_size & (PAGE_SIZE - 1))
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
  2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
@ 2012-07-17 14:41 ` Xiao Guangrong
  2012-07-17 14:41 ` [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:41 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Quote Avi's comment:
| KVM_MEMSLOT_INVALID is actually an internal symbol, not used by
| userspace.  Please move it to kvm_host.h.

Also, move KVM_MEMSLOT_INVALID to the highest bit

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 include/linux/kvm.h      |    1 -
 include/linux/kvm_host.h |    2 ++
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2ce09aa..dc3aa2a 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -103,7 +103,6 @@ struct kvm_userspace_memory_region {

 /* for kvm_memory_region::flags */
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
-#define KVM_MEMSLOT_INVALID      (1UL << 1)

 /* for KVM_IRQ_LINE */
 struct kvm_irq_level {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 14bfde4..8e6bc51 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -34,6 +34,8 @@
 #define KVM_MMIO_SIZE 8
 #endif

+#define KVM_MEMSLOT_INVALID	(1UL << 31)
+
 /*
  * If we support unaligned MMIO, at most one fragment will be split into two:
  */
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
  2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
  2012-07-17 14:41 ` [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
@ 2012-07-17 14:41 ` Xiao Guangrong
  2012-07-17 14:42 ` [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:41 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

It can instead of hva_to_pfn_atomic

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c       |    5 +----
 include/linux/kvm_host.h |    3 ++-
 virt/kvm/kvm_main.c      |   14 ++++++++------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4ed543a..13d3c69 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2480,15 +2480,12 @@ static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
 				     bool no_dirty_log)
 {
 	struct kvm_memory_slot *slot;
-	unsigned long hva;

 	slot = gfn_to_memslot_dirty_bitmap(vcpu, gfn, no_dirty_log);
 	if (!slot)
 		return get_fault_pfn();

-	hva = gfn_to_hva_memslot(slot, gfn);
-
-	return hva_to_pfn_atomic(hva);
+	return gfn_to_pfn_memslot_atomic(slot, gfn);
 }

 static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8e6bc51..e4815e9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -420,7 +420,6 @@ void kvm_release_page_dirty(struct page *page);
 void kvm_set_page_dirty(struct page *page);
 void kvm_set_page_accessed(struct page *page);

-pfn_t hva_to_pfn_atomic(unsigned long addr);
 pfn_t gfn_to_pfn_atomic(struct kvm *kvm, gfn_t gfn);
 pfn_t gfn_to_pfn_async(struct kvm *kvm, gfn_t gfn, bool *async,
 		       bool write_fault, bool *writable);
@@ -428,6 +427,8 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn);
 pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault,
 		      bool *writable);
 pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
+pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn);
+
 void kvm_release_pfn_dirty(pfn_t);
 void kvm_release_pfn_clean(pfn_t pfn);
 void kvm_set_pfn_dirty(pfn_t pfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7cb29bb..6c1e746 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1156,12 +1156,6 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 	return pfn;
 }

-pfn_t hva_to_pfn_atomic(unsigned long addr)
-{
-	return hva_to_pfn(addr, true, NULL, true, NULL);
-}
-EXPORT_SYMBOL_GPL(hva_to_pfn_atomic);
-
 static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, bool atomic, bool *async,
 			  bool write_fault, bool *writable)
 {
@@ -1211,6 +1205,14 @@ pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn)
 	return hva_to_pfn(addr, false, NULL, true, NULL);
 }

+pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+	unsigned long addr = gfn_to_hva_memslot(slot, gfn);
+
+	return hva_to_pfn(addr, true, NULL, true, NULL);
+}
+EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic);
+
 int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, struct page **pages,
 								  int nr_pages)
 {
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (2 preceding siblings ...)
  2012-07-17 14:41 ` [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
@ 2012-07-17 14:42 ` Xiao Guangrong
  2012-07-17 14:43 ` [PATCH 05/10] KVM: reorganize hva_to_pfn Xiao Guangrong
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:42 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

This set of functions is only used to read data from host space, in the
later patch, we will only get a readonly hva in gfn_to_hva_read, and
the function name is a good hint to let gfn_to_hva_read to pair with
kvm_read_hva()/kvm_read_hva_atomic()

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   29 +++++++++++++++++++++++++----
 1 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6c1e746..a89c7b8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1048,6 +1048,27 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(gfn_to_hva);

+/*
+ * The hva returned by this function is only allowed to be read.
+ * It should pair with kvm_read_hva() or kvm_read_hva_atomic().
+ */
+static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
+{
+	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
+}
+
+static int kvm_read_hva(void *data, void __user *hva, int len)
+{
+	return __copy_from_user(data, hva, len);
+
+}
+
+static int kvm_read_hva_atomic(void *data, void __user *hva, int len)
+{
+	return __copy_from_user_inatomic(data, hva, len);
+
+}
+
 pfn_t get_fault_pfn(void)
 {
 	get_page(fault_page);
@@ -1316,10 +1337,10 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	int r;
 	unsigned long addr;

-	addr = gfn_to_hva(kvm, gfn);
+	addr = gfn_to_hva_read(kvm, gfn);
 	if (kvm_is_error_hva(addr))
 		return -EFAULT;
-	r = __copy_from_user(data, (void __user *)addr + offset, len);
+	r = kvm_read_hva(data, (void __user *)addr + offset, len);
 	if (r)
 		return -EFAULT;
 	return 0;
@@ -1354,11 +1375,11 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data,
 	gfn_t gfn = gpa >> PAGE_SHIFT;
 	int offset = offset_in_page(gpa);

-	addr = gfn_to_hva(kvm, gfn);
+	addr = gfn_to_hva_read(kvm, gfn);
 	if (kvm_is_error_hva(addr))
 		return -EFAULT;
 	pagefault_disable();
-	r = __copy_from_user_inatomic(data, (void __user *)addr + offset, len);
+	r = kvm_read_hva_atomic(data, (void __user *)addr + offset, len);
 	pagefault_enable();
 	if (r)
 		return -EFAULT;
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/10] KVM: reorganize hva_to_pfn
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (3 preceding siblings ...)
  2012-07-17 14:42 ` [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
@ 2012-07-17 14:43 ` Xiao Guangrong
  2012-07-17 14:43 ` [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:43 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

We do too many things in hva_to_pfn, this patch reorganize the code,
let it be better readable

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |  161 +++++++++++++++++++++++++++++++--------------------
 1 files changed, 99 insertions(+), 62 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a89c7b8..54fb47b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1096,84 +1096,121 @@ static inline int check_user_page_hwpoison(unsigned long addr)
 	return rc == -EHWPOISON;
 }

-static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
-			bool write_fault, bool *writable)
+/*
+ * The atomic path to get the writable pfn which will be stored in @pfn,
+ * true indicates success, otherwise false is returned.
+ */
+static bool hva_to_pfn_fast(unsigned long addr, bool atomic, bool *async,
+			    bool write_fault, bool *writable, pfn_t *pfn)
 {
 	struct page *page[1];
-	int npages = 0;
-	pfn_t pfn;
+	int npages;

-	/* we can do it either atomically or asynchronously, not both */
-	BUG_ON(atomic && async);
+	if (!(async || atomic))
+		return false;

-	BUG_ON(!write_fault && !writable);
+	npages = __get_user_pages_fast(addr, 1, 1, page);
+	if (npages == 1) {
+		*pfn = page_to_pfn(page[0]);
+
+		if (writable)
+			*writable = true;
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * The slow path to get the pfn of the specified host virtual address,
+ * 1 indicates success, -errno is returned if error is detected.
+ */
+static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
+			   bool *writable, pfn_t *pfn)
+{
+	struct page *page[1];
+	int npages = 0;
+
+	might_sleep();

 	if (writable)
-		*writable = true;
+		*writable = write_fault;

-	if (atomic || async)
-		npages = __get_user_pages_fast(addr, 1, 1, page);
+	if (async) {
+		down_read(&current->mm->mmap_sem);
+		npages = get_user_page_nowait(current, current->mm,
+					      addr, write_fault, page);
+		up_read(&current->mm->mmap_sem);
+	} else
+		npages = get_user_pages_fast(addr, 1, write_fault,
+					     page);

-	if (unlikely(npages != 1) && !atomic) {
-		might_sleep();
+	if (npages != 1)
+		return npages;

-		if (writable)
-			*writable = write_fault;
-
-		if (async) {
-			down_read(&current->mm->mmap_sem);
-			npages = get_user_page_nowait(current, current->mm,
-						     addr, write_fault, page);
-			up_read(&current->mm->mmap_sem);
-		} else
-			npages = get_user_pages_fast(addr, 1, write_fault,
-						     page);
-
-		/* map read fault as writable if possible */
-		if (unlikely(!write_fault) && npages == 1) {
-			struct page *wpage[1];
-
-			npages = __get_user_pages_fast(addr, 1, 1, wpage);
-			if (npages == 1) {
-				*writable = true;
-				put_page(page[0]);
-				page[0] = wpage[0];
-			}
-			npages = 1;
+	/* map read fault as writable if possible */
+	if (unlikely(!write_fault)) {
+		struct page *wpage[1];
+
+		npages = __get_user_pages_fast(addr, 1, 1, wpage);
+		if (npages == 1) {
+			*writable = true;
+			put_page(page[0]);
+			page[0] = wpage[0];
 		}
+
+		npages = 1;
 	}
+	*pfn = page_to_pfn(page[0]);
+	return npages;
+}

-	if (unlikely(npages != 1)) {
-		struct vm_area_struct *vma;
+static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
+			bool write_fault, bool *writable)
+{
+	struct vm_area_struct *vma;
+	pfn_t pfn = 0;
+	int npages;

-		if (atomic)
-			return get_fault_pfn();
+	/* we can do it either atomically or asynchronously, not both */
+	BUG_ON(atomic && async);

-		down_read(&current->mm->mmap_sem);
-		if (npages == -EHWPOISON ||
-			(!async && check_user_page_hwpoison(addr))) {
-			up_read(&current->mm->mmap_sem);
-			get_page(hwpoison_page);
-			return page_to_pfn(hwpoison_page);
-		}
+	BUG_ON(!write_fault && !writable);

-		vma = find_vma_intersection(current->mm, addr, addr+1);
-
-		if (vma == NULL)
-			pfn = get_fault_pfn();
-		else if ((vma->vm_flags & VM_PFNMAP)) {
-			pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) +
-				vma->vm_pgoff;
-			BUG_ON(!kvm_is_mmio_pfn(pfn));
-		} else {
-			if (async && (vma->vm_flags & VM_WRITE))
-				*async = true;
-			pfn = get_fault_pfn();
-		}
-		up_read(&current->mm->mmap_sem);
-	} else
-		pfn = page_to_pfn(page[0]);
+	if (hva_to_pfn_fast(addr, atomic, async, write_fault, writable, &pfn))
+		return pfn;

+	if (atomic)
+		return get_fault_pfn();
+
+	npages = hva_to_pfn_slow(addr, async, write_fault, writable, &pfn);
+	if (npages == 1)
+		return pfn;
+
+	down_read(&current->mm->mmap_sem);
+	if (npages == -EHWPOISON ||
+	      (!async && check_user_page_hwpoison(addr))) {
+		get_page(hwpoison_page);
+		pfn = page_to_pfn(hwpoison_page);
+		goto exit;
+	}
+
+	vma = find_vma_intersection(current->mm, addr, addr + 1);
+
+	if (vma == NULL)
+		pfn = get_fault_pfn();
+	else if ((vma->vm_flags & VM_PFNMAP)) {
+		pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) +
+			vma->vm_pgoff;
+		BUG_ON(!kvm_is_mmio_pfn(pfn));
+	} else {
+		if (async && (vma->vm_flags & VM_WRITE))
+			*async = true;
+		pfn = get_fault_pfn();
+	}
+
+exit:
+	up_read(&current->mm->mmap_sem);
 	return pfn;
 }

-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (4 preceding siblings ...)
  2012-07-17 14:43 ` [PATCH 05/10] KVM: reorganize hva_to_pfn Xiao Guangrong
@ 2012-07-17 14:43 ` Xiao Guangrong
  2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:43 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

In current code, we always map writable pfn for the read-fault, in order
to support readonly memslot, we map writable pfn only if 'writable'
is not NULL

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   24 +++++++++++++++++++++++-
 1 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 54fb47b..e9eab07 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1109,6 +1109,14 @@ static bool hva_to_pfn_fast(unsigned long addr, bool atomic, bool *async,
 	if (!(async || atomic))
 		return false;

+	/*
+	 * Fast pin a writable pfn only if it is a write fault request
+	 * or the caller allows to map a writable pfn for a read fault
+	 * request.
+	 */
+	if (!(write_fault || writable))
+		return false;
+
 	npages = __get_user_pages_fast(addr, 1, 1, page);
 	if (npages == 1) {
 		*pfn = page_to_pfn(page[0]);
@@ -1149,7 +1157,7 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 		return npages;

 	/* map read fault as writable if possible */
-	if (unlikely(!write_fault)) {
+	if (unlikely(!write_fault) && writable) {
 		struct page *wpage[1];

 		npages = __get_user_pages_fast(addr, 1, 1, wpage);
@@ -1165,6 +1173,20 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 	return npages;
 }

+/*
+ * Pin guest page in memory and return its pfn.
+ * @addr: host virtual address which maps memory to the guest
+ * @atomic: whether this function can sleep
+ * @async: whether this function need to wait IO complete if the
+ *         host page is not in the memory
+ * @write_fault: whether we should get a writable host page
+ * @writable: whether it allows to map a writable host page for !@write_fault
+ *
+ * The function will map a writable host page for these two cases:
+ * 1): @write_fault = true
+ * 2): @write_fault = false && @writable, @writable will tell the caller
+ *     whether the mapping is writable.
+ */
 static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 			bool write_fault, bool *writable)
 {
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/10] KVM: introduce readonly_fault_pfn
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (5 preceding siblings ...)
  2012-07-17 14:43 ` [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
@ 2012-07-17 14:44 ` Xiao Guangrong
  2012-07-19 10:15   ` Avi Kivity
  2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:44 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Introduce readonly_fault_pfn, in the later patch, it indicates failure
when we try to get a writable pfn from the readonly memslot

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 include/linux/kvm_host.h |    1 +
 virt/kvm/kvm_main.c      |   92 +++++++++++++++++++++++++++------------------
 2 files changed, 56 insertions(+), 37 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4815e9..a2302e7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -385,6 +385,7 @@ extern struct page *bad_page;
 int is_error_page(struct page *page);
 int is_error_pfn(pfn_t pfn);
 int is_hwpoison_pfn(pfn_t pfn);
+int is_readonly_fault_pfn(pfn_t pfn);
 int is_noslot_pfn(pfn_t pfn);
 int is_invalid_pfn(pfn_t pfn);
 int kvm_is_error_hva(unsigned long addr);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e9eab07..b70f1a4 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -109,6 +109,9 @@ static pfn_t hwpoison_pfn;
 static struct page *fault_page;
 static pfn_t fault_pfn;

+static struct page *readonly_fault_page;
+static pfn_t readonly_fault_pfn;
+
 inline int kvm_is_mmio_pfn(pfn_t pfn)
 {
 	if (pfn_valid(pfn)) {
@@ -949,13 +952,15 @@ EXPORT_SYMBOL_GPL(kvm_disable_largepages);

 int is_error_page(struct page *page)
 {
-	return page == bad_page || page == hwpoison_page || page == fault_page;
+	return page == bad_page || page == hwpoison_page || page == fault_page
+		|| page == readonly_fault_page;
 }
 EXPORT_SYMBOL_GPL(is_error_page);

 int is_error_pfn(pfn_t pfn)
 {
-	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn;
+	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn
+		|| pfn == readonly_fault_pfn;
 }
 EXPORT_SYMBOL_GPL(is_error_pfn);

@@ -965,6 +970,12 @@ int is_hwpoison_pfn(pfn_t pfn)
 }
 EXPORT_SYMBOL_GPL(is_hwpoison_pfn);

+int is_readonly_fault_pfn(pfn_t pfn)
+{
+	return pfn == readonly_fault_pfn;
+}
+EXPORT_SYMBOL_GPL(is_readonly_fault_pfn);
+
 int is_noslot_pfn(pfn_t pfn)
 {
 	return pfn == bad_pfn;
@@ -973,7 +984,8 @@ EXPORT_SYMBOL_GPL(is_noslot_pfn);

 int is_invalid_pfn(pfn_t pfn)
 {
-	return pfn == hwpoison_pfn || pfn == fault_pfn;
+	return pfn == hwpoison_pfn || pfn == fault_pfn ||
+			pfn == readonly_fault_pfn;
 }
 EXPORT_SYMBOL_GPL(is_invalid_pfn);

@@ -1076,6 +1088,12 @@ pfn_t get_fault_pfn(void)
 }
 EXPORT_SYMBOL_GPL(get_fault_pfn);

+static pfn_t get_readonly_fault_pfn(void)
+{
+	get_page(readonly_fault_page);
+	return readonly_fault_pfn;
+}
+
 int get_user_page_nowait(struct task_struct *tsk, struct mm_struct *mm,
 	unsigned long start, int write, struct page **page)
 {
@@ -2809,42 +2827,49 @@ static void kvm_sched_out(struct preempt_notifier *pn,
 	kvm_arch_vcpu_put(vcpu);
 }

-int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
-		  struct module *module)
+static void kvm_uninit_dummy_pages(void)
 {
-	int r;
-	int cpu;
-
-	r = kvm_arch_init(opaque);
-	if (r)
-		goto out_fail;
+	if (fault_page)
+		__free_page(fault_page);
+	if (readonly_fault_page)
+		__free_page(readonly_fault_page);
+	if (hwpoison_page)
+		__free_page(hwpoison_page);
+	if (bad_page)
+		__free_page(bad_page);
+}

+static int kvm_init_dummy_pages(void)
+{
 	bad_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	hwpoison_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	fault_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	readonly_fault_page = alloc_page(GFP_KERNEL | __GFP_ZERO);

-	if (bad_page == NULL) {
-		r = -ENOMEM;
-		goto out;
-	}
+	if (!bad_page || !hwpoison_page || !fault_page || !readonly_fault_page)
+		return -ENOMEM;

 	bad_pfn = page_to_pfn(bad_page);
+	hwpoison_pfn = page_to_pfn(hwpoison_page);
+	fault_pfn = page_to_pfn(fault_page);
+	readonly_fault_pfn = page_to_pfn(readonly_fault_page);

-	hwpoison_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
-
-	if (hwpoison_page == NULL) {
-		r = -ENOMEM;
-		goto out_free_0;
-	}
+	return 0;
+}

-	hwpoison_pfn = page_to_pfn(hwpoison_page);
+int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
+		  struct module *module)
+{
+	int r;
+	int cpu;

-	fault_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	r = kvm_arch_init(opaque);
+	if (r)
+		goto out_fail;

-	if (fault_page == NULL) {
-		r = -ENOMEM;
+	r = kvm_init_dummy_pages();
+	if (r)
 		goto out_free_0;
-	}
-
-	fault_pfn = page_to_pfn(fault_page);

 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
 		r = -ENOMEM;
@@ -2920,12 +2945,7 @@ out_free_1:
 out_free_0a:
 	free_cpumask_var(cpus_hardware_enabled);
 out_free_0:
-	if (fault_page)
-		__free_page(fault_page);
-	if (hwpoison_page)
-		__free_page(hwpoison_page);
-	__free_page(bad_page);
-out:
+	kvm_uninit_dummy_pages();
 	kvm_arch_exit();
 out_fail:
 	return r;
@@ -2945,8 +2965,6 @@ void kvm_exit(void)
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
 	free_cpumask_var(cpus_hardware_enabled);
-	__free_page(fault_page);
-	__free_page(hwpoison_page);
-	__free_page(bad_page);
+	kvm_uninit_dummy_pages();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/10] KVM: introduce readonly_bad_hva
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (6 preceding siblings ...)
  2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
@ 2012-07-17 14:45 ` Xiao Guangrong
  2012-07-19 10:16   ` Avi Kivity
  2012-07-17 14:45 ` [PATCH 09/10] KVM: introduce readonly memslot Xiao Guangrong
  2012-07-17 14:46 ` [PATCH 10/10] KVM: indicate readonly access fault Xiao Guangrong
  9 siblings, 1 reply; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:45 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

In the later patch, it indicates failure when we try to get a writable
hva from the readonly slot

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 virt/kvm/kvm_main.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b70f1a4..c056736 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -994,9 +994,19 @@ static inline unsigned long bad_hva(void)
 	return PAGE_OFFSET;
 }

+static inline unsigned long readonly_bad_hva(void)
+{
+	return PAGE_OFFSET + PAGE_SIZE;
+}
+
+static int kvm_is_readonly_bad_hva(unsigned long addr)
+{
+	return addr == readonly_bad_hva();
+}
+
 int kvm_is_error_hva(unsigned long addr)
 {
-	return addr == bad_hva();
+	return addr == bad_hva() || kvm_is_readonly_bad_hva(addr);
 }
 EXPORT_SYMBOL_GPL(kvm_is_error_hva);

-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/10] KVM: introduce readonly memslot
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (7 preceding siblings ...)
  2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
@ 2012-07-17 14:45 ` Xiao Guangrong
  2012-07-17 14:46 ` [PATCH 10/10] KVM: indicate readonly access fault Xiao Guangrong
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:45 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault-pfn
and async is not allowed, then the vm will crash

We introduce readonly memory region to map ROM/ROMD to the guest, read access
is happy for readonly memslot, write access on readonly memslot will cause
KVM_EXIT_MMIO exit

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 Documentation/virtual/kvm/api.txt |   10 +++-
 arch/x86/include/asm/kvm.h        |    1 +
 arch/x86/kvm/mmu.c                |   10 ++++
 arch/x86/kvm/x86.c                |    1 +
 include/linux/kvm.h               |    6 ++-
 virt/kvm/kvm_main.c               |   84 ++++++++++++++++++++++++++++--------
 6 files changed, 89 insertions(+), 23 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 310fe50..4b3d3f1 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -857,7 +857,8 @@ struct kvm_userspace_memory_region {
 };

 /* for kvm_memory_region::flags */
-#define KVM_MEM_LOG_DIRTY_PAGES  1UL
+#define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
+#define KVM_MEM_READONLY	(1UL << 1)

 This ioctl allows the user to create or modify a guest physical memory
 slot.  When changing an existing slot, it may be moved in the guest
@@ -873,9 +874,12 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
 be identical.  This allows large pages in the guest to be backed by large
 pages in the host.

-The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
+The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which
 instructs kvm to keep track of writes to memory within the slot.  See
-the KVM_GET_DIRTY_LOG ioctl.
+the KVM_GET_DIRTY_LOG ioctl. Another flag is KVM_MEM_READONLY when the
+KVM_CAP_READONLY_MEM capability, it indicates the guest memory is read-only,
+that means, guest is only allowed to read it. Writes will be posted to
+userspace as KVM_EXIT_MMIO exits.

 When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
 region are automatically reflected into the guest.  For example, an mmap()
diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
index 246617e..521bf25 100644
--- a/arch/x86/include/asm/kvm.h
+++ b/arch/x86/include/asm/kvm.h
@@ -25,6 +25,7 @@
 #define __KVM_HAVE_DEBUGREGS
 #define __KVM_HAVE_XSAVE
 #define __KVM_HAVE_XCRS
+#define __KVM_HAVE_READONLY_MEM

 /* Architectural interrupt line count. */
 #define KVM_NR_INTERRUPTS 256
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 13d3c69..d4eee8e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2618,6 +2618,16 @@ static void kvm_send_hwpoison_signal(unsigned long address, struct task_struct *
 static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn)
 {
 	kvm_release_pfn_clean(pfn);
+
+	/*
+	 * Do not cache the mmio info caused by writing the readonly gfn
+	 * into the spte otherwise read access on readonly gfn also can
+	 * caused mmio page fault and treat it as mmio access.
+	 * Return 1 to tell kvm to emulate it.
+	 */
+	if (is_readonly_fault_pfn(pfn))
+		return 1;
+
 	if (is_hwpoison_pfn(pfn)) {
 		kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current);
 		return 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8171836..46e13a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2153,6 +2153,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_GET_TSC_KHZ:
 	case KVM_CAP_PCI_2_3:
 	case KVM_CAP_KVMCLOCK_CTRL:
+	case KVM_CAP_READONLY_MEM:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index dc3aa2a..94867d0 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -102,7 +102,8 @@ struct kvm_userspace_memory_region {
 };

 /* for kvm_memory_region::flags */
-#define KVM_MEM_LOG_DIRTY_PAGES  1UL
+#define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
+#define KVM_MEM_READONLY	(1UL << 1)

 /* for KVM_IRQ_LINE */
 struct kvm_irq_level {
@@ -617,6 +618,9 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_PPC_GET_SMMU_INFO 78
 #define KVM_CAP_S390_COW 79
 #define KVM_CAP_PPC_ALLOC_HTAB 80
+#ifdef __KVM_HAVE_READONLY_MEM
+#define KVM_CAP_READONLY_MEM 81
+#endif

 #ifdef KVM_CAP_IRQ_ROUTING

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c056736..50e18c0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -694,7 +694,13 @@ void update_memslots(struct kvm_memslots *slots, struct kvm_memory_slot *new)

 static int check_memory_region_flags(struct kvm_userspace_memory_region *mem)
 {
-	if (mem->flags & ~KVM_MEM_LOG_DIRTY_PAGES)
+	u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES;
+
+#ifdef KVM_CAP_READONLY_MEM
+	valid_flags |= KVM_MEM_READONLY;
+#endif
+
+	if (mem->flags & ~valid_flags)
 		return -EINVAL;

 	return 0;
@@ -1052,18 +1058,32 @@ out:
 	return size;
 }

-static unsigned long gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
-				     gfn_t *nr_pages)
+static bool memslot_is_readonly(struct kvm_memory_slot *slot)
+{
+	return slot->flags & KVM_MEM_READONLY;
+}
+
+static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
+				     gfn_t *nr_pages, bool write)
 {
 	if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
 		return bad_hva();

+	if (memslot_is_readonly(slot) && write)
+		return readonly_bad_hva();
+
 	if (nr_pages)
 		*nr_pages = slot->npages - (gfn - slot->base_gfn);

 	return gfn_to_hva_memslot(slot, gfn);
 }

+static unsigned long gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
+				     gfn_t *nr_pages)
+{
+	return __gfn_to_hva_many(slot, gfn, nr_pages, true);
+}
+
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 {
 	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
@@ -1076,7 +1096,7 @@ EXPORT_SYMBOL_GPL(gfn_to_hva);
  */
 static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
 {
-	return gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL);
+	return __gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL, false);
 }

 static int kvm_read_hva(void *data, void __user *hva, int len)
@@ -1201,6 +1221,17 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 	return npages;
 }

+static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
+{
+	if (unlikely(!(vma->vm_flags & VM_READ)))
+		return false;
+
+	if (write_fault && (unlikely(!(vma->vm_flags & VM_WRITE))))
+		return false;
+
+	return true;
+}
+
 /*
  * Pin guest page in memory and return its pfn.
  * @addr: host virtual address which maps memory to the guest
@@ -1225,8 +1256,6 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 	/* we can do it either atomically or asynchronously, not both */
 	BUG_ON(atomic && async);

-	BUG_ON(!write_fault && !writable);
-
 	if (hva_to_pfn_fast(addr, atomic, async, write_fault, writable, &pfn))
 		return pfn;

@@ -1254,7 +1283,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async,
 			vma->vm_pgoff;
 		BUG_ON(!kvm_is_mmio_pfn(pfn));
 	} else {
-		if (async && (vma->vm_flags & VM_WRITE))
+		if (async && vma_is_valid(vma, write_fault))
 			*async = true;
 		pfn = get_fault_pfn();
 	}
@@ -1264,21 +1293,41 @@ exit:
 	return pfn;
 }

-static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, bool atomic, bool *async,
-			  bool write_fault, bool *writable)
+static pfn_t
+__gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn, bool atomic,
+		     bool *async, bool write_fault, bool *writable)
 {
-	unsigned long addr;
+	unsigned long addr = __gfn_to_hva_many(slot, gfn, NULL, write_fault);

-	if (async)
-		*async = false;
+	if (kvm_is_readonly_bad_hva(addr))
+		return get_readonly_fault_pfn();

-	addr = gfn_to_hva(kvm, gfn);
 	if (kvm_is_error_hva(addr)) {
 		get_page(bad_page);
 		return page_to_pfn(bad_page);
 	}

-	return hva_to_pfn(addr, atomic, async, write_fault, writable);
+	/* Do not map writable pfn in the readonly memslot. */
+	if (writable && memslot_is_readonly(slot))
+		writable = NULL;
+
+	return hva_to_pfn(addr, atomic, async, write_fault,
+			  writable);
+}
+
+
+static pfn_t __gfn_to_pfn(struct kvm *kvm, gfn_t gfn, bool atomic, bool *async,
+			  bool write_fault, bool *writable)
+{
+	struct kvm_memory_slot *slot;
+
+	if (async)
+		*async = false;
+
+	slot = gfn_to_memslot(kvm, gfn);
+
+	return __gfn_to_pfn_memslot(slot, gfn, atomic, async, write_fault,
+				    writable);
 }

 pfn_t gfn_to_pfn_atomic(struct kvm *kvm, gfn_t gfn)
@@ -1309,15 +1358,12 @@ EXPORT_SYMBOL_GPL(gfn_to_pfn_prot);

 pfn_t gfn_to_pfn_memslot(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-	unsigned long addr = gfn_to_hva_memslot(slot, gfn);
-	return hva_to_pfn(addr, false, NULL, true, NULL);
+	return __gfn_to_pfn_memslot(slot, gfn, false, NULL, true, NULL);
 }

 pfn_t gfn_to_pfn_memslot_atomic(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-	unsigned long addr = gfn_to_hva_memslot(slot, gfn);
-
-	return hva_to_pfn(addr, true, NULL, true, NULL);
+	return __gfn_to_pfn_memslot(slot, gfn, true, NULL, true, NULL);
 }
 EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic);

-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/10] KVM: indicate readonly access fault
  2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
                   ` (8 preceding siblings ...)
  2012-07-17 14:45 ` [PATCH 09/10] KVM: introduce readonly memslot Xiao Guangrong
@ 2012-07-17 14:46 ` Xiao Guangrong
  9 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-17 14:46 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Avi Kivity, Marcelo Tosatti, LKML, KVM

Introduce write_readonly_mem in mmio-exit-info to indicate this exit is
caused by write access on readonly memslot

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c       |   12 ++++++++----
 include/linux/kvm.h      |    3 +++
 include/linux/kvm_host.h |    1 +
 virt/kvm/kvm_main.c      |    3 +++
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 46e13a1..cc5f8f0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3703,9 +3703,10 @@ int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,

 	ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes);
 	if (ret < 0)
-		return 0;
+		return ret;
+
 	kvm_mmu_pte_write(vcpu, gpa, val, bytes);
-	return 1;
+	return 0;
 }

 struct read_write_emulator_ops {
@@ -3735,7 +3736,7 @@ static int read_prepare(struct kvm_vcpu *vcpu, void *val, int bytes)
 static int read_emulate(struct kvm_vcpu *vcpu, gpa_t gpa,
 			void *val, int bytes)
 {
-	return !kvm_read_guest(vcpu->kvm, gpa, val, bytes);
+	return kvm_read_guest(vcpu->kvm, gpa, val, bytes);
 }

 static int write_emulate(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -3797,7 +3798,8 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	if (ret)
 		goto mmio;

-	if (ops->read_write_emulate(vcpu, gpa, val, bytes))
+	ret = ops->read_write_emulate(vcpu, gpa, val, bytes);
+	if (!ret)
 		return X86EMUL_CONTINUE;

 mmio:
@@ -3819,6 +3821,7 @@ mmio:
 		frag->gpa = gpa;
 		frag->data = val;
 		frag->len = now;
+		frag->write_readonly_mem = (ret == -EPERM);

 		gpa += now;
 		val += now;
@@ -3836,6 +3839,7 @@ static void set_mmio_exit_info(struct kvm_vcpu *vcpu,
 	run->mmio.phys_addr = frag->gpa;
 	run->mmio.len = frag->len;
 	run->mmio.is_write = vcpu->mmio_is_write = write;
+	run->mmio.write_readonly_mem = frag->write_readonly_mem;

 	if (write)
 		memcpy(run->mmio.data, frag->data, frag->len);
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 94867d0..9261541 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -222,6 +222,9 @@ struct kvm_run {
 			__u8  data[8];
 			__u32 len;
 			__u8  is_write;
+#ifdef __KVM_HAVE_READONLY_MEM
+			__u8 write_readonly_mem;
+#endif
 		} mmio;
 		/* KVM_EXIT_HYPERCALL */
 		struct {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a2302e7..7e45014 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -142,6 +142,7 @@ struct kvm_mmio_fragment {
 	gpa_t gpa;
 	void *data;
 	unsigned len;
+	bool write_readonly_mem;
 };

 struct kvm_vcpu {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 50e18c0..8d4bc55 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1527,6 +1527,9 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	unsigned long addr;

 	addr = gfn_to_hva(kvm, gfn);
+	if (kvm_is_readonly_bad_hva(addr))
+		return -EPERM;
+
 	if (kvm_is_error_hva(addr))
 		return -EFAULT;
 	r = __copy_to_user((void __user *)addr + offset, data, len);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 07/10] KVM: introduce readonly_fault_pfn
  2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
@ 2012-07-19 10:15   ` Avi Kivity
  2012-07-20  2:56     ` Xiao Guangrong
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2012-07-19 10:15 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, LKML, KVM

On 07/17/2012 05:44 PM, Xiao Guangrong wrote:
> Introduce readonly_fault_pfn, in the later patch, it indicates failure
> when we try to get a writable pfn from the readonly memslot
> 
> +
>  inline int kvm_is_mmio_pfn(pfn_t pfn)
>  {
>  	if (pfn_valid(pfn)) {
> @@ -949,13 +952,15 @@ EXPORT_SYMBOL_GPL(kvm_disable_largepages);
> 
>  int is_error_page(struct page *page)
>  {
> -	return page == bad_page || page == hwpoison_page || page == fault_page;
> +	return page == bad_page || page == hwpoison_page || page == fault_page
> +		|| page == readonly_fault_page;

All those checks are slow, and get_page(fault_page) etc. isn't very
scalable.

We should move to ERR_PTR() etc.  We could use ENOENT, EHWPOISON,
EFAULT, and EROFS for the above, or maybe there are better matches.

>  }
>  EXPORT_SYMBOL_GPL(is_error_page);
> 
>  int is_error_pfn(pfn_t pfn)
>  {
> -	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn;
> +	return pfn == bad_pfn || pfn == hwpoison_pfn || pfn == fault_pfn
> +		|| pfn == readonly_fault_pfn;
>  }

And a similar change here.
-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 08/10] KVM: introduce readonly_bad_hva
  2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
@ 2012-07-19 10:16   ` Avi Kivity
  2012-07-20  3:01     ` Xiao Guangrong
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2012-07-19 10:16 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, LKML, KVM

On 07/17/2012 05:45 PM, Xiao Guangrong wrote:
> In the later patch, it indicates failure when we try to get a writable
> hva from the readonly slot
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  virt/kvm/kvm_main.c |   12 +++++++++++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b70f1a4..c056736 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -994,9 +994,19 @@ static inline unsigned long bad_hva(void)
>  	return PAGE_OFFSET;
>  }
> 
> +static inline unsigned long readonly_bad_hva(void)
> +{
> +	return PAGE_OFFSET + PAGE_SIZE;
> +}
> +
> +static int kvm_is_readonly_bad_hva(unsigned long addr)
> +{
> +	return addr == readonly_bad_hva();
> +}
> +
>  int kvm_is_error_hva(unsigned long addr)
>  {
> -	return addr == bad_hva();
> +	return addr == bad_hva() || kvm_is_readonly_bad_hva(addr);
>  }

addr >= PAGE_OFFSET.  Or change it to use -E*.



-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 07/10] KVM: introduce readonly_fault_pfn
  2012-07-19 10:15   ` Avi Kivity
@ 2012-07-20  2:56     ` Xiao Guangrong
  0 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-20  2:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

On 07/19/2012 06:15 PM, Avi Kivity wrote:
> On 07/17/2012 05:44 PM, Xiao Guangrong wrote:
>> Introduce readonly_fault_pfn, in the later patch, it indicates failure
>> when we try to get a writable pfn from the readonly memslot
>>
>> +
>>  inline int kvm_is_mmio_pfn(pfn_t pfn)
>>  {
>>  	if (pfn_valid(pfn)) {
>> @@ -949,13 +952,15 @@ EXPORT_SYMBOL_GPL(kvm_disable_largepages);
>>
>>  int is_error_page(struct page *page)
>>  {
>> -	return page == bad_page || page == hwpoison_page || page == fault_page;
>> +	return page == bad_page || page == hwpoison_page || page == fault_page
>> +		|| page == readonly_fault_page;
> 
> All those checks are slow, and get_page(fault_page) etc. isn't very
> scalable.
> 
> We should move to ERR_PTR() etc.  We could use ENOENT, EHWPOISON,
> EFAULT, and EROFS for the above, or maybe there are better matches.
> 

Good point. I will do it in the next version, thank you, Avi!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 08/10] KVM: introduce readonly_bad_hva
  2012-07-19 10:16   ` Avi Kivity
@ 2012-07-20  3:01     ` Xiao Guangrong
  0 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2012-07-20  3:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, LKML, KVM

On 07/19/2012 06:16 PM, Avi Kivity wrote:
> On 07/17/2012 05:45 PM, Xiao Guangrong wrote:
>> In the later patch, it indicates failure when we try to get a writable
>> hva from the readonly slot
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
>> ---
>>  virt/kvm/kvm_main.c |   12 +++++++++++-
>>  1 files changed, 11 insertions(+), 1 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index b70f1a4..c056736 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -994,9 +994,19 @@ static inline unsigned long bad_hva(void)
>>  	return PAGE_OFFSET;
>>  }
>>
>> +static inline unsigned long readonly_bad_hva(void)
>> +{
>> +	return PAGE_OFFSET + PAGE_SIZE;
>> +}
>> +
>> +static int kvm_is_readonly_bad_hva(unsigned long addr)
>> +{
>> +	return addr == readonly_bad_hva();
>> +}
>> +
>>  int kvm_is_error_hva(unsigned long addr)
>>  {
>> -	return addr == bad_hva();
>> +	return addr == bad_hva() || kvm_is_readonly_bad_hva(addr);
>>  }
> 
> addr >= PAGE_OFFSET.  Or change it to use -E*.

I prefer to the first one, addr >= PAGE_OFFSET, all virtual addresses
between 0 and (~0ULL) are valid, Using PAGE_OFFSET is more readable.

[ is_error_pfn is suitable to use -err because the the range of physical
  address is always limited, for example,  0 ~ 64G on x86.]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-07-20  3:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-17 14:39 [PATCH 00/10 v4] KVM: introduce readonly memslot Xiao Guangrong
2012-07-17 14:40 ` [PATCH 01/10] KVM: fix missing check for memslot flags Xiao Guangrong
2012-07-17 14:41 ` [PATCH 02/10] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
2012-07-17 14:41 ` [PATCH 03/10] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
2012-07-17 14:42 ` [PATCH 04/10] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
2012-07-17 14:43 ` [PATCH 05/10] KVM: reorganize hva_to_pfn Xiao Guangrong
2012-07-17 14:43 ` [PATCH 06/10] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
2012-07-17 14:44 ` [PATCH 07/10] KVM: introduce readonly_fault_pfn Xiao Guangrong
2012-07-19 10:15   ` Avi Kivity
2012-07-20  2:56     ` Xiao Guangrong
2012-07-17 14:45 ` [PATCH 08/10] KVM: introduce readonly_bad_hva Xiao Guangrong
2012-07-19 10:16   ` Avi Kivity
2012-07-20  3:01     ` Xiao Guangrong
2012-07-17 14:45 ` [PATCH 09/10] KVM: introduce readonly memslot Xiao Guangrong
2012-07-17 14:46 ` [PATCH 10/10] KVM: indicate readonly access fault Xiao Guangrong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).