public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Anthony Liguori <anthony-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
Cc: kvm-devel <kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: [PATCH 0/2] Swapping:fixing the problem with unsleepable place of kvm and gfn_to_page
Date: Tue, 16 Oct 2007 16:31:34 +0200	[thread overview]
Message-ID: <4714CB46.8030208@qumranet.com> (raw)
In-Reply-To: <4713E7D0.7010000-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 958 bytes --]

Anthony Liguori wrote:
> Very nice!
>
> This series seems to have solved the problem for me.
>
> Izik Eidus wrote:
>> this patch remove all the ugly messages from the dmesg,
>> now i think my patch together with Anthony changes can be merged to kvm.
>
> Perhaps you can send out all of the patches together in a single patch 
> bomb for one final review?  Right now I'm using patches from a few 
> different sets.  I'll also run some tests on it.
>
> Also, did your updated 3/4 patch fix the Fedora rmap problem?
>
> Regards,
>
> Anthony Liguori
>
>> thanks to Anthony with his help in working on it.
>>
>
ok, the patch that change the injection of the interrupts was not 
applied, avi didnt like this and wrote better one.
the second patch was applied.

here are the patchs (with fix to some small issues, as well as a fix for 
the rmap problem with fedora livecd )

i hope we will be able to merge it as soon as possible with your older 
userspace patchs.

[-- Attachment #2: 0001-this-add-function-that-allow-walking-on-the-rmap.patch --]
[-- Type: text/x-patch, Size: 2239 bytes --]

>From 58eeac7fa5a25d6b1ae5da38f76cae1b20b68c0a Mon Sep 17 00:00:00 2001
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Date: Tue, 16 Oct 2007 14:42:30 +0200
Subject: [PATCH] this add function that allow walking on the rmap.
this is needed for making all the present shadow pages rmapped

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |   45 +++++++++++++++++++++++++++++++++++----------
 1 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index f52604a..14e54e3 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -456,28 +456,53 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	}
 }
 
-static void rmap_write_protect(struct kvm *kvm, u64 gfn)
+static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
 {
 	struct kvm_rmap_desc *desc;
+	struct kvm_rmap_desc *prev_desc;
+	u64 *prev_spte;
+	int i;
+
+	if (!*rmapp)
+		return NULL;
+	else if (!(*rmapp & 1)) {
+		if (!spte)
+			return (u64 *)*rmapp;
+		return NULL;
+	}
+	desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
+	prev_desc = NULL;
+	prev_spte = NULL;
+	while (desc) {
+		for (i = 0; i < RMAP_EXT && desc->shadow_ptes[i]; ++i) {
+			if (prev_spte == spte)
+				return desc->shadow_ptes[i];
+			prev_spte = desc->shadow_ptes[i];
+		}
+		desc = desc->more;
+	}
+	return NULL;
+}
+
+static void rmap_write_protect(struct kvm *kvm, u64 gfn)
+{
 	unsigned long *rmapp;
 	u64 *spte;
+	u64 *prev_spte;
 
 	gfn = unalias_gfn(kvm, gfn);
 	rmapp = gfn_to_rmap(kvm, gfn);
 
-	while (*rmapp) {
-		if (!(*rmapp & 1))
-			spte = (u64 *)*rmapp;
-		else {
-			desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
-			spte = desc->shadow_ptes[0];
-		}
+	spte = rmap_next(kvm, rmapp, NULL);
+	while (spte) {
 		BUG_ON(!spte);
 		BUG_ON(!(*spte & PT_PRESENT_MASK));
 		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
-		rmap_remove(kvm, spte);
-		set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
+		prev_spte = spte;
+		spte = rmap_next(kvm, rmapp, spte);
+		rmap_remove(kvm, prev_spte);
+		set_shadow_pte(prev_spte, *prev_spte & ~PT_WRITABLE_MASK);
 		kvm_flush_remote_tlbs(kvm);
 	}
 }
-- 
1.5.2.4


[-- Attachment #3: 0002-this-make-the-rmap-keep-reverse-mapping-over-all-the.patch --]
[-- Type: text/x-patch, Size: 2652 bytes --]

>From 2c89280a656b1949fbd7943f754fdd7cce3cf2ac Mon Sep 17 00:00:00 2001
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Date: Tue, 16 Oct 2007 14:43:46 +0200
Subject: [PATCH] this make the rmap keep reverse mapping over all the present shadow pages

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |   23 +++++++++++------------
 1 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 14e54e3..bbf5eb4 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -211,8 +211,8 @@ static int is_io_pte(unsigned long pte)
 
 static int is_rmap_pte(u64 pte)
 {
-	return (pte & (PT_WRITABLE_MASK | PT_PRESENT_MASK))
-		== (PT_WRITABLE_MASK | PT_PRESENT_MASK);
+	return pte != shadow_trap_nonpresent_pte
+		&& pte != shadow_notrap_nonpresent_pte;
 }
 
 static void set_shadow_pte(u64 *sptep, u64 spte)
@@ -488,7 +488,6 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn)
 {
 	unsigned long *rmapp;
 	u64 *spte;
-	u64 *prev_spte;
 
 	gfn = unalias_gfn(kvm, gfn);
 	rmapp = gfn_to_rmap(kvm, gfn);
@@ -497,13 +496,11 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn)
 	while (spte) {
 		BUG_ON(!spte);
 		BUG_ON(!(*spte & PT_PRESENT_MASK));
-		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
-		prev_spte = spte;
-		spte = rmap_next(kvm, rmapp, spte);
-		rmap_remove(kvm, prev_spte);
-		set_shadow_pte(prev_spte, *prev_spte & ~PT_WRITABLE_MASK);
+		if (is_writeble_pte(*spte))
+			set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
 		kvm_flush_remote_tlbs(kvm);
+		spte = rmap_next(kvm, rmapp, spte);
 	}
 }
 
@@ -908,14 +905,18 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 		table = __va(table_addr);
 
 		if (level == 1) {
+			int was_rmapped;
+
 			pte = table[index];
+			was_rmapped = is_rmap_pte(pte);
 			if (is_shadow_present_pte(pte) && is_writeble_pte(pte))
 				return 0;
 			mark_page_dirty(vcpu->kvm, v >> PAGE_SHIFT);
 			page_header_update_slot(vcpu->kvm, table, v);
 			table[index] = p | PT_PRESENT_MASK | PT_WRITABLE_MASK |
 								PT_USER_MASK;
-			rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
+			if (!was_rmapped)
+				rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
 			return 0;
 		}
 
@@ -1424,10 +1425,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 		pt = page->spt;
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
 			/* avoid RMW */
-			if (pt[i] & PT_WRITABLE_MASK) {
-				rmap_remove(kvm, &pt[i]);
+			if (pt[i] & PT_WRITABLE_MASK)
 				pt[i] &= ~PT_WRITABLE_MASK;
-			}
 	}
 }
 
-- 
1.5.2.4


[-- Attachment #4: 0003-change-gfn_to_page-to-be-safe-always-function.patch --]
[-- Type: text/x-patch, Size: 6622 bytes --]

>From 36e3e9054a90006b4ce707bce76becfc85a504fb Mon Sep 17 00:00:00 2001
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Date: Tue, 16 Oct 2007 14:46:10 +0200
Subject: [PATCH] change gfn_to_page to be safe always function.

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    3 ++-
 drivers/kvm/kvm_main.c    |   26 ++++++++++++++------------
 drivers/kvm/mmu.c         |   16 +++++-----------
 drivers/kvm/paging_tmpl.h |   11 +++--------
 4 files changed, 24 insertions(+), 32 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 1edf8a5..18bc4df 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -565,8 +565,9 @@ static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva);
 struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
 
-extern hpa_t bad_page_address;
+extern struct page *bad_page;
 
+int is_error_page(struct page *page);
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 86cdd83..aebc822 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1013,6 +1013,12 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	return r;
 }
 
+int is_error_page(struct page *page)
+{
+	return page == bad_page;
+}
+EXPORT_SYMBOL_GPL(is_error_page);
+
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
@@ -1054,7 +1060,7 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
 	if (!slot)
-		return NULL;
+		return bad_page;
 	return slot->phys_mem[gfn - slot->base_gfn];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
@@ -1074,7 +1080,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -1112,7 +1118,7 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -1150,7 +1156,7 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -3078,7 +3084,7 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
 	page = gfn_to_page(kvm, pgoff);
-	if (!page)
+	if (is_error_page(page))
 		return NOPAGE_SIGBUS;
 	get_page(page);
 	if (type != NULL)
@@ -3393,7 +3399,7 @@ static struct sys_device kvm_sysdev = {
 	.cls = &kvm_sysdev_class,
 };
 
-hpa_t bad_page_address;
+struct page *bad_page;
 
 static inline
 struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
@@ -3522,7 +3528,6 @@ EXPORT_SYMBOL_GPL(kvm_exit_x86);
 
 static __init int kvm_init(void)
 {
-	static struct page *bad_page;
 	int r;
 
 	r = kvm_mmu_module_init();
@@ -3533,16 +3538,13 @@ static __init int kvm_init(void)
 
 	kvm_arch_init();
 
-	bad_page = alloc_page(GFP_KERNEL);
+	bad_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
 
 	if (bad_page == NULL) {
 		r = -ENOMEM;
 		goto out;
 	}
 
-	bad_page_address = page_to_pfn(bad_page) << PAGE_SHIFT;
-	memset(__va(bad_page_address), 0, PAGE_SIZE);
-
 	return 0;
 
 out:
@@ -3555,7 +3557,7 @@ out4:
 static __exit void kvm_exit(void)
 {
 	kvm_exit_debug();
-	__free_page(pfn_to_page(bad_page_address >> PAGE_SHIFT));
+	__free_page(bad_page);
 	kvm_mmu_module_exit();
 }
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index bbf5eb4..2ad14fb 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -850,23 +850,17 @@ static void page_header_update_slot(struct kvm *kvm, void *pte, gpa_t gpa)
 	__set_bit(slot, &page_head->slot_bitmap);
 }
 
-hpa_t safe_gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
-{
-	hpa_t hpa = gpa_to_hpa(kvm, gpa);
-
-	return is_error_hpa(hpa) ? bad_page_address | (gpa & ~PAGE_MASK): hpa;
-}
-
 hpa_t gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
 {
 	struct page *page;
+	hpa_t hpa;
 
 	ASSERT((gpa & HPA_ERR_MASK) == 0);
 	page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
-	if (!page)
-		return gpa | HPA_ERR_MASK;
-	return ((hpa_t)page_to_pfn(page) << PAGE_SHIFT)
-		| (gpa & (PAGE_SIZE-1));
+	hpa = ((hpa_t)page_to_pfn(page) << PAGE_SHIFT) | (gpa & (PAGE_SIZE-1));
+	if (is_error_page(page))
+		return hpa | HPA_ERR_MASK;
+	return hpa;
 }
 
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index a9e687b..58fd35a 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -92,8 +92,6 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			    struct kvm_vcpu *vcpu, gva_t addr,
 			    int write_fault, int user_fault, int fetch_fault)
 {
-	hpa_t hpa;
-	struct kvm_memory_slot *slot;
 	pt_element_t *ptep;
 	pt_element_t root;
 	gfn_t table_gfn;
@@ -118,9 +116,8 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 	walker->table_gfn[walker->level - 1] = table_gfn;
 	pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
 		 walker->level - 1, table_gfn);
-	slot = gfn_to_memslot(vcpu->kvm, table_gfn);
-	hpa = safe_gpa_to_hpa(vcpu->kvm, root & PT64_BASE_ADDR_MASK);
-	walker->page = pfn_to_page(hpa >> PAGE_SHIFT);
+	walker->page = gfn_to_page(vcpu->kvm, (root & PT64_BASE_ADDR_MASK)
+				   >> PAGE_SHIFT);
 	walker->table = kmap_atomic(walker->page, KM_USER0);
 
 	ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
@@ -130,7 +127,6 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 
 	for (;;) {
 		int index = PT_INDEX(addr, walker->level);
-		hpa_t paddr;
 
 		ptep = &walker->table[index];
 		walker->index = index;
@@ -179,8 +175,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		walker->inherited_ar &= walker->table[index];
 		table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 		kunmap_atomic(walker->table, KM_USER0);
-		paddr = safe_gpa_to_hpa(vcpu->kvm, table_gfn << PAGE_SHIFT);
-		walker->page = pfn_to_page(paddr >> PAGE_SHIFT);
+		walker->page = gfn_to_page(vcpu->kvm, table_gfn);
 		walker->table = kmap_atomic(walker->page, KM_USER0);
 		--walker->level;
 		walker->table_gfn[walker->level - 1] = table_gfn;
-- 
1.5.2.4


[-- Attachment #5: 0003-now-that-gfn_to_page-get-called-at-run-time-we-dont.patch --]
[-- Type: text/x-patch, Size: 1776 bytes --]

>From dc0164113041c2f2bf22fc066ca99b9b8531d627 Mon Sep 17 00:00:00 2001
From: Izik Eidus <izik@Home1.(none)>
Date: Sat, 13 Oct 2007 02:56:25 +0200
Subject: [PATCH] now that gfn_to_page get called at run time, we dont have to do memset on the memory.
(it is now much faster to load VM with alot of memory)

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 user/kvmctl.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/user/kvmctl.c b/user/kvmctl.c
index 0604f2f..ff2014e 100644
--- a/user/kvmctl.c
+++ b/user/kvmctl.c
@@ -391,7 +391,6 @@ int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory,
 
 
 	low_memory.userspace_addr = (unsigned long)*vm_mem;
-	memset((unsigned long *)low_memory.userspace_addr, 0, low_memory.memory_size);
 	/* 640K should be enough. */
 	r = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &low_memory);
 	if (r == -1) {
@@ -406,7 +405,6 @@ int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory,
 			return -1;
 		}
 		extended_memory.userspace_addr = (unsigned long)(*vm_mem + exmem);
-		memset((unsigned long *)extended_memory.userspace_addr, 0, extended_memory.memory_size);
 		r = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &extended_memory);
 		if (r == -1) {
 			fprintf(stderr, "kvm_create_memory_region: %m\n");
@@ -422,7 +420,6 @@ int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory,
 			return -1;
 		}
 		above_4g_memory.userspace_addr = (unsigned long)(*vm_mem + 0x100000000);
-		memset((unsigned long *)above_4g_memory.userspace_addr, 0, above_4g_memory.memory_size);
 		r = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &above_4g_memory);
 		if (r == -1) {
 			fprintf(stderr, "kvm_create_memory_region: %m\n");
-- 
1.5.2.4


[-- Attachment #6: 0004-make-the-guest-unshadowed-memory-swappable.patch --]
[-- Type: text/x-patch, Size: 10643 bytes --]

>From bbdbc8a283e74e01deb3d718c8f97bdda64c01be Mon Sep 17 00:00:00 2001
From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Date: Tue, 16 Oct 2007 16:14:08 +0200
Subject: [PATCH] make the guest unshadowed memory swappable

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    2 +
 drivers/kvm/kvm_main.c    |   81 +++++++++++++++++++++++++--------------------
 drivers/kvm/mmu.c         |   16 ++++++++-
 drivers/kvm/paging_tmpl.h |   27 +++++++++++++--
 4 files changed, 86 insertions(+), 40 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 18bc4df..1c52c5d 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -409,6 +409,7 @@ struct kvm_memory_slot {
 	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
 	int user_alloc; /* user allocated memory */
+	unsigned long userspace_addr;
 };
 
 struct kvm {
@@ -570,6 +571,7 @@ extern struct page *bad_page;
 int is_error_page(struct page *page);
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
+void kvm_release_page(struct page *page);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 			int len);
 int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index aebc822..04a8267 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -320,19 +320,6 @@ static struct kvm *kvm_create_vm(void)
 	return kvm;
 }
 
-static void kvm_free_userspace_physmem(struct kvm_memory_slot *free)
-{
-	int i;
-
-	for (i = 0; i < free->npages; ++i) {
-		if (free->phys_mem[i]) {
-			if (!PageReserved(free->phys_mem[i]))
-				SetPageDirty(free->phys_mem[i]);
-			page_cache_release(free->phys_mem[i]);
-		}
-	}
-}
-
 static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
 {
 	int i;
@@ -350,9 +337,7 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 {
 	if (!dont || free->phys_mem != dont->phys_mem)
 		if (free->phys_mem) {
-			if (free->user_alloc)
-				kvm_free_userspace_physmem(free);
-			else
+			if (!free->user_alloc)
 				kvm_free_kernel_physmem(free);
 			vfree(free->phys_mem);
 		}
@@ -772,19 +757,8 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 		memset(new.phys_mem, 0, npages * sizeof(struct page *));
 		memset(new.rmap, 0, npages * sizeof(*new.rmap));
 		if (user_alloc) {
-			unsigned long pages_num;
-
 			new.user_alloc = 1;
-			down_read(&current->mm->mmap_sem);
-
-			pages_num = get_user_pages(current, current->mm,
-						   mem->userspace_addr,
-						   npages, 1, 0, new.phys_mem,
-						   NULL);
-
-			up_read(&current->mm->mmap_sem);
-			if (pages_num != npages)
-				goto out_unlock;
+			new.userspace_addr = mem->userspace_addr;
 		} else {
 			for (i = 0; i < npages; ++i) {
 				new.phys_mem[i] = alloc_page(GFP_HIGHUSER
@@ -1059,12 +1033,39 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
-	if (!slot)
+	if (!slot) {
+		get_page(bad_page);
 		return bad_page;
+	}
+	if (slot->user_alloc) {
+		struct page *page[1];
+		int npages;
+
+		down_read(&current->mm->mmap_sem);
+		npages = get_user_pages(current, current->mm,
+					slot->userspace_addr
+					+ (gfn - slot->base_gfn) * PAGE_SIZE, 1,
+					1, 0, page, NULL);
+		up_read(&current->mm->mmap_sem);
+		if (npages != 1) {
+			get_page(bad_page);
+			return bad_page;
+		}
+		return page[0];
+	}
+	get_page(slot->phys_mem[gfn - slot->base_gfn]);
 	return slot->phys_mem[gfn - slot->base_gfn];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
+void kvm_release_page(struct page *page)
+{
+	if (!PageReserved(page))
+		SetPageDirty(page);
+	put_page(page);
+}
+EXPORT_SYMBOL_GPL(kvm_release_page);
+
 static int next_segment(unsigned long len, int offset)
 {
 	if (len > PAGE_SIZE - offset)
@@ -1080,13 +1081,16 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memcpy(data, page_virt + offset, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
+	kvm_release_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_page);
@@ -1118,14 +1122,17 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memcpy(page_virt + offset, data, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
 	mark_page_dirty(kvm, gfn);
+	kvm_release_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_page);
@@ -1156,13 +1163,16 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memset(page_virt + offset, 0, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
+	kvm_release_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_clear_guest_page);
@@ -2091,8 +2101,6 @@ int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 	for (i = 0; i < nr_pages; ++i) {
 		mutex_lock(&vcpu->kvm->lock);
 		page = gva_to_page(vcpu, address + i * PAGE_SIZE);
-		if (page)
-			get_page(page);
 		vcpu->pio.guest_pages[i] = page;
 		mutex_unlock(&vcpu->kvm->lock);
 		if (!page) {
@@ -3084,9 +3092,10 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
 	page = gfn_to_page(kvm, pgoff);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return NOPAGE_SIGBUS;
-	get_page(page);
+	}
 	if (type != NULL)
 		*type = VM_FAULT_MINOR;
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 2ad14fb..a064015 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -425,6 +425,8 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	if (!is_rmap_pte(*spte))
 		return;
 	page = page_header(__pa(spte));
+	kvm_release_page(pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >>
+			 PAGE_SHIFT));
 	rmapp = gfn_to_rmap(kvm, page->gfns[spte - page->spt]);
 	if (!*rmapp) {
 		printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte);
@@ -924,7 +926,12 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 						     v, level - 1,
 						     1, 3, &table[index]);
 			if (!new_table) {
+				struct page *page;
+
 				pgprintk("nonpaging_map: ENOMEM\n");
+				page = pfn_to_page((p & PT64_DIR_BASE_ADDR_MASK)
+						   >> PAGE_SHIFT);
+				kvm_release_page(page);
 				return -ENOMEM;
 			}
 
@@ -1039,8 +1046,11 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 
 	paddr = gpa_to_hpa(vcpu->kvm, addr & PT64_BASE_ADDR_MASK);
 
-	if (is_error_hpa(paddr))
+	if (is_error_hpa(paddr)) {
+		kvm_release_page(pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+				 >> PAGE_SHIFT));
 		return 1;
+	}
 
 	return nonpaging_map(vcpu, addr & PAGE_MASK, paddr);
 }
@@ -1507,6 +1517,7 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 		} else {
 			gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, va);
 			hpa_t hpa = gpa_to_hpa(vcpu, gpa);
+			struct page *page;
 
 			if (is_shadow_present_pte(ent)
 			    && (ent & PT64_BASE_ADDR_MASK) != hpa)
@@ -1519,6 +1530,9 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 				 && !is_error_hpa(hpa))
 				printk(KERN_ERR "audit: (%s) notrap shadow,"
 				       " valid guest gva %lx\n", audit_msg, va);
+			page = pfn_to_page((gpa & PT64_BASE_ADDR_MASK)
+					   >> PAGE_SHIFT);
+			kvm_release_page(page);
 
 		}
 	}
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 58fd35a..600b1cc 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -175,6 +175,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		walker->inherited_ar &= walker->table[index];
 		table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 		kunmap_atomic(walker->table, KM_USER0);
+		kvm_release_page(walker->page);
 		walker->page = gfn_to_page(vcpu->kvm, table_gfn);
 		walker->table = kmap_atomic(walker->page, KM_USER0);
 		--walker->level;
@@ -183,8 +184,10 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			 walker->level - 1, table_gfn);
 	}
 	walker->pte = *ptep;
-	if (walker->page)
+	if (walker->page) {
 		walker->ptep = NULL;
+		kvm_release_page(walker->page);
+	}
 	if (walker->table)
 		kunmap_atomic(walker->table, KM_USER0);
 	pgprintk("%s: pte %llx\n", __FUNCTION__, (u64)*ptep);
@@ -206,6 +209,8 @@ err:
 		walker->error_code |= PFERR_FETCH_MASK;
 	if (walker->table)
 		kunmap_atomic(walker->table, KM_USER0);
+	if (walker->page)
+		kvm_release_page(walker->page);
 	return 0;
 }
 
@@ -249,6 +254,8 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 	if (is_error_hpa(paddr)) {
 		set_shadow_pte(shadow_pte,
 			       shadow_trap_nonpresent_pte | PT_SHADOW_IO_MARK);
+		kvm_release_page(pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+					     >> PAGE_SHIFT));
 		return;
 	}
 
@@ -286,9 +293,20 @@ unshadowed:
 	pgprintk("%s: setting spte %llx\n", __FUNCTION__, spte);
 	set_shadow_pte(shadow_pte, spte);
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
-	if (!was_rmapped)
+	if (!was_rmapped) {
 		rmap_add(vcpu, shadow_pte, (gaddr & PT64_BASE_ADDR_MASK)
 			 >> PAGE_SHIFT);
+		if (!is_rmap_pte(*shadow_pte)) {
+			struct page *page;
+
+			page = pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+					   >> PAGE_SHIFT);
+			kvm_release_page(page);
+		}
+	}
+	else
+		kvm_release_page(pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+				 >> PAGE_SHIFT));
 	if (!ptwrite || !*ptwrite)
 		vcpu->last_pte_updated = shadow_pte;
 }
@@ -512,19 +530,22 @@ static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu,
 {
 	int i;
 	pt_element_t *gpt;
+	struct page *page;
 
 	if (sp->role.metaphysical || PTTYPE == 32) {
 		nonpaging_prefetch_page(vcpu, sp);
 		return;
 	}
 
-	gpt = kmap_atomic(gfn_to_page(vcpu->kvm, sp->gfn), KM_USER0);
+	page = gfn_to_page(vcpu->kvm, sp->gfn);
+	gpt = kmap_atomic(page, KM_USER0);
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
 		if (is_present_pte(gpt[i]))
 			sp->spt[i] = shadow_trap_nonpresent_pte;
 		else
 			sp->spt[i] = shadow_notrap_nonpresent_pte;
 	kunmap_atomic(gpt, KM_USER0);
+	kvm_release_page(page);
 }
 
 #undef pt_element_t
-- 
1.5.2.4


[-- Attachment #7: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #8: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

  parent reply	other threads:[~2007-10-16 14:31 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-15 21:21 [PATCH 0/2] Swapping:fixing the problem with unsleepable place of kvm and gfn_to_page Izik Eidus
     [not found] ` <4713D9E2.3020804-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-15 22:21   ` Anthony Liguori
     [not found]     ` <4713E7D0.7010000-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-16 14:31       ` Izik Eidus [this message]
     [not found]         ` <4714CB46.8030208-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-16 14:53           ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4714CB46.8030208@qumranet.com \
    --to=izike-atkuwr5tajbwk0htik3j/w@public.gmane.org \
    --cc=anthony-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox