public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/50] KVM patch queue review for 2.6.25 merge window (part I)
@ 2007-12-23 14:50 Avi Kivity
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Following is the first part of the 2.6.25 merge window submission.  Since
there are 238 patches in the queue (and a few more expected), they'll be
sent in five batches of around 50 each.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 01/50] KVM: x86 emulator: Add vmmcall/vmcall to x86_emulate (v3)
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 02/50] KVM: Refactor hypercall infrastructure (v3) Avi Kivity
                     ` (48 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

Add vmmcall/vmcall to x86_emulate.  Future patch will implement functionality
for these instructions.

Signed-off-by: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   23 +++++++++++++++++------
 1 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index bd46de6..84af9cc 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -1380,6 +1380,12 @@ twobyte_insn:
 			u16 size;
 			unsigned long address;
 
+		case 0: /* vmcall */
+			if (modrm_mod != 3 || modrm_rm != 1)
+				goto cannot_emulate;
+
+			/* nop */
+			break;
 		case 2: /* lgdt */
 			rc = read_descriptor(ctxt, ops, src.ptr,
 					     &size, &address, op_bytes);
@@ -1387,12 +1393,17 @@ twobyte_insn:
 				goto done;
 			realmode_lgdt(ctxt->vcpu, size, address);
 			break;
-		case 3: /* lidt */
-			rc = read_descriptor(ctxt, ops, src.ptr,
-					     &size, &address, op_bytes);
-			if (rc)
-				goto done;
-			realmode_lidt(ctxt->vcpu, size, address);
+		case 3: /* lidt/vmmcall */
+			if (modrm_mod == 3 && modrm_rm == 1) {
+				/* nop */
+			} else {
+				rc = read_descriptor(ctxt, ops, src.ptr,
+						     &size, &address,
+						     op_bytes);
+				if (rc)
+					goto done;
+				realmode_lidt(ctxt->vcpu, size, address);
+			}
 			break;
 		case 4: /* smsw */
 			if (modrm_mod != 3)
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 02/50] KVM: Refactor hypercall infrastructure (v3)
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-12-23 14:50   ` [PATCH 01/50] KVM: x86 emulator: Add vmmcall/vmcall to x86_emulate (v3) Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 03/50] KVM: x86 emulator: remove unused functions Avi Kivity
                     ` (47 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

This patch refactors the current hypercall infrastructure to better
support live migration and SMP.  It eliminates the hypercall page by
trapping the UD exception that would occur if you used the wrong hypercall
instruction for the underlying architecture and replacing it with the right
one lazily.

A fall-out of this patch is that the unhandled hypercalls no longer trap to
userspace.  There is very little reason though to use a hypercall to
communicate with userspace as PIO or MMIO can be used.  There is no code
in tree that uses userspace hypercalls.

[avi: fix #ud injection on vmx]

Signed-off-by: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    8 +-
 drivers/kvm/kvm_main.c    |  156 +++++++++++++-------------------------------
 drivers/kvm/svm.c         |   19 +++++-
 drivers/kvm/vmx.c         |   29 +++++++-
 drivers/kvm/x86_emulate.c |   11 +++-
 include/linux/kvm_para.h  |  159 ++++++++++++++++++++++++++++-----------------
 6 files changed, 199 insertions(+), 183 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 3b0bc4b..da9c3aa 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -46,6 +46,7 @@
 #define KVM_MAX_CPUID_ENTRIES 40
 
 #define DE_VECTOR 0
+#define UD_VECTOR 6
 #define NM_VECTOR 7
 #define DF_VECTOR 8
 #define TS_VECTOR 10
@@ -317,9 +318,6 @@ struct kvm_vcpu {
 	unsigned long cr0;
 	unsigned long cr2;
 	unsigned long cr3;
-	gpa_t para_state_gpa;
-	struct page *para_state_page;
-	gpa_t hypercall_gpa;
 	unsigned long cr4;
 	unsigned long cr8;
 	u64 pdptrs[4]; /* pae */
@@ -622,7 +620,9 @@ void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
 int kvm_mmu_load(struct kvm_vcpu *vcpu);
 void kvm_mmu_unload(struct kvm_vcpu *vcpu);
 
-int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
+
+int kvm_fix_hypercall(struct kvm_vcpu *vcpu);
 
 static inline void kvm_guest_enter(void)
 {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 47c10b8..9668e9c 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -39,6 +39,7 @@
 #include <linux/smp.h>
 #include <linux/anon_inodes.h>
 #include <linux/profile.h>
+#include <linux/kvm_para.h>
 
 #include <asm/processor.h>
 #include <asm/msr.h>
@@ -1362,51 +1363,61 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
-int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
-	unsigned long nr, a0, a1, a2, a3, a4, a5, ret;
+	unsigned long nr, a0, a1, a2, a3, ret;
 
 	kvm_x86_ops->cache_regs(vcpu);
-	ret = -KVM_EINVAL;
-#ifdef CONFIG_X86_64
-	if (is_long_mode(vcpu)) {
-		nr = vcpu->regs[VCPU_REGS_RAX];
-		a0 = vcpu->regs[VCPU_REGS_RDI];
-		a1 = vcpu->regs[VCPU_REGS_RSI];
-		a2 = vcpu->regs[VCPU_REGS_RDX];
-		a3 = vcpu->regs[VCPU_REGS_RCX];
-		a4 = vcpu->regs[VCPU_REGS_R8];
-		a5 = vcpu->regs[VCPU_REGS_R9];
-	} else
-#endif
-	{
-		nr = vcpu->regs[VCPU_REGS_RBX] & -1u;
-		a0 = vcpu->regs[VCPU_REGS_RAX] & -1u;
-		a1 = vcpu->regs[VCPU_REGS_RCX] & -1u;
-		a2 = vcpu->regs[VCPU_REGS_RDX] & -1u;
-		a3 = vcpu->regs[VCPU_REGS_RSI] & -1u;
-		a4 = vcpu->regs[VCPU_REGS_RDI] & -1u;
-		a5 = vcpu->regs[VCPU_REGS_RBP] & -1u;
+
+	nr = vcpu->regs[VCPU_REGS_RAX];
+	a0 = vcpu->regs[VCPU_REGS_RBX];
+	a1 = vcpu->regs[VCPU_REGS_RCX];
+	a2 = vcpu->regs[VCPU_REGS_RDX];
+	a3 = vcpu->regs[VCPU_REGS_RSI];
+
+	if (!is_long_mode(vcpu)) {
+		nr &= 0xFFFFFFFF;
+		a0 &= 0xFFFFFFFF;
+		a1 &= 0xFFFFFFFF;
+		a2 &= 0xFFFFFFFF;
+		a3 &= 0xFFFFFFFF;
 	}
+
 	switch (nr) {
 	default:
-		run->hypercall.nr = nr;
-		run->hypercall.args[0] = a0;
-		run->hypercall.args[1] = a1;
-		run->hypercall.args[2] = a2;
-		run->hypercall.args[3] = a3;
-		run->hypercall.args[4] = a4;
-		run->hypercall.args[5] = a5;
-		run->hypercall.ret = ret;
-		run->hypercall.longmode = is_long_mode(vcpu);
-		kvm_x86_ops->decache_regs(vcpu);
-		return 0;
+		ret = -KVM_ENOSYS;
+		break;
 	}
 	vcpu->regs[VCPU_REGS_RAX] = ret;
 	kvm_x86_ops->decache_regs(vcpu);
-	return 1;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
+
+int kvm_fix_hypercall(struct kvm_vcpu *vcpu)
+{
+	char instruction[3];
+	int ret = 0;
+
+	mutex_lock(&vcpu->kvm->lock);
+
+	/*
+	 * Blow out the MMU to ensure that no other VCPU has an active mapping
+	 * to ensure that the updated hypercall appears atomically across all
+	 * VCPUs.
+	 */
+	kvm_mmu_zap_all(vcpu->kvm);
+
+	kvm_x86_ops->cache_regs(vcpu);
+	kvm_x86_ops->patch_hypercall(vcpu, instruction);
+	if (emulator_write_emulated(vcpu->rip, instruction, 3, vcpu)
+	    != X86EMUL_CONTINUE)
+		ret = -EFAULT;
+
+	mutex_unlock(&vcpu->kvm->lock);
+
+	return ret;
 }
-EXPORT_SYMBOL_GPL(kvm_hypercall);
 
 static u64 mk_cr_64(u64 curr_cr, u32 new_val)
 {
@@ -1474,75 +1485,6 @@ void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long val,
 	}
 }
 
-/*
- * Register the para guest with the host:
- */
-static int vcpu_register_para(struct kvm_vcpu *vcpu, gpa_t para_state_gpa)
-{
-	struct kvm_vcpu_para_state *para_state;
-	hpa_t para_state_hpa, hypercall_hpa;
-	struct page *para_state_page;
-	unsigned char *hypercall;
-	gpa_t hypercall_gpa;
-
-	printk(KERN_DEBUG "kvm: guest trying to enter paravirtual mode\n");
-	printk(KERN_DEBUG ".... para_state_gpa: %08Lx\n", para_state_gpa);
-
-	/*
-	 * Needs to be page aligned:
-	 */
-	if (para_state_gpa != PAGE_ALIGN(para_state_gpa))
-		goto err_gp;
-
-	para_state_hpa = gpa_to_hpa(vcpu, para_state_gpa);
-	printk(KERN_DEBUG ".... para_state_hpa: %08Lx\n", para_state_hpa);
-	if (is_error_hpa(para_state_hpa))
-		goto err_gp;
-
-	mark_page_dirty(vcpu->kvm, para_state_gpa >> PAGE_SHIFT);
-	para_state_page = pfn_to_page(para_state_hpa >> PAGE_SHIFT);
-	para_state = kmap(para_state_page);
-
-	printk(KERN_DEBUG "....  guest version: %d\n", para_state->guest_version);
-	printk(KERN_DEBUG "....           size: %d\n", para_state->size);
-
-	para_state->host_version = KVM_PARA_API_VERSION;
-	/*
-	 * We cannot support guests that try to register themselves
-	 * with a newer API version than the host supports:
-	 */
-	if (para_state->guest_version > KVM_PARA_API_VERSION) {
-		para_state->ret = -KVM_EINVAL;
-		goto err_kunmap_skip;
-	}
-
-	hypercall_gpa = para_state->hypercall_gpa;
-	hypercall_hpa = gpa_to_hpa(vcpu, hypercall_gpa);
-	printk(KERN_DEBUG ".... hypercall_hpa: %08Lx\n", hypercall_hpa);
-	if (is_error_hpa(hypercall_hpa)) {
-		para_state->ret = -KVM_EINVAL;
-		goto err_kunmap_skip;
-	}
-
-	printk(KERN_DEBUG "kvm: para guest successfully registered.\n");
-	vcpu->para_state_page = para_state_page;
-	vcpu->para_state_gpa = para_state_gpa;
-	vcpu->hypercall_gpa = hypercall_gpa;
-
-	mark_page_dirty(vcpu->kvm, hypercall_gpa >> PAGE_SHIFT);
-	hypercall = kmap_atomic(pfn_to_page(hypercall_hpa >> PAGE_SHIFT),
-				KM_USER1) + (hypercall_hpa & ~PAGE_MASK);
-	kvm_x86_ops->patch_hypercall(vcpu, hypercall);
-	kunmap_atomic(hypercall, KM_USER1);
-
-	para_state->ret = 0;
-err_kunmap_skip:
-	kunmap(para_state_page);
-	return 0;
-err_gp:
-	return 1;
-}
-
 int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
 {
 	u64 data;
@@ -1656,12 +1598,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
 	case MSR_IA32_MISC_ENABLE:
 		vcpu->ia32_misc_enable_msr = data;
 		break;
-	/*
-	 * This is the 'probe whether the host is KVM' logic:
-	 */
-	case MSR_KVM_API_MAGIC:
-		return vcpu_register_para(vcpu, data);
-
 	default:
 		pr_unimpl(vcpu, "unhandled wrmsr: 0x%x\n", msr);
 		return 1;
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 4e04e49..5883f3e 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -476,7 +476,8 @@ static void init_vmcb(struct vmcb *vmcb)
 					INTERCEPT_DR5_MASK |
 					INTERCEPT_DR7_MASK;
 
-	control->intercept_exceptions = 1 << PF_VECTOR;
+	control->intercept_exceptions = (1 << PF_VECTOR) |
+					(1 << UD_VECTOR);
 
 
 	control->intercept = 	(1ULL << INTERCEPT_INTR) |
@@ -979,6 +980,17 @@ static int pf_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 	return 0;
 }
 
+static int ud_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
+{
+	int er;
+
+	er = emulate_instruction(&svm->vcpu, kvm_run, 0, 0);
+	if (er != EMULATE_DONE)
+		inject_ud(&svm->vcpu);
+
+	return 1;
+}
+
 static int nm_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 {
 	svm->vmcb->control.intercept_exceptions &= ~(1 << NM_VECTOR);
@@ -1045,7 +1057,8 @@ static int vmmcall_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 {
 	svm->next_rip = svm->vmcb->save.rip + 3;
 	skip_emulated_instruction(&svm->vcpu);
-	return kvm_hypercall(&svm->vcpu, kvm_run);
+	kvm_emulate_hypercall(&svm->vcpu);
+	return 1;
 }
 
 static int invalid_op_interception(struct vcpu_svm *svm,
@@ -1241,6 +1254,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm,
 	[SVM_EXIT_WRITE_DR3]			= emulate_on_interception,
 	[SVM_EXIT_WRITE_DR5]			= emulate_on_interception,
 	[SVM_EXIT_WRITE_DR7]			= emulate_on_interception,
+	[SVM_EXIT_EXCP_BASE + UD_VECTOR]	= ud_interception,
 	[SVM_EXIT_EXCP_BASE + PF_VECTOR] 	= pf_interception,
 	[SVM_EXIT_EXCP_BASE + NM_VECTOR] 	= nm_interception,
 	[SVM_EXIT_INTR] 			= nop_on_interception,
@@ -1675,7 +1689,6 @@ svm_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
 	hypercall[0] = 0x0f;
 	hypercall[1] = 0x01;
 	hypercall[2] = 0xd9;
-	hypercall[3] = 0xc3;
 }
 
 static void svm_check_processor_compat(void *rtn)
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index bb56ae3..77d061b 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -164,6 +164,13 @@ static inline int is_no_device(u32 intr_info)
 		(INTR_TYPE_EXCEPTION | NM_VECTOR | INTR_INFO_VALID_MASK);
 }
 
+static inline int is_invalid_opcode(u32 intr_info)
+{
+	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK |
+			     INTR_INFO_VALID_MASK)) ==
+		(INTR_TYPE_EXCEPTION | UD_VECTOR | INTR_INFO_VALID_MASK);
+}
+
 static inline int is_external_interrupt(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK))
@@ -315,7 +322,7 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
 {
 	u32 eb;
 
-	eb = 1u << PF_VECTOR;
+	eb = (1u << PF_VECTOR) | (1u << UD_VECTOR);
 	if (!vcpu->fpu_active)
 		eb |= 1u << NM_VECTOR;
 	if (vcpu->guest_debug.enabled)
@@ -560,6 +567,14 @@ static void vmx_inject_gp(struct kvm_vcpu *vcpu, unsigned error_code)
 		     INTR_INFO_VALID_MASK);
 }
 
+static void vmx_inject_ud(struct kvm_vcpu *vcpu)
+{
+	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
+		     UD_VECTOR |
+		     INTR_TYPE_EXCEPTION |
+		     INTR_INFO_VALID_MASK);
+}
+
 /*
  * Swap MSR entry in host/guest MSR entry array.
  */
@@ -1771,6 +1786,14 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		return 1;
 	}
 
+	if (is_invalid_opcode(intr_info)) {
+		er = emulate_instruction(vcpu, kvm_run, 0, 0);
+		if (er != EMULATE_DONE)
+			vmx_inject_ud(vcpu);
+
+		return 1;
+	}
+
 	error_code = 0;
 	rip = vmcs_readl(GUEST_RIP);
 	if (intr_info & INTR_INFO_DELIEVER_CODE_MASK)
@@ -1873,7 +1896,6 @@ vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
 	hypercall[0] = 0x0f;
 	hypercall[1] = 0x01;
 	hypercall[2] = 0xc1;
-	hypercall[3] = 0xc3;
 }
 
 static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
@@ -2059,7 +2081,8 @@ static int handle_halt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 static int handle_vmcall(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	skip_emulated_instruction(vcpu);
-	return kvm_hypercall(vcpu, kvm_run);
+	kvm_emulate_hypercall(vcpu);
+	return 1;
 }
 
 /*
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 84af9cc..f12bc2c 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -1384,7 +1384,11 @@ twobyte_insn:
 			if (modrm_mod != 3 || modrm_rm != 1)
 				goto cannot_emulate;
 
-			/* nop */
+			rc = kvm_fix_hypercall(ctxt->vcpu);
+			if (rc)
+				goto done;
+
+			kvm_emulate_hypercall(ctxt->vcpu);
 			break;
 		case 2: /* lgdt */
 			rc = read_descriptor(ctxt, ops, src.ptr,
@@ -1395,7 +1399,10 @@ twobyte_insn:
 			break;
 		case 3: /* lidt/vmmcall */
 			if (modrm_mod == 3 && modrm_rm == 1) {
-				/* nop */
+				rc = kvm_fix_hypercall(ctxt->vcpu);
+				if (rc)
+					goto done;
+				kvm_emulate_hypercall(ctxt->vcpu);
 			} else {
 				rc = read_descriptor(ctxt, ops, src.ptr,
 						     &size, &address,
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index 3b29256..cc5dfb4 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -1,73 +1,110 @@
 #ifndef __LINUX_KVM_PARA_H
 #define __LINUX_KVM_PARA_H
 
-/*
- * Guest OS interface for KVM paravirtualization
- *
- * Note: this interface is totally experimental, and is certain to change
- *       as we make progress.
+/* This CPUID returns the signature 'KVMKVMKVM' in ebx, ecx, and edx.  It
+ * should be used to determine that a VM is running under KVM.
  */
+#define KVM_CPUID_SIGNATURE	0x40000000
 
-/*
- * Per-VCPU descriptor area shared between guest and host. Writable to
- * both guest and host. Registered with the host by the guest when
- * a guest acknowledges paravirtual mode.
- *
- * NOTE: all addresses are guest-physical addresses (gpa), to make it
- * easier for the hypervisor to map between the various addresses.
- */
-struct kvm_vcpu_para_state {
-	/*
-	 * API version information for compatibility. If there's any support
-	 * mismatch (too old host trying to execute too new guest) then
-	 * the host will deny entry into paravirtual mode. Any other
-	 * combination (new host + old guest and new host + new guest)
-	 * is supposed to work - new host versions will support all old
-	 * guest API versions.
-	 */
-	u32 guest_version;
-	u32 host_version;
-	u32 size;
-	u32 ret;
-
-	/*
-	 * The address of the vm exit instruction (VMCALL or VMMCALL),
-	 * which the host will patch according to the CPU model the
-	 * VM runs on:
-	 */
-	u64 hypercall_gpa;
-
-} __attribute__ ((aligned(PAGE_SIZE)));
-
-#define KVM_PARA_API_VERSION 1
-
-/*
- * This is used for an RDMSR's ECX parameter to probe for a KVM host.
- * Hopefully no CPU vendor will use up this number. This is placed well
- * out of way of the typical space occupied by CPU vendors' MSR indices,
- * and we think (or at least hope) it wont be occupied in the future
- * either.
+/* This CPUID returns a feature bitmap in eax.  Before enabling a particular
+ * paravirtualization, the appropriate feature bit should be checked.
  */
-#define MSR_KVM_API_MAGIC 0x87655678
+#define KVM_CPUID_FEATURES	0x40000001
 
-#define KVM_EINVAL 1
+/* Return values for hypercalls */
+#define KVM_ENOSYS		1000
 
-/*
- * Hypercall calling convention:
- *
- * Each hypercall may have 0-6 parameters.
- *
- * 64-bit hypercall index is in RAX, goes from 0 to __NR_hypercalls-1
- *
- * 64-bit parameters 1-6 are in the standard gcc x86_64 calling convention
- * order: RDI, RSI, RDX, RCX, R8, R9.
- *
- * 32-bit index is EBX, parameters are: EAX, ECX, EDX, ESI, EDI, EBP.
- * (the first 3 are according to the gcc regparm calling convention)
+#ifdef __KERNEL__
+#include <asm/processor.h>
+
+/* This instruction is vmcall.  On non-VT architectures, it will generate a
+ * trap that we will then rewrite to the appropriate instruction.
+ */
+#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1"
+
+/* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun
+ * instruction.  The hypervisor may replace it with something else but only the
+ * instructions are guaranteed to be supported.
  *
- * No registers are clobbered by the hypercall, except that the
- * return value is in RAX.
+ * Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively.
+ * The hypercall number should be placed in rax and the return value will be
+ * placed in rax.  No other registers will be clobbered unless explicited
+ * noted by the particular hypercall.
  */
-#define __NR_hypercalls			0
+
+static inline long kvm_hypercall0(unsigned int nr)
+{
+	long ret;
+	asm volatile(KVM_HYPERCALL
+		     : "=a"(ret)
+		     : "a"(nr));
+	return ret;
+}
+
+static inline long kvm_hypercall1(unsigned int nr, unsigned long p1)
+{
+	long ret;
+	asm volatile(KVM_HYPERCALL
+		     : "=a"(ret)
+		     : "a"(nr), "b"(p1));
+	return ret;
+}
+
+static inline long kvm_hypercall2(unsigned int nr, unsigned long p1,
+				  unsigned long p2)
+{
+	long ret;
+	asm volatile(KVM_HYPERCALL
+		     : "=a"(ret)
+		     : "a"(nr), "b"(p1), "c"(p2));
+	return ret;
+}
+
+static inline long kvm_hypercall3(unsigned int nr, unsigned long p1,
+				  unsigned long p2, unsigned long p3)
+{
+	long ret;
+	asm volatile(KVM_HYPERCALL
+		     : "=a"(ret)
+		     : "a"(nr), "b"(p1), "c"(p2), "d"(p3));
+	return ret;
+}
+
+static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
+				  unsigned long p2, unsigned long p3,
+				  unsigned long p4)
+{
+	long ret;
+	asm volatile(KVM_HYPERCALL
+		     : "=a"(ret)
+		     : "a"(nr), "b"(p1), "c"(p2), "d"(p3), "S"(p4));
+	return ret;
+}
+
+static inline int kvm_para_available(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+	char signature[13];
+
+	cpuid(KVM_CPUID_SIGNATURE, &eax, &ebx, &ecx, &edx);
+	memcpy(signature + 0, &ebx, 4);
+	memcpy(signature + 4, &ecx, 4);
+	memcpy(signature + 8, &edx, 4);
+	signature[12] = 0;
+
+	if (strcmp(signature, "KVMKVMKVM") == 0)
+		return 1;
+
+	return 0;
+}
+
+static inline int kvm_para_has_feature(unsigned int feature)
+{
+	if (cpuid_eax(KVM_CPUID_FEATURES) & (1UL << feature))
+		return 1;
+	return 0;
+}
+
+#endif
 
 #endif
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 03/50] KVM: x86 emulator: remove unused functions
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-12-23 14:50   ` [PATCH 01/50] KVM: x86 emulator: Add vmmcall/vmcall to x86_emulate (v3) Avi Kivity
  2007-12-23 14:50   ` [PATCH 02/50] KVM: Refactor hypercall infrastructure (v3) Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 04/50] KVM: x86 emulator: move all x86_emulate_memop() to a structure Avi Kivity
                     ` (46 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Remove #ifdef functions never used

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   39 ---------------------------------------
 1 files changed, 0 insertions(+), 39 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index f12bc2c..9ea82f8 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -1639,42 +1639,3 @@ cannot_emulate:
 	DPRINTF("Cannot emulate %02x\n", b);
 	return -1;
 }
-
-#ifdef __XEN__
-
-#include <asm/mm.h>
-#include <asm/uaccess.h>
-
-int
-x86_emulate_read_std(unsigned long addr,
-		     unsigned long *val,
-		     unsigned int bytes, struct x86_emulate_ctxt *ctxt)
-{
-	unsigned int rc;
-
-	*val = 0;
-
-	if ((rc = copy_from_user((void *)val, (void *)addr, bytes)) != 0) {
-		propagate_page_fault(addr + bytes - rc, 0);	/* read fault */
-		return X86EMUL_PROPAGATE_FAULT;
-	}
-
-	return X86EMUL_CONTINUE;
-}
-
-int
-x86_emulate_write_std(unsigned long addr,
-		      unsigned long val,
-		      unsigned int bytes, struct x86_emulate_ctxt *ctxt)
-{
-	unsigned int rc;
-
-	if ((rc = copy_to_user((void *)addr, (void *)&val, bytes)) != 0) {
-		propagate_page_fault(addr + bytes - rc, PGERR_write_access);
-		return X86EMUL_PROPAGATE_FAULT;
-	}
-
-	return X86EMUL_CONTINUE;
-}
-
-#endif
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 04/50] KVM: x86 emulator: move all x86_emulate_memop() to a structure
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 03/50] KVM: x86 emulator: remove unused functions Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 05/50] KVM: x86 emulator: move all decoding process to function x86_decode_insn() Avi Kivity
                     ` (45 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Move all x86_emulate_memop() common variables between decode and execute to a
structure decode_cache.  This will help in later separating decode and
emulate.

            struct decode_cache {
                u8 twobyte;
                u8 b;
                u8 lock_prefix;
                u8 rep_prefix;
                u8 op_bytes;
                u8 ad_bytes;
                struct operand src;
                struct operand dst;
                unsigned long *override_base;
                unsigned int d;
                unsigned long regs[NR_VCPU_REGS];
                unsigned long eip;
                /* modrm */
                u8 modrm;
                u8 modrm_mod;
                u8 modrm_reg;
                u8 modrm_rm;
                u8 use_modrm_ea;
                unsigned long modrm_ea;
                unsigned long modrm_val;
           };

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |  919 ++++++++++++++++++++++++---------------------
 drivers/kvm/x86_emulate.h |   34 ++
 2 files changed, 518 insertions(+), 435 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 9ea82f8..f946182 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -222,13 +222,6 @@ static u16 twobyte_table[256] = {
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
 };
 
-/* Type, address-of, and value of an instruction's operand. */
-struct operand {
-	enum { OP_REG, OP_MEM, OP_IMM } type;
-	unsigned int bytes;
-	unsigned long val, orig_val, *ptr;
-};
-
 /* EFLAGS bit definitions. */
 #define EFLG_OF (1<<11)
 #define EFLG_DF (1<<10)
@@ -431,24 +424,26 @@ struct operand {
 
 /* Access/update address held in a register, based on addressing mode. */
 #define address_mask(reg)						\
-	((ad_bytes == sizeof(unsigned long)) ? 				\
-		(reg) :	((reg) & ((1UL << (ad_bytes << 3)) - 1)))
+	((c->ad_bytes == sizeof(unsigned long)) ? 			\
+		(reg) :	((reg) & ((1UL << (c->ad_bytes << 3)) - 1)))
 #define register_address(base, reg)                                     \
 	((base) + address_mask(reg))
 #define register_address_increment(reg, inc)                            \
 	do {								\
 		/* signed type ensures sign extension to long */        \
 		int _inc = (inc);					\
-		if ( ad_bytes == sizeof(unsigned long) )		\
+		if (c->ad_bytes == sizeof(unsigned long))		\
 			(reg) += _inc;					\
 		else							\
-			(reg) = ((reg) & ~((1UL << (ad_bytes << 3)) - 1)) | \
-			   (((reg) + _inc) & ((1UL << (ad_bytes << 3)) - 1)); \
+			(reg) = ((reg) & 				\
+				 ~((1UL << (c->ad_bytes << 3)) - 1)) |	\
+				(((reg) + _inc) &			\
+				 ((1UL << (c->ad_bytes << 3)) - 1));	\
 	} while (0)
 
 #define JMP_REL(rel) 							\
 	do {								\
-		register_address_increment(_eip, rel);			\
+		register_address_increment(c->eip, rel);		\
 	} while (0)
 
 /*
@@ -524,39 +519,35 @@ static int test_cc(unsigned int condition, unsigned int flags)
 int
 x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
-	unsigned d;
-	u8 b, sib, twobyte = 0, rex_prefix = 0;
-	u8 modrm, modrm_mod = 0, modrm_reg = 0, modrm_rm = 0;
-	unsigned long *override_base = NULL;
-	unsigned int op_bytes, ad_bytes, lock_prefix = 0, rep_prefix = 0, i;
+	struct decode_cache *c = &ctxt->decode;
+	u8 sib, rex_prefix = 0;
+	unsigned int i;
 	int rc = 0;
-	struct operand src, dst;
 	unsigned long cr2 = ctxt->cr2;
 	int mode = ctxt->mode;
-	unsigned long modrm_ea;
-	int use_modrm_ea, index_reg = 0, base_reg = 0, scale, rip_relative = 0;
+	int index_reg = 0, base_reg = 0, scale, rip_relative = 0;
 	int no_wb = 0;
 	u64 msr_data;
 
 	/* Shadow copy of register state. Committed on successful emulation. */
-	unsigned long _regs[NR_VCPU_REGS];
-	unsigned long _eip = ctxt->vcpu->rip, _eflags = ctxt->eflags;
-	unsigned long modrm_val = 0;
+	unsigned long _eflags = ctxt->eflags;
 
-	memcpy(_regs, ctxt->vcpu->regs, sizeof _regs);
+	memset(c, 0, sizeof(struct decode_cache));
+	c->eip = ctxt->vcpu->rip;
+	memcpy(c->regs, ctxt->vcpu->regs, sizeof c->regs);
 
 	switch (mode) {
 	case X86EMUL_MODE_REAL:
 	case X86EMUL_MODE_PROT16:
-		op_bytes = ad_bytes = 2;
+		c->op_bytes = c->ad_bytes = 2;
 		break;
 	case X86EMUL_MODE_PROT32:
-		op_bytes = ad_bytes = 4;
+		c->op_bytes = c->ad_bytes = 4;
 		break;
 #ifdef CONFIG_X86_64
 	case X86EMUL_MODE_PROT64:
-		op_bytes = 4;
-		ad_bytes = 8;
+		c->op_bytes = 4;
+		c->ad_bytes = 8;
 		break;
 #endif
 	default:
@@ -565,40 +556,42 @@ x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 
 	/* Legacy prefixes. */
 	for (i = 0; i < 8; i++) {
-		switch (b = insn_fetch(u8, 1, _eip)) {
+		switch (c->b = insn_fetch(u8, 1, c->eip)) {
 		case 0x66:	/* operand-size override */
-			op_bytes ^= 6;	/* switch between 2/4 bytes */
+			c->op_bytes ^= 6;	/* switch between 2/4 bytes */
 			break;
 		case 0x67:	/* address-size override */
 			if (mode == X86EMUL_MODE_PROT64)
-				ad_bytes ^= 12;	/* switch between 4/8 bytes */
+				/* switch between 4/8 bytes */
+				c->ad_bytes ^= 12;
 			else
-				ad_bytes ^= 6;	/* switch between 2/4 bytes */
+				/* switch between 2/4 bytes */
+				c->ad_bytes ^= 6;
 			break;
 		case 0x2e:	/* CS override */
-			override_base = &ctxt->cs_base;
+			c->override_base = &ctxt->cs_base;
 			break;
 		case 0x3e:	/* DS override */
-			override_base = &ctxt->ds_base;
+			c->override_base = &ctxt->ds_base;
 			break;
 		case 0x26:	/* ES override */
-			override_base = &ctxt->es_base;
+			c->override_base = &ctxt->es_base;
 			break;
 		case 0x64:	/* FS override */
-			override_base = &ctxt->fs_base;
+			c->override_base = &ctxt->fs_base;
 			break;
 		case 0x65:	/* GS override */
-			override_base = &ctxt->gs_base;
+			c->override_base = &ctxt->gs_base;
 			break;
 		case 0x36:	/* SS override */
-			override_base = &ctxt->ss_base;
+			c->override_base = &ctxt->ss_base;
 			break;
 		case 0xf0:	/* LOCK */
-			lock_prefix = 1;
+			c->lock_prefix = 1;
 			break;
 		case 0xf2:	/* REPNE/REPNZ */
 		case 0xf3:	/* REP/REPE/REPZ */
-			rep_prefix = 1;
+			c->rep_prefix = 1;
 			break;
 		default:
 			goto done_prefixes;
@@ -608,177 +601,182 @@ x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 done_prefixes:
 
 	/* REX prefix. */
-	if ((mode == X86EMUL_MODE_PROT64) && ((b & 0xf0) == 0x40)) {
-		rex_prefix = b;
-		if (b & 8)
-			op_bytes = 8;	/* REX.W */
-		modrm_reg = (b & 4) << 1;	/* REX.R */
-		index_reg = (b & 2) << 2; /* REX.X */
-		modrm_rm = base_reg = (b & 1) << 3; /* REG.B */
-		b = insn_fetch(u8, 1, _eip);
+	if ((mode == X86EMUL_MODE_PROT64) && ((c->b & 0xf0) == 0x40)) {
+		rex_prefix = c->b;
+		if (c->b & 8)
+			c->op_bytes = 8;	/* REX.W */
+		c->modrm_reg = (c->b & 4) << 1;	/* REX.R */
+		index_reg = (c->b & 2) << 2; /* REX.X */
+		c->modrm_rm = base_reg = (c->b & 1) << 3; /* REG.B */
+		c->b = insn_fetch(u8, 1, c->eip);
 	}
 
 	/* Opcode byte(s). */
-	d = opcode_table[b];
-	if (d == 0) {
+	c->d = opcode_table[c->b];
+	if (c->d == 0) {
 		/* Two-byte opcode? */
-		if (b == 0x0f) {
-			twobyte = 1;
-			b = insn_fetch(u8, 1, _eip);
-			d = twobyte_table[b];
+		if (c->b == 0x0f) {
+			c->twobyte = 1;
+			c->b = insn_fetch(u8, 1, c->eip);
+			c->d = twobyte_table[c->b];
 		}
 
 		/* Unrecognised? */
-		if (d == 0)
+		if (c->d == 0)
 			goto cannot_emulate;
 	}
 
 	/* ModRM and SIB bytes. */
-	if (d & ModRM) {
-		modrm = insn_fetch(u8, 1, _eip);
-		modrm_mod |= (modrm & 0xc0) >> 6;
-		modrm_reg |= (modrm & 0x38) >> 3;
-		modrm_rm |= (modrm & 0x07);
-		modrm_ea = 0;
-		use_modrm_ea = 1;
-
-		if (modrm_mod == 3) {
-			modrm_val = *(unsigned long *)
-				decode_register(modrm_rm, _regs, d & ByteOp);
+	if (c->d & ModRM) {
+		c->modrm = insn_fetch(u8, 1, c->eip);
+		c->modrm_mod |= (c->modrm & 0xc0) >> 6;
+		c->modrm_reg |= (c->modrm & 0x38) >> 3;
+		c->modrm_rm |= (c->modrm & 0x07);
+		c->modrm_ea = 0;
+		c->use_modrm_ea = 1;
+
+		if (c->modrm_mod == 3) {
+			c->modrm_val = *(unsigned long *)
+			  decode_register(c->modrm_rm, c->regs, c->d & ByteOp);
 			goto modrm_done;
 		}
 
-		if (ad_bytes == 2) {
-			unsigned bx = _regs[VCPU_REGS_RBX];
-			unsigned bp = _regs[VCPU_REGS_RBP];
-			unsigned si = _regs[VCPU_REGS_RSI];
-			unsigned di = _regs[VCPU_REGS_RDI];
+		if (c->ad_bytes == 2) {
+			unsigned bx = c->regs[VCPU_REGS_RBX];
+			unsigned bp = c->regs[VCPU_REGS_RBP];
+			unsigned si = c->regs[VCPU_REGS_RSI];
+			unsigned di = c->regs[VCPU_REGS_RDI];
 
 			/* 16-bit ModR/M decode. */
-			switch (modrm_mod) {
+			switch (c->modrm_mod) {
 			case 0:
-				if (modrm_rm == 6)
-					modrm_ea += insn_fetch(u16, 2, _eip);
+				if (c->modrm_rm == 6)
+					c->modrm_ea +=
+						insn_fetch(u16, 2, c->eip);
 				break;
 			case 1:
-				modrm_ea += insn_fetch(s8, 1, _eip);
+				c->modrm_ea += insn_fetch(s8, 1, c->eip);
 				break;
 			case 2:
-				modrm_ea += insn_fetch(u16, 2, _eip);
+				c->modrm_ea += insn_fetch(u16, 2, c->eip);
 				break;
 			}
-			switch (modrm_rm) {
+			switch (c->modrm_rm) {
 			case 0:
-				modrm_ea += bx + si;
+				c->modrm_ea += bx + si;
 				break;
 			case 1:
-				modrm_ea += bx + di;
+				c->modrm_ea += bx + di;
 				break;
 			case 2:
-				modrm_ea += bp + si;
+				c->modrm_ea += bp + si;
 				break;
 			case 3:
-				modrm_ea += bp + di;
+				c->modrm_ea += bp + di;
 				break;
 			case 4:
-				modrm_ea += si;
+				c->modrm_ea += si;
 				break;
 			case 5:
-				modrm_ea += di;
+				c->modrm_ea += di;
 				break;
 			case 6:
-				if (modrm_mod != 0)
-					modrm_ea += bp;
+				if (c->modrm_mod != 0)
+					c->modrm_ea += bp;
 				break;
 			case 7:
-				modrm_ea += bx;
+				c->modrm_ea += bx;
 				break;
 			}
-			if (modrm_rm == 2 || modrm_rm == 3 ||
-			    (modrm_rm == 6 && modrm_mod != 0))
-				if (!override_base)
-					override_base = &ctxt->ss_base;
-			modrm_ea = (u16)modrm_ea;
+			if (c->modrm_rm == 2 || c->modrm_rm == 3 ||
+			    (c->modrm_rm == 6 && c->modrm_mod != 0))
+				if (!c->override_base)
+					c->override_base = &ctxt->ss_base;
+			c->modrm_ea = (u16)c->modrm_ea;
 		} else {
 			/* 32/64-bit ModR/M decode. */
-			switch (modrm_rm) {
+			switch (c->modrm_rm) {
 			case 4:
 			case 12:
-				sib = insn_fetch(u8, 1, _eip);
+				sib = insn_fetch(u8, 1, c->eip);
 				index_reg |= (sib >> 3) & 7;
 				base_reg |= sib & 7;
 				scale = sib >> 6;
 
 				switch (base_reg) {
 				case 5:
-					if (modrm_mod != 0)
-						modrm_ea += _regs[base_reg];
+					if (c->modrm_mod != 0)
+						c->modrm_ea +=
+							c->regs[base_reg];
 					else
-						modrm_ea += insn_fetch(s32, 4, _eip);
+						c->modrm_ea +=
+						    insn_fetch(s32, 4, c->eip);
 					break;
 				default:
-					modrm_ea += _regs[base_reg];
+					c->modrm_ea += c->regs[base_reg];
 				}
 				switch (index_reg) {
 				case 4:
 					break;
 				default:
-					modrm_ea += _regs[index_reg] << scale;
+					c->modrm_ea +=
+						c->regs[index_reg] << scale;
 
 				}
 				break;
 			case 5:
-				if (modrm_mod != 0)
-					modrm_ea += _regs[modrm_rm];
+				if (c->modrm_mod != 0)
+					c->modrm_ea += c->regs[c->modrm_rm];
 				else if (mode == X86EMUL_MODE_PROT64)
 					rip_relative = 1;
 				break;
 			default:
-				modrm_ea += _regs[modrm_rm];
+				c->modrm_ea += c->regs[c->modrm_rm];
 				break;
 			}
-			switch (modrm_mod) {
+			switch (c->modrm_mod) {
 			case 0:
-				if (modrm_rm == 5)
-					modrm_ea += insn_fetch(s32, 4, _eip);
+				if (c->modrm_rm == 5)
+					c->modrm_ea +=
+						insn_fetch(s32, 4, c->eip);
 				break;
 			case 1:
-				modrm_ea += insn_fetch(s8, 1, _eip);
+				c->modrm_ea += insn_fetch(s8, 1, c->eip);
 				break;
 			case 2:
-				modrm_ea += insn_fetch(s32, 4, _eip);
+				c->modrm_ea += insn_fetch(s32, 4, c->eip);
 				break;
 			}
 		}
-		if (!override_base)
-			override_base = &ctxt->ds_base;
+		if (!c->override_base)
+			c->override_base = &ctxt->ds_base;
 		if (mode == X86EMUL_MODE_PROT64 &&
-		    override_base != &ctxt->fs_base &&
-		    override_base != &ctxt->gs_base)
-			override_base = NULL;
+		    c->override_base != &ctxt->fs_base &&
+		    c->override_base != &ctxt->gs_base)
+			c->override_base = NULL;
 
-		if (override_base)
-			modrm_ea += *override_base;
+		if (c->override_base)
+			c->modrm_ea += *c->override_base;
 
 		if (rip_relative) {
-			modrm_ea += _eip;
-			switch (d & SrcMask) {
+			c->modrm_ea += c->eip;
+			switch (c->d & SrcMask) {
 			case SrcImmByte:
-				modrm_ea += 1;
+				c->modrm_ea += 1;
 				break;
 			case SrcImm:
-				if (d & ByteOp)
-					modrm_ea += 1;
+				if (c->d & ByteOp)
+					c->modrm_ea += 1;
 				else
-					if (op_bytes == 8)
-						modrm_ea += 4;
+					if (c->op_bytes == 8)
+						c->modrm_ea += 4;
 					else
-						modrm_ea += op_bytes;
+						c->modrm_ea += c->op_bytes;
 			}
 		}
-		if (ad_bytes != 8)
-			modrm_ea = (u32)modrm_ea;
-		cr2 = modrm_ea;
+		if (c->ad_bytes != 8)
+			c->modrm_ea = (u32)c->modrm_ea;
+		cr2 = c->modrm_ea;
 	modrm_done:
 		;
 	}
@@ -787,200 +785,210 @@ done_prefixes:
 	 * Decode and fetch the source operand: register, memory
 	 * or immediate.
 	 */
-	switch (d & SrcMask) {
+	switch (c->d & SrcMask) {
 	case SrcNone:
 		break;
 	case SrcReg:
-		src.type = OP_REG;
-		if (d & ByteOp) {
-			src.ptr = decode_register(modrm_reg, _regs,
+		c->src.type = OP_REG;
+		if (c->d & ByteOp) {
+			c->src.ptr =
+				decode_register(c->modrm_reg, c->regs,
 						  (rex_prefix == 0));
-			src.val = src.orig_val = *(u8 *) src.ptr;
-			src.bytes = 1;
+			c->src.val = c->src.orig_val = *(u8 *)c->src.ptr;
+			c->src.bytes = 1;
 		} else {
-			src.ptr = decode_register(modrm_reg, _regs, 0);
-			switch ((src.bytes = op_bytes)) {
+			c->src.ptr =
+			    decode_register(c->modrm_reg, c->regs, 0);
+			switch ((c->src.bytes = c->op_bytes)) {
 			case 2:
-				src.val = src.orig_val = *(u16 *) src.ptr;
+				c->src.val = c->src.orig_val =
+						       *(u16 *) c->src.ptr;
 				break;
 			case 4:
-				src.val = src.orig_val = *(u32 *) src.ptr;
+				c->src.val = c->src.orig_val =
+						       *(u32 *) c->src.ptr;
 				break;
 			case 8:
-				src.val = src.orig_val = *(u64 *) src.ptr;
+				c->src.val = c->src.orig_val =
+						       *(u64 *) c->src.ptr;
 				break;
 			}
 		}
 		break;
 	case SrcMem16:
-		src.bytes = 2;
+		c->src.bytes = 2;
 		goto srcmem_common;
 	case SrcMem32:
-		src.bytes = 4;
+		c->src.bytes = 4;
 		goto srcmem_common;
 	case SrcMem:
-		src.bytes = (d & ByteOp) ? 1 : op_bytes;
+		c->src.bytes = (c->d & ByteOp) ? 1 :
+							   c->op_bytes;
 		/* Don't fetch the address for invlpg: it could be unmapped. */
-		if (twobyte && b == 0x01 && modrm_reg == 7)
+		if (c->twobyte && c->b == 0x01
+				    && c->modrm_reg == 7)
 			break;
 	      srcmem_common:
 		/*
 		 * For instructions with a ModR/M byte, switch to register
 		 * access if Mod = 3.
 		 */
-		if ((d & ModRM) && modrm_mod == 3) {
-			src.type = OP_REG;
+		if ((c->d & ModRM) && c->modrm_mod == 3) {
+			c->src.type = OP_REG;
 			break;
 		}
-		src.type = OP_MEM;
-		src.ptr = (unsigned long *)cr2;
-		src.val = 0;
-		if ((rc = ops->read_emulated((unsigned long)src.ptr,
-					     &src.val, src.bytes, ctxt->vcpu)) != 0)
+		c->src.type = OP_MEM;
+		c->src.ptr = (unsigned long *)cr2;
+		c->src.val = 0;
+		if ((rc = ops->read_emulated((unsigned long)c->src.ptr,
+					   &c->src.val,
+					   c->src.bytes, ctxt->vcpu)) != 0)
 			goto done;
-		src.orig_val = src.val;
+		c->src.orig_val = c->src.val;
 		break;
 	case SrcImm:
-		src.type = OP_IMM;
-		src.ptr = (unsigned long *)_eip;
-		src.bytes = (d & ByteOp) ? 1 : op_bytes;
-		if (src.bytes == 8)
-			src.bytes = 4;
+		c->src.type = OP_IMM;
+		c->src.ptr = (unsigned long *)c->eip;
+		c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		if (c->src.bytes == 8)
+			c->src.bytes = 4;
 		/* NB. Immediates are sign-extended as necessary. */
-		switch (src.bytes) {
+		switch (c->src.bytes) {
 		case 1:
-			src.val = insn_fetch(s8, 1, _eip);
+			c->src.val = insn_fetch(s8, 1, c->eip);
 			break;
 		case 2:
-			src.val = insn_fetch(s16, 2, _eip);
+			c->src.val = insn_fetch(s16, 2, c->eip);
 			break;
 		case 4:
-			src.val = insn_fetch(s32, 4, _eip);
+			c->src.val = insn_fetch(s32, 4, c->eip);
 			break;
 		}
 		break;
 	case SrcImmByte:
-		src.type = OP_IMM;
-		src.ptr = (unsigned long *)_eip;
-		src.bytes = 1;
-		src.val = insn_fetch(s8, 1, _eip);
+		c->src.type = OP_IMM;
+		c->src.ptr = (unsigned long *)c->eip;
+		c->src.bytes = 1;
+		c->src.val = insn_fetch(s8, 1, c->eip);
 		break;
 	}
 
 	/* Decode and fetch the destination operand: register or memory. */
-	switch (d & DstMask) {
+	switch (c->d & DstMask) {
 	case ImplicitOps:
 		/* Special instructions do their own operand decoding. */
 		goto special_insn;
 	case DstReg:
-		dst.type = OP_REG;
-		if ((d & ByteOp)
-		    && !(twobyte && (b == 0xb6 || b == 0xb7))) {
-			dst.ptr = decode_register(modrm_reg, _regs,
+		c->dst.type = OP_REG;
+		if ((c->d & ByteOp)
+		    && !(c->twobyte &&
+			(c->b == 0xb6 || c->b == 0xb7))) {
+			c->dst.ptr =
+				decode_register(c->modrm_reg, c->regs,
 						  (rex_prefix == 0));
-			dst.val = *(u8 *) dst.ptr;
-			dst.bytes = 1;
+			c->dst.val = *(u8 *) c->dst.ptr;
+			c->dst.bytes = 1;
 		} else {
-			dst.ptr = decode_register(modrm_reg, _regs, 0);
-			switch ((dst.bytes = op_bytes)) {
+			c->dst.ptr =
+			    decode_register(c->modrm_reg, c->regs, 0);
+			switch ((c->dst.bytes = c->op_bytes)) {
 			case 2:
-				dst.val = *(u16 *)dst.ptr;
+				c->dst.val = *(u16 *)c->dst.ptr;
 				break;
 			case 4:
-				dst.val = *(u32 *)dst.ptr;
+				c->dst.val = *(u32 *)c->dst.ptr;
 				break;
 			case 8:
-				dst.val = *(u64 *)dst.ptr;
+				c->dst.val = *(u64 *)c->dst.ptr;
 				break;
 			}
 		}
 		break;
 	case DstMem:
-		dst.type = OP_MEM;
-		dst.ptr = (unsigned long *)cr2;
-		dst.bytes = (d & ByteOp) ? 1 : op_bytes;
-		dst.val = 0;
-		/*
-		 * For instructions with a ModR/M byte, switch to register
-		 * access if Mod = 3.
-		 */
-		if ((d & ModRM) && modrm_mod == 3) {
-			dst.type = OP_REG;
+		c->dst.type = OP_MEM;
+		c->dst.ptr = (unsigned long *)cr2;
+		c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		c->dst.val = 0;
+		if ((c->d & ModRM) && c->modrm_mod == 3) {
+			c->dst.type = OP_REG;
 			break;
 		}
-		if (d & BitOp) {
-			unsigned long mask = ~(dst.bytes * 8 - 1);
+		if (c->d & BitOp) {
+			unsigned long mask = ~(c->dst.bytes * 8 - 1);
 
-			dst.ptr = (void *)dst.ptr + (src.val & mask) / 8;
+			c->dst.ptr = (void *)c->dst.ptr +
+						   (c->src.val & mask) / 8;
 		}
-		if (!(d & Mov) && /* optimisation - avoid slow emulated read */
-		    ((rc = ops->read_emulated((unsigned long)dst.ptr,
-					      &dst.val, dst.bytes, ctxt->vcpu)) != 0))
+		if (!(c->d & Mov) &&
+				   /* optimisation - avoid slow emulated read */
+		    ((rc = ops->read_emulated((unsigned long)c->dst.ptr,
+					   &c->dst.val,
+					  c->dst.bytes, ctxt->vcpu)) != 0))
 			goto done;
 		break;
 	}
-	dst.orig_val = dst.val;
+	c->dst.orig_val = c->dst.val;
 
-	if (twobyte)
+	if (c->twobyte)
 		goto twobyte_insn;
 
-	switch (b) {
+	switch (c->b) {
 	case 0x00 ... 0x05:
 	      add:		/* add */
-		emulate_2op_SrcV("add", src, dst, _eflags);
+		emulate_2op_SrcV("add", c->src, c->dst, _eflags);
 		break;
 	case 0x08 ... 0x0d:
 	      or:		/* or */
-		emulate_2op_SrcV("or", src, dst, _eflags);
+		emulate_2op_SrcV("or", c->src, c->dst, _eflags);
 		break;
 	case 0x10 ... 0x15:
 	      adc:		/* adc */
-		emulate_2op_SrcV("adc", src, dst, _eflags);
+		emulate_2op_SrcV("adc", c->src, c->dst, _eflags);
 		break;
 	case 0x18 ... 0x1d:
 	      sbb:		/* sbb */
-		emulate_2op_SrcV("sbb", src, dst, _eflags);
+		emulate_2op_SrcV("sbb", c->src, c->dst, _eflags);
 		break;
 	case 0x20 ... 0x23:
 	      and:		/* and */
-		emulate_2op_SrcV("and", src, dst, _eflags);
+		emulate_2op_SrcV("and", c->src, c->dst, _eflags);
 		break;
 	case 0x24:              /* and al imm8 */
-		dst.type = OP_REG;
-		dst.ptr = &_regs[VCPU_REGS_RAX];
-		dst.val = *(u8 *)dst.ptr;
-		dst.bytes = 1;
-		dst.orig_val = dst.val;
+		c->dst.type = OP_REG;
+		c->dst.ptr = &c->regs[VCPU_REGS_RAX];
+		c->dst.val = *(u8 *)c->dst.ptr;
+		c->dst.bytes = 1;
+		c->dst.orig_val = c->dst.val;
 		goto and;
 	case 0x25:              /* and ax imm16, or eax imm32 */
-		dst.type = OP_REG;
-		dst.bytes = op_bytes;
-		dst.ptr = &_regs[VCPU_REGS_RAX];
-		if (op_bytes == 2)
-			dst.val = *(u16 *)dst.ptr;
+		c->dst.type = OP_REG;
+		c->dst.bytes = c->op_bytes;
+		c->dst.ptr = &c->regs[VCPU_REGS_RAX];
+		if (c->op_bytes == 2)
+			c->dst.val = *(u16 *)c->dst.ptr;
 		else
-			dst.val = *(u32 *)dst.ptr;
-		dst.orig_val = dst.val;
+			c->dst.val = *(u32 *)c->dst.ptr;
+		c->dst.orig_val = c->dst.val;
 		goto and;
 	case 0x28 ... 0x2d:
 	      sub:		/* sub */
-		emulate_2op_SrcV("sub", src, dst, _eflags);
+		emulate_2op_SrcV("sub", c->src, c->dst, _eflags);
 		break;
 	case 0x30 ... 0x35:
 	      xor:		/* xor */
-		emulate_2op_SrcV("xor", src, dst, _eflags);
+		emulate_2op_SrcV("xor", c->src, c->dst, _eflags);
 		break;
 	case 0x38 ... 0x3d:
 	      cmp:		/* cmp */
-		emulate_2op_SrcV("cmp", src, dst, _eflags);
+		emulate_2op_SrcV("cmp", c->src, c->dst, _eflags);
 		break;
 	case 0x63:		/* movsxd */
 		if (mode != X86EMUL_MODE_PROT64)
 			goto cannot_emulate;
-		dst.val = (s32) src.val;
+		c->dst.val = (s32) c->src.val;
 		break;
 	case 0x80 ... 0x83:	/* Grp1 */
-		switch (modrm_reg) {
+		switch (c->modrm_reg) {
 		case 0:
 			goto add;
 		case 1:
@@ -1001,155 +1009,164 @@ done_prefixes:
 		break;
 	case 0x84 ... 0x85:
 	      test:		/* test */
-		emulate_2op_SrcV("test", src, dst, _eflags);
+		emulate_2op_SrcV("test", c->src, c->dst, _eflags);
 		break;
 	case 0x86 ... 0x87:	/* xchg */
 		/* Write back the register source. */
-		switch (dst.bytes) {
+		switch (c->dst.bytes) {
 		case 1:
-			*(u8 *) src.ptr = (u8) dst.val;
+			*(u8 *) c->src.ptr = (u8) c->dst.val;
 			break;
 		case 2:
-			*(u16 *) src.ptr = (u16) dst.val;
+			*(u16 *) c->src.ptr = (u16) c->dst.val;
 			break;
 		case 4:
-			*src.ptr = (u32) dst.val;
+			*c->src.ptr = (u32) c->dst.val;
 			break;	/* 64b reg: zero-extend */
 		case 8:
-			*src.ptr = dst.val;
+			*c->src.ptr = c->dst.val;
 			break;
 		}
 		/*
 		 * Write back the memory destination with implicit LOCK
 		 * prefix.
 		 */
-		dst.val = src.val;
-		lock_prefix = 1;
+		c->dst.val = c->src.val;
+		c->lock_prefix = 1;
 		break;
 	case 0x88 ... 0x8b:	/* mov */
 		goto mov;
 	case 0x8d: /* lea r16/r32, m */
-		dst.val = modrm_val;
+		c->dst.val = c->modrm_val;
 		break;
 	case 0x8f:		/* pop (sole member of Grp1a) */
 		/* 64-bit mode: POP always pops a 64-bit operand. */
 		if (mode == X86EMUL_MODE_PROT64)
-			dst.bytes = 8;
-		if ((rc = ops->read_std(register_address(ctxt->ss_base,
-							 _regs[VCPU_REGS_RSP]),
-					&dst.val, dst.bytes, ctxt->vcpu)) != 0)
+			c->dst.bytes = 8;
+		if ((rc = ops->read_std(register_address(
+						   ctxt->ss_base,
+						   c->regs[VCPU_REGS_RSP]),
+						   &c->dst.val,
+						   c->dst.bytes,
+						   ctxt->vcpu)) != 0)
 			goto done;
-		register_address_increment(_regs[VCPU_REGS_RSP], dst.bytes);
+		register_address_increment(c->regs[VCPU_REGS_RSP],
+					   c->dst.bytes);
 		break;
 	case 0xa0 ... 0xa1:	/* mov */
-		dst.ptr = (unsigned long *)&_regs[VCPU_REGS_RAX];
-		dst.val = src.val;
-		_eip += ad_bytes;	/* skip src displacement */
+		c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
+		c->dst.val = c->src.val;
+		/* skip src displacement */
+		c->eip += c->ad_bytes;
 		break;
 	case 0xa2 ... 0xa3:	/* mov */
-		dst.val = (unsigned long)_regs[VCPU_REGS_RAX];
-		_eip += ad_bytes;	/* skip dst displacement */
+		c->dst.val = (unsigned long)c->regs[VCPU_REGS_RAX];
+		/* skip c->dst displacement */
+		c->eip += c->ad_bytes;
 		break;
 	case 0xc0 ... 0xc1:
 	      grp2:		/* Grp2 */
-		switch (modrm_reg) {
+		switch (c->modrm_reg) {
 		case 0:	/* rol */
-			emulate_2op_SrcB("rol", src, dst, _eflags);
+			emulate_2op_SrcB("rol", c->src, c->dst, _eflags);
 			break;
 		case 1:	/* ror */
-			emulate_2op_SrcB("ror", src, dst, _eflags);
+			emulate_2op_SrcB("ror", c->src, c->dst, _eflags);
 			break;
 		case 2:	/* rcl */
-			emulate_2op_SrcB("rcl", src, dst, _eflags);
+			emulate_2op_SrcB("rcl", c->src, c->dst, _eflags);
 			break;
 		case 3:	/* rcr */
-			emulate_2op_SrcB("rcr", src, dst, _eflags);
+			emulate_2op_SrcB("rcr", c->src, c->dst, _eflags);
 			break;
 		case 4:	/* sal/shl */
 		case 6:	/* sal/shl */
-			emulate_2op_SrcB("sal", src, dst, _eflags);
+			emulate_2op_SrcB("sal", c->src, c->dst, _eflags);
 			break;
 		case 5:	/* shr */
-			emulate_2op_SrcB("shr", src, dst, _eflags);
+			emulate_2op_SrcB("shr", c->src, c->dst, _eflags);
 			break;
 		case 7:	/* sar */
-			emulate_2op_SrcB("sar", src, dst, _eflags);
+			emulate_2op_SrcB("sar", c->src, c->dst, _eflags);
 			break;
 		}
 		break;
 	case 0xc6 ... 0xc7:	/* mov (sole member of Grp11) */
 	mov:
-		dst.val = src.val;
+		c->dst.val = c->src.val;
 		break;
 	case 0xd0 ... 0xd1:	/* Grp2 */
-		src.val = 1;
+		c->src.val = 1;
 		goto grp2;
 	case 0xd2 ... 0xd3:	/* Grp2 */
-		src.val = _regs[VCPU_REGS_RCX];
+		c->src.val = c->regs[VCPU_REGS_RCX];
 		goto grp2;
 	case 0xf6 ... 0xf7:	/* Grp3 */
-		switch (modrm_reg) {
+		switch (c->modrm_reg) {
 		case 0 ... 1:	/* test */
 			/*
 			 * Special case in Grp3: test has an immediate
 			 * source operand.
 			 */
-			src.type = OP_IMM;
-			src.ptr = (unsigned long *)_eip;
-			src.bytes = (d & ByteOp) ? 1 : op_bytes;
-			if (src.bytes == 8)
-				src.bytes = 4;
-			switch (src.bytes) {
+			c->src.type = OP_IMM;
+			c->src.ptr = (unsigned long *)c->eip;
+			c->src.bytes = (c->d & ByteOp) ? 1 :
+							       c->op_bytes;
+			if (c->src.bytes == 8)
+				c->src.bytes = 4;
+			switch (c->src.bytes) {
 			case 1:
-				src.val = insn_fetch(s8, 1, _eip);
+				c->src.val = insn_fetch(s8, 1, c->eip);
 				break;
 			case 2:
-				src.val = insn_fetch(s16, 2, _eip);
+				c->src.val = insn_fetch(s16, 2, c->eip);
 				break;
 			case 4:
-				src.val = insn_fetch(s32, 4, _eip);
+				c->src.val = insn_fetch(s32, 4, c->eip);
 				break;
 			}
 			goto test;
 		case 2:	/* not */
-			dst.val = ~dst.val;
+			c->dst.val = ~c->dst.val;
 			break;
 		case 3:	/* neg */
-			emulate_1op("neg", dst, _eflags);
+			emulate_1op("neg", c->dst, _eflags);
 			break;
 		default:
 			goto cannot_emulate;
 		}
 		break;
 	case 0xfe ... 0xff:	/* Grp4/Grp5 */
-		switch (modrm_reg) {
+		switch (c->modrm_reg) {
 		case 0:	/* inc */
-			emulate_1op("inc", dst, _eflags);
+			emulate_1op("inc", c->dst, _eflags);
 			break;
 		case 1:	/* dec */
-			emulate_1op("dec", dst, _eflags);
+			emulate_1op("dec", c->dst, _eflags);
 			break;
 		case 4: /* jmp abs */
-			if (b == 0xff)
-				_eip = dst.val;
+			if (c->b == 0xff)
+				c->eip = c->dst.val;
 			else
 				goto cannot_emulate;
 			break;
 		case 6:	/* push */
 			/* 64-bit mode: PUSH always pushes a 64-bit operand. */
 			if (mode == X86EMUL_MODE_PROT64) {
-				dst.bytes = 8;
-				if ((rc = ops->read_std((unsigned long)dst.ptr,
-							&dst.val, 8,
-							ctxt->vcpu)) != 0)
+				c->dst.bytes = 8;
+				if ((rc = ops->read_std(
+						 (unsigned long)c->dst.ptr,
+						 &c->dst.val, 8,
+						 ctxt->vcpu)) != 0)
 					goto done;
 			}
-			register_address_increment(_regs[VCPU_REGS_RSP],
-						   -dst.bytes);
+			register_address_increment(c->regs[VCPU_REGS_RSP],
+						   -c->dst.bytes);
 			if ((rc = ops->write_emulated(
 				     register_address(ctxt->ss_base,
-						      _regs[VCPU_REGS_RSP]),
-				     &dst.val, dst.bytes, ctxt->vcpu)) != 0)
+					  c->regs[VCPU_REGS_RSP]),
+					  &c->dst.val,
+					   c->dst.bytes, ctxt->vcpu)) != 0)
 				goto done;
 			no_wb = 1;
 			break;
@@ -1161,34 +1178,40 @@ done_prefixes:
 
 writeback:
 	if (!no_wb) {
-		switch (dst.type) {
+		switch (c->dst.type) {
 		case OP_REG:
-			/* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */
-			switch (dst.bytes) {
+			/* The 4-byte case *is* correct:
+			 * in 64-bit mode we zero-extend.
+			 */
+			switch (c->dst.bytes) {
 			case 1:
-				*(u8 *)dst.ptr = (u8)dst.val;
+				*(u8 *)c->dst.ptr = (u8)c->dst.val;
 				break;
 			case 2:
-				*(u16 *)dst.ptr = (u16)dst.val;
+				*(u16 *)c->dst.ptr = (u16)c->dst.val;
 				break;
 			case 4:
-				*dst.ptr = (u32)dst.val;
+				*c->dst.ptr = (u32)c->dst.val;
 				break;	/* 64b: zero-ext */
 			case 8:
-				*dst.ptr = dst.val;
+				*c->dst.ptr = c->dst.val;
 				break;
 			}
 			break;
 		case OP_MEM:
-			if (lock_prefix)
-				rc = ops->cmpxchg_emulated((unsigned long)dst.
-							   ptr, &dst.orig_val,
-							   &dst.val, dst.bytes,
-							   ctxt->vcpu);
+			if (c->lock_prefix)
+				rc = ops->cmpxchg_emulated(
+						(unsigned long)c->dst.ptr,
+						&c->dst.orig_val,
+						&c->dst.val,
+						c->dst.bytes,
+						ctxt->vcpu);
 			else
-				rc = ops->write_emulated((unsigned long)dst.ptr,
-							 &dst.val, dst.bytes,
-							 ctxt->vcpu);
+				rc = ops->write_emulated(
+						(unsigned long)c->dst.ptr,
+						&c->dst.val,
+						c->dst.bytes,
+						ctxt->vcpu);
 			if (rc != 0)
 				goto done;
 		default:
@@ -1197,173 +1220,185 @@ writeback:
 	}
 
 	/* Commit shadow register state. */
-	memcpy(ctxt->vcpu->regs, _regs, sizeof _regs);
+	memcpy(ctxt->vcpu->regs, c->regs, sizeof c->regs);
 	ctxt->eflags = _eflags;
-	ctxt->vcpu->rip = _eip;
+	ctxt->vcpu->rip = c->eip;
 
 done:
 	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
 
 special_insn:
-	if (twobyte)
+	if (c->twobyte)
 		goto twobyte_special_insn;
-	switch(b) {
+	switch (c->b) {
 	case 0x50 ... 0x57:  /* push reg */
-		if (op_bytes == 2)
-			src.val = (u16) _regs[b & 0x7];
+		if (c->op_bytes == 2)
+			c->src.val = (u16) c->regs[c->b & 0x7];
 		else
-			src.val = (u32) _regs[b & 0x7];
-		dst.type  = OP_MEM;
-		dst.bytes = op_bytes;
-		dst.val = src.val;
-		register_address_increment(_regs[VCPU_REGS_RSP], -op_bytes);
-		dst.ptr = (void *) register_address(
-			ctxt->ss_base, _regs[VCPU_REGS_RSP]);
+			c->src.val = (u32) c->regs[c->b & 0x7];
+		c->dst.type  = OP_MEM;
+		c->dst.bytes = c->op_bytes;
+		c->dst.val = c->src.val;
+		register_address_increment(c->regs[VCPU_REGS_RSP],
+					   -c->op_bytes);
+		c->dst.ptr = (void *) register_address(
+			ctxt->ss_base, c->regs[VCPU_REGS_RSP]);
 		break;
 	case 0x58 ... 0x5f: /* pop reg */
-		dst.ptr = (unsigned long *)&_regs[b & 0x7];
+		c->dst.ptr =
+				(unsigned long *)&c->regs[c->b & 0x7];
 	pop_instruction:
 		if ((rc = ops->read_std(register_address(ctxt->ss_base,
-			_regs[VCPU_REGS_RSP]), dst.ptr, op_bytes, ctxt->vcpu))
-			!= 0)
+			c->regs[VCPU_REGS_RSP]), c->dst.ptr,
+			c->op_bytes, ctxt->vcpu)) != 0)
 			goto done;
 
-		register_address_increment(_regs[VCPU_REGS_RSP], op_bytes);
+		register_address_increment(c->regs[VCPU_REGS_RSP],
+					   c->op_bytes);
 		no_wb = 1; /* Disable writeback. */
 		break;
 	case 0x6a: /* push imm8 */
-		src.val = 0L;
-		src.val = insn_fetch(s8, 1, _eip);
-	push:
-		dst.type  = OP_MEM;
-		dst.bytes = op_bytes;
-		dst.val = src.val;
-		register_address_increment(_regs[VCPU_REGS_RSP], -op_bytes);
-		dst.ptr = (void *) register_address(ctxt->ss_base,
-							_regs[VCPU_REGS_RSP]);
+		c->src.val = 0L;
+		c->src.val = insn_fetch(s8, 1, c->eip);
+push:
+		c->dst.type  = OP_MEM;
+		c->dst.bytes = c->op_bytes;
+		c->dst.val = c->src.val;
+		register_address_increment(c->regs[VCPU_REGS_RSP],
+					   -c->op_bytes);
+		c->dst.ptr = (void *) register_address(ctxt->ss_base,
+						       c->regs[VCPU_REGS_RSP]);
 		break;
 	case 0x6c:		/* insb */
 	case 0x6d:		/* insw/insd */
 		 if (kvm_emulate_pio_string(ctxt->vcpu, NULL,
-				1, 					/* in */
-				(d & ByteOp) ? 1 : op_bytes, 		/* size */
-				rep_prefix ?
-				address_mask(_regs[VCPU_REGS_RCX]) : 1,	/* count */
-				(_eflags & EFLG_DF),			/* down */
+				1,
+				(c->d & ByteOp) ? 1 : c->op_bytes,
+				c->rep_prefix ?
+				address_mask(c->regs[VCPU_REGS_RCX]) : 1,
+				(_eflags & EFLG_DF),
 				register_address(ctxt->es_base,
-						 _regs[VCPU_REGS_RDI]),	/* address */
-				rep_prefix,
-				_regs[VCPU_REGS_RDX]			/* port */
-				) == 0)
+						 c->regs[VCPU_REGS_RDI]),
+				c->rep_prefix,
+				c->regs[VCPU_REGS_RDX]) == 0)
 			return -1;
 		return 0;
 	case 0x6e:		/* outsb */
 	case 0x6f:		/* outsw/outsd */
 		if (kvm_emulate_pio_string(ctxt->vcpu, NULL,
-				0, 					/* in */
-				(d & ByteOp) ? 1 : op_bytes, 		/* size */
-				rep_prefix ?
-				address_mask(_regs[VCPU_REGS_RCX]) : 1,	/* count */
-				(_eflags & EFLG_DF),			/* down */
-				register_address(override_base ?
-						 *override_base : ctxt->ds_base,
-						 _regs[VCPU_REGS_RSI]),	/* address */
-				rep_prefix,
-				_regs[VCPU_REGS_RDX]			/* port */
-				) == 0)
+				0,
+				(c->d & ByteOp) ? 1 : c->op_bytes,
+				c->rep_prefix ?
+				address_mask(c->regs[VCPU_REGS_RCX]) : 1,
+				(_eflags & EFLG_DF),
+				register_address(c->override_base ?
+							*c->override_base :
+							ctxt->ds_base,
+						 c->regs[VCPU_REGS_RSI]),
+				c->rep_prefix,
+				c->regs[VCPU_REGS_RDX]) == 0)
 			return -1;
 		return 0;
 	case 0x70 ... 0x7f: /* jcc (short) */ {
-		int rel = insn_fetch(s8, 1, _eip);
+		int rel = insn_fetch(s8, 1, c->eip);
 
-		if (test_cc(b, _eflags))
+		if (test_cc(c->b, _eflags))
 		JMP_REL(rel);
 		break;
 	}
 	case 0x9c: /* pushf */
-		src.val =  (unsigned long) _eflags;
+		c->src.val =  (unsigned long) _eflags;
 		goto push;
 	case 0x9d: /* popf */
-		dst.ptr = (unsigned long *) &_eflags;
+		c->dst.ptr = (unsigned long *) &_eflags;
 		goto pop_instruction;
 	case 0xc3: /* ret */
-		dst.ptr = &_eip;
+		c->dst.ptr = &c->eip;
 		goto pop_instruction;
 	case 0xf4:              /* hlt */
 		ctxt->vcpu->halt_request = 1;
 		goto done;
 	}
-	if (rep_prefix) {
-		if (_regs[VCPU_REGS_RCX] == 0) {
-			ctxt->vcpu->rip = _eip;
+	if (c->rep_prefix) {
+		if (c->regs[VCPU_REGS_RCX] == 0) {
+			ctxt->vcpu->rip = c->eip;
 			goto done;
 		}
-		_regs[VCPU_REGS_RCX]--;
-		_eip = ctxt->vcpu->rip;
+		c->regs[VCPU_REGS_RCX]--;
+		c->eip = ctxt->vcpu->rip;
 	}
-	switch (b) {
+	switch (c->b) {
 	case 0xa4 ... 0xa5:	/* movs */
-		dst.type = OP_MEM;
-		dst.bytes = (d & ByteOp) ? 1 : op_bytes;
-		dst.ptr = (unsigned long *)register_address(ctxt->es_base,
-							_regs[VCPU_REGS_RDI]);
+		c->dst.type = OP_MEM;
+		c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		c->dst.ptr = (unsigned long *)register_address(
+						   ctxt->es_base,
+						   c->regs[VCPU_REGS_RDI]);
 		if ((rc = ops->read_emulated(register_address(
-		      override_base ? *override_base : ctxt->ds_base,
-		      _regs[VCPU_REGS_RSI]), &dst.val, dst.bytes, ctxt->vcpu)) != 0)
+		      c->override_base ? *c->override_base :
+					ctxt->ds_base,
+					c->regs[VCPU_REGS_RSI]),
+					&c->dst.val,
+					c->dst.bytes, ctxt->vcpu)) != 0)
 			goto done;
-		register_address_increment(_regs[VCPU_REGS_RSI],
-			     (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
-		register_address_increment(_regs[VCPU_REGS_RDI],
-			     (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
+		register_address_increment(c->regs[VCPU_REGS_RSI],
+				       (_eflags & EFLG_DF) ? -c->dst.bytes
+							   : c->dst.bytes);
+		register_address_increment(c->regs[VCPU_REGS_RDI],
+				       (_eflags & EFLG_DF) ? -c->dst.bytes
+							   : c->dst.bytes);
 		break;
 	case 0xa6 ... 0xa7:	/* cmps */
 		DPRINTF("Urk! I don't handle CMPS.\n");
 		goto cannot_emulate;
 	case 0xaa ... 0xab:	/* stos */
-		dst.type = OP_MEM;
-		dst.bytes = (d & ByteOp) ? 1 : op_bytes;
-		dst.ptr = (unsigned long *)cr2;
-		dst.val = _regs[VCPU_REGS_RAX];
-		register_address_increment(_regs[VCPU_REGS_RDI],
-			     (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
+		c->dst.type = OP_MEM;
+		c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		c->dst.ptr = (unsigned long *)cr2;
+		c->dst.val = c->regs[VCPU_REGS_RAX];
+		register_address_increment(c->regs[VCPU_REGS_RDI],
+				       (_eflags & EFLG_DF) ? -c->dst.bytes
+							   : c->dst.bytes);
 		break;
 	case 0xac ... 0xad:	/* lods */
-		dst.type = OP_REG;
-		dst.bytes = (d & ByteOp) ? 1 : op_bytes;
-		dst.ptr = (unsigned long *)&_regs[VCPU_REGS_RAX];
-		if ((rc = ops->read_emulated(cr2, &dst.val, dst.bytes,
+		c->dst.type = OP_REG;
+		c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
+		if ((rc = ops->read_emulated(cr2, &c->dst.val,
+					     c->dst.bytes,
 					     ctxt->vcpu)) != 0)
 			goto done;
-		register_address_increment(_regs[VCPU_REGS_RSI],
-			   (_eflags & EFLG_DF) ? -dst.bytes : dst.bytes);
+		register_address_increment(c->regs[VCPU_REGS_RSI],
+				       (_eflags & EFLG_DF) ? -c->dst.bytes
+							   : c->dst.bytes);
 		break;
 	case 0xae ... 0xaf:	/* scas */
 		DPRINTF("Urk! I don't handle SCAS.\n");
 		goto cannot_emulate;
 	case 0xe8: /* call (near) */ {
 		long int rel;
-		switch (op_bytes) {
+		switch (c->op_bytes) {
 		case 2:
-			rel = insn_fetch(s16, 2, _eip);
+			rel = insn_fetch(s16, 2, c->eip);
 			break;
 		case 4:
-			rel = insn_fetch(s32, 4, _eip);
+			rel = insn_fetch(s32, 4, c->eip);
 			break;
 		case 8:
-			rel = insn_fetch(s64, 8, _eip);
+			rel = insn_fetch(s64, 8, c->eip);
 			break;
 		default:
 			DPRINTF("Call: Invalid op_bytes\n");
 			goto cannot_emulate;
 		}
-		src.val = (unsigned long) _eip;
+		c->src.val = (unsigned long) c->eip;
 		JMP_REL(rel);
-		op_bytes = ad_bytes;
+		c->op_bytes = c->ad_bytes;
 		goto push;
 	}
 	case 0xe9: /* jmp rel */
 	case 0xeb: /* jmp rel short */
-		JMP_REL(src.val);
+		JMP_REL(c->src.val);
 		no_wb = 1; /* Disable writeback. */
 		break;
 
@@ -1372,16 +1407,16 @@ special_insn:
 	goto writeback;
 
 twobyte_insn:
-	switch (b) {
+	switch (c->b) {
 	case 0x01: /* lgdt, lidt, lmsw */
 		/* Disable writeback. */
 		no_wb = 1;
-		switch (modrm_reg) {
+		switch (c->modrm_reg) {
 			u16 size;
 			unsigned long address;
 
 		case 0: /* vmcall */
-			if (modrm_mod != 3 || modrm_rm != 1)
+			if (c->modrm_mod != 3 || c->modrm_rm != 1)
 				goto cannot_emulate;
 
 			rc = kvm_fix_hypercall(ctxt->vcpu);
@@ -1391,37 +1426,37 @@ twobyte_insn:
 			kvm_emulate_hypercall(ctxt->vcpu);
 			break;
 		case 2: /* lgdt */
-			rc = read_descriptor(ctxt, ops, src.ptr,
-					     &size, &address, op_bytes);
+			rc = read_descriptor(ctxt, ops, c->src.ptr,
+					     &size, &address, c->op_bytes);
 			if (rc)
 				goto done;
 			realmode_lgdt(ctxt->vcpu, size, address);
 			break;
 		case 3: /* lidt/vmmcall */
-			if (modrm_mod == 3 && modrm_rm == 1) {
+			if (c->modrm_mod == 3 && c->modrm_rm == 1) {
 				rc = kvm_fix_hypercall(ctxt->vcpu);
 				if (rc)
 					goto done;
 				kvm_emulate_hypercall(ctxt->vcpu);
 			} else {
-				rc = read_descriptor(ctxt, ops, src.ptr,
+				rc = read_descriptor(ctxt, ops, c->src.ptr,
 						     &size, &address,
-						     op_bytes);
+						     c->op_bytes);
 				if (rc)
 					goto done;
 				realmode_lidt(ctxt->vcpu, size, address);
 			}
 			break;
 		case 4: /* smsw */
-			if (modrm_mod != 3)
+			if (c->modrm_mod != 3)
 				goto cannot_emulate;
-			*(u16 *)&_regs[modrm_rm]
+			*(u16 *)&c->regs[c->modrm_rm]
 				= realmode_get_cr(ctxt->vcpu, 0);
 			break;
 		case 6: /* lmsw */
-			if (modrm_mod != 3)
+			if (c->modrm_mod != 3)
 				goto cannot_emulate;
-			realmode_lmsw(ctxt->vcpu, (u16)modrm_val, &_eflags);
+			realmode_lmsw(ctxt->vcpu, (u16)c->modrm_val, &_eflags);
 			break;
 		case 7: /* invlpg*/
 			emulate_invlpg(ctxt->vcpu, cr2);
@@ -1432,24 +1467,26 @@ twobyte_insn:
 		break;
 	case 0x21: /* mov from dr to reg */
 		no_wb = 1;
-		if (modrm_mod != 3)
+		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		rc = emulator_get_dr(ctxt, modrm_reg, &_regs[modrm_rm]);
+		rc = emulator_get_dr(ctxt, c->modrm_reg,
+				     &c->regs[c->modrm_rm]);
 		break;
 	case 0x23: /* mov from reg to dr */
 		no_wb = 1;
-		if (modrm_mod != 3)
+		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		rc = emulator_set_dr(ctxt, modrm_reg, _regs[modrm_rm]);
+		rc = emulator_set_dr(ctxt, c->modrm_reg,
+				     c->regs[c->modrm_rm]);
 		break;
 	case 0x40 ... 0x4f:	/* cmov */
-		dst.val = dst.orig_val = src.val;
+		c->dst.val = c->dst.orig_val = c->src.val;
 		no_wb = 1;
 		/*
 		 * First, assume we're decoding an even cmov opcode
 		 * (lsb == 0).
 		 */
-		switch ((b & 15) >> 1) {
+		switch ((c->b & 15) >> 1) {
 		case 0:	/* cmovo */
 			no_wb = (_eflags & EFLG_OF) ? 0 : 1;
 			break;
@@ -1477,46 +1514,50 @@ twobyte_insn:
 			break;
 		}
 		/* Odd cmov opcodes (lsb == 1) have inverted sense. */
-		no_wb ^= b & 1;
+		no_wb ^= c->b & 1;
 		break;
 	case 0xa3:
 	      bt:		/* bt */
-		src.val &= (dst.bytes << 3) - 1; /* only subword offset */
-		emulate_2op_SrcV_nobyte("bt", src, dst, _eflags);
+		/* only subword offset */
+		c->src.val &= (c->dst.bytes << 3) - 1;
+		emulate_2op_SrcV_nobyte("bt", c->src, c->dst, _eflags);
 		break;
 	case 0xab:
 	      bts:		/* bts */
-		src.val &= (dst.bytes << 3) - 1; /* only subword offset */
-		emulate_2op_SrcV_nobyte("bts", src, dst, _eflags);
+		/* only subword offset */
+		c->src.val &= (c->dst.bytes << 3) - 1;
+		emulate_2op_SrcV_nobyte("bts", c->src, c->dst, _eflags);
 		break;
 	case 0xb0 ... 0xb1:	/* cmpxchg */
 		/*
 		 * Save real source value, then compare EAX against
 		 * destination.
 		 */
-		src.orig_val = src.val;
-		src.val = _regs[VCPU_REGS_RAX];
-		emulate_2op_SrcV("cmp", src, dst, _eflags);
+		c->src.orig_val = c->src.val;
+		c->src.val = c->regs[VCPU_REGS_RAX];
+		emulate_2op_SrcV("cmp", c->src, c->dst, _eflags);
 		if (_eflags & EFLG_ZF) {
 			/* Success: write back to memory. */
-			dst.val = src.orig_val;
+			c->dst.val = c->src.orig_val;
 		} else {
 			/* Failure: write the value we saw to EAX. */
-			dst.type = OP_REG;
-			dst.ptr = (unsigned long *)&_regs[VCPU_REGS_RAX];
+			c->dst.type = OP_REG;
+			c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
 		}
 		break;
 	case 0xb3:
 	      btr:		/* btr */
-		src.val &= (dst.bytes << 3) - 1; /* only subword offset */
-		emulate_2op_SrcV_nobyte("btr", src, dst, _eflags);
+		/* only subword offset */
+		c->src.val &= (c->dst.bytes << 3) - 1;
+		emulate_2op_SrcV_nobyte("btr", c->src, c->dst, _eflags);
 		break;
 	case 0xb6 ... 0xb7:	/* movzx */
-		dst.bytes = op_bytes;
-		dst.val = (d & ByteOp) ? (u8) src.val : (u16) src.val;
+		c->dst.bytes = c->op_bytes;
+		c->dst.val = (c->d & ByteOp) ? (u8) c->src.val
+						       : (u16) c->src.val;
 		break;
 	case 0xba:		/* Grp8 */
-		switch (modrm_reg & 3) {
+		switch (c->modrm_reg & 3) {
 		case 0:
 			goto bt;
 		case 1:
@@ -1529,16 +1570,19 @@ twobyte_insn:
 		break;
 	case 0xbb:
 	      btc:		/* btc */
-		src.val &= (dst.bytes << 3) - 1; /* only subword offset */
-		emulate_2op_SrcV_nobyte("btc", src, dst, _eflags);
+		/* only subword offset */
+		c->src.val &= (c->dst.bytes << 3) - 1;
+		emulate_2op_SrcV_nobyte("btc", c->src, c->dst, _eflags);
 		break;
 	case 0xbe ... 0xbf:	/* movsx */
-		dst.bytes = op_bytes;
-		dst.val = (d & ByteOp) ? (s8) src.val : (s16) src.val;
+		c->dst.bytes = c->op_bytes;
+		c->dst.val = (c->d & ByteOp) ? (s8) c->src.val :
+							(s16) c->src.val;
 		break;
 	case 0xc3:		/* movnti */
-		dst.bytes = op_bytes;
-		dst.val = (op_bytes == 4) ? (u32) src.val : (u64) src.val;
+		c->dst.bytes = c->op_bytes;
+		c->dst.val = (c->op_bytes == 4) ? (u32) c->src.val :
+			                                (u64) c->src.val;
 		break;
 	}
 	goto writeback;
@@ -1546,7 +1590,7 @@ twobyte_insn:
 twobyte_special_insn:
 	/* Disable writeback. */
 	no_wb = 1;
-	switch (b) {
+	switch (c->b) {
 	case 0x06:
 		emulate_clts(ctxt->vcpu);
 		break;
@@ -1558,56 +1602,59 @@ twobyte_special_insn:
 	case 0x18:		/* Grp16 (prefetch/nop) */
 		break;
 	case 0x20: /* mov cr, reg */
-		if (modrm_mod != 3)
+		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		_regs[modrm_rm] = realmode_get_cr(ctxt->vcpu, modrm_reg);
+		c->regs[c->modrm_rm] =
+				realmode_get_cr(ctxt->vcpu, c->modrm_reg);
 		break;
 	case 0x22: /* mov reg, cr */
-		if (modrm_mod != 3)
+		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		realmode_set_cr(ctxt->vcpu, modrm_reg, modrm_val, &_eflags);
+		realmode_set_cr(ctxt->vcpu,
+				c->modrm_reg, c->modrm_val, &_eflags);
 		break;
 	case 0x30:
 		/* wrmsr */
-		msr_data = (u32)_regs[VCPU_REGS_RAX]
-			| ((u64)_regs[VCPU_REGS_RDX] << 32);
-		rc = kvm_set_msr(ctxt->vcpu, _regs[VCPU_REGS_RCX], msr_data);
+		msr_data = (u32)c->regs[VCPU_REGS_RAX]
+			| ((u64)c->regs[VCPU_REGS_RDX] << 32);
+		rc = kvm_set_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], msr_data);
 		if (rc) {
 			kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
-			_eip = ctxt->vcpu->rip;
+			c->eip = ctxt->vcpu->rip;
 		}
 		rc = X86EMUL_CONTINUE;
 		break;
 	case 0x32:
 		/* rdmsr */
-		rc = kvm_get_msr(ctxt->vcpu, _regs[VCPU_REGS_RCX], &msr_data);
+		rc = kvm_get_msr(ctxt->vcpu,
+				 c->regs[VCPU_REGS_RCX], &msr_data);
 		if (rc) {
 			kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
-			_eip = ctxt->vcpu->rip;
+			c->eip = ctxt->vcpu->rip;
 		} else {
-			_regs[VCPU_REGS_RAX] = (u32)msr_data;
-			_regs[VCPU_REGS_RDX] = msr_data >> 32;
+			c->regs[VCPU_REGS_RAX] = (u32)msr_data;
+			c->regs[VCPU_REGS_RDX] = msr_data >> 32;
 		}
 		rc = X86EMUL_CONTINUE;
 		break;
 	case 0x80 ... 0x8f: /* jnz rel, etc*/ {
 		long int rel;
 
-		switch (op_bytes) {
+		switch (c->op_bytes) {
 		case 2:
-			rel = insn_fetch(s16, 2, _eip);
+			rel = insn_fetch(s16, 2, c->eip);
 			break;
 		case 4:
-			rel = insn_fetch(s32, 4, _eip);
+			rel = insn_fetch(s32, 4, c->eip);
 			break;
 		case 8:
-			rel = insn_fetch(s64, 8, _eip);
+			rel = insn_fetch(s64, 8, c->eip);
 			break;
 		default:
 			DPRINTF("jnz: Invalid op_bytes\n");
 			goto cannot_emulate;
 		}
-		if (test_cc(b, _eflags))
+		if (test_cc(c->b, _eflags))
 			JMP_REL(rel);
 		break;
 	}
@@ -1617,14 +1664,16 @@ twobyte_special_insn:
 			if ((rc = ops->read_emulated(cr2, &old, 8, ctxt->vcpu))
 									!= 0)
 				goto done;
-			if (((u32) (old >> 0) != (u32) _regs[VCPU_REGS_RAX]) ||
-			    ((u32) (old >> 32) != (u32) _regs[VCPU_REGS_RDX])) {
-				_regs[VCPU_REGS_RAX] = (u32) (old >> 0);
-				_regs[VCPU_REGS_RDX] = (u32) (old >> 32);
+			if (((u32) (old >> 0) !=
+					(u32) c->regs[VCPU_REGS_RAX]) ||
+			    ((u32) (old >> 32) !=
+					(u32) c->regs[VCPU_REGS_RDX])) {
+				c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
+				c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
 				_eflags &= ~EFLG_ZF;
 			} else {
-				new = ((u64)_regs[VCPU_REGS_RCX] << 32)
-					| (u32) _regs[VCPU_REGS_RBX];
+				new = ((u64)c->regs[VCPU_REGS_RCX] << 32)
+					| (u32) c->regs[VCPU_REGS_RBX];
 				if ((rc = ops->cmpxchg_emulated(cr2, &old,
 							  &new, 8, ctxt->vcpu)) != 0)
 					goto done;
@@ -1636,6 +1685,6 @@ twobyte_special_insn:
 	goto writeback;
 
 cannot_emulate:
-	DPRINTF("Cannot emulate %02x\n", b);
+	DPRINTF("Cannot emulate %02x\n", c->b);
 	return -1;
 }
diff --git a/drivers/kvm/x86_emulate.h b/drivers/kvm/x86_emulate.h
index 92c73aa..c354200 100644
--- a/drivers/kvm/x86_emulate.h
+++ b/drivers/kvm/x86_emulate.h
@@ -112,6 +112,36 @@ struct x86_emulate_ops {
 
 };
 
+/* Type, address-of, and value of an instruction's operand. */
+struct operand {
+	enum { OP_REG, OP_MEM, OP_IMM } type;
+	unsigned int bytes;
+	unsigned long val, orig_val, *ptr;
+};
+
+struct decode_cache {
+	u8 twobyte;
+	u8 b;
+	u8 lock_prefix;
+	u8 rep_prefix;
+	u8 op_bytes;
+	u8 ad_bytes;
+	struct operand src;
+	struct operand dst;
+	unsigned long *override_base;
+	unsigned int d;
+	unsigned long regs[NR_VCPU_REGS];
+	unsigned long eip;
+	/* modrm */
+	u8 modrm;
+	u8 modrm_mod;
+	u8 modrm_reg;
+	u8 modrm_rm;
+	u8 use_modrm_ea;
+	unsigned long modrm_ea;
+	unsigned long modrm_val;
+};
+
 struct x86_emulate_ctxt {
 	/* Register state before/after emulation. */
 	struct kvm_vcpu *vcpu;
@@ -129,6 +159,10 @@ struct x86_emulate_ctxt {
 	unsigned long ss_base;
 	unsigned long gs_base;
 	unsigned long fs_base;
+
+	/* decode cache */
+
+	struct decode_cache decode;
 };
 
 /* Execution mode, passed to the emulator. */
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 05/50] KVM: x86 emulator: move all decoding process to function x86_decode_insn()
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 04/50] KVM: x86 emulator: move all x86_emulate_memop() to a structure Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 06/50] KVM: emulate_instruction() calls now x86_decode_insn() and x86_emulate_insn() Avi Kivity
                     ` (44 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Split the decoding process into a new function x86_decode_insn().

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   77 +++++++++++++++++++++++++++++++--------------
 1 files changed, 53 insertions(+), 24 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index f946182..f20534b 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -517,20 +517,16 @@ static int test_cc(unsigned int condition, unsigned int flags)
 }
 
 int
-x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
+x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
 	struct decode_cache *c = &ctxt->decode;
 	u8 sib, rex_prefix = 0;
 	unsigned int i;
 	int rc = 0;
-	unsigned long cr2 = ctxt->cr2;
 	int mode = ctxt->mode;
 	int index_reg = 0, base_reg = 0, scale, rip_relative = 0;
-	int no_wb = 0;
-	u64 msr_data;
 
 	/* Shadow copy of register state. Committed on successful emulation. */
-	unsigned long _eflags = ctxt->eflags;
 
 	memset(c, 0, sizeof(struct decode_cache));
 	c->eip = ctxt->vcpu->rip;
@@ -622,8 +618,10 @@ done_prefixes:
 		}
 
 		/* Unrecognised? */
-		if (c->d == 0)
-			goto cannot_emulate;
+		if (c->d == 0) {
+			DPRINTF("Cannot emulate %02x\n", c->b);
+			return -1;
+		}
 	}
 
 	/* ModRM and SIB bytes. */
@@ -776,7 +774,6 @@ done_prefixes:
 		}
 		if (c->ad_bytes != 8)
 			c->modrm_ea = (u32)c->modrm_ea;
-		cr2 = c->modrm_ea;
 	modrm_done:
 		;
 	}
@@ -838,13 +835,6 @@ done_prefixes:
 			break;
 		}
 		c->src.type = OP_MEM;
-		c->src.ptr = (unsigned long *)cr2;
-		c->src.val = 0;
-		if ((rc = ops->read_emulated((unsigned long)c->src.ptr,
-					   &c->src.val,
-					   c->src.bytes, ctxt->vcpu)) != 0)
-			goto done;
-		c->src.orig_val = c->src.val;
 		break;
 	case SrcImm:
 		c->src.type = OP_IMM;
@@ -877,7 +867,7 @@ done_prefixes:
 	switch (c->d & DstMask) {
 	case ImplicitOps:
 		/* Special instructions do their own operand decoding. */
-		goto special_insn;
+		return 0;
 	case DstReg:
 		c->dst.type = OP_REG;
 		if ((c->d & ByteOp)
@@ -905,14 +895,54 @@ done_prefixes:
 		}
 		break;
 	case DstMem:
-		c->dst.type = OP_MEM;
-		c->dst.ptr = (unsigned long *)cr2;
-		c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
-		c->dst.val = 0;
 		if ((c->d & ModRM) && c->modrm_mod == 3) {
 			c->dst.type = OP_REG;
 			break;
 		}
+		c->dst.type = OP_MEM;
+		break;
+	}
+
+done:
+	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
+}
+
+int
+x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
+{
+	unsigned long cr2 = ctxt->cr2;
+	int no_wb = 0;
+	u64 msr_data;
+	unsigned long _eflags = ctxt->eflags;
+	struct decode_cache *c = &ctxt->decode;
+	int rc;
+
+	rc = x86_decode_insn(ctxt, ops);
+	if (rc)
+		return rc;
+
+	if ((c->d & ModRM) && (c->modrm_mod != 3))
+		cr2 = c->modrm_ea;
+
+	if (c->src.type == OP_MEM) {
+		c->src.ptr = (unsigned long *)cr2;
+		c->src.val = 0;
+		if ((rc = ops->read_emulated((unsigned long)c->src.ptr,
+					     &c->src.val,
+					     c->src.bytes,
+					     ctxt->vcpu)) != 0)
+			goto done;
+		c->src.orig_val = c->src.val;
+	}
+
+	if ((c->d & DstMask) == ImplicitOps)
+		goto special_insn;
+
+
+	if (c->dst.type == OP_MEM) {
+		c->dst.ptr = (unsigned long *)cr2;
+		c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		c->dst.val = 0;
 		if (c->d & BitOp) {
 			unsigned long mask = ~(c->dst.bytes * 8 - 1);
 
@@ -925,7 +955,6 @@ done_prefixes:
 					   &c->dst.val,
 					  c->dst.bytes, ctxt->vcpu)) != 0))
 			goto done;
-		break;
 	}
 	c->dst.orig_val = c->dst.val;
 
@@ -983,7 +1012,7 @@ done_prefixes:
 		emulate_2op_SrcV("cmp", c->src, c->dst, _eflags);
 		break;
 	case 0x63:		/* movsxd */
-		if (mode != X86EMUL_MODE_PROT64)
+		if (ctxt->mode != X86EMUL_MODE_PROT64)
 			goto cannot_emulate;
 		c->dst.val = (s32) c->src.val;
 		break;
@@ -1041,7 +1070,7 @@ done_prefixes:
 		break;
 	case 0x8f:		/* pop (sole member of Grp1a) */
 		/* 64-bit mode: POP always pops a 64-bit operand. */
-		if (mode == X86EMUL_MODE_PROT64)
+		if (ctxt->mode == X86EMUL_MODE_PROT64)
 			c->dst.bytes = 8;
 		if ((rc = ops->read_std(register_address(
 						   ctxt->ss_base,
@@ -1152,7 +1181,7 @@ done_prefixes:
 			break;
 		case 6:	/* push */
 			/* 64-bit mode: PUSH always pushes a 64-bit operand. */
-			if (mode == X86EMUL_MODE_PROT64) {
+			if (ctxt->mode == X86EMUL_MODE_PROT64) {
 				c->dst.bytes = 8;
 				if ((rc = ops->read_std(
 						 (unsigned long)c->dst.ptr,
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 06/50] KVM: emulate_instruction() calls now x86_decode_insn() and x86_emulate_insn()
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 05/50] KVM: x86 emulator: move all decoding process to function x86_decode_insn() Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 07/50] KVM: Call x86_decode_insn() only when needed Avi Kivity
                     ` (43 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

emulate_instruction() calls now x86_decode_insn() and x86_emulate_insn().
x86_emulate_insn() is x86_emulate_memop() without the decoding part.

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c    |    5 ++++-
 drivers/kvm/x86_emulate.c |    8 ++------
 drivers/kvm/x86_emulate.h |   11 ++++-------
 3 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 9668e9c..39c54d5 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1287,7 +1287,10 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 
 	vcpu->mmio_is_write = 0;
 	vcpu->pio.string = 0;
-	r = x86_emulate_memop(&emulate_ctxt, &emulate_ops);
+	r = x86_decode_insn(&emulate_ctxt, &emulate_ops);
+	if (r == 0)
+		r = x86_emulate_insn(&emulate_ctxt, &emulate_ops);
+
 	if (vcpu->pio.string)
 		return EMULATE_DO_MMIO;
 
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index f20534b..9290083 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -908,18 +908,14 @@ done:
 }
 
 int
-x86_emulate_memop(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
+x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
 	unsigned long cr2 = ctxt->cr2;
 	int no_wb = 0;
 	u64 msr_data;
 	unsigned long _eflags = ctxt->eflags;
 	struct decode_cache *c = &ctxt->decode;
-	int rc;
-
-	rc = x86_decode_insn(ctxt, ops);
-	if (rc)
-		return rc;
+	int rc = 0;
 
 	if ((c->d & ModRM) && (c->modrm_mod != 3))
 		cr2 = c->modrm_ea;
diff --git a/drivers/kvm/x86_emulate.h b/drivers/kvm/x86_emulate.h
index c354200..28acad4 100644
--- a/drivers/kvm/x86_emulate.h
+++ b/drivers/kvm/x86_emulate.h
@@ -178,12 +178,9 @@ struct x86_emulate_ctxt {
 #define X86EMUL_MODE_HOST X86EMUL_MODE_PROT64
 #endif
 
-/*
- * x86_emulate_memop: Emulate an instruction that faulted attempting to
- *                    read/write a 'special' memory area.
- * Returns -1 on failure, 0 on success.
- */
-int x86_emulate_memop(struct x86_emulate_ctxt *ctxt,
-		      struct x86_emulate_ops *ops);
+int x86_decode_insn(struct x86_emulate_ctxt *ctxt,
+		    struct x86_emulate_ops *ops);
+int x86_emulate_insn(struct x86_emulate_ctxt *ctxt,
+		     struct x86_emulate_ops *ops);
 
 #endif				/* __X86_EMULATE_H__ */
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 07/50] KVM: Call x86_decode_insn() only when needed
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 06/50] KVM: emulate_instruction() calls now x86_decode_insn() and x86_emulate_insn() Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 08/50] KVM: VMX: Further reduce efer reloads Avi Kivity
                     ` (42 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Move emulate_ctxt to kvm_vcpu to keep emulate context when we exit from kvm
module. Call x86_decode_insn() only when needed. Modify x86_emulate_insn() to
not modify the context if it must be re-entered.

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    8 ++++-
 drivers/kvm/kvm_main.c    |   77 +++++++++++++++++++++++++-------------------
 drivers/kvm/svm.c         |    9 +++--
 drivers/kvm/vmx.c         |    9 +++--
 drivers/kvm/x86_emulate.c |   24 ++++++++++++--
 5 files changed, 82 insertions(+), 45 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index da9c3aa..e885b19 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -207,6 +207,8 @@ enum {
 	VCPU_SREG_LDTR,
 };
 
+#include "x86_emulate.h"
+
 struct kvm_pio_request {
 	unsigned long count;
 	int cur_count;
@@ -380,6 +382,10 @@ struct kvm_vcpu {
 
 	int cpuid_nent;
 	struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES];
+
+	/* emulate context */
+
+	struct x86_emulate_ctxt emulate_ctxt;
 };
 
 struct kvm_mem_alias {
@@ -555,7 +561,7 @@ enum emulation_result {
 };
 
 int emulate_instruction(struct kvm_vcpu *vcpu, struct kvm_run *run,
-			unsigned long cr2, u16 error_code);
+			unsigned long cr2, u16 error_code, int no_decode);
 void kvm_report_emulation_failure(struct kvm_vcpu *cvpu, const char *context);
 void realmode_lgdt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
 void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 39c54d5..fad3a08 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1251,45 +1251,56 @@ struct x86_emulate_ops emulate_ops = {
 int emulate_instruction(struct kvm_vcpu *vcpu,
 			struct kvm_run *run,
 			unsigned long cr2,
-			u16 error_code)
+			u16 error_code,
+			int no_decode)
 {
-	struct x86_emulate_ctxt emulate_ctxt;
-	int r;
-	int cs_db, cs_l;
+	int r = 0;
 
 	vcpu->mmio_fault_cr2 = cr2;
 	kvm_x86_ops->cache_regs(vcpu);
 
-	kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
-
-	emulate_ctxt.vcpu = vcpu;
-	emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
-	emulate_ctxt.cr2 = cr2;
-	emulate_ctxt.mode = (emulate_ctxt.eflags & X86_EFLAGS_VM)
-		? X86EMUL_MODE_REAL : cs_l
-		? X86EMUL_MODE_PROT64 :	cs_db
-		? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
-
-	if (emulate_ctxt.mode == X86EMUL_MODE_PROT64) {
-		emulate_ctxt.cs_base = 0;
-		emulate_ctxt.ds_base = 0;
-		emulate_ctxt.es_base = 0;
-		emulate_ctxt.ss_base = 0;
-	} else {
-		emulate_ctxt.cs_base = get_segment_base(vcpu, VCPU_SREG_CS);
-		emulate_ctxt.ds_base = get_segment_base(vcpu, VCPU_SREG_DS);
-		emulate_ctxt.es_base = get_segment_base(vcpu, VCPU_SREG_ES);
-		emulate_ctxt.ss_base = get_segment_base(vcpu, VCPU_SREG_SS);
-	}
-
-	emulate_ctxt.gs_base = get_segment_base(vcpu, VCPU_SREG_GS);
-	emulate_ctxt.fs_base = get_segment_base(vcpu, VCPU_SREG_FS);
-
 	vcpu->mmio_is_write = 0;
 	vcpu->pio.string = 0;
-	r = x86_decode_insn(&emulate_ctxt, &emulate_ops);
+
+	if (!no_decode) {
+		int cs_db, cs_l;
+		kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
+
+		vcpu->emulate_ctxt.vcpu = vcpu;
+		vcpu->emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
+		vcpu->emulate_ctxt.cr2 = cr2;
+		vcpu->emulate_ctxt.mode =
+			(vcpu->emulate_ctxt.eflags & X86_EFLAGS_VM)
+			? X86EMUL_MODE_REAL : cs_l
+			? X86EMUL_MODE_PROT64 :	cs_db
+			? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
+
+		if (vcpu->emulate_ctxt.mode == X86EMUL_MODE_PROT64) {
+			vcpu->emulate_ctxt.cs_base = 0;
+			vcpu->emulate_ctxt.ds_base = 0;
+			vcpu->emulate_ctxt.es_base = 0;
+			vcpu->emulate_ctxt.ss_base = 0;
+		} else {
+			vcpu->emulate_ctxt.cs_base =
+					get_segment_base(vcpu, VCPU_SREG_CS);
+			vcpu->emulate_ctxt.ds_base =
+					get_segment_base(vcpu, VCPU_SREG_DS);
+			vcpu->emulate_ctxt.es_base =
+					get_segment_base(vcpu, VCPU_SREG_ES);
+			vcpu->emulate_ctxt.ss_base =
+					get_segment_base(vcpu, VCPU_SREG_SS);
+		}
+
+		vcpu->emulate_ctxt.gs_base =
+					get_segment_base(vcpu, VCPU_SREG_GS);
+		vcpu->emulate_ctxt.fs_base =
+					get_segment_base(vcpu, VCPU_SREG_FS);
+
+		r = x86_decode_insn(&vcpu->emulate_ctxt, &emulate_ops);
+	}
+
 	if (r == 0)
-		r = x86_emulate_insn(&emulate_ctxt, &emulate_ops);
+		r = x86_emulate_insn(&vcpu->emulate_ctxt, &emulate_ops);
 
 	if (vcpu->pio.string)
 		return EMULATE_DO_MMIO;
@@ -1313,7 +1324,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 	}
 
 	kvm_x86_ops->decache_regs(vcpu);
-	kvm_x86_ops->set_rflags(vcpu, emulate_ctxt.eflags);
+	kvm_x86_ops->set_rflags(vcpu, vcpu->emulate_ctxt.eflags);
 
 	if (vcpu->mmio_is_write) {
 		vcpu->mmio_needed = 0;
@@ -2055,7 +2066,7 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		vcpu->mmio_read_completed = 1;
 		vcpu->mmio_needed = 0;
 		r = emulate_instruction(vcpu, kvm_run,
-					vcpu->mmio_fault_cr2, 0);
+					vcpu->mmio_fault_cr2, 0, 1);
 		if (r == EMULATE_DO_MMIO) {
 			/*
 			 * Read-modify-write.  Back to userspace.
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 5883f3e..a0eef78 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -960,7 +960,7 @@ static int pf_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 		return 1;
 	}
 	er = emulate_instruction(&svm->vcpu, kvm_run, fault_address,
-				 error_code);
+				 error_code, 0);
 	mutex_unlock(&kvm->lock);
 
 	switch (er) {
@@ -984,7 +984,7 @@ static int ud_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 {
 	int er;
 
-	er = emulate_instruction(&svm->vcpu, kvm_run, 0, 0);
+	er = emulate_instruction(&svm->vcpu, kvm_run, 0, 0, 0);
 	if (er != EMULATE_DONE)
 		inject_ud(&svm->vcpu);
 
@@ -1027,7 +1027,8 @@ static int io_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 	string = (io_info & SVM_IOIO_STR_MASK) != 0;
 
 	if (string) {
-		if (emulate_instruction(&svm->vcpu, kvm_run, 0, 0) == EMULATE_DO_MMIO)
+		if (emulate_instruction(&svm->vcpu,
+					kvm_run, 0, 0, 0) == EMULATE_DO_MMIO)
 			return 0;
 		return 1;
 	}
@@ -1086,7 +1087,7 @@ static int cpuid_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 static int emulate_on_interception(struct vcpu_svm *svm,
 				   struct kvm_run *kvm_run)
 {
-	if (emulate_instruction(&svm->vcpu, NULL, 0, 0) != EMULATE_DONE)
+	if (emulate_instruction(&svm->vcpu, NULL, 0, 0, 0) != EMULATE_DONE)
 		pr_unimpl(&svm->vcpu, "%s: failed\n", __FUNCTION__);
 	return 1;
 }
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 77d061b..dcc0a84 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1750,7 +1750,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
 	 * Cause the #SS fault with 0 error code in VM86 mode.
 	 */
 	if (((vec == GP_VECTOR) || (vec == SS_VECTOR)) && err_code == 0)
-		if (emulate_instruction(vcpu, NULL, 0, 0) == EMULATE_DONE)
+		if (emulate_instruction(vcpu, NULL, 0, 0, 0) == EMULATE_DONE)
 			return 1;
 	return 0;
 }
@@ -1787,7 +1787,7 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	}
 
 	if (is_invalid_opcode(intr_info)) {
-		er = emulate_instruction(vcpu, kvm_run, 0, 0);
+		er = emulate_instruction(vcpu, kvm_run, 0, 0, 0);
 		if (er != EMULATE_DONE)
 			vmx_inject_ud(vcpu);
 
@@ -1812,7 +1812,7 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 			return 1;
 		}
 
-		er = emulate_instruction(vcpu, kvm_run, cr2, error_code);
+		er = emulate_instruction(vcpu, kvm_run, cr2, error_code, 0);
 		mutex_unlock(&vcpu->kvm->lock);
 
 		switch (er) {
@@ -1873,7 +1873,8 @@ static int handle_io(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	string = (exit_qualification & 16) != 0;
 
 	if (string) {
-		if (emulate_instruction(vcpu, kvm_run, 0, 0) == EMULATE_DO_MMIO)
+		if (emulate_instruction(vcpu,
+					kvm_run, 0, 0, 0) == EMULATE_DO_MMIO)
 			return 0;
 		return 1;
 	}
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 9290083..cab1719 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -913,10 +913,19 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	unsigned long cr2 = ctxt->cr2;
 	int no_wb = 0;
 	u64 msr_data;
+	unsigned long saved_eip = 0;
 	unsigned long _eflags = ctxt->eflags;
 	struct decode_cache *c = &ctxt->decode;
 	int rc = 0;
 
+	/* Shadow copy of register state. Committed on successful emulation.
+	 * NOTE: we can copy them from vcpu as x86_decode_insn() doesn't
+	 * modify them.
+	 */
+
+	memcpy(c->regs, ctxt->vcpu->regs, sizeof c->regs);
+	saved_eip = c->eip;
+
 	if ((c->d & ModRM) && (c->modrm_mod != 3))
 		cr2 = c->modrm_ea;
 
@@ -1250,7 +1259,11 @@ writeback:
 	ctxt->vcpu->rip = c->eip;
 
 done:
-	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
+	if (rc == X86EMUL_UNHANDLEABLE) {
+		c->eip = saved_eip;
+		return -1;
+	}
+	return 0;
 
 special_insn:
 	if (c->twobyte)
@@ -1305,8 +1318,10 @@ push:
 				register_address(ctxt->es_base,
 						 c->regs[VCPU_REGS_RDI]),
 				c->rep_prefix,
-				c->regs[VCPU_REGS_RDX]) == 0)
+				c->regs[VCPU_REGS_RDX]) == 0) {
+			c->eip = saved_eip;
 			return -1;
+		}
 		return 0;
 	case 0x6e:		/* outsb */
 	case 0x6f:		/* outsw/outsd */
@@ -1321,8 +1336,10 @@ push:
 							ctxt->ds_base,
 						 c->regs[VCPU_REGS_RSI]),
 				c->rep_prefix,
-				c->regs[VCPU_REGS_RDX]) == 0)
+				c->regs[VCPU_REGS_RDX]) == 0) {
+			c->eip = saved_eip;
 			return -1;
+		}
 		return 0;
 	case 0x70 ... 0x7f: /* jcc (short) */ {
 		int rel = insn_fetch(s8, 1, c->eip);
@@ -1711,5 +1728,6 @@ twobyte_special_insn:
 
 cannot_emulate:
 	DPRINTF("Cannot emulate %02x\n", c->b);
+	c->eip = saved_eip;
 	return -1;
 }
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 08/50] KVM: VMX: Further reduce efer reloads
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 07/50] KVM: Call x86_decode_insn() only when needed Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 09/50] KVM: Allow not-present guest page faults to bypass kvm Avi Kivity
                     ` (41 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

KVM avoids reloading the efer msr when the difference between the guest
and host values consist of the long mode bits (which are switched by
hardware) and the NX bit (which is emulated by the KVM MMU).

This patch also allows KVM to ignore SCE (syscall enable) when the guest
is running in 32-bit mode.  This is because the syscall instruction is
not available in 32-bit mode on Intel processors, so the SCE bit is
effectively meaningless.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/vmx.c |   61 ++++++++++++++++++++++++++++++++--------------------
 1 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index dcc0a84..f0f27a7 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -57,6 +57,7 @@ struct vcpu_vmx {
 		u16           fs_sel, gs_sel, ldt_sel;
 		int           gs_ldt_reload_needed;
 		int           fs_reload_needed;
+		int           guest_efer_loaded;
 	}host_state;
 
 };
@@ -74,8 +75,6 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
 static struct page *vmx_io_bitmap_a;
 static struct page *vmx_io_bitmap_b;
 
-#define EFER_SAVE_RESTORE_BITS ((u64)EFER_SCE)
-
 static struct vmcs_config {
 	int size;
 	int order;
@@ -138,18 +137,6 @@ static void save_msrs(struct kvm_msr_entry *e, int n)
 		rdmsrl(e[i].index, e[i].data);
 }
 
-static inline u64 msr_efer_save_restore_bits(struct kvm_msr_entry msr)
-{
-	return (u64)msr.data & EFER_SAVE_RESTORE_BITS;
-}
-
-static inline int msr_efer_need_save_restore(struct vcpu_vmx *vmx)
-{
-	int efer_offset = vmx->msr_offset_efer;
-	return msr_efer_save_restore_bits(vmx->host_msrs[efer_offset]) !=
-		msr_efer_save_restore_bits(vmx->guest_msrs[efer_offset]);
-}
-
 static inline int is_page_fault(u32 intr_info)
 {
 	return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK |
@@ -351,16 +338,42 @@ static void reload_tss(void)
 
 static void load_transition_efer(struct vcpu_vmx *vmx)
 {
-	u64 trans_efer;
 	int efer_offset = vmx->msr_offset_efer;
+	u64 host_efer = vmx->host_msrs[efer_offset].data;
+	u64 guest_efer = vmx->guest_msrs[efer_offset].data;
+	u64 ignore_bits;
+
+	if (efer_offset < 0)
+		return;
+	/*
+	 * NX is emulated; LMA and LME handled by hardware; SCE meaninless
+	 * outside long mode
+	 */
+	ignore_bits = EFER_NX | EFER_SCE;
+#ifdef CONFIG_X86_64
+	ignore_bits |= EFER_LMA | EFER_LME;
+	/* SCE is meaningful only in long mode on Intel */
+	if (guest_efer & EFER_LMA)
+		ignore_bits &= ~(u64)EFER_SCE;
+#endif
+	if ((guest_efer & ~ignore_bits) == (host_efer & ~ignore_bits))
+		return;
 
-	trans_efer = vmx->host_msrs[efer_offset].data;
-	trans_efer &= ~EFER_SAVE_RESTORE_BITS;
-	trans_efer |= msr_efer_save_restore_bits(vmx->guest_msrs[efer_offset]);
-	wrmsrl(MSR_EFER, trans_efer);
+	vmx->host_state.guest_efer_loaded = 1;
+	guest_efer &= ~ignore_bits;
+	guest_efer |= host_efer & ignore_bits;
+	wrmsrl(MSR_EFER, guest_efer);
 	vmx->vcpu.stat.efer_reload++;
 }
 
+static void reload_host_efer(struct vcpu_vmx *vmx)
+{
+	if (vmx->host_state.guest_efer_loaded) {
+		vmx->host_state.guest_efer_loaded = 0;
+		load_msrs(vmx->host_msrs + vmx->msr_offset_efer, 1);
+	}
+}
+
 static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -406,8 +419,7 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 	}
 #endif
 	load_msrs(vmx->guest_msrs, vmx->save_nmsrs);
-	if (msr_efer_need_save_restore(vmx))
-		load_transition_efer(vmx);
+	load_transition_efer(vmx);
 }
 
 static void vmx_load_host_state(struct vcpu_vmx *vmx)
@@ -436,8 +448,7 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
 	reload_tss();
 	save_msrs(vmx->guest_msrs, vmx->save_nmsrs);
 	load_msrs(vmx->host_msrs, vmx->save_nmsrs);
-	if (msr_efer_need_save_restore(vmx))
-		load_msrs(vmx->host_msrs + vmx->msr_offset_efer, 1);
+	reload_host_efer(vmx);
 }
 
 /*
@@ -727,8 +738,10 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 #ifdef CONFIG_X86_64
 	case MSR_EFER:
 		ret = kvm_set_msr_common(vcpu, msr_index, data);
-		if (vmx->host_state.loaded)
+		if (vmx->host_state.loaded) {
+			reload_host_efer(vmx);
 			load_transition_efer(vmx);
+		}
 		break;
 	case MSR_FS_BASE:
 		vmcs_writel(GUEST_FS_BASE, data);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 09/50] KVM: Allow not-present guest page faults to bypass kvm
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 08/50] KVM: VMX: Further reduce efer reloads Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 10/50] KVM: MMU: Make flooding detection work when guest page faults are bypassed Avi Kivity
                     ` (40 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

There are two classes of page faults trapped by kvm:
 - host page faults, where the fault is needed to allow kvm to install
   the shadow pte or update the guest accessed and dirty bits
 - guest page faults, where the guest has faulted and kvm simply injects
   the fault back into the guest to handle

The second class, guest page faults, is pure overhead.  We can eliminate
some of it on vmx using the following evil trick:
 - when we set up a shadow page table entry, if the corresponding guest pte
   is not present, set up the shadow pte as not present
 - if the guest pte _is_ present, mark the shadow pte as present but also
   set one of the reserved bits in the shadow pte
 - tell the vmx hardware not to trap faults which have the present bit clear

With this, normal page-not-present faults go directly to the guest,
bypassing kvm entirely.

Unfortunately, this trick only works on Intel hardware, as AMD lacks a
way to discriminate among page faults based on error code.  It is also
a little risky since it uses reserved bits which might become unreserved
in the future, so a module parameter is provided to disable it.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    3 ++
 drivers/kvm/kvm_main.c    |    4 ++-
 drivers/kvm/mmu.c         |   89 ++++++++++++++++++++++++++++++++++-----------
 drivers/kvm/paging_tmpl.h |   52 ++++++++++++++++++++-------
 drivers/kvm/vmx.c         |   11 +++++-
 5 files changed, 122 insertions(+), 37 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index e885b19..7de948e 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -150,6 +150,8 @@ struct kvm_mmu {
 	int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
 	void (*free)(struct kvm_vcpu *vcpu);
 	gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva);
+	void (*prefetch_page)(struct kvm_vcpu *vcpu,
+			      struct kvm_mmu_page *page);
 	hpa_t root_hpa;
 	int root_level;
 	int shadow_root_level;
@@ -536,6 +538,7 @@ void kvm_mmu_module_exit(void);
 void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
 int kvm_mmu_create(struct kvm_vcpu *vcpu);
 int kvm_mmu_setup(struct kvm_vcpu *vcpu);
+void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte);
 
 int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index fad3a08..da057cf 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -3501,7 +3501,9 @@ int kvm_init_x86(struct kvm_x86_ops *ops, unsigned int vcpu_size,
 	kvm_preempt_ops.sched_in = kvm_sched_in;
 	kvm_preempt_ops.sched_out = kvm_sched_out;
 
-	return r;
+	kvm_mmu_set_nonpresent_ptes(0ull, 0ull);
+
+	return 0;
 
 out_free:
 	kmem_cache_destroy(kvm_vcpu_cache);
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index feb5ac9..069ce83 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -156,6 +156,16 @@ static struct kmem_cache *pte_chain_cache;
 static struct kmem_cache *rmap_desc_cache;
 static struct kmem_cache *mmu_page_header_cache;
 
+static u64 __read_mostly shadow_trap_nonpresent_pte;
+static u64 __read_mostly shadow_notrap_nonpresent_pte;
+
+void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte)
+{
+	shadow_trap_nonpresent_pte = trap_pte;
+	shadow_notrap_nonpresent_pte = notrap_pte;
+}
+EXPORT_SYMBOL_GPL(kvm_mmu_set_nonpresent_ptes);
+
 static int is_write_protection(struct kvm_vcpu *vcpu)
 {
 	return vcpu->cr0 & X86_CR0_WP;
@@ -176,6 +186,13 @@ static int is_present_pte(unsigned long pte)
 	return pte & PT_PRESENT_MASK;
 }
 
+static int is_shadow_present_pte(u64 pte)
+{
+	pte &= ~PT_SHADOW_IO_MARK;
+	return pte != shadow_trap_nonpresent_pte
+		&& pte != shadow_notrap_nonpresent_pte;
+}
+
 static int is_writeble_pte(unsigned long pte)
 {
 	return pte & PT_WRITABLE_MASK;
@@ -450,7 +467,7 @@ static int is_empty_shadow_page(u64 *spt)
 	u64 *end;
 
 	for (pos = spt, end = pos + PAGE_SIZE / sizeof(u64); pos != end; pos++)
-		if (*pos != 0) {
+		if ((*pos & ~PT_SHADOW_IO_MARK) != shadow_trap_nonpresent_pte) {
 			printk(KERN_ERR "%s: %p %llx\n", __FUNCTION__,
 			       pos, *pos);
 			return 0;
@@ -632,6 +649,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 	page->gfn = gfn;
 	page->role = role;
 	hlist_add_head(&page->hash_link, bucket);
+	vcpu->mmu.prefetch_page(vcpu, page);
 	if (!metaphysical)
 		rmap_write_protect(vcpu, gfn);
 	return page;
@@ -648,9 +666,9 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
 
 	if (page->role.level == PT_PAGE_TABLE_LEVEL) {
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
-			if (pt[i] & PT_PRESENT_MASK)
+			if (is_shadow_present_pte(pt[i]))
 				rmap_remove(&pt[i]);
-			pt[i] = 0;
+			pt[i] = shadow_trap_nonpresent_pte;
 		}
 		kvm_flush_remote_tlbs(kvm);
 		return;
@@ -659,8 +677,8 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
 		ent = pt[i];
 
-		pt[i] = 0;
-		if (!(ent & PT_PRESENT_MASK))
+		pt[i] = shadow_trap_nonpresent_pte;
+		if (!is_shadow_present_pte(ent))
 			continue;
 		ent &= PT64_BASE_ADDR_MASK;
 		mmu_page_remove_parent_pte(page_header(ent), &pt[i]);
@@ -691,7 +709,7 @@ static void kvm_mmu_zap_page(struct kvm *kvm,
 		}
 		BUG_ON(!parent_pte);
 		kvm_mmu_put_page(page, parent_pte);
-		set_shadow_pte(parent_pte, 0);
+		set_shadow_pte(parent_pte, shadow_trap_nonpresent_pte);
 	}
 	kvm_mmu_page_unlink_children(kvm, page);
 	if (!page->root_count) {
@@ -798,7 +816,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 
 		if (level == 1) {
 			pte = table[index];
-			if (is_present_pte(pte) && is_writeble_pte(pte))
+			if (is_shadow_present_pte(pte) && is_writeble_pte(pte))
 				return 0;
 			mark_page_dirty(vcpu->kvm, v >> PAGE_SHIFT);
 			page_header_update_slot(vcpu->kvm, table, v);
@@ -808,7 +826,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 			return 0;
 		}
 
-		if (table[index] == 0) {
+		if (table[index] == shadow_trap_nonpresent_pte) {
 			struct kvm_mmu_page *new_table;
 			gfn_t pseudo_gfn;
 
@@ -829,6 +847,15 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 	}
 }
 
+static void nonpaging_prefetch_page(struct kvm_vcpu *vcpu,
+				    struct kvm_mmu_page *sp)
+{
+	int i;
+
+	for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
+		sp->spt[i] = shadow_trap_nonpresent_pte;
+}
+
 static void mmu_free_roots(struct kvm_vcpu *vcpu)
 {
 	int i;
@@ -943,6 +970,7 @@ static int nonpaging_init_context(struct kvm_vcpu *vcpu)
 	context->page_fault = nonpaging_page_fault;
 	context->gva_to_gpa = nonpaging_gva_to_gpa;
 	context->free = nonpaging_free;
+	context->prefetch_page = nonpaging_prefetch_page;
 	context->root_level = 0;
 	context->shadow_root_level = PT32E_ROOT_LEVEL;
 	context->root_hpa = INVALID_PAGE;
@@ -989,6 +1017,7 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
 	context->new_cr3 = paging_new_cr3;
 	context->page_fault = paging64_page_fault;
 	context->gva_to_gpa = paging64_gva_to_gpa;
+	context->prefetch_page = paging64_prefetch_page;
 	context->free = paging_free;
 	context->root_level = level;
 	context->shadow_root_level = level;
@@ -1009,6 +1038,7 @@ static int paging32_init_context(struct kvm_vcpu *vcpu)
 	context->page_fault = paging32_page_fault;
 	context->gva_to_gpa = paging32_gva_to_gpa;
 	context->free = paging_free;
+	context->prefetch_page = paging32_prefetch_page;
 	context->root_level = PT32_ROOT_LEVEL;
 	context->shadow_root_level = PT32E_ROOT_LEVEL;
 	context->root_hpa = INVALID_PAGE;
@@ -1081,7 +1111,7 @@ static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 	struct kvm_mmu_page *child;
 
 	pte = *spte;
-	if (is_present_pte(pte)) {
+	if (is_shadow_present_pte(pte)) {
 		if (page->role.level == PT_PAGE_TABLE_LEVEL)
 			rmap_remove(spte);
 		else {
@@ -1089,22 +1119,25 @@ static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 			mmu_page_remove_parent_pte(child, spte);
 		}
 	}
-	set_shadow_pte(spte, 0);
+	set_shadow_pte(spte, shadow_trap_nonpresent_pte);
 	kvm_flush_remote_tlbs(vcpu->kvm);
 }
 
 static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu_page *page,
 				  u64 *spte,
-				  const void *new, int bytes)
+				  const void *new, int bytes,
+				  int offset_in_pte)
 {
 	if (page->role.level != PT_PAGE_TABLE_LEVEL)
 		return;
 
 	if (page->role.glevels == PT32_ROOT_LEVEL)
-		paging32_update_pte(vcpu, page, spte, new, bytes);
+		paging32_update_pte(vcpu, page, spte, new, bytes,
+				    offset_in_pte);
 	else
-		paging64_update_pte(vcpu, page, spte, new, bytes);
+		paging64_update_pte(vcpu, page, spte, new, bytes,
+				    offset_in_pte);
 }
 
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -1126,6 +1159,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	int npte;
 
 	pgprintk("%s: gpa %llx bytes %d\n", __FUNCTION__, gpa, bytes);
+	kvm_mmu_audit(vcpu, "pre pte write");
 	if (gfn == vcpu->last_pt_write_gfn) {
 		++vcpu->last_pt_write_count;
 		if (vcpu->last_pt_write_count >= 3)
@@ -1181,10 +1215,12 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		spte = &page->spt[page_offset / sizeof(*spte)];
 		while (npte--) {
 			mmu_pte_write_zap_pte(vcpu, page, spte);
-			mmu_pte_write_new_pte(vcpu, page, spte, new, bytes);
+			mmu_pte_write_new_pte(vcpu, page, spte, new, bytes,
+					      page_offset & (pte_size - 1));
 			++spte;
 		}
 	}
+	kvm_mmu_audit(vcpu, "post pte write");
 }
 
 int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
@@ -1359,22 +1395,33 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i, va += va_delta) {
 		u64 ent = pt[i];
 
-		if (!(ent & PT_PRESENT_MASK))
+		if (ent == shadow_trap_nonpresent_pte)
 			continue;
 
 		va = canonicalize(va);
-		if (level > 1)
+		if (level > 1) {
+			if (ent == shadow_notrap_nonpresent_pte)
+				printk(KERN_ERR "audit: (%s) nontrapping pte"
+				       " in nonleaf level: levels %d gva %lx"
+				       " level %d pte %llx\n", audit_msg,
+				       vcpu->mmu.root_level, va, level, ent);
+
 			audit_mappings_page(vcpu, ent, va, level - 1);
-		else {
+		} else {
 			gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, va);
 			hpa_t hpa = gpa_to_hpa(vcpu, gpa);
 
-			if ((ent & PT_PRESENT_MASK)
+			if (is_shadow_present_pte(ent)
 			    && (ent & PT64_BASE_ADDR_MASK) != hpa)
-				printk(KERN_ERR "audit error: (%s) levels %d"
-				       " gva %lx gpa %llx hpa %llx ent %llx\n",
+				printk(KERN_ERR "xx audit error: (%s) levels %d"
+				       " gva %lx gpa %llx hpa %llx ent %llx %d\n",
 				       audit_msg, vcpu->mmu.root_level,
-				       va, gpa, hpa, ent);
+				       va, gpa, hpa, ent, is_shadow_present_pte(ent));
+			else if (ent == shadow_notrap_nonpresent_pte
+				 && !is_error_hpa(hpa))
+				printk(KERN_ERR "audit: (%s) notrap shadow,"
+				       " valid guest gva %lx\n", audit_msg, va);
+
 		}
 	}
 }
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 6b094b4..99ac9b1 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -31,6 +31,7 @@
 	#define PT_INDEX(addr, level) PT64_INDEX(addr, level)
 	#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
 	#define PT_LEVEL_MASK(level) PT64_LEVEL_MASK(level)
+	#define PT_LEVEL_BITS PT64_LEVEL_BITS
 	#ifdef CONFIG_X86_64
 	#define PT_MAX_FULL_LEVELS 4
 	#else
@@ -45,6 +46,7 @@
 	#define PT_INDEX(addr, level) PT32_INDEX(addr, level)
 	#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
 	#define PT_LEVEL_MASK(level) PT32_LEVEL_MASK(level)
+	#define PT_LEVEL_BITS PT32_LEVEL_BITS
 	#define PT_MAX_FULL_LEVELS 2
 #else
 	#error Invalid PTTYPE value
@@ -211,12 +213,12 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 {
 	hpa_t paddr;
 	int dirty = gpte & PT_DIRTY_MASK;
-	u64 spte = *shadow_pte;
-	int was_rmapped = is_rmap_pte(spte);
+	u64 spte;
+	int was_rmapped = is_rmap_pte(*shadow_pte);
 
 	pgprintk("%s: spte %llx gpte %llx access %llx write_fault %d"
 		 " user_fault %d gfn %lx\n",
-		 __FUNCTION__, spte, (u64)gpte, access_bits,
+		 __FUNCTION__, *shadow_pte, (u64)gpte, access_bits,
 		 write_fault, user_fault, gfn);
 
 	if (write_fault && !dirty) {
@@ -236,7 +238,7 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
 	}
 
-	spte |= PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK;
+	spte = PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK;
 	spte |= gpte & PT64_NX_MASK;
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
@@ -248,10 +250,8 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		spte |= PT_USER_MASK;
 
 	if (is_error_hpa(paddr)) {
-		spte |= gaddr;
-		spte |= PT_SHADOW_IO_MARK;
-		spte &= ~PT_PRESENT_MASK;
-		set_shadow_pte(shadow_pte, spte);
+		set_shadow_pte(shadow_pte,
+			       shadow_trap_nonpresent_pte | PT_SHADOW_IO_MARK);
 		return;
 	}
 
@@ -286,6 +286,7 @@ unshadowed:
 	if (access_bits & PT_WRITABLE_MASK)
 		mark_page_dirty(vcpu->kvm, gaddr >> PAGE_SHIFT);
 
+	pgprintk("%s: setting spte %llx\n", __FUNCTION__, spte);
 	set_shadow_pte(shadow_pte, spte);
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
 	if (!was_rmapped)
@@ -304,14 +305,18 @@ static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t gpte,
 }
 
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
-			      u64 *spte, const void *pte, int bytes)
+			      u64 *spte, const void *pte, int bytes,
+			      int offset_in_pte)
 {
 	pt_element_t gpte;
 
-	if (bytes < sizeof(pt_element_t))
-		return;
 	gpte = *(const pt_element_t *)pte;
-	if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK))
+	if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK)) {
+		if (!offset_in_pte && !is_present_pte(gpte))
+			set_shadow_pte(spte, shadow_notrap_nonpresent_pte);
+		return;
+	}
+	if (bytes < sizeof(pt_element_t))
 		return;
 	pgprintk("%s: gpte %llx spte %p\n", __FUNCTION__, (u64)gpte, spte);
 	FNAME(set_pte)(vcpu, gpte, spte, PT_USER_MASK | PT_WRITABLE_MASK, 0,
@@ -368,7 +373,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 		unsigned hugepage_access = 0;
 
 		shadow_ent = ((u64 *)__va(shadow_addr)) + index;
-		if (is_present_pte(*shadow_ent) || is_io_pte(*shadow_ent)) {
+		if (is_shadow_present_pte(*shadow_ent)) {
 			if (level == PT_PAGE_TABLE_LEVEL)
 				break;
 			shadow_addr = *shadow_ent & PT64_BASE_ADDR_MASK;
@@ -500,6 +505,26 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr)
 	return gpa;
 }
 
+static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu,
+				 struct kvm_mmu_page *sp)
+{
+	int i;
+	pt_element_t *gpt;
+
+	if (sp->role.metaphysical || PTTYPE == 32) {
+		nonpaging_prefetch_page(vcpu, sp);
+		return;
+	}
+
+	gpt = kmap_atomic(gfn_to_page(vcpu->kvm, sp->gfn), KM_USER0);
+	for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
+		if (is_present_pte(gpt[i]))
+			sp->spt[i] = shadow_trap_nonpresent_pte;
+		else
+			sp->spt[i] = shadow_notrap_nonpresent_pte;
+	kunmap_atomic(gpt, KM_USER0);
+}
+
 #undef pt_element_t
 #undef guest_walker
 #undef FNAME
@@ -508,4 +533,5 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr)
 #undef SHADOW_PT_INDEX
 #undef PT_LEVEL_MASK
 #undef PT_DIR_BASE_ADDR_MASK
+#undef PT_LEVEL_BITS
 #undef PT_MAX_FULL_LEVELS
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index f0f27a7..d32e63d 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -26,6 +26,7 @@
 #include <linux/mm.h>
 #include <linux/highmem.h>
 #include <linux/sched.h>
+#include <linux/moduleparam.h>
 
 #include <asm/io.h>
 #include <asm/desc.h>
@@ -33,6 +34,9 @@
 MODULE_AUTHOR("Qumranet");
 MODULE_LICENSE("GPL");
 
+static int bypass_guest_pf = 1;
+module_param(bypass_guest_pf, bool, 0);
+
 struct vmcs {
 	u32 revision_id;
 	u32 abort;
@@ -1535,8 +1539,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	}
 	vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control);
 
-	vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0);
-	vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0);
+	vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, !!bypass_guest_pf);
+	vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, !!bypass_guest_pf);
 	vmcs_write32(CR3_TARGET_COUNT, 0);           /* 22.2.1 */
 
 	vmcs_writel(HOST_CR0, read_cr0());  /* 22.2.3 */
@@ -2582,6 +2586,9 @@ static int __init vmx_init(void)
 	if (r)
 		goto out1;
 
+	if (bypass_guest_pf)
+		kvm_mmu_set_nonpresent_ptes(~0xffeull, 0ull);
+
 	return 0;
 
 out1:
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 10/50] KVM: MMU: Make flooding detection work when guest page faults are bypassed
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 09/50] KVM: Allow not-present guest page faults to bypass kvm Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 11/50] KVM: MMU: Ignore reserved bits in cr3 in non-pae mode Avi Kivity
                     ` (39 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

When we allow guest page faults to reach the guests directly, we lose
the fault tracking which allows us to detect demand paging.  So we provide
an alternate mechnism by clearing the accessed bit when we set a pte, and
checking it later to see if the guest actually used it.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    1 +
 drivers/kvm/mmu.c         |   21 ++++++++++++++++++++-
 drivers/kvm/paging_tmpl.h |    9 ++++++++-
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7de948e..08ffc82 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -346,6 +346,7 @@ struct kvm_vcpu {
 
 	gfn_t last_pt_write_gfn;
 	int   last_pt_write_count;
+	u64  *last_pte_updated;
 
 	struct kvm_guest_debug guest_debug;
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 069ce83..d347e89 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -692,6 +692,15 @@ static void kvm_mmu_put_page(struct kvm_mmu_page *page,
 	mmu_page_remove_parent_pte(page, parent_pte);
 }
 
+static void kvm_mmu_reset_last_pte_updated(struct kvm *kvm)
+{
+	int i;
+
+	for (i = 0; i < KVM_MAX_VCPUS; ++i)
+		if (kvm->vcpus[i])
+			kvm->vcpus[i]->last_pte_updated = NULL;
+}
+
 static void kvm_mmu_zap_page(struct kvm *kvm,
 			     struct kvm_mmu_page *page)
 {
@@ -717,6 +726,7 @@ static void kvm_mmu_zap_page(struct kvm *kvm,
 		kvm_mmu_free_page(kvm, page);
 	} else
 		list_move(&page->link, &kvm->active_mmu_pages);
+	kvm_mmu_reset_last_pte_updated(kvm);
 }
 
 static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
@@ -1140,6 +1150,13 @@ static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
 				    offset_in_pte);
 }
 
+static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu)
+{
+	u64 *spte = vcpu->last_pte_updated;
+
+	return !!(spte && (*spte & PT_ACCESSED_MASK));
+}
+
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 		       const u8 *new, int bytes)
 {
@@ -1160,13 +1177,15 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 
 	pgprintk("%s: gpa %llx bytes %d\n", __FUNCTION__, gpa, bytes);
 	kvm_mmu_audit(vcpu, "pre pte write");
-	if (gfn == vcpu->last_pt_write_gfn) {
+	if (gfn == vcpu->last_pt_write_gfn
+	    && !last_updated_pte_accessed(vcpu)) {
 		++vcpu->last_pt_write_count;
 		if (vcpu->last_pt_write_count >= 3)
 			flooded = 1;
 	} else {
 		vcpu->last_pt_write_gfn = gfn;
 		vcpu->last_pt_write_count = 1;
+		vcpu->last_pte_updated = NULL;
 	}
 	index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
 	bucket = &vcpu->kvm->mmu_page_hash[index];
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 99ac9b1..be0f852 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -238,7 +238,12 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
 	}
 
-	spte = PT_PRESENT_MASK | PT_ACCESSED_MASK | PT_DIRTY_MASK;
+	/*
+	 * We don't set the accessed bit, since we sometimes want to see
+	 * whether the guest actually used the pte (in order to detect
+	 * demand paging).
+	 */
+	spte = PT_PRESENT_MASK | PT_DIRTY_MASK;
 	spte |= gpte & PT64_NX_MASK;
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
@@ -291,6 +296,8 @@ unshadowed:
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
 	if (!was_rmapped)
 		rmap_add(vcpu, shadow_pte);
+	if (!ptwrite || !*ptwrite)
+		vcpu->last_pte_updated = shadow_pte;
 }
 
 static void FNAME(set_pte)(struct kvm_vcpu *vcpu, pt_element_t gpte,
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 11/50] KVM: MMU: Ignore reserved bits in cr3 in non-pae mode
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 10/50] KVM: MMU: Make flooding detection work when guest page faults are bypassed Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 12/50] KVM: x86 emulator: split some decoding into functions for readability Avi Kivity
                     ` (38 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Ryan Harper <ryanh-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

This patch removes the fault injected when the guest attempts to set reserved
bits in cr3.  X86 hardware doesn't generate a fault when setting reserved bits.
The result of this patch is that vmware-server, running within a kvm guest,
boots and runs memtest from an iso.

Signed-off-by: Ryan Harper <ryanh-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |   11 ++++-------
 1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index da057cf..b10fd7e 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -554,14 +554,11 @@ void set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 				inject_gp(vcpu);
 				return;
 			}
-		} else {
-			if (cr3 & CR3_NONPAE_RESERVED_BITS) {
-				printk(KERN_DEBUG
-				       "set_cr3: #GP, reserved bits\n");
-				inject_gp(vcpu);
-				return;
-			}
 		}
+		/*
+		 * We don't check reserved bits in nonpae mode, because
+		 * this isn't enforced, and VMware depends on this.
+		 */
 	}
 
 	mutex_lock(&vcpu->kvm->lock);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 12/50] KVM: x86 emulator: split some decoding into functions for readability
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 11/50] KVM: MMU: Ignore reserved bits in cr3 in non-pae mode Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 13/50] KVM: x86 emulator: remove _eflags and use directly ctxt->eflags Avi Kivity
                     ` (37 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

To improve readability, move push, writeback, and grp 1a/2/3/4/5/9 emulation
parts into functions.

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |  451 ++++++++++++++++++++++++++------------------
 1 files changed, 266 insertions(+), 185 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index cab1719..a108736 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -907,6 +907,244 @@ done:
 	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
 }
 
+static inline void emulate_push(struct x86_emulate_ctxt *ctxt)
+{
+	struct decode_cache *c = &ctxt->decode;
+
+	c->dst.type  = OP_MEM;
+	c->dst.bytes = c->op_bytes;
+	c->dst.val = c->src.val;
+	register_address_increment(c->regs[VCPU_REGS_RSP], -c->op_bytes);
+	c->dst.ptr = (void *) register_address(ctxt->ss_base,
+					       c->regs[VCPU_REGS_RSP]);
+}
+
+static inline int emulate_grp1a(struct x86_emulate_ctxt *ctxt,
+				struct x86_emulate_ops *ops)
+{
+	struct decode_cache *c = &ctxt->decode;
+	int rc;
+
+	/* 64-bit mode: POP always pops a 64-bit operand. */
+
+	if (ctxt->mode == X86EMUL_MODE_PROT64)
+		c->dst.bytes = 8;
+
+	rc = ops->read_std(register_address(ctxt->ss_base,
+					    c->regs[VCPU_REGS_RSP]),
+			   &c->dst.val, c->dst.bytes, ctxt->vcpu);
+	if (rc != 0)
+		return rc;
+
+	register_address_increment(c->regs[VCPU_REGS_RSP], c->dst.bytes);
+
+	return 0;
+}
+
+static inline void emulate_grp2(struct decode_cache *c, unsigned long *_eflags)
+{
+	switch (c->modrm_reg) {
+	case 0:	/* rol */
+		emulate_2op_SrcB("rol", c->src, c->dst, *_eflags);
+		break;
+	case 1:	/* ror */
+		emulate_2op_SrcB("ror", c->src, c->dst, *_eflags);
+		break;
+	case 2:	/* rcl */
+		emulate_2op_SrcB("rcl", c->src, c->dst, *_eflags);
+		break;
+	case 3:	/* rcr */
+		emulate_2op_SrcB("rcr", c->src, c->dst, *_eflags);
+		break;
+	case 4:	/* sal/shl */
+	case 6:	/* sal/shl */
+		emulate_2op_SrcB("sal", c->src, c->dst, *_eflags);
+		break;
+	case 5:	/* shr */
+		emulate_2op_SrcB("shr", c->src, c->dst, *_eflags);
+		break;
+	case 7:	/* sar */
+		emulate_2op_SrcB("sar", c->src, c->dst, *_eflags);
+		break;
+	}
+}
+
+static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt,
+			       struct x86_emulate_ops *ops,
+			       unsigned long *_eflags)
+{
+	struct decode_cache *c = &ctxt->decode;
+	int rc = 0;
+
+	switch (c->modrm_reg) {
+	case 0 ... 1:	/* test */
+		/*
+		 * Special case in Grp3: test has an immediate
+		 * source operand.
+		 */
+		c->src.type = OP_IMM;
+		c->src.ptr = (unsigned long *)c->eip;
+		c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+		if (c->src.bytes == 8)
+			c->src.bytes = 4;
+		switch (c->src.bytes) {
+		case 1:
+			c->src.val = insn_fetch(s8, 1, c->eip);
+			break;
+		case 2:
+			c->src.val = insn_fetch(s16, 2, c->eip);
+			break;
+		case 4:
+			c->src.val = insn_fetch(s32, 4, c->eip);
+			break;
+		}
+		emulate_2op_SrcV("test", c->src, c->dst, *_eflags);
+		break;
+	case 2:	/* not */
+		c->dst.val = ~c->dst.val;
+		break;
+	case 3:	/* neg */
+		emulate_1op("neg", c->dst, *_eflags);
+		break;
+	default:
+		DPRINTF("Cannot emulate %02x\n", c->b);
+		rc = X86EMUL_UNHANDLEABLE;
+		break;
+	}
+done:
+	return rc;
+}
+
+static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
+			       struct x86_emulate_ops *ops,
+			       unsigned long *_eflags,
+			       int *no_wb)
+{
+	struct decode_cache *c = &ctxt->decode;
+	int rc;
+
+	switch (c->modrm_reg) {
+	case 0:	/* inc */
+		emulate_1op("inc", c->dst, *_eflags);
+		break;
+	case 1:	/* dec */
+		emulate_1op("dec", c->dst, *_eflags);
+		break;
+	case 4: /* jmp abs */
+		if (c->b == 0xff)
+			c->eip = c->dst.val;
+		else {
+			DPRINTF("Cannot emulate %02x\n", c->b);
+			return X86EMUL_UNHANDLEABLE;
+		}
+		break;
+	case 6:	/* push */
+
+		/* 64-bit mode: PUSH always pushes a 64-bit operand. */
+
+		if (ctxt->mode == X86EMUL_MODE_PROT64) {
+			c->dst.bytes = 8;
+			rc = ops->read_std((unsigned long)c->dst.ptr,
+					   &c->dst.val, 8, ctxt->vcpu);
+			if (rc != 0)
+				return rc;
+		}
+		register_address_increment(c->regs[VCPU_REGS_RSP],
+					   -c->dst.bytes);
+		rc = ops->write_emulated(register_address(ctxt->ss_base,
+				    c->regs[VCPU_REGS_RSP]), &c->dst.val,
+				    c->dst.bytes, ctxt->vcpu);
+		if (rc != 0)
+			return rc;
+		*no_wb = 1;
+		break;
+	default:
+		DPRINTF("Cannot emulate %02x\n", c->b);
+		return X86EMUL_UNHANDLEABLE;
+	}
+	return 0;
+}
+
+static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt,
+			       struct x86_emulate_ops *ops,
+			       unsigned long *_eflags,
+			       unsigned long cr2)
+{
+	struct decode_cache *c = &ctxt->decode;
+	u64 old, new;
+	int rc;
+
+	rc = ops->read_emulated(cr2, &old, 8, ctxt->vcpu);
+	if (rc != 0)
+		return rc;
+
+	if (((u32) (old >> 0) != (u32) c->regs[VCPU_REGS_RAX]) ||
+	    ((u32) (old >> 32) != (u32) c->regs[VCPU_REGS_RDX])) {
+
+		c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
+		c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
+		*_eflags &= ~EFLG_ZF;
+
+	} else {
+		new = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
+		       (u32) c->regs[VCPU_REGS_RBX];
+
+		rc = ops->cmpxchg_emulated(cr2, &old, &new, 8, ctxt->vcpu);
+		if (rc != 0)
+			return rc;
+		*_eflags |= EFLG_ZF;
+	}
+	return 0;
+}
+
+static inline int writeback(struct x86_emulate_ctxt *ctxt,
+			    struct x86_emulate_ops *ops)
+{
+	int rc;
+	struct decode_cache *c = &ctxt->decode;
+
+	switch (c->dst.type) {
+	case OP_REG:
+		/* The 4-byte case *is* correct:
+		 * in 64-bit mode we zero-extend.
+		 */
+		switch (c->dst.bytes) {
+		case 1:
+			*(u8 *)c->dst.ptr = (u8)c->dst.val;
+			break;
+		case 2:
+			*(u16 *)c->dst.ptr = (u16)c->dst.val;
+			break;
+		case 4:
+			*c->dst.ptr = (u32)c->dst.val;
+			break;	/* 64b: zero-ext */
+		case 8:
+			*c->dst.ptr = c->dst.val;
+			break;
+		}
+		break;
+	case OP_MEM:
+		if (c->lock_prefix)
+			rc = ops->cmpxchg_emulated(
+					(unsigned long)c->dst.ptr,
+					&c->dst.orig_val,
+					&c->dst.val,
+					c->dst.bytes,
+					ctxt->vcpu);
+		else
+			rc = ops->write_emulated(
+					(unsigned long)c->dst.ptr,
+					&c->dst.val,
+					c->dst.bytes,
+					ctxt->vcpu);
+		if (rc != 0)
+			return rc;
+	default:
+		break;
+	}
+	return 0;
+}
+
 int
 x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
@@ -1042,7 +1280,6 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		}
 		break;
 	case 0x84 ... 0x85:
-	      test:		/* test */
 		emulate_2op_SrcV("test", c->src, c->dst, _eflags);
 		break;
 	case 0x86 ... 0x87:	/* xchg */
@@ -1074,18 +1311,9 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		c->dst.val = c->modrm_val;
 		break;
 	case 0x8f:		/* pop (sole member of Grp1a) */
-		/* 64-bit mode: POP always pops a 64-bit operand. */
-		if (ctxt->mode == X86EMUL_MODE_PROT64)
-			c->dst.bytes = 8;
-		if ((rc = ops->read_std(register_address(
-						   ctxt->ss_base,
-						   c->regs[VCPU_REGS_RSP]),
-						   &c->dst.val,
-						   c->dst.bytes,
-						   ctxt->vcpu)) != 0)
+		rc = emulate_grp1a(ctxt, ops);
+		if (rc != 0)
 			goto done;
-		register_address_increment(c->regs[VCPU_REGS_RSP],
-					   c->dst.bytes);
 		break;
 	case 0xa0 ... 0xa1:	/* mov */
 		c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX];
@@ -1099,31 +1327,7 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		c->eip += c->ad_bytes;
 		break;
 	case 0xc0 ... 0xc1:
-	      grp2:		/* Grp2 */
-		switch (c->modrm_reg) {
-		case 0:	/* rol */
-			emulate_2op_SrcB("rol", c->src, c->dst, _eflags);
-			break;
-		case 1:	/* ror */
-			emulate_2op_SrcB("ror", c->src, c->dst, _eflags);
-			break;
-		case 2:	/* rcl */
-			emulate_2op_SrcB("rcl", c->src, c->dst, _eflags);
-			break;
-		case 3:	/* rcr */
-			emulate_2op_SrcB("rcr", c->src, c->dst, _eflags);
-			break;
-		case 4:	/* sal/shl */
-		case 6:	/* sal/shl */
-			emulate_2op_SrcB("sal", c->src, c->dst, _eflags);
-			break;
-		case 5:	/* shr */
-			emulate_2op_SrcB("shr", c->src, c->dst, _eflags);
-			break;
-		case 7:	/* sar */
-			emulate_2op_SrcB("sar", c->src, c->dst, _eflags);
-			break;
-		}
+		emulate_grp2(c, &_eflags);
 		break;
 	case 0xc6 ... 0xc7:	/* mov (sole member of Grp11) */
 	mov:
@@ -1131,126 +1335,29 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		break;
 	case 0xd0 ... 0xd1:	/* Grp2 */
 		c->src.val = 1;
-		goto grp2;
+		emulate_grp2(c, &_eflags);
+		break;
 	case 0xd2 ... 0xd3:	/* Grp2 */
 		c->src.val = c->regs[VCPU_REGS_RCX];
-		goto grp2;
+		emulate_grp2(c, &_eflags);
+		break;
 	case 0xf6 ... 0xf7:	/* Grp3 */
-		switch (c->modrm_reg) {
-		case 0 ... 1:	/* test */
-			/*
-			 * Special case in Grp3: test has an immediate
-			 * source operand.
-			 */
-			c->src.type = OP_IMM;
-			c->src.ptr = (unsigned long *)c->eip;
-			c->src.bytes = (c->d & ByteOp) ? 1 :
-							       c->op_bytes;
-			if (c->src.bytes == 8)
-				c->src.bytes = 4;
-			switch (c->src.bytes) {
-			case 1:
-				c->src.val = insn_fetch(s8, 1, c->eip);
-				break;
-			case 2:
-				c->src.val = insn_fetch(s16, 2, c->eip);
-				break;
-			case 4:
-				c->src.val = insn_fetch(s32, 4, c->eip);
-				break;
-			}
-			goto test;
-		case 2:	/* not */
-			c->dst.val = ~c->dst.val;
-			break;
-		case 3:	/* neg */
-			emulate_1op("neg", c->dst, _eflags);
-			break;
-		default:
-			goto cannot_emulate;
-		}
+		rc = emulate_grp3(ctxt, ops, &_eflags);
+		if (rc != 0)
+			goto done;
 		break;
 	case 0xfe ... 0xff:	/* Grp4/Grp5 */
-		switch (c->modrm_reg) {
-		case 0:	/* inc */
-			emulate_1op("inc", c->dst, _eflags);
-			break;
-		case 1:	/* dec */
-			emulate_1op("dec", c->dst, _eflags);
-			break;
-		case 4: /* jmp abs */
-			if (c->b == 0xff)
-				c->eip = c->dst.val;
-			else
-				goto cannot_emulate;
-			break;
-		case 6:	/* push */
-			/* 64-bit mode: PUSH always pushes a 64-bit operand. */
-			if (ctxt->mode == X86EMUL_MODE_PROT64) {
-				c->dst.bytes = 8;
-				if ((rc = ops->read_std(
-						 (unsigned long)c->dst.ptr,
-						 &c->dst.val, 8,
-						 ctxt->vcpu)) != 0)
-					goto done;
-			}
-			register_address_increment(c->regs[VCPU_REGS_RSP],
-						   -c->dst.bytes);
-			if ((rc = ops->write_emulated(
-				     register_address(ctxt->ss_base,
-					  c->regs[VCPU_REGS_RSP]),
-					  &c->dst.val,
-					   c->dst.bytes, ctxt->vcpu)) != 0)
-				goto done;
-			no_wb = 1;
-			break;
-		default:
-			goto cannot_emulate;
-		}
+		rc = emulate_grp45(ctxt, ops, &_eflags, &no_wb);
+		if (rc != 0)
+			goto done;
 		break;
 	}
 
 writeback:
 	if (!no_wb) {
-		switch (c->dst.type) {
-		case OP_REG:
-			/* The 4-byte case *is* correct:
-			 * in 64-bit mode we zero-extend.
-			 */
-			switch (c->dst.bytes) {
-			case 1:
-				*(u8 *)c->dst.ptr = (u8)c->dst.val;
-				break;
-			case 2:
-				*(u16 *)c->dst.ptr = (u16)c->dst.val;
-				break;
-			case 4:
-				*c->dst.ptr = (u32)c->dst.val;
-				break;	/* 64b: zero-ext */
-			case 8:
-				*c->dst.ptr = c->dst.val;
-				break;
-			}
-			break;
-		case OP_MEM:
-			if (c->lock_prefix)
-				rc = ops->cmpxchg_emulated(
-						(unsigned long)c->dst.ptr,
-						&c->dst.orig_val,
-						&c->dst.val,
-						c->dst.bytes,
-						ctxt->vcpu);
-			else
-				rc = ops->write_emulated(
-						(unsigned long)c->dst.ptr,
-						&c->dst.val,
-						c->dst.bytes,
-						ctxt->vcpu);
-			if (rc != 0)
-				goto done;
-		default:
-			break;
-		}
+		rc = writeback(ctxt, ops);
+		if (rc != 0)
+			goto done;
 	}
 
 	/* Commit shadow register state. */
@@ -1283,8 +1390,7 @@ special_insn:
 			ctxt->ss_base, c->regs[VCPU_REGS_RSP]);
 		break;
 	case 0x58 ... 0x5f: /* pop reg */
-		c->dst.ptr =
-				(unsigned long *)&c->regs[c->b & 0x7];
+		c->dst.ptr = (unsigned long *)&c->regs[c->b & 0x7];
 	pop_instruction:
 		if ((rc = ops->read_std(register_address(ctxt->ss_base,
 			c->regs[VCPU_REGS_RSP]), c->dst.ptr,
@@ -1298,14 +1404,7 @@ special_insn:
 	case 0x6a: /* push imm8 */
 		c->src.val = 0L;
 		c->src.val = insn_fetch(s8, 1, c->eip);
-push:
-		c->dst.type  = OP_MEM;
-		c->dst.bytes = c->op_bytes;
-		c->dst.val = c->src.val;
-		register_address_increment(c->regs[VCPU_REGS_RSP],
-					   -c->op_bytes);
-		c->dst.ptr = (void *) register_address(ctxt->ss_base,
-						       c->regs[VCPU_REGS_RSP]);
+		emulate_push(ctxt);
 		break;
 	case 0x6c:		/* insb */
 	case 0x6d:		/* insw/insd */
@@ -1350,7 +1449,8 @@ push:
 	}
 	case 0x9c: /* pushf */
 		c->src.val =  (unsigned long) _eflags;
-		goto push;
+		emulate_push(ctxt);
+		break;
 	case 0x9d: /* popf */
 		c->dst.ptr = (unsigned long *) &_eflags;
 		goto pop_instruction;
@@ -1436,7 +1536,8 @@ push:
 		c->src.val = (unsigned long) c->eip;
 		JMP_REL(rel);
 		c->op_bytes = c->ad_bytes;
-		goto push;
+		emulate_push(ctxt);
+		break;
 	}
 	case 0xe9: /* jmp rel */
 	case 0xeb: /* jmp rel short */
@@ -1511,8 +1612,7 @@ twobyte_insn:
 		no_wb = 1;
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		rc = emulator_get_dr(ctxt, c->modrm_reg,
-				     &c->regs[c->modrm_rm]);
+		rc = emulator_get_dr(ctxt, c->modrm_reg, &c->regs[c->modrm_rm]);
 		break;
 	case 0x23: /* mov from reg to dr */
 		no_wb = 1;
@@ -1668,8 +1768,7 @@ twobyte_special_insn:
 		break;
 	case 0x32:
 		/* rdmsr */
-		rc = kvm_get_msr(ctxt->vcpu,
-				 c->regs[VCPU_REGS_RCX], &msr_data);
+		rc = kvm_get_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], &msr_data);
 		if (rc) {
 			kvm_x86_ops->inject_gp(ctxt->vcpu, 0);
 			c->eip = ctxt->vcpu->rip;
@@ -1701,28 +1800,10 @@ twobyte_special_insn:
 		break;
 	}
 	case 0xc7:		/* Grp9 (cmpxchg8b) */
-		{
-			u64 old, new;
-			if ((rc = ops->read_emulated(cr2, &old, 8, ctxt->vcpu))
-									!= 0)
-				goto done;
-			if (((u32) (old >> 0) !=
-					(u32) c->regs[VCPU_REGS_RAX]) ||
-			    ((u32) (old >> 32) !=
-					(u32) c->regs[VCPU_REGS_RDX])) {
-				c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
-				c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
-				_eflags &= ~EFLG_ZF;
-			} else {
-				new = ((u64)c->regs[VCPU_REGS_RCX] << 32)
-					| (u32) c->regs[VCPU_REGS_RBX];
-				if ((rc = ops->cmpxchg_emulated(cr2, &old,
-							  &new, 8, ctxt->vcpu)) != 0)
-					goto done;
-				_eflags |= EFLG_ZF;
-			}
-			break;
-		}
+		rc = emulate_grp9(ctxt, ops, &_eflags, cr2);
+		if (rc != 0)
+			goto done;
+		break;
 	}
 	goto writeback;
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 13/50] KVM: x86 emulator: remove _eflags and use directly ctxt->eflags.
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 12/50] KVM: x86 emulator: split some decoding into functions for readability Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:50   ` [PATCH 14/50] KVM: x86 emulator: Remove no_wb, use dst.type = OP_NONE instead Avi Kivity
                     ` (36 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Remove _eflags and use directly ctxt->eflags. Caching eflags is not needed as
it is restored to vcpu by kvm_main.c:emulate_instruction() from ctxt->eflags
only if emulation doesn't fail.

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |  121 ++++++++++++++++++++++-----------------------
 1 files changed, 59 insertions(+), 62 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index a108736..45beeb9 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -941,37 +941,37 @@ static inline int emulate_grp1a(struct x86_emulate_ctxt *ctxt,
 	return 0;
 }
 
-static inline void emulate_grp2(struct decode_cache *c, unsigned long *_eflags)
+static inline void emulate_grp2(struct x86_emulate_ctxt *ctxt)
 {
+	struct decode_cache *c = &ctxt->decode;
 	switch (c->modrm_reg) {
 	case 0:	/* rol */
-		emulate_2op_SrcB("rol", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("rol", c->src, c->dst, ctxt->eflags);
 		break;
 	case 1:	/* ror */
-		emulate_2op_SrcB("ror", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("ror", c->src, c->dst, ctxt->eflags);
 		break;
 	case 2:	/* rcl */
-		emulate_2op_SrcB("rcl", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("rcl", c->src, c->dst, ctxt->eflags);
 		break;
 	case 3:	/* rcr */
-		emulate_2op_SrcB("rcr", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("rcr", c->src, c->dst, ctxt->eflags);
 		break;
 	case 4:	/* sal/shl */
 	case 6:	/* sal/shl */
-		emulate_2op_SrcB("sal", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("sal", c->src, c->dst, ctxt->eflags);
 		break;
 	case 5:	/* shr */
-		emulate_2op_SrcB("shr", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("shr", c->src, c->dst, ctxt->eflags);
 		break;
 	case 7:	/* sar */
-		emulate_2op_SrcB("sar", c->src, c->dst, *_eflags);
+		emulate_2op_SrcB("sar", c->src, c->dst, ctxt->eflags);
 		break;
 	}
 }
 
 static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt,
-			       struct x86_emulate_ops *ops,
-			       unsigned long *_eflags)
+			       struct x86_emulate_ops *ops)
 {
 	struct decode_cache *c = &ctxt->decode;
 	int rc = 0;
@@ -998,13 +998,13 @@ static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt,
 			c->src.val = insn_fetch(s32, 4, c->eip);
 			break;
 		}
-		emulate_2op_SrcV("test", c->src, c->dst, *_eflags);
+		emulate_2op_SrcV("test", c->src, c->dst, ctxt->eflags);
 		break;
 	case 2:	/* not */
 		c->dst.val = ~c->dst.val;
 		break;
 	case 3:	/* neg */
-		emulate_1op("neg", c->dst, *_eflags);
+		emulate_1op("neg", c->dst, ctxt->eflags);
 		break;
 	default:
 		DPRINTF("Cannot emulate %02x\n", c->b);
@@ -1017,7 +1017,6 @@ done:
 
 static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
 			       struct x86_emulate_ops *ops,
-			       unsigned long *_eflags,
 			       int *no_wb)
 {
 	struct decode_cache *c = &ctxt->decode;
@@ -1025,10 +1024,10 @@ static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
 
 	switch (c->modrm_reg) {
 	case 0:	/* inc */
-		emulate_1op("inc", c->dst, *_eflags);
+		emulate_1op("inc", c->dst, ctxt->eflags);
 		break;
 	case 1:	/* dec */
-		emulate_1op("dec", c->dst, *_eflags);
+		emulate_1op("dec", c->dst, ctxt->eflags);
 		break;
 	case 4: /* jmp abs */
 		if (c->b == 0xff)
@@ -1067,7 +1066,6 @@ static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
 
 static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt,
 			       struct x86_emulate_ops *ops,
-			       unsigned long *_eflags,
 			       unsigned long cr2)
 {
 	struct decode_cache *c = &ctxt->decode;
@@ -1083,7 +1081,7 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt,
 
 		c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
 		c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
-		*_eflags &= ~EFLG_ZF;
+		ctxt->eflags &= ~EFLG_ZF;
 
 	} else {
 		new = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
@@ -1092,7 +1090,7 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt,
 		rc = ops->cmpxchg_emulated(cr2, &old, &new, 8, ctxt->vcpu);
 		if (rc != 0)
 			return rc;
-		*_eflags |= EFLG_ZF;
+		ctxt->eflags |= EFLG_ZF;
 	}
 	return 0;
 }
@@ -1152,7 +1150,6 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	int no_wb = 0;
 	u64 msr_data;
 	unsigned long saved_eip = 0;
-	unsigned long _eflags = ctxt->eflags;
 	struct decode_cache *c = &ctxt->decode;
 	int rc = 0;
 
@@ -1207,23 +1204,23 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	switch (c->b) {
 	case 0x00 ... 0x05:
 	      add:		/* add */
-		emulate_2op_SrcV("add", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("add", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x08 ... 0x0d:
 	      or:		/* or */
-		emulate_2op_SrcV("or", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("or", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x10 ... 0x15:
 	      adc:		/* adc */
-		emulate_2op_SrcV("adc", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("adc", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x18 ... 0x1d:
 	      sbb:		/* sbb */
-		emulate_2op_SrcV("sbb", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("sbb", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x20 ... 0x23:
 	      and:		/* and */
-		emulate_2op_SrcV("and", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("and", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x24:              /* and al imm8 */
 		c->dst.type = OP_REG;
@@ -1244,15 +1241,15 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		goto and;
 	case 0x28 ... 0x2d:
 	      sub:		/* sub */
-		emulate_2op_SrcV("sub", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("sub", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x30 ... 0x35:
 	      xor:		/* xor */
-		emulate_2op_SrcV("xor", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("xor", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x38 ... 0x3d:
 	      cmp:		/* cmp */
-		emulate_2op_SrcV("cmp", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("cmp", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x63:		/* movsxd */
 		if (ctxt->mode != X86EMUL_MODE_PROT64)
@@ -1280,7 +1277,7 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		}
 		break;
 	case 0x84 ... 0x85:
-		emulate_2op_SrcV("test", c->src, c->dst, _eflags);
+		emulate_2op_SrcV("test", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0x86 ... 0x87:	/* xchg */
 		/* Write back the register source. */
@@ -1327,7 +1324,7 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		c->eip += c->ad_bytes;
 		break;
 	case 0xc0 ... 0xc1:
-		emulate_grp2(c, &_eflags);
+		emulate_grp2(ctxt);
 		break;
 	case 0xc6 ... 0xc7:	/* mov (sole member of Grp11) */
 	mov:
@@ -1335,19 +1332,19 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		break;
 	case 0xd0 ... 0xd1:	/* Grp2 */
 		c->src.val = 1;
-		emulate_grp2(c, &_eflags);
+		emulate_grp2(ctxt);
 		break;
 	case 0xd2 ... 0xd3:	/* Grp2 */
 		c->src.val = c->regs[VCPU_REGS_RCX];
-		emulate_grp2(c, &_eflags);
+		emulate_grp2(ctxt);
 		break;
 	case 0xf6 ... 0xf7:	/* Grp3 */
-		rc = emulate_grp3(ctxt, ops, &_eflags);
+		rc = emulate_grp3(ctxt, ops);
 		if (rc != 0)
 			goto done;
 		break;
 	case 0xfe ... 0xff:	/* Grp4/Grp5 */
-		rc = emulate_grp45(ctxt, ops, &_eflags, &no_wb);
+		rc = emulate_grp45(ctxt, ops, &no_wb);
 		if (rc != 0)
 			goto done;
 		break;
@@ -1362,7 +1359,6 @@ writeback:
 
 	/* Commit shadow register state. */
 	memcpy(ctxt->vcpu->regs, c->regs, sizeof c->regs);
-	ctxt->eflags = _eflags;
 	ctxt->vcpu->rip = c->eip;
 
 done:
@@ -1413,7 +1409,7 @@ special_insn:
 				(c->d & ByteOp) ? 1 : c->op_bytes,
 				c->rep_prefix ?
 				address_mask(c->regs[VCPU_REGS_RCX]) : 1,
-				(_eflags & EFLG_DF),
+				(ctxt->eflags & EFLG_DF),
 				register_address(ctxt->es_base,
 						 c->regs[VCPU_REGS_RDI]),
 				c->rep_prefix,
@@ -1429,7 +1425,7 @@ special_insn:
 				(c->d & ByteOp) ? 1 : c->op_bytes,
 				c->rep_prefix ?
 				address_mask(c->regs[VCPU_REGS_RCX]) : 1,
-				(_eflags & EFLG_DF),
+				(ctxt->eflags & EFLG_DF),
 				register_address(c->override_base ?
 							*c->override_base :
 							ctxt->ds_base,
@@ -1443,16 +1439,16 @@ special_insn:
 	case 0x70 ... 0x7f: /* jcc (short) */ {
 		int rel = insn_fetch(s8, 1, c->eip);
 
-		if (test_cc(c->b, _eflags))
+		if (test_cc(c->b, ctxt->eflags))
 		JMP_REL(rel);
 		break;
 	}
 	case 0x9c: /* pushf */
-		c->src.val =  (unsigned long) _eflags;
+		c->src.val =  (unsigned long) ctxt->eflags;
 		emulate_push(ctxt);
 		break;
 	case 0x9d: /* popf */
-		c->dst.ptr = (unsigned long *) &_eflags;
+		c->dst.ptr = (unsigned long *) &ctxt->eflags;
 		goto pop_instruction;
 	case 0xc3: /* ret */
 		c->dst.ptr = &c->eip;
@@ -1484,10 +1480,10 @@ special_insn:
 					c->dst.bytes, ctxt->vcpu)) != 0)
 			goto done;
 		register_address_increment(c->regs[VCPU_REGS_RSI],
-				       (_eflags & EFLG_DF) ? -c->dst.bytes
+				       (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
 							   : c->dst.bytes);
 		register_address_increment(c->regs[VCPU_REGS_RDI],
-				       (_eflags & EFLG_DF) ? -c->dst.bytes
+				       (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
 							   : c->dst.bytes);
 		break;
 	case 0xa6 ... 0xa7:	/* cmps */
@@ -1499,7 +1495,7 @@ special_insn:
 		c->dst.ptr = (unsigned long *)cr2;
 		c->dst.val = c->regs[VCPU_REGS_RAX];
 		register_address_increment(c->regs[VCPU_REGS_RDI],
-				       (_eflags & EFLG_DF) ? -c->dst.bytes
+				       (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
 							   : c->dst.bytes);
 		break;
 	case 0xac ... 0xad:	/* lods */
@@ -1511,7 +1507,7 @@ special_insn:
 					     ctxt->vcpu)) != 0)
 			goto done;
 		register_address_increment(c->regs[VCPU_REGS_RSI],
-				       (_eflags & EFLG_DF) ? -c->dst.bytes
+				       (ctxt->eflags & EFLG_DF) ? -c->dst.bytes
 							   : c->dst.bytes);
 		break;
 	case 0xae ... 0xaf:	/* scas */
@@ -1599,7 +1595,8 @@ twobyte_insn:
 		case 6: /* lmsw */
 			if (c->modrm_mod != 3)
 				goto cannot_emulate;
-			realmode_lmsw(ctxt->vcpu, (u16)c->modrm_val, &_eflags);
+			realmode_lmsw(ctxt->vcpu, (u16)c->modrm_val,
+						  &ctxt->eflags);
 			break;
 		case 7: /* invlpg*/
 			emulate_invlpg(ctxt->vcpu, cr2);
@@ -1630,29 +1627,29 @@ twobyte_insn:
 		 */
 		switch ((c->b & 15) >> 1) {
 		case 0:	/* cmovo */
-			no_wb = (_eflags & EFLG_OF) ? 0 : 1;
+			no_wb = (ctxt->eflags & EFLG_OF) ? 0 : 1;
 			break;
 		case 1:	/* cmovb/cmovc/cmovnae */
-			no_wb = (_eflags & EFLG_CF) ? 0 : 1;
+			no_wb = (ctxt->eflags & EFLG_CF) ? 0 : 1;
 			break;
 		case 2:	/* cmovz/cmove */
-			no_wb = (_eflags & EFLG_ZF) ? 0 : 1;
+			no_wb = (ctxt->eflags & EFLG_ZF) ? 0 : 1;
 			break;
 		case 3:	/* cmovbe/cmovna */
-			no_wb = (_eflags & (EFLG_CF | EFLG_ZF)) ? 0 : 1;
+			no_wb = (ctxt->eflags & (EFLG_CF | EFLG_ZF)) ? 0 : 1;
 			break;
 		case 4:	/* cmovs */
-			no_wb = (_eflags & EFLG_SF) ? 0 : 1;
+			no_wb = (ctxt->eflags & EFLG_SF) ? 0 : 1;
 			break;
 		case 5:	/* cmovp/cmovpe */
-			no_wb = (_eflags & EFLG_PF) ? 0 : 1;
+			no_wb = (ctxt->eflags & EFLG_PF) ? 0 : 1;
 			break;
 		case 7:	/* cmovle/cmovng */
-			no_wb = (_eflags & EFLG_ZF) ? 0 : 1;
+			no_wb = (ctxt->eflags & EFLG_ZF) ? 0 : 1;
 			/* fall through */
 		case 6:	/* cmovl/cmovnge */
-			no_wb &= (!(_eflags & EFLG_SF) !=
-			      !(_eflags & EFLG_OF)) ? 0 : 1;
+			no_wb &= (!(ctxt->eflags & EFLG_SF) !=
+			      !(ctxt->eflags & EFLG_OF)) ? 0 : 1;
 			break;
 		}
 		/* Odd cmov opcodes (lsb == 1) have inverted sense. */
@@ -1662,13 +1659,13 @@ twobyte_insn:
 	      bt:		/* bt */
 		/* only subword offset */
 		c->src.val &= (c->dst.bytes << 3) - 1;
-		emulate_2op_SrcV_nobyte("bt", c->src, c->dst, _eflags);
+		emulate_2op_SrcV_nobyte("bt", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0xab:
 	      bts:		/* bts */
 		/* only subword offset */
 		c->src.val &= (c->dst.bytes << 3) - 1;
-		emulate_2op_SrcV_nobyte("bts", c->src, c->dst, _eflags);
+		emulate_2op_SrcV_nobyte("bts", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0xb0 ... 0xb1:	/* cmpxchg */
 		/*
@@ -1677,8 +1674,8 @@ twobyte_insn:
 		 */
 		c->src.orig_val = c->src.val;
 		c->src.val = c->regs[VCPU_REGS_RAX];
-		emulate_2op_SrcV("cmp", c->src, c->dst, _eflags);
-		if (_eflags & EFLG_ZF) {
+		emulate_2op_SrcV("cmp", c->src, c->dst, ctxt->eflags);
+		if (ctxt->eflags & EFLG_ZF) {
 			/* Success: write back to memory. */
 			c->dst.val = c->src.orig_val;
 		} else {
@@ -1691,7 +1688,7 @@ twobyte_insn:
 	      btr:		/* btr */
 		/* only subword offset */
 		c->src.val &= (c->dst.bytes << 3) - 1;
-		emulate_2op_SrcV_nobyte("btr", c->src, c->dst, _eflags);
+		emulate_2op_SrcV_nobyte("btr", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0xb6 ... 0xb7:	/* movzx */
 		c->dst.bytes = c->op_bytes;
@@ -1714,7 +1711,7 @@ twobyte_insn:
 	      btc:		/* btc */
 		/* only subword offset */
 		c->src.val &= (c->dst.bytes << 3) - 1;
-		emulate_2op_SrcV_nobyte("btc", c->src, c->dst, _eflags);
+		emulate_2op_SrcV_nobyte("btc", c->src, c->dst, ctxt->eflags);
 		break;
 	case 0xbe ... 0xbf:	/* movsx */
 		c->dst.bytes = c->op_bytes;
@@ -1753,7 +1750,7 @@ twobyte_special_insn:
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
 		realmode_set_cr(ctxt->vcpu,
-				c->modrm_reg, c->modrm_val, &_eflags);
+				c->modrm_reg, c->modrm_val, &ctxt->eflags);
 		break;
 	case 0x30:
 		/* wrmsr */
@@ -1795,12 +1792,12 @@ twobyte_special_insn:
 			DPRINTF("jnz: Invalid op_bytes\n");
 			goto cannot_emulate;
 		}
-		if (test_cc(c->b, _eflags))
+		if (test_cc(c->b, ctxt->eflags))
 			JMP_REL(rel);
 		break;
 	}
 	case 0xc7:		/* Grp9 (cmpxchg8b) */
-		rc = emulate_grp9(ctxt, ops, &_eflags, cr2);
+		rc = emulate_grp9(ctxt, ops, cr2);
 		if (rc != 0)
 			goto done;
 		break;
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 14/50] KVM: x86 emulator: Remove no_wb, use dst.type = OP_NONE instead
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 13/50] KVM: x86 emulator: remove _eflags and use directly ctxt->eflags Avi Kivity
@ 2007-12-23 14:50   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 15/50] KVM: x86_emulator: no writeback for bt Avi Kivity
                     ` (35 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Remove no_wb, use dst.type = OP_NONE instead, idea stollen from xen-3.1

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   76 ++++++++++++++------------------------------
 drivers/kvm/x86_emulate.h |    2 +-
 2 files changed, 25 insertions(+), 53 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 45beeb9..8b0186f 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -1016,8 +1016,7 @@ done:
 }
 
 static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
-			       struct x86_emulate_ops *ops,
-			       int *no_wb)
+			       struct x86_emulate_ops *ops)
 {
 	struct decode_cache *c = &ctxt->decode;
 	int rc;
@@ -1055,7 +1054,7 @@ static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
 				    c->dst.bytes, ctxt->vcpu);
 		if (rc != 0)
 			return rc;
-		*no_wb = 1;
+		c->dst.type = OP_NONE;
 		break;
 	default:
 		DPRINTF("Cannot emulate %02x\n", c->b);
@@ -1137,6 +1136,10 @@ static inline int writeback(struct x86_emulate_ctxt *ctxt,
 					ctxt->vcpu);
 		if (rc != 0)
 			return rc;
+		break;
+	case OP_NONE:
+		/* no writeback */
+		break;
 	default:
 		break;
 	}
@@ -1147,7 +1150,6 @@ int
 x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
 	unsigned long cr2 = ctxt->cr2;
-	int no_wb = 0;
 	u64 msr_data;
 	unsigned long saved_eip = 0;
 	struct decode_cache *c = &ctxt->decode;
@@ -1344,18 +1346,16 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 			goto done;
 		break;
 	case 0xfe ... 0xff:	/* Grp4/Grp5 */
-		rc = emulate_grp45(ctxt, ops, &no_wb);
+		rc = emulate_grp45(ctxt, ops);
 		if (rc != 0)
 			goto done;
 		break;
 	}
 
 writeback:
-	if (!no_wb) {
-		rc = writeback(ctxt, ops);
-		if (rc != 0)
-			goto done;
-	}
+	rc = writeback(ctxt, ops);
+	if (rc != 0)
+		goto done;
 
 	/* Commit shadow register state. */
 	memcpy(ctxt->vcpu->regs, c->regs, sizeof c->regs);
@@ -1395,7 +1395,7 @@ special_insn:
 
 		register_address_increment(c->regs[VCPU_REGS_RSP],
 					   c->op_bytes);
-		no_wb = 1; /* Disable writeback. */
+		c->dst.type = OP_NONE;	/* Disable writeback. */
 		break;
 	case 0x6a: /* push imm8 */
 		c->src.val = 0L;
@@ -1538,7 +1538,7 @@ special_insn:
 	case 0xe9: /* jmp rel */
 	case 0xeb: /* jmp rel short */
 		JMP_REL(c->src.val);
-		no_wb = 1; /* Disable writeback. */
+		c->dst.type = OP_NONE; /* Disable writeback. */
 		break;
 
 
@@ -1548,8 +1548,6 @@ special_insn:
 twobyte_insn:
 	switch (c->b) {
 	case 0x01: /* lgdt, lidt, lmsw */
-		/* Disable writeback. */
-		no_wb = 1;
 		switch (c->modrm_reg) {
 			u16 size;
 			unsigned long address;
@@ -1604,56 +1602,30 @@ twobyte_insn:
 		default:
 			goto cannot_emulate;
 		}
+		/* Disable writeback. */
+		c->dst.type = OP_NONE;
 		break;
 	case 0x21: /* mov from dr to reg */
-		no_wb = 1;
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
 		rc = emulator_get_dr(ctxt, c->modrm_reg, &c->regs[c->modrm_rm]);
+		if (rc)
+			goto cannot_emulate;
+		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x23: /* mov from reg to dr */
-		no_wb = 1;
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
 		rc = emulator_set_dr(ctxt, c->modrm_reg,
 				     c->regs[c->modrm_rm]);
+		if (rc)
+			goto cannot_emulate;
+		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x40 ... 0x4f:	/* cmov */
 		c->dst.val = c->dst.orig_val = c->src.val;
-		no_wb = 1;
-		/*
-		 * First, assume we're decoding an even cmov opcode
-		 * (lsb == 0).
-		 */
-		switch ((c->b & 15) >> 1) {
-		case 0:	/* cmovo */
-			no_wb = (ctxt->eflags & EFLG_OF) ? 0 : 1;
-			break;
-		case 1:	/* cmovb/cmovc/cmovnae */
-			no_wb = (ctxt->eflags & EFLG_CF) ? 0 : 1;
-			break;
-		case 2:	/* cmovz/cmove */
-			no_wb = (ctxt->eflags & EFLG_ZF) ? 0 : 1;
-			break;
-		case 3:	/* cmovbe/cmovna */
-			no_wb = (ctxt->eflags & (EFLG_CF | EFLG_ZF)) ? 0 : 1;
-			break;
-		case 4:	/* cmovs */
-			no_wb = (ctxt->eflags & EFLG_SF) ? 0 : 1;
-			break;
-		case 5:	/* cmovp/cmovpe */
-			no_wb = (ctxt->eflags & EFLG_PF) ? 0 : 1;
-			break;
-		case 7:	/* cmovle/cmovng */
-			no_wb = (ctxt->eflags & EFLG_ZF) ? 0 : 1;
-			/* fall through */
-		case 6:	/* cmovl/cmovnge */
-			no_wb &= (!(ctxt->eflags & EFLG_SF) !=
-			      !(ctxt->eflags & EFLG_OF)) ? 0 : 1;
-			break;
-		}
-		/* Odd cmov opcodes (lsb == 1) have inverted sense. */
-		no_wb ^= c->b & 1;
+		if (!test_cc(c->b, ctxt->eflags))
+			c->dst.type = OP_NONE; /* no writeback */
 		break;
 	case 0xa3:
 	      bt:		/* bt */
@@ -1727,8 +1699,6 @@ twobyte_insn:
 	goto writeback;
 
 twobyte_special_insn:
-	/* Disable writeback. */
-	no_wb = 1;
 	switch (c->b) {
 	case 0x06:
 		emulate_clts(ctxt->vcpu);
@@ -1802,6 +1772,8 @@ twobyte_special_insn:
 			goto done;
 		break;
 	}
+	/* Disable writeback. */
+	c->dst.type = OP_NONE;
 	goto writeback;
 
 cannot_emulate:
diff --git a/drivers/kvm/x86_emulate.h b/drivers/kvm/x86_emulate.h
index 28acad4..f03b128 100644
--- a/drivers/kvm/x86_emulate.h
+++ b/drivers/kvm/x86_emulate.h
@@ -114,7 +114,7 @@ struct x86_emulate_ops {
 
 /* Type, address-of, and value of an instruction's operand. */
 struct operand {
-	enum { OP_REG, OP_MEM, OP_IMM } type;
+	enum { OP_REG, OP_MEM, OP_IMM, OP_NONE } type;
 	unsigned int bytes;
 	unsigned long val, orig_val, *ptr;
 };
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 15/50] KVM: x86_emulator: no writeback for bt
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (13 preceding siblings ...)
  2007-12-23 14:50   ` [PATCH 14/50] KVM: x86 emulator: Remove no_wb, use dst.type = OP_NONE instead Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 16/50] KVM: Purify x86_decode_insn() error case management Avi Kivity
                     ` (34 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Qing He

From: Qing He <qing.he-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Signed-off-by: Qing He <qing.he-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 8b0186f..fe50317 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -1629,6 +1629,7 @@ twobyte_insn:
 		break;
 	case 0xa3:
 	      bt:		/* bt */
+		c->dst.type = OP_NONE;
 		/* only subword offset */
 		c->src.val &= (c->dst.bytes << 3) - 1;
 		emulate_2op_SrcV_nobyte("bt", c->src, c->dst, ctxt->eflags);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 16/50] KVM: Purify x86_decode_insn() error case management
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (14 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 15/50] KVM: x86_emulator: no writeback for bt Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 17/50] KVM: x86 emulator: Any legacy prefix after a REX prefix nullifies its effect Avi Kivity
                     ` (33 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

The only valid case is on protected page access, other cases are errors.

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index b10fd7e..f7566b9 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1251,7 +1251,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 			u16 error_code,
 			int no_decode)
 {
-	int r = 0;
+	int r;
 
 	vcpu->mmio_fault_cr2 = cr2;
 	kvm_x86_ops->cache_regs(vcpu);
@@ -1294,10 +1294,14 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 					get_segment_base(vcpu, VCPU_SREG_FS);
 
 		r = x86_decode_insn(&vcpu->emulate_ctxt, &emulate_ops);
+		if (r)  {
+			if (kvm_mmu_unprotect_page_virt(vcpu, cr2))
+				return EMULATE_DONE;
+			return EMULATE_FAIL;
+		}
 	}
 
-	if (r == 0)
-		r = x86_emulate_insn(&vcpu->emulate_ctxt, &emulate_ops);
+	r = x86_emulate_insn(&vcpu->emulate_ctxt, &emulate_ops);
 
 	if (vcpu->pio.string)
 		return EMULATE_DO_MMIO;
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 17/50] KVM: x86 emulator: Any legacy prefix after a REX prefix nullifies its effect
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (15 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 16/50] KVM: Purify x86_decode_insn() error case management Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 18/50] KVM: VMX: Don't clear the vmcs if the vcpu is not loaded on any processor Avi Kivity
                     ` (32 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

This patch modifies the management of REX prefix according behavior
I saw in Xen 3.1.  In Xen, this modification has been introduced by
Jan Beulich.

http://lists.xensource.com/archives/html/xen-changelog/2007-01/msg00081.html

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   24 +++++++++++++++---------
 1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index fe50317..e6b213b 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -521,7 +521,6 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
 	struct decode_cache *c = &ctxt->decode;
 	u8 sib, rex_prefix = 0;
-	unsigned int i;
 	int rc = 0;
 	int mode = ctxt->mode;
 	int index_reg = 0, base_reg = 0, scale, rip_relative = 0;
@@ -551,7 +550,7 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	}
 
 	/* Legacy prefixes. */
-	for (i = 0; i < 8; i++) {
+	for (;;) {
 		switch (c->b = insn_fetch(u8, 1, c->eip)) {
 		case 0x66:	/* operand-size override */
 			c->op_bytes ^= 6;	/* switch between 2/4 bytes */
@@ -582,6 +581,11 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		case 0x36:	/* SS override */
 			c->override_base = &ctxt->ss_base;
 			break;
+		case 0x40 ... 0x4f: /* REX */
+			if (mode != X86EMUL_MODE_PROT64)
+				goto done_prefixes;
+			rex_prefix = c->b;
+			continue;
 		case 0xf0:	/* LOCK */
 			c->lock_prefix = 1;
 			break;
@@ -592,19 +596,21 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		default:
 			goto done_prefixes;
 		}
+
+		/* Any legacy prefix after a REX prefix nullifies its effect. */
+
+		rex_prefix = 0;
 	}
 
 done_prefixes:
 
 	/* REX prefix. */
-	if ((mode == X86EMUL_MODE_PROT64) && ((c->b & 0xf0) == 0x40)) {
-		rex_prefix = c->b;
-		if (c->b & 8)
+	if (rex_prefix) {
+		if (rex_prefix & 8)
 			c->op_bytes = 8;	/* REX.W */
-		c->modrm_reg = (c->b & 4) << 1;	/* REX.R */
-		index_reg = (c->b & 2) << 2; /* REX.X */
-		c->modrm_rm = base_reg = (c->b & 1) << 3; /* REG.B */
-		c->b = insn_fetch(u8, 1, c->eip);
+		c->modrm_reg = (rex_prefix & 4) << 1;	/* REX.R */
+		index_reg = (rex_prefix & 2) << 2; /* REX.X */
+		c->modrm_rm = base_reg = (rex_prefix & 1) << 3; /* REG.B */
 	}
 
 	/* Opcode byte(s). */
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 18/50] KVM: VMX: Don't clear the vmcs if the vcpu is not loaded on any processor
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (16 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 17/50] KVM: x86 emulator: Any legacy prefix after a REX prefix nullifies its effect Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 19/50] KVM: VMX: Simplify vcpu_clear() Avi Kivity
                     ` (31 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Noted by Eddie Dong.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/vmx.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index d32e63d..8929575 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -225,7 +225,9 @@ static void __vcpu_clear(void *arg)
 
 static void vcpu_clear(struct vcpu_vmx *vmx)
 {
-	if (vmx->vcpu.cpu != raw_smp_processor_id() && vmx->vcpu.cpu != -1)
+	if (vmx->vcpu.cpu == -1)
+		return;
+	if (vmx->vcpu.cpu != raw_smp_processor_id())
 		smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear,
 					 vmx, 0, 1);
 	else
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 19/50] KVM: VMX: Simplify vcpu_clear()
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (17 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 18/50] KVM: VMX: Don't clear the vmcs if the vcpu is not loaded on any processor Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 20/50] KVM: Remove the usage of page->private field by rmap Avi Kivity
                     ` (30 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Now that smp_call_function_single() knows how to call a function on the
current cpu, there's no need to check explicitly.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/vmx.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 8929575..c87f52b 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -227,11 +227,7 @@ static void vcpu_clear(struct vcpu_vmx *vmx)
 {
 	if (vmx->vcpu.cpu == -1)
 		return;
-	if (vmx->vcpu.cpu != raw_smp_processor_id())
-		smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear,
-					 vmx, 0, 1);
-	else
-		__vcpu_clear(vmx);
+	smp_call_function_single(vmx->vcpu.cpu, __vcpu_clear, vmx, 0, 1);
 	vmx->launched = 0;
 }
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 20/50] KVM: Remove the usage of page->private field by rmap
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (18 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 19/50] KVM: VMX: Simplify vcpu_clear() Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 21/50] KVM: Add general accessors to read and write guest memory Avi Kivity
                     ` (29 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

When kvm uses user-allocated pages in the future for the guest, we won't
be able to use page->private for rmap, since page->rmap is reserved for
the filesystem.  So we move the rmap base pointers to the memory slot.

A side effect of this is that we need to store the gfn of each gpte in
the shadow pages, since the memory slot is addressed by gfn, instead of
hfn like struct page.

Signed-off-by: Izik Eidus <izik-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    6 ++-
 drivers/kvm/kvm_main.c    |   11 +++-
 drivers/kvm/mmu.c         |  122 ++++++++++++++++++++++++++-------------------
 drivers/kvm/paging_tmpl.h |    3 +-
 4 files changed, 86 insertions(+), 56 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 08ffc82..80cfb99 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -126,6 +126,8 @@ struct kvm_mmu_page {
 	union kvm_mmu_page_role role;
 
 	u64 *spt;
+	/* hold the gfn of each spte inside spt */
+	gfn_t *gfns;
 	unsigned long slot_bitmap; /* One bit set per slot which has memory
 				    * in this shadow page.
 				    */
@@ -159,7 +161,7 @@ struct kvm_mmu {
 	u64 *pae_root;
 };
 
-#define KVM_NR_MEM_OBJS 20
+#define KVM_NR_MEM_OBJS 40
 
 struct kvm_mmu_memory_cache {
 	int nobjs;
@@ -402,6 +404,7 @@ struct kvm_memory_slot {
 	unsigned long npages;
 	unsigned long flags;
 	struct page **phys_mem;
+	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
 };
 
@@ -554,6 +557,7 @@ struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
 
 extern hpa_t bad_page_address;
 
+gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index f7566b9..ac563fc 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -309,6 +309,8 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 					__free_page(free->phys_mem[i]);
 			vfree(free->phys_mem);
 		}
+	if (!dont || free->rmap != dont->rmap)
+		vfree(free->rmap);
 
 	if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
 		vfree(free->dirty_bitmap);
@@ -719,13 +721,18 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 		if (!new.phys_mem)
 			goto out_unlock;
 
+		new.rmap = vmalloc(npages * sizeof(struct page*));
+
+		if (!new.rmap)
+			goto out_unlock;
+
 		memset(new.phys_mem, 0, npages * sizeof(struct page *));
+		memset(new.rmap, 0, npages * sizeof(*new.rmap));
 		for (i = 0; i < npages; ++i) {
 			new.phys_mem[i] = alloc_page(GFP_HIGHUSER
 						     | __GFP_ZERO);
 			if (!new.phys_mem[i])
 				goto out_unlock;
-			set_page_private(new.phys_mem[i],0);
 		}
 	}
 
@@ -909,7 +916,7 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	return r;
 }
 
-static gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
+gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
 	struct kvm_mem_alias *alias;
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index d347e89..72757db 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -276,7 +276,7 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu)
 				   rmap_desc_cache, 1);
 	if (r)
 		goto out;
-	r = mmu_topup_memory_cache_page(&vcpu->mmu_page_cache, 4);
+	r = mmu_topup_memory_cache_page(&vcpu->mmu_page_cache, 8);
 	if (r)
 		goto out;
 	r = mmu_topup_memory_cache(&vcpu->mmu_page_header_cache,
@@ -327,35 +327,52 @@ static void mmu_free_rmap_desc(struct kvm_rmap_desc *rd)
 }
 
 /*
+ * Take gfn and return the reverse mapping to it.
+ * Note: gfn must be unaliased before this function get called
+ */
+
+static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn)
+{
+	struct kvm_memory_slot *slot;
+
+	slot = gfn_to_memslot(kvm, gfn);
+	return &slot->rmap[gfn - slot->base_gfn];
+}
+
+/*
  * Reverse mapping data structures:
  *
- * If page->private bit zero is zero, then page->private points to the
- * shadow page table entry that points to page_address(page).
+ * If rmapp bit zero is zero, then rmapp point to the shadw page table entry
+ * that points to page_address(page).
  *
- * If page->private bit zero is one, (then page->private & ~1) points
- * to a struct kvm_rmap_desc containing more mappings.
+ * If rmapp bit zero is one, (then rmap & ~1) points to a struct kvm_rmap_desc
+ * containing more mappings.
  */
-static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte)
+static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
 {
-	struct page *page;
+	struct kvm_mmu_page *page;
 	struct kvm_rmap_desc *desc;
+	unsigned long *rmapp;
 	int i;
 
 	if (!is_rmap_pte(*spte))
 		return;
-	page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT);
-	if (!page_private(page)) {
+	gfn = unalias_gfn(vcpu->kvm, gfn);
+	page = page_header(__pa(spte));
+	page->gfns[spte - page->spt] = gfn;
+	rmapp = gfn_to_rmap(vcpu->kvm, gfn);
+	if (!*rmapp) {
 		rmap_printk("rmap_add: %p %llx 0->1\n", spte, *spte);
-		set_page_private(page,(unsigned long)spte);
-	} else if (!(page_private(page) & 1)) {
+		*rmapp = (unsigned long)spte;
+	} else if (!(*rmapp & 1)) {
 		rmap_printk("rmap_add: %p %llx 1->many\n", spte, *spte);
 		desc = mmu_alloc_rmap_desc(vcpu);
-		desc->shadow_ptes[0] = (u64 *)page_private(page);
+		desc->shadow_ptes[0] = (u64 *)*rmapp;
 		desc->shadow_ptes[1] = spte;
-		set_page_private(page,(unsigned long)desc | 1);
+		*rmapp = (unsigned long)desc | 1;
 	} else {
 		rmap_printk("rmap_add: %p %llx many->many\n", spte, *spte);
-		desc = (struct kvm_rmap_desc *)(page_private(page) & ~1ul);
+		desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
 		while (desc->shadow_ptes[RMAP_EXT-1] && desc->more)
 			desc = desc->more;
 		if (desc->shadow_ptes[RMAP_EXT-1]) {
@@ -368,7 +385,7 @@ static void rmap_add(struct kvm_vcpu *vcpu, u64 *spte)
 	}
 }
 
-static void rmap_desc_remove_entry(struct page *page,
+static void rmap_desc_remove_entry(unsigned long *rmapp,
 				   struct kvm_rmap_desc *desc,
 				   int i,
 				   struct kvm_rmap_desc *prev_desc)
@@ -382,44 +399,46 @@ static void rmap_desc_remove_entry(struct page *page,
 	if (j != 0)
 		return;
 	if (!prev_desc && !desc->more)
-		set_page_private(page,(unsigned long)desc->shadow_ptes[0]);
+		*rmapp = (unsigned long)desc->shadow_ptes[0];
 	else
 		if (prev_desc)
 			prev_desc->more = desc->more;
 		else
-			set_page_private(page,(unsigned long)desc->more | 1);
+			*rmapp = (unsigned long)desc->more | 1;
 	mmu_free_rmap_desc(desc);
 }
 
-static void rmap_remove(u64 *spte)
+static void rmap_remove(struct kvm *kvm, u64 *spte)
 {
-	struct page *page;
 	struct kvm_rmap_desc *desc;
 	struct kvm_rmap_desc *prev_desc;
+	struct kvm_mmu_page *page;
+	unsigned long *rmapp;
 	int i;
 
 	if (!is_rmap_pte(*spte))
 		return;
-	page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT);
-	if (!page_private(page)) {
+	page = page_header(__pa(spte));
+	rmapp = gfn_to_rmap(kvm, page->gfns[spte - page->spt]);
+	if (!*rmapp) {
 		printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte);
 		BUG();
-	} else if (!(page_private(page) & 1)) {
+	} else if (!(*rmapp & 1)) {
 		rmap_printk("rmap_remove:  %p %llx 1->0\n", spte, *spte);
-		if ((u64 *)page_private(page) != spte) {
+		if ((u64 *)*rmapp != spte) {
 			printk(KERN_ERR "rmap_remove:  %p %llx 1->BUG\n",
 			       spte, *spte);
 			BUG();
 		}
-		set_page_private(page,0);
+		*rmapp = 0;
 	} else {
 		rmap_printk("rmap_remove:  %p %llx many->many\n", spte, *spte);
-		desc = (struct kvm_rmap_desc *)(page_private(page) & ~1ul);
+		desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
 		prev_desc = NULL;
 		while (desc) {
 			for (i = 0; i < RMAP_EXT && desc->shadow_ptes[i]; ++i)
 				if (desc->shadow_ptes[i] == spte) {
-					rmap_desc_remove_entry(page,
+					rmap_desc_remove_entry(rmapp,
 							       desc, i,
 							       prev_desc);
 					return;
@@ -433,28 +452,25 @@ static void rmap_remove(u64 *spte)
 
 static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 {
-	struct kvm *kvm = vcpu->kvm;
-	struct page *page;
 	struct kvm_rmap_desc *desc;
+	unsigned long *rmapp;
 	u64 *spte;
 
-	page = gfn_to_page(kvm, gfn);
-	BUG_ON(!page);
+	gfn = unalias_gfn(vcpu->kvm, gfn);
+	rmapp = gfn_to_rmap(vcpu->kvm, gfn);
 
-	while (page_private(page)) {
-		if (!(page_private(page) & 1))
-			spte = (u64 *)page_private(page);
+	while (*rmapp) {
+		if (!(*rmapp & 1))
+			spte = (u64 *)*rmapp;
 		else {
-			desc = (struct kvm_rmap_desc *)(page_private(page) & ~1ul);
+			desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
 			spte = desc->shadow_ptes[0];
 		}
 		BUG_ON(!spte);
-		BUG_ON((*spte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT
-		       != page_to_pfn(page));
 		BUG_ON(!(*spte & PT_PRESENT_MASK));
 		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
-		rmap_remove(spte);
+		rmap_remove(vcpu->kvm, spte);
 		set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
 		kvm_flush_remote_tlbs(vcpu->kvm);
 	}
@@ -482,6 +498,7 @@ static void kvm_mmu_free_page(struct kvm *kvm,
 	ASSERT(is_empty_shadow_page(page_head->spt));
 	list_del(&page_head->link);
 	__free_page(virt_to_page(page_head->spt));
+	__free_page(virt_to_page(page_head->gfns));
 	kfree(page_head);
 	++kvm->n_free_mmu_pages;
 }
@@ -502,6 +519,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 	page = mmu_memory_cache_alloc(&vcpu->mmu_page_header_cache,
 				      sizeof *page);
 	page->spt = mmu_memory_cache_alloc(&vcpu->mmu_page_cache, PAGE_SIZE);
+	page->gfns = mmu_memory_cache_alloc(&vcpu->mmu_page_cache, PAGE_SIZE);
 	set_page_private(virt_to_page(page->spt), (unsigned long)page);
 	list_add(&page->link, &vcpu->kvm->active_mmu_pages);
 	ASSERT(is_empty_shadow_page(page->spt));
@@ -667,7 +685,7 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm,
 	if (page->role.level == PT_PAGE_TABLE_LEVEL) {
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
 			if (is_shadow_present_pte(pt[i]))
-				rmap_remove(&pt[i]);
+				rmap_remove(kvm, &pt[i]);
 			pt[i] = shadow_trap_nonpresent_pte;
 		}
 		kvm_flush_remote_tlbs(kvm);
@@ -832,7 +850,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 			page_header_update_slot(vcpu->kvm, table, v);
 			table[index] = p | PT_PRESENT_MASK | PT_WRITABLE_MASK |
 								PT_USER_MASK;
-			rmap_add(vcpu, &table[index]);
+			rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
 			return 0;
 		}
 
@@ -1123,7 +1141,7 @@ static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 	pte = *spte;
 	if (is_shadow_present_pte(pte)) {
 		if (page->role.level == PT_PAGE_TABLE_LEVEL)
-			rmap_remove(spte);
+			rmap_remove(vcpu->kvm, spte);
 		else {
 			child = page_header(pte & PT64_BASE_ADDR_MASK);
 			mmu_page_remove_parent_pte(child, spte);
@@ -1340,7 +1358,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
 			/* avoid RMW */
 			if (pt[i] & PT_WRITABLE_MASK) {
-				rmap_remove(&pt[i]);
+				rmap_remove(kvm, &pt[i]);
 				pt[i] &= ~PT_WRITABLE_MASK;
 			}
 	}
@@ -1470,15 +1488,15 @@ static int count_rmaps(struct kvm_vcpu *vcpu)
 		struct kvm_rmap_desc *d;
 
 		for (j = 0; j < m->npages; ++j) {
-			struct page *page = m->phys_mem[j];
+			unsigned long *rmapp = &m->rmap[j];
 
-			if (!page->private)
+			if (!*rmapp)
 				continue;
-			if (!(page->private & 1)) {
+			if (!(*rmapp & 1)) {
 				++nmaps;
 				continue;
 			}
-			d = (struct kvm_rmap_desc *)(page->private & ~1ul);
+			d = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
 			while (d) {
 				for (k = 0; k < RMAP_EXT; ++k)
 					if (d->shadow_ptes[k])
@@ -1530,18 +1548,18 @@ static void audit_rmap(struct kvm_vcpu *vcpu)
 static void audit_write_protection(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu_page *page;
+	struct kvm_memory_slot *slot;
+	unsigned long *rmapp;
+	gfn_t gfn;
 
 	list_for_each_entry(page, &vcpu->kvm->active_mmu_pages, link) {
-		hfn_t hfn;
-		struct page *pg;
-
 		if (page->role.metaphysical)
 			continue;
 
-		hfn = gpa_to_hpa(vcpu, (gpa_t)page->gfn << PAGE_SHIFT)
-			>> PAGE_SHIFT;
-		pg = pfn_to_page(hfn);
-		if (pg->private)
+		slot = gfn_to_memslot(vcpu->kvm, page->gfn);
+		gfn = unalias_gfn(vcpu->kvm, page->gfn);
+		rmapp = &slot->rmap[gfn - slot->base_gfn];
+		if (*rmapp)
 			printk(KERN_ERR "%s: (%s) shadow page has writable"
 			       " mappings: gfn %lx role %x\n",
 			       __FUNCTION__, audit_msg, page->gfn,
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index be0f852..fbe595f 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -295,7 +295,8 @@ unshadowed:
 	set_shadow_pte(shadow_pte, spte);
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
 	if (!was_rmapped)
-		rmap_add(vcpu, shadow_pte);
+		rmap_add(vcpu, shadow_pte, (gaddr & PT64_BASE_ADDR_MASK)
+			 >> PAGE_SHIFT);
 	if (!ptwrite || !*ptwrite)
 		vcpu->last_pte_updated = shadow_pte;
 }
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 21/50] KVM: Add general accessors to read and write guest memory
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (19 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 20/50] KVM: Remove the usage of page->private field by rmap Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 22/50] KVM: Allow dynamic allocation of the mmu shadow cache size Avi Kivity
                     ` (28 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Izik Eidus (none)

From: Izik Eidus <izik@Home1.(none)>

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    9 +++
 drivers/kvm/kvm_main.c |  160 +++++++++++++++++++++++++++++++++++++++---------
 drivers/kvm/vmx.c      |   43 ++++++-------
 3 files changed, 158 insertions(+), 54 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 80cfb99..1965438 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -559,6 +559,15 @@ extern hpa_t bad_page_address;
 
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
+int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
+			int len);
+int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len);
+int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
+			 int offset, int len);
+int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
+		    unsigned long len);
+int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
+int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
 
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ac563fc..3d1972e 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -400,22 +400,16 @@ static int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
 	gfn_t pdpt_gfn = cr3 >> PAGE_SHIFT;
 	unsigned offset = ((cr3 & (PAGE_SIZE-1)) >> 5) << 2;
 	int i;
-	u64 *pdpt;
 	int ret;
-	struct page *page;
 	u64 pdpte[ARRAY_SIZE(vcpu->pdptrs)];
 
 	mutex_lock(&vcpu->kvm->lock);
-	page = gfn_to_page(vcpu->kvm, pdpt_gfn);
-	if (!page) {
+	ret = kvm_read_guest_page(vcpu->kvm, pdpt_gfn, pdpte,
+				  offset * sizeof(u64), sizeof(pdpte));
+	if (ret < 0) {
 		ret = 0;
 		goto out;
 	}
-
-	pdpt = kmap_atomic(page, KM_USER0);
-	memcpy(pdpte, pdpt+offset, sizeof(pdpte));
-	kunmap_atomic(pdpt, KM_USER0);
-
 	for (i = 0; i < ARRAY_SIZE(pdpte); ++i) {
 		if ((pdpte[i] & 1) && (pdpte[i] & 0xfffffff0000001e6ull)) {
 			ret = 0;
@@ -962,6 +956,127 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
+static int next_segment(unsigned long len, int offset)
+{
+	if (len > PAGE_SIZE - offset)
+		return PAGE_SIZE - offset;
+	else
+		return len;
+}
+
+int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
+			int len)
+{
+	void *page_virt;
+	struct page *page;
+
+	page = gfn_to_page(kvm, gfn);
+	if (!page)
+		return -EFAULT;
+	page_virt = kmap_atomic(page, KM_USER0);
+
+	memcpy(data, page_virt + offset, len);
+
+	kunmap_atomic(page_virt, KM_USER0);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_read_guest_page);
+
+int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len)
+{
+	gfn_t gfn = gpa >> PAGE_SHIFT;
+	int seg;
+	int offset = offset_in_page(gpa);
+	int ret;
+
+	while ((seg = next_segment(len, offset)) != 0) {
+		ret = kvm_read_guest_page(kvm, gfn, data, offset, seg);
+		if (ret < 0)
+			return ret;
+		offset = 0;
+		len -= seg;
+		data += seg;
+		++gfn;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_read_guest);
+
+int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
+			 int offset, int len)
+{
+	void *page_virt;
+	struct page *page;
+
+	page = gfn_to_page(kvm, gfn);
+	if (!page)
+		return -EFAULT;
+	page_virt = kmap_atomic(page, KM_USER0);
+
+	memcpy(page_virt + offset, data, len);
+
+	kunmap_atomic(page_virt, KM_USER0);
+	mark_page_dirty(kvm, gfn);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_write_guest_page);
+
+int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
+		    unsigned long len)
+{
+	gfn_t gfn = gpa >> PAGE_SHIFT;
+	int seg;
+	int offset = offset_in_page(gpa);
+	int ret;
+
+	while ((seg = next_segment(len, offset)) != 0) {
+		ret = kvm_write_guest_page(kvm, gfn, data, offset, seg);
+		if (ret < 0)
+			return ret;
+		offset = 0;
+		len -= seg;
+		data += seg;
+		++gfn;
+	}
+	return 0;
+}
+
+int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
+{
+	void *page_virt;
+	struct page *page;
+
+	page = gfn_to_page(kvm, gfn);
+	if (!page)
+		return -EFAULT;
+	page_virt = kmap_atomic(page, KM_USER0);
+
+	memset(page_virt + offset, 0, len);
+
+	kunmap_atomic(page_virt, KM_USER0);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_clear_guest_page);
+
+int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len)
+{
+	gfn_t gfn = gpa >> PAGE_SHIFT;
+	int seg;
+	int offset = offset_in_page(gpa);
+	int ret;
+
+        while ((seg = next_segment(len, offset)) != 0) {
+		ret = kvm_clear_guest_page(kvm, gfn, offset, seg);
+		if (ret < 0)
+			return ret;
+		offset = 0;
+		len -= seg;
+		++gfn;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_clear_guest);
+
 /* WARNING: Does not work on aliased pages. */
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
 {
@@ -988,21 +1103,13 @@ int emulator_read_std(unsigned long addr,
 		gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
 		unsigned offset = addr & (PAGE_SIZE-1);
 		unsigned tocopy = min(bytes, (unsigned)PAGE_SIZE - offset);
-		unsigned long pfn;
-		struct page *page;
-		void *page_virt;
+		int ret;
 
 		if (gpa == UNMAPPED_GVA)
 			return X86EMUL_PROPAGATE_FAULT;
-		pfn = gpa >> PAGE_SHIFT;
-		page = gfn_to_page(vcpu->kvm, pfn);
-		if (!page)
+		ret = kvm_read_guest(vcpu->kvm, gpa, data, tocopy);
+		if (ret < 0)
 			return X86EMUL_UNHANDLEABLE;
-		page_virt = kmap_atomic(page, KM_USER0);
-
-		memcpy(data, page_virt + offset, tocopy);
-
-		kunmap_atomic(page_virt, KM_USER0);
 
 		bytes -= tocopy;
 		data += tocopy;
@@ -1095,19 +1202,12 @@ static int emulator_read_emulated(unsigned long addr,
 static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
 			       const void *val, int bytes)
 {
-	struct page *page;
-	void *virt;
+	int ret;
 
-	if (((gpa + bytes - 1) >> PAGE_SHIFT) != (gpa >> PAGE_SHIFT))
-		return 0;
-	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
-	if (!page)
+	ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes);
+	if (ret < 0)
 		return 0;
-	mark_page_dirty(vcpu->kvm, gpa >> PAGE_SHIFT);
-	virt = kmap_atomic(page, KM_USER0);
 	kvm_mmu_pte_write(vcpu, gpa, val, bytes);
-	memcpy(virt + offset_in_page(gpa), val, bytes);
-	kunmap_atomic(virt, KM_USER0);
 	return 1;
 }
 
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index c87f52b..ce15e51 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -1387,33 +1387,28 @@ static void vmx_set_gdt(struct kvm_vcpu *vcpu, struct descriptor_table *dt)
 
 static int init_rmode_tss(struct kvm* kvm)
 {
-	struct page *p1, *p2, *p3;
 	gfn_t fn = rmode_tss_base(kvm) >> PAGE_SHIFT;
-	char *page;
-
-	p1 = gfn_to_page(kvm, fn++);
-	p2 = gfn_to_page(kvm, fn++);
-	p3 = gfn_to_page(kvm, fn);
+	u16 data = 0;
+	int r;
 
-	if (!p1 || !p2 || !p3) {
-		kvm_printf(kvm,"%s: gfn_to_page failed\n", __FUNCTION__);
+	r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE);
+	if (r < 0)
+		return 0;
+	data = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
+	r = kvm_write_guest_page(kvm, fn++, &data, 0x66, sizeof(u16));
+	if (r < 0)
+		return 0;
+	r = kvm_clear_guest_page(kvm, fn++, 0, PAGE_SIZE);
+	if (r < 0)
+		return 0;
+	r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE);
+	if (r < 0)
+		return 0;
+	data = ~0;
+	r = kvm_write_guest_page(kvm, fn, &data, RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1,
+			sizeof(u8));
+	if (r < 0)
 		return 0;
-	}
-
-	page = kmap_atomic(p1, KM_USER0);
-	clear_page(page);
-	*(u16*)(page + 0x66) = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
-	kunmap_atomic(page, KM_USER0);
-
-	page = kmap_atomic(p2, KM_USER0);
-	clear_page(page);
-	kunmap_atomic(page, KM_USER0);
-
-	page = kmap_atomic(p3, KM_USER0);
-	clear_page(page);
-	*(page + RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1) = ~0;
-	kunmap_atomic(page, KM_USER0);
-
 	return 1;
 }
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 22/50] KVM: Allow dynamic allocation of the mmu shadow cache size
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (20 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 21/50] KVM: Add general accessors to read and write guest memory Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 23/50] KVM: Add kvm_free_lapic() to pair with kvm_create_lapic() Avi Kivity
                     ` (27 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

The user is now able to set how many mmu pages will be allocated to the guest.

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    7 ++++++-
 drivers/kvm/kvm_main.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/kvm/mmu.c      |   40 ++++++++++++++++++++++++++++++++++++++--
 include/linux/kvm.h    |    3 +++
 4 files changed, 94 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 1965438..9f10c37 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -40,6 +40,8 @@
 #define KVM_MAX_VCPUS 4
 #define KVM_ALIAS_SLOTS 4
 #define KVM_MEMORY_SLOTS 8
+#define KVM_PERMILLE_MMU_PAGES 20
+#define KVM_MIN_ALLOC_MMU_PAGES 64
 #define KVM_NUM_MMU_PAGES 1024
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
@@ -418,7 +420,9 @@ struct kvm {
 	 * Hash table of struct kvm_mmu_page.
 	 */
 	struct list_head active_mmu_pages;
-	int n_free_mmu_pages;
+	unsigned int n_free_mmu_pages;
+	unsigned int n_requested_mmu_pages;
+	unsigned int n_alloc_mmu_pages;
 	struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
 	struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
 	unsigned long rmap_overflow;
@@ -547,6 +551,7 @@ void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte);
 int kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
 void kvm_mmu_zap_all(struct kvm *kvm);
+void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
 
 hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 3d1972e..d220e63 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -743,6 +743,24 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 	if (mem->slot >= kvm->nmemslots)
 		kvm->nmemslots = mem->slot + 1;
 
+	if (!kvm->n_requested_mmu_pages) {
+		unsigned int n_pages;
+
+		if (npages) {
+			n_pages = npages * KVM_PERMILLE_MMU_PAGES / 1000;
+			kvm_mmu_change_mmu_pages(kvm, kvm->n_alloc_mmu_pages +
+						 n_pages);
+		} else {
+			unsigned int nr_mmu_pages;
+
+			n_pages = old.npages * KVM_PERMILLE_MMU_PAGES / 1000;
+			nr_mmu_pages = kvm->n_alloc_mmu_pages - n_pages;
+			nr_mmu_pages = max(nr_mmu_pages,
+				        (unsigned int) KVM_MIN_ALLOC_MMU_PAGES);
+			kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
+		}
+	}
+
 	*memslot = new;
 
 	kvm_mmu_slot_remove_write_access(kvm, mem->slot);
@@ -760,6 +778,26 @@ out:
 	return r;
 }
 
+static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm *kvm,
+					  u32 kvm_nr_mmu_pages)
+{
+	if (kvm_nr_mmu_pages < KVM_MIN_ALLOC_MMU_PAGES)
+		return -EINVAL;
+
+	mutex_lock(&kvm->lock);
+
+	kvm_mmu_change_mmu_pages(kvm, kvm_nr_mmu_pages);
+	kvm->n_requested_mmu_pages = kvm_nr_mmu_pages;
+
+	mutex_unlock(&kvm->lock);
+	return 0;
+}
+
+static int kvm_vm_ioctl_get_nr_mmu_pages(struct kvm *kvm)
+{
+	return kvm->n_alloc_mmu_pages;
+}
+
 /*
  * Get (and clear) the dirty memory log for a memory slot.
  */
@@ -3071,6 +3109,14 @@ static long kvm_vm_ioctl(struct file *filp,
 			goto out;
 		break;
 	}
+	case KVM_SET_NR_MMU_PAGES:
+		r = kvm_vm_ioctl_set_nr_mmu_pages(kvm, arg);
+		if (r)
+			goto out;
+		break;
+	case KVM_GET_NR_MMU_PAGES:
+		r = kvm_vm_ioctl_get_nr_mmu_pages(kvm);
+		break;
 	case KVM_GET_DIRTY_LOG: {
 		struct kvm_dirty_log log;
 
@@ -3278,6 +3324,7 @@ static long kvm_dev_ioctl(struct file *filp,
 		switch (ext) {
 		case KVM_CAP_IRQCHIP:
 		case KVM_CAP_HLT:
+		case KVM_CAP_MMU_SHADOW_CACHE_CONTROL:
 			r = 1;
 			break;
 		default:
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 72757db..6cda1fe 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -747,6 +747,40 @@ static void kvm_mmu_zap_page(struct kvm *kvm,
 	kvm_mmu_reset_last_pte_updated(kvm);
 }
 
+/*
+ * Changing the number of mmu pages allocated to the vm
+ * Note: if kvm_nr_mmu_pages is too small, you will get dead lock
+ */
+void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages)
+{
+	/*
+	 * If we set the number of mmu pages to be smaller be than the
+	 * number of actived pages , we must to free some mmu pages before we
+	 * change the value
+	 */
+
+	if ((kvm->n_alloc_mmu_pages - kvm->n_free_mmu_pages) >
+	    kvm_nr_mmu_pages) {
+		int n_used_mmu_pages = kvm->n_alloc_mmu_pages
+				       - kvm->n_free_mmu_pages;
+
+		while (n_used_mmu_pages > kvm_nr_mmu_pages) {
+			struct kvm_mmu_page *page;
+
+			page = container_of(kvm->active_mmu_pages.prev,
+					    struct kvm_mmu_page, link);
+			kvm_mmu_zap_page(kvm, page);
+			n_used_mmu_pages--;
+		}
+		kvm->n_free_mmu_pages = 0;
+	}
+	else
+		kvm->n_free_mmu_pages += kvm_nr_mmu_pages
+					 - kvm->n_alloc_mmu_pages;
+
+	kvm->n_alloc_mmu_pages = kvm_nr_mmu_pages;
+}
+
 static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
 {
 	unsigned index;
@@ -1297,8 +1331,10 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
 
 	ASSERT(vcpu);
 
-	vcpu->kvm->n_free_mmu_pages = KVM_NUM_MMU_PAGES;
-
+	if (vcpu->kvm->n_requested_mmu_pages)
+		vcpu->kvm->n_free_mmu_pages = vcpu->kvm->n_requested_mmu_pages;
+	else
+		vcpu->kvm->n_free_mmu_pages = vcpu->kvm->n_alloc_mmu_pages;
 	/*
 	 * When emulating 32-bit mode, cr3 is only 32 bits even on x86_64.
 	 * Therefore we need to allocate shadow page tables in the first
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 057a7f3..d2fd973 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -347,11 +347,14 @@ struct kvm_signal_mask {
  */
 #define KVM_CAP_IRQCHIP	  0
 #define KVM_CAP_HLT	  1
+#define KVM_CAP_MMU_SHADOW_CACHE_CONTROL 2
 
 /*
  * ioctls for VM fds
  */
 #define KVM_SET_MEMORY_REGION     _IOW(KVMIO, 0x40, struct kvm_memory_region)
+#define KVM_SET_NR_MMU_PAGES      _IO(KVMIO, 0x44)
+#define KVM_GET_NR_MMU_PAGES      _IO(KVMIO, 0x45)
 /*
  * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
  * a vcpu fd.
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 23/50] KVM: Add kvm_free_lapic() to pair with kvm_create_lapic()
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (21 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 22/50] KVM: Allow dynamic allocation of the mmu shadow cache size Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 24/50] KVM: Hoist kvm_create_lapic() into kvm_vcpu_init() Avi Kivity
                     ` (26 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>

Instead of the asymetry of kvm_free_apic, implement kvm_free_lapic().
And guess what?  I found a minor bug: we don't need to hrtimer_cancel()
from kvm_main.c, because we do that in kvm_free_apic().

Also:
1) kvm_vcpu_uninit should be the reverse order from kvm_vcpu_init.
2) Don't set apic->regs_page to zero before freeing apic.

Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/irq.h      |    2 +-
 drivers/kvm/kvm_main.c |    4 +---
 drivers/kvm/lapic.c    |   19 +++++++++----------
 3 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/kvm/irq.h b/drivers/kvm/irq.h
index 11fc014..508280e 100644
--- a/drivers/kvm/irq.h
+++ b/drivers/kvm/irq.h
@@ -139,7 +139,7 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
 int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
 int kvm_create_lapic(struct kvm_vcpu *vcpu);
 void kvm_lapic_reset(struct kvm_vcpu *vcpu);
-void kvm_free_apic(struct kvm_lapic *apic);
+void kvm_free_lapic(struct kvm_vcpu *vcpu);
 u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu);
 void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8);
 void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index d220e63..760753d 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -268,10 +268,8 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_init);
 
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu)
 {
+	kvm_free_lapic(vcpu);
 	kvm_mmu_destroy(vcpu);
-	if (vcpu->apic)
-		hrtimer_cancel(&vcpu->apic->timer.dev);
-	kvm_free_apic(vcpu->apic);
 	free_page((unsigned long)vcpu->pio_data);
 	free_page((unsigned long)vcpu->run);
 }
diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
index 238fcad..8e8dab0 100644
--- a/drivers/kvm/lapic.c
+++ b/drivers/kvm/lapic.c
@@ -762,19 +762,17 @@ static int apic_mmio_range(struct kvm_io_device *this, gpa_t addr)
 	return ret;
 }
 
-void kvm_free_apic(struct kvm_lapic *apic)
+void kvm_free_lapic(struct kvm_vcpu *vcpu)
 {
-	if (!apic)
+	if (!vcpu->apic)
 		return;
 
-	hrtimer_cancel(&apic->timer.dev);
+	hrtimer_cancel(&vcpu->apic->timer.dev);
 
-	if (apic->regs_page) {
-		__free_page(apic->regs_page);
-		apic->regs_page = 0;
-	}
+	if (vcpu->apic->regs_page)
+		__free_page(vcpu->apic->regs_page);
 
-	kfree(apic);
+	kfree(vcpu->apic);
 }
 
 /*
@@ -962,7 +960,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
 	if (apic->regs_page == NULL) {
 		printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
 		       vcpu->vcpu_id);
-		goto nomem;
+		goto nomem_free_apic;
 	}
 	apic->regs = page_address(apic->regs_page);
 	memset(apic->regs, 0, PAGE_SIZE);
@@ -980,8 +978,9 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
 	apic->dev.private = apic;
 
 	return 0;
+nomem_free_apic:
+	kfree(apic);
 nomem:
-	kvm_free_apic(apic);
 	return -ENOMEM;
 }
 EXPORT_SYMBOL_GPL(kvm_create_lapic);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 24/50] KVM: Hoist kvm_create_lapic() into kvm_vcpu_init()
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (22 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 23/50] KVM: Add kvm_free_lapic() to pair with kvm_create_lapic() Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 25/50] KVM: Remove gratuitous casts from lapic.c Avi Kivity
                     ` (25 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>

Move kvm_create_lapic() into kvm_vcpu_init(), rather than having svm
and vmx do it.  And make it return the error rather than a fairly
random -ENOMEM.

This also solves the problem that neither svm.c nor vmx.c actually
handles the error path properly.

Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |   10 +++++++++-
 drivers/kvm/svm.c      |    6 ------
 drivers/kvm/vmx.c      |    6 ------
 3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 760753d..0a04b75 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -255,14 +255,22 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	if (r < 0)
 		goto fail_free_pio_data;
 
+	if (irqchip_in_kernel(kvm)) {
+		r = kvm_create_lapic(vcpu);
+		if (r < 0)
+			goto fail_mmu_destroy;
+	}
+
 	return 0;
 
+fail_mmu_destroy:
+	kvm_mmu_destroy(vcpu);
 fail_free_pio_data:
 	free_page((unsigned long)vcpu->pio_data);
 fail_free_run:
 	free_page((unsigned long)vcpu->run);
 fail:
-	return -ENOMEM;
+	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_init);
 
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index a0eef78..f2278d0 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -588,12 +588,6 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto free_svm;
 
-	if (irqchip_in_kernel(kvm)) {
-		err = kvm_create_lapic(&svm->vcpu);
-		if (err < 0)
-			goto free_svm;
-	}
-
 	page = alloc_page(GFP_KERNEL);
 	if (!page) {
 		err = -ENOMEM;
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index ce15e51..718d1f4 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -2431,12 +2431,6 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto free_vcpu;
 
-	if (irqchip_in_kernel(kvm)) {
-		err = kvm_create_lapic(&vmx->vcpu);
-		if (err < 0)
-			goto free_vcpu;
-	}
-
 	vmx->guest_msrs = kmalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!vmx->guest_msrs) {
 		err = -ENOMEM;
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 25/50] KVM: Remove gratuitous casts from lapic.c
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (23 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 24/50] KVM: Hoist kvm_create_lapic() into kvm_vcpu_init() Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 26/50] KVM: CodingStyle cleanup Avi Kivity
                     ` (24 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>

Since vcpu->apic is of the correct type, there's not need to cast.

Signed-off-by: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/lapic.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
index 8e8dab0..554e73a 100644
--- a/drivers/kvm/lapic.c
+++ b/drivers/kvm/lapic.c
@@ -172,7 +172,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
 
 int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu)
 {
-	struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
+	struct kvm_lapic *apic = vcpu->apic;
 	int highest_irr;
 
 	if (!apic)
@@ -783,7 +783,7 @@ void kvm_free_lapic(struct kvm_vcpu *vcpu)
 
 void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
 {
-	struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
+	struct kvm_lapic *apic = vcpu->apic;
 
 	if (!apic)
 		return;
@@ -792,7 +792,7 @@ void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
 
 u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu)
 {
-	struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
+	struct kvm_lapic *apic = vcpu->apic;
 	u64 tpr;
 
 	if (!apic)
@@ -805,7 +805,7 @@ EXPORT_SYMBOL_GPL(kvm_lapic_get_cr8);
 
 void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
 {
-	struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
+	struct kvm_lapic *apic = vcpu->apic;
 
 	if (!apic) {
 		value |= MSR_IA32_APICBASE_BSP;
@@ -882,7 +882,7 @@ EXPORT_SYMBOL_GPL(kvm_lapic_reset);
 
 int kvm_lapic_enabled(struct kvm_vcpu *vcpu)
 {
-	struct kvm_lapic *apic = (struct kvm_lapic *)vcpu->apic;
+	struct kvm_lapic *apic = vcpu->apic;
 	int ret = 0;
 
 	if (!apic)
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 26/50] KVM: CodingStyle cleanup
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (24 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 25/50] KVM: Remove gratuitous casts from lapic.c Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 27/50] KVM: Support assigning userspace memory to the guest Avi Kivity
                     ` (23 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Mike Day <ncmike-eyUYGtrA2zhAfugRpC6u6w@public.gmane.org>

Signed-off-by: Mike D. Day <ncmike-eyUYGtrA2zhAfugRpC6u6w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |   32 +++++++++---------
 drivers/kvm/kvm_main.c    |   58 ++++++++++++++++++----------------
 drivers/kvm/lapic.c       |    3 +-
 drivers/kvm/mmu.c         |   10 +++--
 drivers/kvm/paging_tmpl.h |    2 +-
 drivers/kvm/svm.c         |   48 +++++++++++++---------------
 drivers/kvm/svm.h         |    2 +-
 drivers/kvm/vmx.c         |   60 +++++++++++++++++++----------------
 drivers/kvm/vmx.h         |    8 ++--
 drivers/kvm/x86_emulate.c |   76 ++++++++++++++++++++++-----------------------
 10 files changed, 151 insertions(+), 148 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 9f10c37..ec5b498 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -528,7 +528,7 @@ extern struct kvm_x86_ops *kvm_x86_ops;
 	if (printk_ratelimit())						\
 		printk(KERN_ERR "kvm: %i: cpu%i " fmt,			\
 		       current->tgid, (vcpu)->vcpu_id , ## __VA_ARGS__); \
- } while(0)
+ } while (0)
 
 #define kvm_printf(kvm, fmt ...) printk(KERN_DEBUG fmt)
 #define vcpu_printf(vcpu, fmt...) kvm_printf(vcpu->kvm, fmt)
@@ -598,7 +598,7 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
 
 struct x86_emulate_ctxt;
 
-int kvm_emulate_pio (struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
+int kvm_emulate_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 		     int size, unsigned port);
 int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 			   int size, unsigned long count, int down,
@@ -607,7 +607,7 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int kvm_emulate_halt(struct kvm_vcpu *vcpu);
 int emulate_invlpg(struct kvm_vcpu *vcpu, gva_t address);
 int emulate_clts(struct kvm_vcpu *vcpu);
-int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr,
+int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr,
 		    unsigned long *dest);
 int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr,
 		    unsigned long value);
@@ -631,7 +631,7 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_flush_remote_tlbs(struct kvm *kvm);
 
 int emulator_read_std(unsigned long addr,
-                      void *val,
+		      void *val,
 		      unsigned int bytes,
 		      struct kvm_vcpu *vcpu);
 int emulator_write_emulated(unsigned long addr,
@@ -721,55 +721,55 @@ static inline struct kvm_mmu_page *page_header(hpa_t shadow_page)
 static inline u16 read_fs(void)
 {
 	u16 seg;
-	asm ("mov %%fs, %0" : "=g"(seg));
+	asm("mov %%fs, %0" : "=g"(seg));
 	return seg;
 }
 
 static inline u16 read_gs(void)
 {
 	u16 seg;
-	asm ("mov %%gs, %0" : "=g"(seg));
+	asm("mov %%gs, %0" : "=g"(seg));
 	return seg;
 }
 
 static inline u16 read_ldt(void)
 {
 	u16 ldt;
-	asm ("sldt %0" : "=g"(ldt));
+	asm("sldt %0" : "=g"(ldt));
 	return ldt;
 }
 
 static inline void load_fs(u16 sel)
 {
-	asm ("mov %0, %%fs" : : "rm"(sel));
+	asm("mov %0, %%fs" : : "rm"(sel));
 }
 
 static inline void load_gs(u16 sel)
 {
-	asm ("mov %0, %%gs" : : "rm"(sel));
+	asm("mov %0, %%gs" : : "rm"(sel));
 }
 
 #ifndef load_ldt
 static inline void load_ldt(u16 sel)
 {
-	asm ("lldt %0" : : "rm"(sel));
+	asm("lldt %0" : : "rm"(sel));
 }
 #endif
 
 static inline void get_idt(struct descriptor_table *table)
 {
-	asm ("sidt %0" : "=m"(*table));
+	asm("sidt %0" : "=m"(*table));
 }
 
 static inline void get_gdt(struct descriptor_table *table)
 {
-	asm ("sgdt %0" : "=m"(*table));
+	asm("sgdt %0" : "=m"(*table));
 }
 
 static inline unsigned long read_tr_base(void)
 {
 	u16 tr;
-	asm ("str %0" : "=g"(tr));
+	asm("str %0" : "=g"(tr));
 	return segment_base(tr);
 }
 
@@ -785,17 +785,17 @@ static inline unsigned long read_msr(unsigned long msr)
 
 static inline void fx_save(struct i387_fxsave_struct *image)
 {
-	asm ("fxsave (%0)":: "r" (image));
+	asm("fxsave (%0)":: "r" (image));
 }
 
 static inline void fx_restore(struct i387_fxsave_struct *image)
 {
-	asm ("fxrstor (%0)":: "r" (image));
+	asm("fxrstor (%0)":: "r" (image));
 }
 
 static inline void fpu_init(void)
 {
-	asm ("finit");
+	asm("finit");
 }
 
 static inline u32 get_rdx_init_val(void)
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 0a04b75..47ffefb 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -104,7 +104,7 @@ static struct dentry *debugfs_dir;
 #define EFER_RESERVED_BITS 0xfffffffffffff2fe
 
 #ifdef CONFIG_X86_64
-// LDT or TSS descriptor in the GDT. 16 bytes.
+/* LDT or TSS descriptor in the GDT. 16 bytes. */
 struct segment_descriptor_64 {
 	struct segment_descriptor s;
 	u32 base_higher;
@@ -121,27 +121,27 @@ unsigned long segment_base(u16 selector)
 	struct descriptor_table gdt;
 	struct segment_descriptor *d;
 	unsigned long table_base;
-	typedef unsigned long ul;
 	unsigned long v;
 
 	if (selector == 0)
 		return 0;
 
-	asm ("sgdt %0" : "=m"(gdt));
+	asm("sgdt %0" : "=m"(gdt));
 	table_base = gdt.base;
 
 	if (selector & 4) {           /* from ldt */
 		u16 ldt_selector;
 
-		asm ("sldt %0" : "=g"(ldt_selector));
+		asm("sldt %0" : "=g"(ldt_selector));
 		table_base = segment_base(ldt_selector);
 	}
 	d = (struct segment_descriptor *)(table_base + (selector & ~7));
-	v = d->base_low | ((ul)d->base_mid << 16) | ((ul)d->base_high << 24);
+	v = d->base_low | ((unsigned long)d->base_mid << 16) |
+		((unsigned long)d->base_high << 24);
 #ifdef CONFIG_X86_64
-	if (d->system == 0
-	    && (d->type == 2 || d->type == 9 || d->type == 11))
-		v |= ((ul)((struct segment_descriptor_64 *)d)->base_higher) << 32;
+	if (d->system == 0 && (d->type == 2 || d->type == 9 || d->type == 11))
+		v |= ((unsigned long) \
+		      ((struct segment_descriptor_64 *)d)->base_higher) << 32;
 #endif
 	return v;
 }
@@ -721,7 +721,7 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 		if (!new.phys_mem)
 			goto out_unlock;
 
-		new.rmap = vmalloc(npages * sizeof(struct page*));
+		new.rmap = vmalloc(npages * sizeof(struct page *));
 
 		if (!new.rmap)
 			goto out_unlock;
@@ -904,17 +904,17 @@ static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	r = 0;
 	switch (chip->chip_id) {
 	case KVM_IRQCHIP_PIC_MASTER:
-		memcpy (&chip->chip.pic,
+		memcpy(&chip->chip.pic,
 			&pic_irqchip(kvm)->pics[0],
 			sizeof(struct kvm_pic_state));
 		break;
 	case KVM_IRQCHIP_PIC_SLAVE:
-		memcpy (&chip->chip.pic,
+		memcpy(&chip->chip.pic,
 			&pic_irqchip(kvm)->pics[1],
 			sizeof(struct kvm_pic_state));
 		break;
 	case KVM_IRQCHIP_IOAPIC:
-		memcpy (&chip->chip.ioapic,
+		memcpy(&chip->chip.ioapic,
 			ioapic_irqchip(kvm),
 			sizeof(struct kvm_ioapic_state));
 		break;
@@ -932,17 +932,17 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	r = 0;
 	switch (chip->chip_id) {
 	case KVM_IRQCHIP_PIC_MASTER:
-		memcpy (&pic_irqchip(kvm)->pics[0],
+		memcpy(&pic_irqchip(kvm)->pics[0],
 			&chip->chip.pic,
 			sizeof(struct kvm_pic_state));
 		break;
 	case KVM_IRQCHIP_PIC_SLAVE:
-		memcpy (&pic_irqchip(kvm)->pics[1],
+		memcpy(&pic_irqchip(kvm)->pics[1],
 			&chip->chip.pic,
 			sizeof(struct kvm_pic_state));
 		break;
 	case KVM_IRQCHIP_IOAPIC:
-		memcpy (ioapic_irqchip(kvm),
+		memcpy(ioapic_irqchip(kvm),
 			&chip->chip.ioapic,
 			sizeof(struct kvm_ioapic_state));
 		break;
@@ -1341,7 +1341,7 @@ int emulate_clts(struct kvm_vcpu *vcpu)
 	return X86EMUL_CONTINUE;
 }
 
-int emulator_get_dr(struct x86_emulate_ctxt* ctxt, int dr, unsigned long *dest)
+int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long *dest)
 {
 	struct kvm_vcpu *vcpu = ctxt->vcpu;
 
@@ -1934,7 +1934,7 @@ static void pio_string_write(struct kvm_io_device *pio_dev,
 	mutex_unlock(&vcpu->kvm->lock);
 }
 
-int kvm_emulate_pio (struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
+int kvm_emulate_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 		  int size, unsigned port)
 {
 	struct kvm_io_device *pio_dev;
@@ -2089,7 +2089,7 @@ static int __vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	int r;
 
 	if (unlikely(vcpu->mp_state == VCPU_MP_STATE_SIPI_RECEIVED)) {
-		printk("vcpu %d received sipi with vector # %x\n",
+		pr_debug("vcpu %d received sipi with vector # %x\n",
 		       vcpu->vcpu_id, vcpu->sipi_vector);
 		kvm_lapic_reset(vcpu);
 		kvm_x86_ops->vcpu_reset(vcpu);
@@ -2363,7 +2363,8 @@ static int kvm_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
 		       sizeof sregs->interrupt_bitmap);
 		pending_vec = kvm_x86_ops->get_irq(vcpu);
 		if (pending_vec >= 0)
-			set_bit(pending_vec, (unsigned long *)sregs->interrupt_bitmap);
+			set_bit(pending_vec,
+				(unsigned long *)sregs->interrupt_bitmap);
 	} else
 		memcpy(sregs->interrupt_bitmap, vcpu->irq_pending,
 		       sizeof sregs->interrupt_bitmap);
@@ -2436,7 +2437,8 @@ static int kvm_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 		/* Only pending external irq is handled here */
 		if (pending_vec < max_bits) {
 			kvm_x86_ops->set_irq(vcpu, pending_vec);
-			printk("Set back pending irq %d\n", pending_vec);
+			pr_debug("Set back pending irq %d\n",
+				 pending_vec);
 		}
 	}
 
@@ -3155,8 +3157,7 @@ static long kvm_vm_ioctl(struct file *filp,
 				kvm->vpic = NULL;
 				goto out;
 			}
-		}
-		else
+		} else
 			goto out;
 		break;
 	case KVM_IRQ_LINE: {
@@ -3448,7 +3449,7 @@ static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
 }
 
 static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
-                       void *v)
+		      void *v)
 {
 	if (val == SYS_RESTART) {
 		/*
@@ -3655,7 +3656,7 @@ int kvm_init_x86(struct kvm_x86_ops *ops, unsigned int vcpu_size,
 
 	r = misc_register(&kvm_dev);
 	if (r) {
-		printk (KERN_ERR "kvm: misc device register failed\n");
+		printk(KERN_ERR "kvm: misc device register failed\n");
 		goto out_free;
 	}
 
@@ -3683,6 +3684,7 @@ out:
 	kvm_x86_ops = NULL;
 	return r;
 }
+EXPORT_SYMBOL_GPL(kvm_init_x86);
 
 void kvm_exit_x86(void)
 {
@@ -3696,6 +3698,7 @@ void kvm_exit_x86(void)
 	kvm_x86_ops->hardware_unsetup();
 	kvm_x86_ops = NULL;
 }
+EXPORT_SYMBOL_GPL(kvm_exit_x86);
 
 static __init int kvm_init(void)
 {
@@ -3710,7 +3713,9 @@ static __init int kvm_init(void)
 
 	kvm_init_msr_list();
 
-	if ((bad_page = alloc_page(GFP_KERNEL)) == NULL) {
+	bad_page = alloc_page(GFP_KERNEL);
+
+	if (bad_page == NULL) {
 		r = -ENOMEM;
 		goto out;
 	}
@@ -3736,6 +3741,3 @@ static __exit void kvm_exit(void)
 
 module_init(kvm_init)
 module_exit(kvm_exit)
-
-EXPORT_SYMBOL_GPL(kvm_init_x86);
-EXPORT_SYMBOL_GPL(kvm_exit_x86);
diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
index 554e73a..e15b42e 100644
--- a/drivers/kvm/lapic.c
+++ b/drivers/kvm/lapic.c
@@ -906,8 +906,7 @@ static int __apic_timer_fn(struct kvm_lapic *apic)
 	wait_queue_head_t *q = &apic->vcpu->wq;
 
 	atomic_inc(&apic->timer.pending);
-	if (waitqueue_active(q))
-	{
+	if (waitqueue_active(q)) {
 		apic->vcpu->mp_state = VCPU_MP_STATE_RUNNABLE;
 		wake_up_interruptible(q);
 	}
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 6cda1fe..ece0aa4 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -90,7 +90,8 @@ static int dbg = 1;
 
 #define PT32_DIR_PSE36_SIZE 4
 #define PT32_DIR_PSE36_SHIFT 13
-#define PT32_DIR_PSE36_MASK (((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT)
+#define PT32_DIR_PSE36_MASK \
+	(((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT)
 
 
 #define PT_FIRST_AVAIL_BITS_SHIFT 9
@@ -103,7 +104,7 @@ static int dbg = 1;
 #define PT64_LEVEL_BITS 9
 
 #define PT64_LEVEL_SHIFT(level) \
-		( PAGE_SHIFT + (level - 1) * PT64_LEVEL_BITS )
+		(PAGE_SHIFT + (level - 1) * PT64_LEVEL_BITS)
 
 #define PT64_LEVEL_MASK(level) \
 		(((1ULL << PT64_LEVEL_BITS) - 1) << PT64_LEVEL_SHIFT(level))
@@ -115,7 +116,7 @@ static int dbg = 1;
 #define PT32_LEVEL_BITS 10
 
 #define PT32_LEVEL_SHIFT(level) \
-		( PAGE_SHIFT + (level - 1) * PT32_LEVEL_BITS )
+		(PAGE_SHIFT + (level - 1) * PT32_LEVEL_BITS)
 
 #define PT32_LEVEL_MASK(level) \
 		(((1ULL << PT32_LEVEL_BITS) - 1) << PT32_LEVEL_SHIFT(level))
@@ -1489,7 +1490,8 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 				printk(KERN_ERR "xx audit error: (%s) levels %d"
 				       " gva %lx gpa %llx hpa %llx ent %llx %d\n",
 				       audit_msg, vcpu->mmu.root_level,
-				       va, gpa, hpa, ent, is_shadow_present_pte(ent));
+				       va, gpa, hpa, ent,
+				       is_shadow_present_pte(ent));
 			else if (ent == shadow_notrap_nonpresent_pte
 				 && !is_error_hpa(hpa))
 				printk(KERN_ERR "audit: (%s) notrap shadow,"
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index fbe595f..447d2c3 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -163,7 +163,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		walker->page = pfn_to_page(paddr >> PAGE_SHIFT);
 		walker->table = kmap_atomic(walker->page, KM_USER0);
 		--walker->level;
-		walker->table_gfn[walker->level - 1 ] = table_gfn;
+		walker->table_gfn[walker->level - 1] = table_gfn;
 		pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
 			 walker->level - 1, table_gfn);
 	}
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index f2278d0..746a377 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -229,12 +229,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 		printk(KERN_DEBUG "%s: NOP\n", __FUNCTION__);
 		return;
 	}
-	if (svm->next_rip - svm->vmcb->save.rip > MAX_INST_SIZE) {
+	if (svm->next_rip - svm->vmcb->save.rip > MAX_INST_SIZE)
 		printk(KERN_ERR "%s: ip 0x%llx next 0x%llx\n",
 		       __FUNCTION__,
 		       svm->vmcb->save.rip,
 		       svm->next_rip);
-	}
 
 	vcpu->rip = svm->vmcb->save.rip = svm->next_rip;
 	svm->vmcb->control.int_state &= ~SVM_INTERRUPT_SHADOW_MASK;
@@ -312,7 +311,7 @@ static void svm_hardware_enable(void *garbage)
 	svm_data->next_asid = svm_data->max_asid + 1;
 	svm_features = cpuid_edx(SVM_CPUID_FUNC);
 
-	asm volatile ( "sgdt %0" : "=m"(gdt_descr) );
+	asm volatile ("sgdt %0" : "=m"(gdt_descr));
 	gdt = (struct desc_struct *)gdt_descr.address;
 	svm_data->tss_desc = (struct kvm_ldttss_desc *)(gdt + GDT_ENTRY_TSS);
 
@@ -544,8 +543,7 @@ static void init_vmcb(struct vmcb *vmcb)
 	init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
 
 	save->efer = MSR_EFER_SVME_MASK;
-
-        save->dr6 = 0xffff0ff0;
+	save->dr6 = 0xffff0ff0;
 	save->dr7 = 0x400;
 	save->rflags = 2;
 	save->rip = 0x0000fff0;
@@ -783,7 +781,7 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 			svm->vmcb->save.efer |= KVM_EFER_LMA | KVM_EFER_LME;
 		}
 
-		if (is_paging(vcpu) && !(cr0 & X86_CR0_PG) ) {
+		if (is_paging(vcpu) && !(cr0 & X86_CR0_PG)) {
 			vcpu->shadow_efer &= ~KVM_EFER_LMA;
 			svm->vmcb->save.efer &= ~(KVM_EFER_LMA | KVM_EFER_LME);
 		}
@@ -1010,7 +1008,7 @@ static int shutdown_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 
 static int io_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 {
-	u32 io_info = svm->vmcb->control.exit_info_1; //address size bug?
+	u32 io_info = svm->vmcb->control.exit_info_1; /* address size bug? */
 	int size, down, in, string, rep;
 	unsigned port;
 
@@ -1316,7 +1314,7 @@ static void reload_tss(struct kvm_vcpu *vcpu)
 	int cpu = raw_smp_processor_id();
 
 	struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu);
-	svm_data->tss_desc->type = 9; //available 32/64-bit TSS
+	svm_data->tss_desc->type = 9; /* available 32/64-bit TSS */
 	load_TR_desc();
 }
 
@@ -1434,9 +1432,9 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu,
 	 * Interrupts blocked.  Wait for unblock.
 	 */
 	if (!svm->vcpu.interrupt_window_open &&
-	    (svm->vcpu.irq_summary || kvm_run->request_interrupt_window)) {
+	    (svm->vcpu.irq_summary || kvm_run->request_interrupt_window))
 		control->intercept |= 1ULL << INTERCEPT_VINTR;
-	} else
+	 else
 		control->intercept &= ~(1ULL << INTERCEPT_VINTR);
 }
 
@@ -1581,23 +1579,23 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		:
 		: [svm]"a"(svm),
 		  [vmcb]"i"(offsetof(struct vcpu_svm, vmcb_pa)),
-		  [rbx]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RBX])),
-		  [rcx]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RCX])),
-		  [rdx]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RDX])),
-		  [rsi]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RSI])),
-		  [rdi]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RDI])),
-		  [rbp]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_RBP]))
+		  [rbx]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RBX])),
+		  [rcx]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RCX])),
+		  [rdx]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RDX])),
+		  [rsi]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RSI])),
+		  [rdi]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RDI])),
+		  [rbp]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_RBP]))
 #ifdef CONFIG_X86_64
-		  ,[r8 ]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R8])),
-		  [r9 ]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R9 ])),
-		  [r10]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R10])),
-		  [r11]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R11])),
-		  [r12]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R12])),
-		  [r13]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R13])),
-		  [r14]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R14])),
-		  [r15]"i"(offsetof(struct vcpu_svm,vcpu.regs[VCPU_REGS_R15]))
+		  , [r8]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R8])),
+		  [r9]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R9])),
+		  [r10]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R10])),
+		  [r11]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R11])),
+		  [r12]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R12])),
+		  [r13]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R13])),
+		  [r14]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R14])),
+		  [r15]"i"(offsetof(struct vcpu_svm, vcpu.regs[VCPU_REGS_R15]))
 #endif
-		: "cc", "memory" );
+		: "cc", "memory");
 
 	if ((svm->vmcb->save.dr7 & 0xff))
 		load_db_regs(svm->host_db_regs);
diff --git a/drivers/kvm/svm.h b/drivers/kvm/svm.h
index 3b1b0f3..5fa277c 100644
--- a/drivers/kvm/svm.h
+++ b/drivers/kvm/svm.h
@@ -311,7 +311,7 @@ struct __attribute__ ((__packed__)) vmcb {
 
 #define SVM_EXIT_ERR		-1
 
-#define SVM_CR0_SELECTIVE_MASK (1 << 3 | 1) // TS and MP
+#define SVM_CR0_SELECTIVE_MASK (1 << 3 | 1) /* TS and MP */
 
 #define SVM_VMLOAD ".byte 0x0f, 0x01, 0xda"
 #define SVM_VMRUN  ".byte 0x0f, 0x01, 0xd8"
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 718d1f4..1336174 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -62,7 +62,7 @@ struct vcpu_vmx {
 		int           gs_ldt_reload_needed;
 		int           fs_reload_needed;
 		int           guest_efer_loaded;
-	}host_state;
+	} host_state;
 
 };
 
@@ -271,7 +271,7 @@ static void vmcs_writel(unsigned long field, unsigned long value)
 	u8 error;
 
 	asm volatile (ASM_VMX_VMWRITE_RAX_RDX "; setna %0"
-		       : "=q"(error) : "a"(value), "d"(field) : "cc" );
+		       : "=q"(error) : "a"(value), "d"(field) : "cc");
 	if (unlikely(error))
 		vmwrite_error(field, value);
 }
@@ -415,10 +415,10 @@ static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 #endif
 
 #ifdef CONFIG_X86_64
-	if (is_long_mode(&vmx->vcpu)) {
+	if (is_long_mode(&vmx->vcpu))
 		save_msrs(vmx->host_msrs +
 			  vmx->msr_offset_kernel_gs_base, 1);
-	}
+
 #endif
 	load_msrs(vmx->guest_msrs, vmx->save_nmsrs);
 	load_transition_efer(vmx);
@@ -845,7 +845,7 @@ static int vmx_get_irq(struct kvm_vcpu *vcpu)
 		if (is_external_interrupt(idtv_info_field))
 			return idtv_info_field & VECTORING_INFO_VECTOR_MASK;
 		else
-			printk("pending exception: not handled yet\n");
+			printk(KERN_DEBUG "pending exception: not handled yet\n");
 	}
 	return -1;
 }
@@ -893,7 +893,7 @@ static void hardware_disable(void *garbage)
 }
 
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-				      u32 msr, u32* result)
+				      u32 msr, u32 *result)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 ctl = ctl_min | ctl_opt;
@@ -1102,7 +1102,7 @@ static void enter_pmode(struct kvm_vcpu *vcpu)
 	vmcs_write32(GUEST_CS_AR_BYTES, 0x9b);
 }
 
-static gva_t rmode_tss_base(struct kvm* kvm)
+static gva_t rmode_tss_base(struct kvm *kvm)
 {
 	gfn_t base_gfn = kvm->memslots[0].base_gfn + kvm->memslots[0].npages - 3;
 	return base_gfn << PAGE_SHIFT;
@@ -1385,7 +1385,7 @@ static void vmx_set_gdt(struct kvm_vcpu *vcpu, struct descriptor_table *dt)
 	vmcs_writel(GUEST_GDTR_BASE, dt->base);
 }
 
-static int init_rmode_tss(struct kvm* kvm)
+static int init_rmode_tss(struct kvm *kvm)
 {
 	gfn_t fn = rmode_tss_base(kvm) >> PAGE_SHIFT;
 	u16 data = 0;
@@ -1494,7 +1494,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 		vmcs_writel(GUEST_RIP, 0);
 	vmcs_writel(GUEST_RSP, 0);
 
-	//todo: dr0 = dr1 = dr2 = dr3 = 0; dr6 = 0xffff0ff0
+	/* todo: dr0 = dr1 = dr2 = dr3 = 0; dr6 = 0xffff0ff0 */
 	vmcs_writel(GUEST_DR7, 0x400);
 
 	vmcs_writel(GUEST_GDTR_BASE, 0);
@@ -1561,7 +1561,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	get_idt(&dt);
 	vmcs_writel(HOST_IDTR_BASE, dt.base);   /* 22.2.4 */
 
-	asm ("mov $.Lkvm_vmx_return, %0" : "=r"(kvm_vmx_return));
+	asm("mov $.Lkvm_vmx_return, %0" : "=r"(kvm_vmx_return));
 	vmcs_writel(HOST_RIP, kvm_vmx_return); /* 22.2.5 */
 	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0);
@@ -1613,7 +1613,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
 
 	vmx->vcpu.cr0 = 0x60000010;
-	vmx_set_cr0(&vmx->vcpu, vmx->vcpu.cr0); // enter rmode
+	vmx_set_cr0(&vmx->vcpu, vmx->vcpu.cr0); /* enter rmode */
 	vmx_set_cr4(&vmx->vcpu, 0);
 #ifdef CONFIG_X86_64
 	vmx_set_efer(&vmx->vcpu, 0);
@@ -1644,7 +1644,7 @@ static void inject_rmode_irq(struct kvm_vcpu *vcpu, int irq)
 	u16 sp =  vmcs_readl(GUEST_RSP);
 	u32 ss_limit = vmcs_read32(GUEST_SS_LIMIT);
 
-	if (sp > ss_limit || sp < 6 ) {
+	if (sp > ss_limit || sp < 6) {
 		vcpu_printf(vcpu, "%s: #SS, rsp 0x%lx ss 0x%lx limit 0x%x\n",
 			    __FUNCTION__,
 			    vmcs_readl(GUEST_RSP),
@@ -1664,15 +1664,18 @@ static void inject_rmode_irq(struct kvm_vcpu *vcpu, int irq)
 	ip =  vmcs_readl(GUEST_RIP);
 
 
-	if (emulator_write_emulated(ss_base + sp - 2, &flags, 2, vcpu) != X86EMUL_CONTINUE ||
-	    emulator_write_emulated(ss_base + sp - 4, &cs, 2, vcpu) != X86EMUL_CONTINUE ||
-	    emulator_write_emulated(ss_base + sp - 6, &ip, 2, vcpu) != X86EMUL_CONTINUE) {
+	if (emulator_write_emulated(
+		    ss_base + sp - 2, &flags, 2, vcpu) != X86EMUL_CONTINUE ||
+	    emulator_write_emulated(
+		    ss_base + sp - 4, &cs, 2, vcpu) != X86EMUL_CONTINUE ||
+	    emulator_write_emulated(
+		    ss_base + sp - 6, &ip, 2, vcpu) != X86EMUL_CONTINUE) {
 		vcpu_printf(vcpu, "%s: write guest err\n", __FUNCTION__);
 		return;
 	}
 
 	vmcs_writel(GUEST_RFLAGS, flags &
-		    ~( X86_EFLAGS_IF | X86_EFLAGS_AC | X86_EFLAGS_TF));
+		    ~(X86_EFLAGS_IF | X86_EFLAGS_AC | X86_EFLAGS_TF));
 	vmcs_write16(GUEST_CS_SELECTOR, ent[1]) ;
 	vmcs_writel(GUEST_CS_BASE, ent[1] << 4);
 	vmcs_writel(GUEST_RIP, ent[0]);
@@ -1777,10 +1780,9 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
 
 	if ((vect_info & VECTORING_INFO_VALID_MASK) &&
-						!is_page_fault(intr_info)) {
+						!is_page_fault(intr_info))
 		printk(KERN_ERR "%s: unexpected, vectoring info 0x%x "
 		       "intr info 0x%x\n", __FUNCTION__, vect_info, intr_info);
-	}
 
 	if (!irqchip_in_kernel(vcpu->kvm) && is_external_interrupt(vect_info)) {
 		int irq = vect_info & VECTORING_INFO_VECTOR_MASK;
@@ -1831,7 +1833,7 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		case EMULATE_DO_MMIO:
 			++vcpu->stat.mmio_exits;
 			return 0;
-		 case EMULATE_FAIL:
+		case EMULATE_FAIL:
 			kvm_report_emulation_failure(vcpu, "pagetable");
 			break;
 		default:
@@ -1849,7 +1851,8 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		return 1;
 	}
 
-	if ((intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK)) == (INTR_TYPE_EXCEPTION | 1)) {
+	if ((intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VECTOR_MASK)) ==
+	    (INTR_TYPE_EXCEPTION | 1)) {
 		kvm_run->exit_reason = KVM_EXIT_DEBUG;
 		return 0;
 	}
@@ -2138,8 +2141,8 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
-	if ( (vectoring_info & VECTORING_INFO_VALID_MASK) &&
-				exit_reason != EXIT_REASON_EXCEPTION_NMI )
+	if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
+				exit_reason != EXIT_REASON_EXCEPTION_NMI)
 		printk(KERN_WARNING "%s: unexpected, valid vectoring info and "
 		       "exit reason is 0x%x\n", __FUNCTION__, exit_reason);
 	if (exit_reason < kvm_vmx_max_exit_handlers
@@ -2238,7 +2241,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	 */
 	vmcs_writel(HOST_CR0, read_cr0());
 
-	asm (
+	asm(
 		/* Store host registers */
 #ifdef CONFIG_X86_64
 		"push %%rax; push %%rbx; push %%rdx;"
@@ -2342,8 +2345,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		[rdi]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RDI])),
 		[rbp]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_RBP])),
 #ifdef CONFIG_X86_64
-		[r8 ]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R8 ])),
-		[r9 ]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R9 ])),
+		[r8]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R8])),
+		[r9]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R9])),
 		[r10]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R10])),
 		[r11]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R11])),
 		[r12]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R12])),
@@ -2352,11 +2355,12 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		[r15]"i"(offsetof(struct kvm_vcpu, regs[VCPU_REGS_R15])),
 #endif
 		[cr2]"i"(offsetof(struct kvm_vcpu, cr2))
-	      : "cc", "memory" );
+	      : "cc", "memory");
 
-	vcpu->interrupt_window_open = (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & 3) == 0;
+	vcpu->interrupt_window_open =
+		(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & 3) == 0;
 
-	asm ("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
+	asm("mov %0, %%ds; mov %0, %%es" : : "r"(__USER_DS));
 	vmx->launched = 1;
 
 	intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
diff --git a/drivers/kvm/vmx.h b/drivers/kvm/vmx.h
index fd4e146..270d477 100644
--- a/drivers/kvm/vmx.h
+++ b/drivers/kvm/vmx.h
@@ -234,9 +234,9 @@ enum vmcs_field {
 /*
  * Exit Qualifications for MOV for Control Register Access
  */
-#define CONTROL_REG_ACCESS_NUM          0x7     /* 2:0, number of control register */
+#define CONTROL_REG_ACCESS_NUM          0x7     /* 2:0, number of control reg.*/
 #define CONTROL_REG_ACCESS_TYPE         0x30    /* 5:4, access type */
-#define CONTROL_REG_ACCESS_REG          0xf00   /* 10:8, general purpose register */
+#define CONTROL_REG_ACCESS_REG          0xf00   /* 10:8, general purpose reg. */
 #define LMSW_SOURCE_DATA_SHIFT 16
 #define LMSW_SOURCE_DATA  (0xFFFF << LMSW_SOURCE_DATA_SHIFT) /* 16:31 lmsw source */
 #define REG_EAX                         (0 << 8)
@@ -259,11 +259,11 @@ enum vmcs_field {
 /*
  * Exit Qualifications for MOV for Debug Register Access
  */
-#define DEBUG_REG_ACCESS_NUM            0x7     /* 2:0, number of debug register */
+#define DEBUG_REG_ACCESS_NUM            0x7     /* 2:0, number of debug reg. */
 #define DEBUG_REG_ACCESS_TYPE           0x10    /* 4, direction of access */
 #define TYPE_MOV_TO_DR                  (0 << 4)
 #define TYPE_MOV_FROM_DR                (1 << 4)
-#define DEBUG_REG_ACCESS_REG            0xf00   /* 11:8, general purpose register */
+#define DEBUG_REG_ACCESS_REG            0xf00   /* 11:8, general purpose reg. */
 
 
 /* segment AR */
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index e6b213b..b03029e 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -23,7 +23,7 @@
 #include <stdio.h>
 #include <stdint.h>
 #include <public/xen.h>
-#define DPRINTF(_f, _a ...) printf( _f , ## _a )
+#define DPRINTF(_f, _a ...) printf(_f , ## _a)
 #else
 #include "kvm.h"
 #define DPRINTF(x...) do {} while (0)
@@ -285,21 +285,21 @@ static u16 twobyte_table[256] = {
 		switch ((_dst).bytes) {					    \
 		case 2:							    \
 			__asm__ __volatile__ (				    \
-				_PRE_EFLAGS("0","4","2")		    \
+				_PRE_EFLAGS("0", "4", "2")		    \
 				_op"w %"_wx"3,%1; "			    \
-				_POST_EFLAGS("0","4","2")		    \
+				_POST_EFLAGS("0", "4", "2")		    \
 				: "=m" (_eflags), "=m" ((_dst).val),        \
 				  "=&r" (_tmp)				    \
-				: _wy ((_src).val), "i" (EFLAGS_MASK) );    \
+				: _wy ((_src).val), "i" (EFLAGS_MASK));     \
 			break;						    \
 		case 4:							    \
 			__asm__ __volatile__ (				    \
-				_PRE_EFLAGS("0","4","2")		    \
+				_PRE_EFLAGS("0", "4", "2")		    \
 				_op"l %"_lx"3,%1; "			    \
-				_POST_EFLAGS("0","4","2")		    \
+				_POST_EFLAGS("0", "4", "2")		    \
 				: "=m" (_eflags), "=m" ((_dst).val),	    \
 				  "=&r" (_tmp)				    \
-				: _ly ((_src).val), "i" (EFLAGS_MASK) );    \
+				: _ly ((_src).val), "i" (EFLAGS_MASK));     \
 			break;						    \
 		case 8:							    \
 			__emulate_2op_8byte(_op, _src, _dst,		    \
@@ -311,16 +311,15 @@ static u16 twobyte_table[256] = {
 #define __emulate_2op(_op,_src,_dst,_eflags,_bx,_by,_wx,_wy,_lx,_ly,_qx,_qy) \
 	do {								     \
 		unsigned long _tmp;					     \
-		switch ( (_dst).bytes )					     \
-		{							     \
+		switch ((_dst).bytes) {				             \
 		case 1:							     \
 			__asm__ __volatile__ (				     \
-				_PRE_EFLAGS("0","4","2")		     \
+				_PRE_EFLAGS("0", "4", "2")		     \
 				_op"b %"_bx"3,%1; "			     \
-				_POST_EFLAGS("0","4","2")		     \
+				_POST_EFLAGS("0", "4", "2")		     \
 				: "=m" (_eflags), "=m" ((_dst).val),	     \
 				  "=&r" (_tmp)				     \
-				: _by ((_src).val), "i" (EFLAGS_MASK) );     \
+				: _by ((_src).val), "i" (EFLAGS_MASK));      \
 			break;						     \
 		default:						     \
 			__emulate_2op_nobyte(_op, _src, _dst, _eflags,	     \
@@ -349,34 +348,33 @@ static u16 twobyte_table[256] = {
 	do {								\
 		unsigned long _tmp;					\
 									\
-		switch ( (_dst).bytes )					\
-		{							\
+		switch ((_dst).bytes) {				        \
 		case 1:							\
 			__asm__ __volatile__ (				\
-				_PRE_EFLAGS("0","3","2")		\
+				_PRE_EFLAGS("0", "3", "2")		\
 				_op"b %1; "				\
-				_POST_EFLAGS("0","3","2")		\
+				_POST_EFLAGS("0", "3", "2")		\
 				: "=m" (_eflags), "=m" ((_dst).val),	\
 				  "=&r" (_tmp)				\
-				: "i" (EFLAGS_MASK) );			\
+				: "i" (EFLAGS_MASK));			\
 			break;						\
 		case 2:							\
 			__asm__ __volatile__ (				\
-				_PRE_EFLAGS("0","3","2")		\
+				_PRE_EFLAGS("0", "3", "2")		\
 				_op"w %1; "				\
-				_POST_EFLAGS("0","3","2")		\
+				_POST_EFLAGS("0", "3", "2")		\
 				: "=m" (_eflags), "=m" ((_dst).val),	\
 				  "=&r" (_tmp)				\
-				: "i" (EFLAGS_MASK) );			\
+				: "i" (EFLAGS_MASK));			\
 			break;						\
 		case 4:							\
 			__asm__ __volatile__ (				\
-				_PRE_EFLAGS("0","3","2")		\
+				_PRE_EFLAGS("0", "3", "2")		\
 				_op"l %1; "				\
-				_POST_EFLAGS("0","3","2")		\
+				_POST_EFLAGS("0", "3", "2")		\
 				: "=m" (_eflags), "=m" ((_dst).val),	\
 				  "=&r" (_tmp)				\
-				: "i" (EFLAGS_MASK) );			\
+				: "i" (EFLAGS_MASK));			\
 			break;						\
 		case 8:							\
 			__emulate_1op_8byte(_op, _dst, _eflags);	\
@@ -389,21 +387,21 @@ static u16 twobyte_table[256] = {
 #define __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy)           \
 	do {								  \
 		__asm__ __volatile__ (					  \
-			_PRE_EFLAGS("0","4","2")			  \
+			_PRE_EFLAGS("0", "4", "2")			  \
 			_op"q %"_qx"3,%1; "				  \
-			_POST_EFLAGS("0","4","2")			  \
+			_POST_EFLAGS("0", "4", "2")			  \
 			: "=m" (_eflags), "=m" ((_dst).val), "=&r" (_tmp) \
-			: _qy ((_src).val), "i" (EFLAGS_MASK) );	  \
+			: _qy ((_src).val), "i" (EFLAGS_MASK));		\
 	} while (0)
 
 #define __emulate_1op_8byte(_op, _dst, _eflags)                           \
 	do {								  \
 		__asm__ __volatile__ (					  \
-			_PRE_EFLAGS("0","3","2")			  \
+			_PRE_EFLAGS("0", "3", "2")			  \
 			_op"q %1; "					  \
-			_POST_EFLAGS("0","3","2")			  \
+			_POST_EFLAGS("0", "3", "2")			  \
 			: "=m" (_eflags), "=m" ((_dst).val), "=&r" (_tmp) \
-			: "i" (EFLAGS_MASK) );				  \
+			: "i" (EFLAGS_MASK));				  \
 	} while (0)
 
 #elif defined(__i386__)
@@ -415,8 +413,8 @@ static u16 twobyte_table[256] = {
 #define insn_fetch(_type, _size, _eip)                                  \
 ({	unsigned long _x;						\
 	rc = ops->read_std((unsigned long)(_eip) + ctxt->cs_base, &_x,	\
-                                                  (_size), ctxt->vcpu); \
-	if ( rc != 0 )							\
+			   (_size), ctxt->vcpu);			\
+	if (rc != 0)							\
 		goto done;						\
 	(_eip) += (_size);						\
 	(_type)_x;							\
@@ -780,7 +778,7 @@ done_prefixes:
 		}
 		if (c->ad_bytes != 8)
 			c->modrm_ea = (u32)c->modrm_ea;
-	modrm_done:
+modrm_done:
 		;
 	}
 
@@ -828,10 +826,9 @@ done_prefixes:
 		c->src.bytes = (c->d & ByteOp) ? 1 :
 							   c->op_bytes;
 		/* Don't fetch the address for invlpg: it could be unmapped. */
-		if (c->twobyte && c->b == 0x01
-				    && c->modrm_reg == 7)
+		if (c->twobyte && c->b == 0x01 && c->modrm_reg == 7)
 			break;
-	      srcmem_common:
+	srcmem_common:
 		/*
 		 * For instructions with a ModR/M byte, switch to register
 		 * access if Mod = 3.
@@ -1175,10 +1172,11 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	if (c->src.type == OP_MEM) {
 		c->src.ptr = (unsigned long *)cr2;
 		c->src.val = 0;
-		if ((rc = ops->read_emulated((unsigned long)c->src.ptr,
-					     &c->src.val,
-					     c->src.bytes,
-					     ctxt->vcpu)) != 0)
+		rc = ops->read_emulated((unsigned long)c->src.ptr,
+					&c->src.val,
+					c->src.bytes,
+					ctxt->vcpu);
+		if (rc != 0)
 			goto done;
 		c->src.orig_val = c->src.val;
 	}
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 27/50] KVM: Support assigning userspace memory to the guest
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (25 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 26/50] KVM: CodingStyle cleanup Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
       [not found]     ` <1198421495-31481-28-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-12-23 14:51   ` [PATCH 28/50] KVM: Move x86 msr handling to new files x86.[ch] Avi Kivity
                     ` (22 subsequent siblings)
  49 siblings, 1 reply; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

Instead of having the kernel allocate memory to the guest, let userspace
allocate it and pass the address to the kernel.

This is required for s390 support, but also enables features like memory
sharing and using hugetlbfs backed memory.

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    1 +
 drivers/kvm/kvm_main.c |   81 +++++++++++++++++++++++++++++++++++++++++------
 include/linux/kvm.h    |   12 +++++++
 3 files changed, 83 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index ec5b498..3eaed4d 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -408,6 +408,7 @@ struct kvm_memory_slot {
 	struct page **phys_mem;
 	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
+	int user_alloc; /* user allocated memory */
 };
 
 struct kvm {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 47ffefb..9633fd3 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -40,6 +40,7 @@
 #include <linux/anon_inodes.h>
 #include <linux/profile.h>
 #include <linux/kvm_para.h>
+#include <linux/pagemap.h>
 
 #include <asm/processor.h>
 #include <asm/msr.h>
@@ -300,19 +301,40 @@ static struct kvm *kvm_create_vm(void)
 	return kvm;
 }
 
+static void kvm_free_userspace_physmem(struct kvm_memory_slot *free)
+{
+	int i;
+
+	for (i = 0; i < free->npages; ++i) {
+		if (free->phys_mem[i]) {
+			if (!PageReserved(free->phys_mem[i]))
+				SetPageDirty(free->phys_mem[i]);
+			page_cache_release(free->phys_mem[i]);
+		}
+	}
+}
+
+static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
+{
+	int i;
+
+	for (i = 0; i < free->npages; ++i)
+		if (free->phys_mem[i])
+			__free_page(free->phys_mem[i]);
+}
+
 /*
  * Free any memory in @free but not in @dont.
  */
 static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 				  struct kvm_memory_slot *dont)
 {
-	int i;
-
 	if (!dont || free->phys_mem != dont->phys_mem)
 		if (free->phys_mem) {
-			for (i = 0; i < free->npages; ++i)
-				if (free->phys_mem[i])
-					__free_page(free->phys_mem[i]);
+			if (free->user_alloc)
+				kvm_free_userspace_physmem(free);
+			else
+				kvm_free_kernel_physmem(free);
 			vfree(free->phys_mem);
 		}
 	if (!dont || free->rmap != dont->rmap)
@@ -652,7 +674,9 @@ EXPORT_SYMBOL_GPL(fx_init);
  * Discontiguous memory is allowed, mostly for framebuffers.
  */
 static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
-					  struct kvm_memory_region *mem)
+					  struct
+					  kvm_userspace_memory_region *mem,
+					  int user_alloc)
 {
 	int r;
 	gfn_t base_gfn;
@@ -728,11 +752,27 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 
 		memset(new.phys_mem, 0, npages * sizeof(struct page *));
 		memset(new.rmap, 0, npages * sizeof(*new.rmap));
-		for (i = 0; i < npages; ++i) {
-			new.phys_mem[i] = alloc_page(GFP_HIGHUSER
-						     | __GFP_ZERO);
-			if (!new.phys_mem[i])
+		if (user_alloc) {
+			unsigned long pages_num;
+
+			new.user_alloc = 1;
+			down_read(&current->mm->mmap_sem);
+
+			pages_num = get_user_pages(current, current->mm,
+						   mem->userspace_addr,
+						   npages, 1, 0, new.phys_mem,
+						   NULL);
+
+			up_read(&current->mm->mmap_sem);
+			if (pages_num != npages)
 				goto out_unlock;
+		} else {
+			for (i = 0; i < npages; ++i) {
+				new.phys_mem[i] = alloc_page(GFP_HIGHUSER
+							     | __GFP_ZERO);
+				if (!new.phys_mem[i])
+					goto out_unlock;
+			}
 		}
 	}
 
@@ -3108,11 +3148,29 @@ static long kvm_vm_ioctl(struct file *filp,
 		break;
 	case KVM_SET_MEMORY_REGION: {
 		struct kvm_memory_region kvm_mem;
+		struct kvm_userspace_memory_region kvm_userspace_mem;
 
 		r = -EFAULT;
 		if (copy_from_user(&kvm_mem, argp, sizeof kvm_mem))
 			goto out;
-		r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_mem);
+		kvm_userspace_mem.slot = kvm_mem.slot;
+		kvm_userspace_mem.flags = kvm_mem.flags;
+		kvm_userspace_mem.guest_phys_addr = kvm_mem.guest_phys_addr;
+		kvm_userspace_mem.memory_size = kvm_mem.memory_size;
+		r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem, 0);
+		if (r)
+			goto out;
+		break;
+	}
+	case KVM_SET_USER_MEMORY_REGION: {
+		struct kvm_userspace_memory_region kvm_userspace_mem;
+
+		r = -EFAULT;
+		if (copy_from_user(&kvm_userspace_mem, argp,
+						sizeof kvm_userspace_mem))
+			goto out;
+
+		r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem, 1);
 		if (r)
 			goto out;
 		break;
@@ -3332,6 +3390,7 @@ static long kvm_dev_ioctl(struct file *filp,
 		case KVM_CAP_IRQCHIP:
 		case KVM_CAP_HLT:
 		case KVM_CAP_MMU_SHADOW_CACHE_CONTROL:
+		case KVM_CAP_USER_MEMORY:
 			r = 1;
 			break;
 		default:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index d2fd973..971f465 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -23,6 +23,15 @@ struct kvm_memory_region {
 	__u64 memory_size; /* bytes */
 };
 
+/* for KVM_SET_USER_MEMORY_REGION */
+struct kvm_userspace_memory_region {
+	__u32 slot;
+	__u32 flags;
+	__u64 guest_phys_addr;
+	__u64 memory_size; /* bytes */
+	__u64 userspace_addr; /* start of the userspace allocated memory */
+};
+
 /* for kvm_memory_region::flags */
 #define KVM_MEM_LOG_DIRTY_PAGES  1UL
 
@@ -348,6 +357,7 @@ struct kvm_signal_mask {
 #define KVM_CAP_IRQCHIP	  0
 #define KVM_CAP_HLT	  1
 #define KVM_CAP_MMU_SHADOW_CACHE_CONTROL 2
+#define KVM_CAP_USER_MEMORY 3
 
 /*
  * ioctls for VM fds
@@ -355,6 +365,8 @@ struct kvm_signal_mask {
 #define KVM_SET_MEMORY_REGION     _IOW(KVMIO, 0x40, struct kvm_memory_region)
 #define KVM_SET_NR_MMU_PAGES      _IO(KVMIO, 0x44)
 #define KVM_GET_NR_MMU_PAGES      _IO(KVMIO, 0x45)
+#define KVM_SET_USER_MEMORY_REGION _IOW(KVMIO, 0x46,\
+					struct kvm_userspace_memory_region)
 /*
  * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns
  * a vcpu fd.
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 28/50] KVM: Move x86 msr handling to new files x86.[ch]
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (26 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 27/50] KVM: Support assigning userspace memory to the guest Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 29/50] KVM: MMU: Clean up MMU functions to take struct kvm when appropriate Avi Kivity
                     ` (21 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Carsten Otte

From: Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

Signed-off-by: Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/Makefile   |    2 +-
 drivers/kvm/kvm.h      |    4 ++
 drivers/kvm/kvm_main.c |   69 +-------------------------------
 drivers/kvm/x86.c      |  102 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/kvm/x86.h      |   16 +++++++
 5 files changed, 126 insertions(+), 67 deletions(-)
 create mode 100644 drivers/kvm/x86.c
 create mode 100644 drivers/kvm/x86.h

diff --git a/drivers/kvm/Makefile b/drivers/kvm/Makefile
index e5a8f4d..cf18ad4 100644
--- a/drivers/kvm/Makefile
+++ b/drivers/kvm/Makefile
@@ -2,7 +2,7 @@
 # Makefile for Kernel-based Virtual Machine module
 #
 
-kvm-objs := kvm_main.o mmu.o x86_emulate.o i8259.o irq.o lapic.o ioapic.o
+kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o ioapic.o
 obj-$(CONFIG_KVM) += kvm.o
 kvm-intel-objs = vmx.o
 obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 3eaed4d..9c9c1d7 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -653,6 +653,10 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
 int kvm_fix_hypercall(struct kvm_vcpu *vcpu);
 
+long kvm_arch_dev_ioctl(struct file *filp,
+			unsigned int ioctl, unsigned long arg);
+__init void kvm_arch_init(void);
+
 static inline void kvm_guest_enter(void)
 {
 	current->flags |= PF_VCPU;
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 9633fd3..9f7370f 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -16,6 +16,7 @@
  */
 
 #include "kvm.h"
+#include "x86.h"
 #include "x86_emulate.h"
 #include "segment_descriptor.h"
 #include "irq.h"
@@ -2508,43 +2509,6 @@ void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
 EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
 
 /*
- * List of msr numbers which we expose to userspace through KVM_GET_MSRS
- * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
- *
- * This list is modified at module load time to reflect the
- * capabilities of the host cpu.
- */
-static u32 msrs_to_save[] = {
-	MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,
-	MSR_K6_STAR,
-#ifdef CONFIG_X86_64
-	MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
-#endif
-	MSR_IA32_TIME_STAMP_COUNTER,
-};
-
-static unsigned num_msrs_to_save;
-
-static u32 emulated_msrs[] = {
-	MSR_IA32_MISC_ENABLE,
-};
-
-static __init void kvm_init_msr_list(void)
-{
-	u32 dummy[2];
-	unsigned i, j;
-
-	for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
-		if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
-			continue;
-		if (j < i)
-			msrs_to_save[j] = msrs_to_save[i];
-		j++;
-	}
-	num_msrs_to_save = j;
-}
-
-/*
  * Adapt set_msr() to msr_io()'s calling convention
  */
 static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
@@ -3356,33 +3320,6 @@ static long kvm_dev_ioctl(struct file *filp,
 			goto out;
 		r = kvm_dev_ioctl_create_vm();
 		break;
-	case KVM_GET_MSR_INDEX_LIST: {
-		struct kvm_msr_list __user *user_msr_list = argp;
-		struct kvm_msr_list msr_list;
-		unsigned n;
-
-		r = -EFAULT;
-		if (copy_from_user(&msr_list, user_msr_list, sizeof msr_list))
-			goto out;
-		n = msr_list.nmsrs;
-		msr_list.nmsrs = num_msrs_to_save + ARRAY_SIZE(emulated_msrs);
-		if (copy_to_user(user_msr_list, &msr_list, sizeof msr_list))
-			goto out;
-		r = -E2BIG;
-		if (n < num_msrs_to_save)
-			goto out;
-		r = -EFAULT;
-		if (copy_to_user(user_msr_list->indices, &msrs_to_save,
-				 num_msrs_to_save * sizeof(u32)))
-			goto out;
-		if (copy_to_user(user_msr_list->indices
-				 + num_msrs_to_save * sizeof(u32),
-				 &emulated_msrs,
-				 ARRAY_SIZE(emulated_msrs) * sizeof(u32)))
-			goto out;
-		r = 0;
-		break;
-	}
 	case KVM_CHECK_EXTENSION: {
 		int ext = (long)argp;
 
@@ -3406,7 +3343,7 @@ static long kvm_dev_ioctl(struct file *filp,
 		r = 2 * PAGE_SIZE;
 		break;
 	default:
-		;
+		return kvm_arch_dev_ioctl(filp, ioctl, arg);
 	}
 out:
 	return r;
@@ -3770,7 +3707,7 @@ static __init int kvm_init(void)
 
 	kvm_init_debug();
 
-	kvm_init_msr_list();
+	kvm_arch_init();
 
 	bad_page = alloc_page(GFP_KERNEL);
 
diff --git a/drivers/kvm/x86.c b/drivers/kvm/x86.c
new file mode 100644
index 0000000..437902c
--- /dev/null
+++ b/drivers/kvm/x86.c
@@ -0,0 +1,102 @@
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * derived from drivers/kvm/kvm_main.c
+ *
+ * Copyright (C) 2006 Qumranet, Inc.
+ *
+ * Authors:
+ *   Avi Kivity   <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
+ *   Yaniv Kamay  <yaniv-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "x86.h"
+
+#include <asm/uaccess.h>
+
+/*
+ * List of msr numbers which we expose to userspace through KVM_GET_MSRS
+ * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
+ *
+ * This list is modified at module load time to reflect the
+ * capabilities of the host cpu.
+ */
+static u32 msrs_to_save[] = {
+	MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,
+	MSR_K6_STAR,
+#ifdef CONFIG_X86_64
+	MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
+#endif
+	MSR_IA32_TIME_STAMP_COUNTER,
+};
+
+static unsigned num_msrs_to_save;
+
+static u32 emulated_msrs[] = {
+	MSR_IA32_MISC_ENABLE,
+};
+
+long kvm_arch_dev_ioctl(struct file *filp,
+			unsigned int ioctl, unsigned long arg)
+{
+	void __user *argp = (void __user *)arg;
+	long r;
+
+	switch (ioctl) {
+	case KVM_GET_MSR_INDEX_LIST: {
+		struct kvm_msr_list __user *user_msr_list = argp;
+		struct kvm_msr_list msr_list;
+		unsigned n;
+
+		r = -EFAULT;
+		if (copy_from_user(&msr_list, user_msr_list, sizeof msr_list))
+			goto out;
+		n = msr_list.nmsrs;
+		msr_list.nmsrs = num_msrs_to_save + ARRAY_SIZE(emulated_msrs);
+		if (copy_to_user(user_msr_list, &msr_list, sizeof msr_list))
+			goto out;
+		r = -E2BIG;
+		if (n < num_msrs_to_save)
+			goto out;
+		r = -EFAULT;
+		if (copy_to_user(user_msr_list->indices, &msrs_to_save,
+				 num_msrs_to_save * sizeof(u32)))
+			goto out;
+		if (copy_to_user(user_msr_list->indices
+				 + num_msrs_to_save * sizeof(u32),
+				 &emulated_msrs,
+				 ARRAY_SIZE(emulated_msrs) * sizeof(u32)))
+			goto out;
+		r = 0;
+		break;
+	}
+	default:
+		r = -EINVAL;
+	}
+out:
+	return r;
+}
+
+static __init void kvm_init_msr_list(void)
+{
+	u32 dummy[2];
+	unsigned i, j;
+
+	for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
+		if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
+			continue;
+		if (j < i)
+			msrs_to_save[j] = msrs_to_save[i];
+		j++;
+	}
+	num_msrs_to_save = j;
+}
+
+__init void kvm_arch_init(void)
+{
+	kvm_init_msr_list();
+}
diff --git a/drivers/kvm/x86.h b/drivers/kvm/x86.h
new file mode 100644
index 0000000..1e2f71b
--- /dev/null
+++ b/drivers/kvm/x86.h
@@ -0,0 +1,16 @@
+#/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * This header defines architecture specific interfaces, x86 version
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef KVM_X86_H
+#define KVM_X86_H
+
+#include "kvm.h"
+
+#endif
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 29/50] KVM: MMU: Clean up MMU functions to take struct kvm when appropriate
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (27 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 28/50] KVM: Move x86 msr handling to new files x86.[ch] Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 30/50] KVM: MMU: More struct kvm_vcpu -> struct kvm cleanups Avi Kivity
                     ` (20 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

Some of the MMU functions take a struct kvm_vcpu even though they affect all
VCPUs.  This patch cleans up some of them to instead take a struct kvm.  This
makes things a bit more clear.

The main thing that was confusing me was whether certain functions need to be
called on all VCPUs.

Signed-off-by: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c         |   18 +++++++++---------
 drivers/kvm/paging_tmpl.h |    4 ++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index ece0aa4..a5ca945 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -606,7 +606,7 @@ static void mmu_page_remove_parent_pte(struct kvm_mmu_page *page,
 	BUG();
 }
 
-static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm_vcpu *vcpu,
+static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm *kvm,
 						gfn_t gfn)
 {
 	unsigned index;
@@ -616,7 +616,7 @@ static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm_vcpu *vcpu,
 
 	pgprintk("%s: looking for gfn %lx\n", __FUNCTION__, gfn);
 	index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
-	bucket = &vcpu->kvm->mmu_page_hash[index];
+	bucket = &kvm->mmu_page_hash[index];
 	hlist_for_each_entry(page, node, bucket, hash_link)
 		if (page->gfn == gfn && !page->role.metaphysical) {
 			pgprintk("%s: found role %x\n",
@@ -782,7 +782,7 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages)
 	kvm->n_alloc_mmu_pages = kvm_nr_mmu_pages;
 }
 
-static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
+static int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 {
 	unsigned index;
 	struct hlist_head *bucket;
@@ -793,25 +793,25 @@ static int kvm_mmu_unprotect_page(struct kvm_vcpu *vcpu, gfn_t gfn)
 	pgprintk("%s: looking for gfn %lx\n", __FUNCTION__, gfn);
 	r = 0;
 	index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES;
-	bucket = &vcpu->kvm->mmu_page_hash[index];
+	bucket = &kvm->mmu_page_hash[index];
 	hlist_for_each_entry_safe(page, node, n, bucket, hash_link)
 		if (page->gfn == gfn && !page->role.metaphysical) {
 			pgprintk("%s: gfn %lx role %x\n", __FUNCTION__, gfn,
 				 page->role.word);
-			kvm_mmu_zap_page(vcpu->kvm, page);
+			kvm_mmu_zap_page(kvm, page);
 			r = 1;
 		}
 	return r;
 }
 
-static void mmu_unshadow(struct kvm_vcpu *vcpu, gfn_t gfn)
+static void mmu_unshadow(struct kvm *kvm, gfn_t gfn)
 {
 	struct kvm_mmu_page *page;
 
-	while ((page = kvm_mmu_lookup_page(vcpu, gfn)) != NULL) {
+	while ((page = kvm_mmu_lookup_page(kvm, gfn)) != NULL) {
 		pgprintk("%s: zap %lx %x\n",
 			 __FUNCTION__, gfn, page->role.word);
-		kvm_mmu_zap_page(vcpu->kvm, page);
+		kvm_mmu_zap_page(kvm, page);
 	}
 }
 
@@ -1299,7 +1299,7 @@ int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
 {
 	gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, gva);
 
-	return kvm_mmu_unprotect_page(vcpu, gpa >> PAGE_SHIFT);
+	return kvm_mmu_unprotect_page(vcpu->kvm, gpa >> PAGE_SHIFT);
 }
 
 void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 447d2c3..4f6edf8 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -268,11 +268,11 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 
 		spte |= PT_WRITABLE_MASK;
 		if (user_fault) {
-			mmu_unshadow(vcpu, gfn);
+			mmu_unshadow(vcpu->kvm, gfn);
 			goto unshadowed;
 		}
 
-		shadow = kvm_mmu_lookup_page(vcpu, gfn);
+		shadow = kvm_mmu_lookup_page(vcpu->kvm, gfn);
 		if (shadow) {
 			pgprintk("%s: found shadow page for %lx, marking ro\n",
 				 __FUNCTION__, gfn);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 30/50] KVM: MMU: More struct kvm_vcpu -> struct kvm cleanups
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (28 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 29/50] KVM: MMU: Clean up MMU functions to take struct kvm when appropriate Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 31/50] KVM: Move guest pte dirty bit management to the guest pagetable walker Avi Kivity
                     ` (19 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

This time, the biggest change is gpa_to_hpa. The translation of GPA to HPA does
not depend on the VCPU state unlike GVA to GPA so there's no need to pass in
the kvm_vcpu.

Signed-off-by: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    2 +-
 drivers/kvm/mmu.c         |   26 +++++++++++++-------------
 drivers/kvm/paging_tmpl.h |    6 +++---
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 9c9c1d7..d56962d 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -554,7 +554,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
 void kvm_mmu_zap_all(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
 
-hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa);
+hpa_t gpa_to_hpa(struct kvm *kvm, gpa_t gpa);
 #define HPA_MSB ((sizeof(hpa_t) * 8) - 1)
 #define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB)
 static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index a5ca945..d046ba8 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -451,14 +451,14 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	}
 }
 
-static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
+static void rmap_write_protect(struct kvm *kvm, u64 gfn)
 {
 	struct kvm_rmap_desc *desc;
 	unsigned long *rmapp;
 	u64 *spte;
 
-	gfn = unalias_gfn(vcpu->kvm, gfn);
-	rmapp = gfn_to_rmap(vcpu->kvm, gfn);
+	gfn = unalias_gfn(kvm, gfn);
+	rmapp = gfn_to_rmap(kvm, gfn);
 
 	while (*rmapp) {
 		if (!(*rmapp & 1))
@@ -471,9 +471,9 @@ static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)
 		BUG_ON(!(*spte & PT_PRESENT_MASK));
 		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
-		rmap_remove(vcpu->kvm, spte);
+		rmap_remove(kvm, spte);
 		set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
-		kvm_flush_remote_tlbs(vcpu->kvm);
+		kvm_flush_remote_tlbs(kvm);
 	}
 }
 
@@ -670,7 +670,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 	hlist_add_head(&page->hash_link, bucket);
 	vcpu->mmu.prefetch_page(vcpu, page);
 	if (!metaphysical)
-		rmap_write_protect(vcpu, gfn);
+		rmap_write_protect(vcpu->kvm, gfn);
 	return page;
 }
 
@@ -823,19 +823,19 @@ static void page_header_update_slot(struct kvm *kvm, void *pte, gpa_t gpa)
 	__set_bit(slot, &page_head->slot_bitmap);
 }
 
-hpa_t safe_gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
+hpa_t safe_gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
 {
-	hpa_t hpa = gpa_to_hpa(vcpu, gpa);
+	hpa_t hpa = gpa_to_hpa(kvm, gpa);
 
 	return is_error_hpa(hpa) ? bad_page_address | (gpa & ~PAGE_MASK): hpa;
 }
 
-hpa_t gpa_to_hpa(struct kvm_vcpu *vcpu, gpa_t gpa)
+hpa_t gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
 {
 	struct page *page;
 
 	ASSERT((gpa & HPA_ERR_MASK) == 0);
-	page = gfn_to_page(vcpu->kvm, gpa >> PAGE_SHIFT);
+	page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
 	if (!page)
 		return gpa | HPA_ERR_MASK;
 	return ((hpa_t)page_to_pfn(page) << PAGE_SHIFT)
@@ -848,7 +848,7 @@ hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
 
 	if (gpa == UNMAPPED_GVA)
 		return UNMAPPED_GVA;
-	return gpa_to_hpa(vcpu, gpa);
+	return gpa_to_hpa(vcpu->kvm, gpa);
 }
 
 struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva)
@@ -857,7 +857,7 @@ struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva)
 
 	if (gpa == UNMAPPED_GVA)
 		return NULL;
-	return pfn_to_page(gpa_to_hpa(vcpu, gpa) >> PAGE_SHIFT);
+	return pfn_to_page(gpa_to_hpa(vcpu->kvm, gpa) >> PAGE_SHIFT);
 }
 
 static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
@@ -1012,7 +1012,7 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 	ASSERT(VALID_PAGE(vcpu->mmu.root_hpa));
 
 
-	paddr = gpa_to_hpa(vcpu , addr & PT64_BASE_ADDR_MASK);
+	paddr = gpa_to_hpa(vcpu->kvm, addr & PT64_BASE_ADDR_MASK);
 
 	if (is_error_hpa(paddr))
 		return 1;
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 4f6edf8..8e1e4ca 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -103,7 +103,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 	pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
 		 walker->level - 1, table_gfn);
 	slot = gfn_to_memslot(vcpu->kvm, table_gfn);
-	hpa = safe_gpa_to_hpa(vcpu, root & PT64_BASE_ADDR_MASK);
+	hpa = safe_gpa_to_hpa(vcpu->kvm, root & PT64_BASE_ADDR_MASK);
 	walker->page = pfn_to_page(hpa >> PAGE_SHIFT);
 	walker->table = kmap_atomic(walker->page, KM_USER0);
 
@@ -159,7 +159,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		walker->inherited_ar &= walker->table[index];
 		table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 		kunmap_atomic(walker->table, KM_USER0);
-		paddr = safe_gpa_to_hpa(vcpu, table_gfn << PAGE_SHIFT);
+		paddr = safe_gpa_to_hpa(vcpu->kvm, table_gfn << PAGE_SHIFT);
 		walker->page = pfn_to_page(paddr >> PAGE_SHIFT);
 		walker->table = kmap_atomic(walker->page, KM_USER0);
 		--walker->level;
@@ -248,7 +248,7 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 	if (!dirty)
 		access_bits &= ~PT_WRITABLE_MASK;
 
-	paddr = gpa_to_hpa(vcpu, gaddr & PT64_BASE_ADDR_MASK);
+	paddr = gpa_to_hpa(vcpu->kvm, gaddr & PT64_BASE_ADDR_MASK);
 
 	spte |= PT_PRESENT_MASK;
 	if (access_bits & PT_USER_MASK)
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 31/50] KVM: Move guest pte dirty bit management to the guest pagetable walker
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (29 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 30/50] KVM: MMU: More struct kvm_vcpu -> struct kvm cleanups Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 32/50] KVM: MMU: Fix nx access bit for huge pages Avi Kivity
                     ` (18 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

This is more consistent with the accessed bit management, and makes the dirty
bit available earlier for other purposes.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c         |    5 +++++
 drivers/kvm/paging_tmpl.h |   31 ++++++++-----------------------
 2 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index d046ba8..e6616a6 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -199,6 +199,11 @@ static int is_writeble_pte(unsigned long pte)
 	return pte & PT_WRITABLE_MASK;
 }
 
+static int is_dirty_pte(unsigned long pte)
+{
+	return pte & PT_DIRTY_MASK;
+}
+
 static int is_io_pte(unsigned long pte)
 {
 	return pte & PT_SHADOW_IO_MARK;
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 8e1e4ca..da36e48 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -144,6 +144,10 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		if (walker->level == PT_PAGE_TABLE_LEVEL) {
 			walker->gfn = (*ptep & PT_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
+			if (write_fault && !is_dirty_pte(*ptep)) {
+				mark_page_dirty(vcpu->kvm, table_gfn);
+				*ptep |= PT_DIRTY_MASK;
+			}
 			break;
 		}
 
@@ -153,6 +157,10 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			walker->gfn = (*ptep & PT_DIR_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
 			walker->gfn += PT_INDEX(addr, PT_PAGE_TABLE_LEVEL);
+			if (write_fault && !is_dirty_pte(*ptep)) {
+				mark_page_dirty(vcpu->kvm, table_gfn);
+				*ptep |= PT_DIRTY_MASK;
+			}
 			break;
 		}
 
@@ -194,12 +202,6 @@ err:
 	return 0;
 }
 
-static void FNAME(mark_pagetable_dirty)(struct kvm *kvm,
-					struct guest_walker *walker)
-{
-	mark_page_dirty(kvm, walker->table_gfn[walker->level - 1]);
-}
-
 static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 				  u64 *shadow_pte,
 				  gpa_t gaddr,
@@ -221,23 +223,6 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 		 __FUNCTION__, *shadow_pte, (u64)gpte, access_bits,
 		 write_fault, user_fault, gfn);
 
-	if (write_fault && !dirty) {
-		pt_element_t *guest_ent, *tmp = NULL;
-
-		if (walker->ptep)
-			guest_ent = walker->ptep;
-		else {
-			tmp = kmap_atomic(walker->page, KM_USER0);
-			guest_ent = &tmp[walker->index];
-		}
-
-		*guest_ent |= PT_DIRTY_MASK;
-		if (!walker->ptep)
-			kunmap_atomic(tmp, KM_USER0);
-		dirty = 1;
-		FNAME(mark_pagetable_dirty)(vcpu->kvm, walker);
-	}
-
 	/*
 	 * We don't set the accessed bit, since we sometimes want to see
 	 * whether the guest actually used the pte (in order to detect
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 32/50] KVM: MMU: Fix nx access bit for huge pages
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (30 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 31/50] KVM: Move guest pte dirty bit management to the guest pagetable walker Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 33/50] KVM: MMU: Disable write access on clean large pages Avi Kivity
                     ` (17 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

We must set the bit before the shift, otherwise the wrong bit gets set.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/paging_tmpl.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index da36e48..e07cb2e 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -382,9 +382,9 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			metaphysical = 1;
 			hugepage_access = walker->pte;
 			hugepage_access &= PT_USER_MASK | PT_WRITABLE_MASK;
+			hugepage_access >>= PT_WRITABLE_SHIFT;
 			if (walker->pte & PT64_NX_MASK)
 				hugepage_access |= (1 << 2);
-			hugepage_access >>= PT_WRITABLE_SHIFT;
 			table_gfn = (walker->pte & PT_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
 		} else {
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 33/50] KVM: MMU: Disable write access on clean large pages
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (31 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 32/50] KVM: MMU: Fix nx access bit for huge pages Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 34/50] KVM: MMU: Instantiate real-mode shadows as user writable shadows Avi Kivity
                     ` (16 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

By forcing clean huge pages to be read-only, we have separate roles
for the shadow of a clean large page and the shadow of a dirty large
page.  This is necessary because different ptes will be instantiated
for the two cases, even for read faults.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/paging_tmpl.h |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index e07cb2e..4538b15 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -382,6 +382,8 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
 			metaphysical = 1;
 			hugepage_access = walker->pte;
 			hugepage_access &= PT_USER_MASK | PT_WRITABLE_MASK;
+			if (!is_dirty_pte(walker->pte))
+				hugepage_access &= ~PT_WRITABLE_MASK;
 			hugepage_access >>= PT_WRITABLE_SHIFT;
 			if (walker->pte & PT64_NX_MASK)
 				hugepage_access |= (1 << 2);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 34/50] KVM: MMU: Instantiate real-mode shadows as user writable shadows
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (32 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 33/50] KVM: MMU: Disable write access on clean large pages Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 35/50] KVM: MMU: Move dirty bit updates to a separate function Avi Kivity
                     ` (15 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

This is consistent with real-mode permissions.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index e6616a6..f52604a 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -902,7 +902,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 				>> PAGE_SHIFT;
 			new_table = kvm_mmu_get_page(vcpu, pseudo_gfn,
 						     v, level - 1,
-						     1, 0, &table[index]);
+						     1, 3, &table[index]);
 			if (!new_table) {
 				pgprintk("nonpaging_map: ENOMEM\n");
 				return -ENOMEM;
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 35/50] KVM: MMU: Move dirty bit updates to a separate function
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (33 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 34/50] KVM: MMU: Instantiate real-mode shadows as user writable shadows Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 36/50] KVM: MMU: When updating the dirty bit, inform the mmu about it Avi Kivity
                     ` (14 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/paging_tmpl.h |   23 +++++++++++++++--------
 1 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 4538b15..a0f84a5 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -69,6 +69,17 @@ struct guest_walker {
 	u32 error_code;
 };
 
+static void FNAME(update_dirty_bit)(struct kvm_vcpu *vcpu,
+				    int write_fault,
+				    pt_element_t *ptep,
+				    gfn_t table_gfn)
+{
+	if (write_fault && !is_dirty_pte(*ptep)) {
+		mark_page_dirty(vcpu->kvm, table_gfn);
+		*ptep |= PT_DIRTY_MASK;
+	}
+}
+
 /*
  * Fetch a guest pte for a guest virtual address
  */
@@ -144,10 +155,8 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		if (walker->level == PT_PAGE_TABLE_LEVEL) {
 			walker->gfn = (*ptep & PT_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
-			if (write_fault && !is_dirty_pte(*ptep)) {
-				mark_page_dirty(vcpu->kvm, table_gfn);
-				*ptep |= PT_DIRTY_MASK;
-			}
+			FNAME(update_dirty_bit)(vcpu, write_fault, ptep,
+						table_gfn);
 			break;
 		}
 
@@ -157,10 +166,8 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			walker->gfn = (*ptep & PT_DIR_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
 			walker->gfn += PT_INDEX(addr, PT_PAGE_TABLE_LEVEL);
-			if (write_fault && !is_dirty_pte(*ptep)) {
-				mark_page_dirty(vcpu->kvm, table_gfn);
-				*ptep |= PT_DIRTY_MASK;
-			}
+			FNAME(update_dirty_bit)(vcpu, write_fault, ptep,
+						table_gfn);
 			break;
 		}
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 36/50] KVM: MMU: When updating the dirty bit, inform the mmu about it
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (34 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 35/50] KVM: MMU: Move dirty bit updates to a separate function Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 37/50] KVM: Portability: split kvm_vcpu_ioctl Avi Kivity
                     ` (13 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Since the mmu uses different shadow pages for dirty large pages and clean
large pages, this allows the mmu to drop ptes that are now invalid.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/paging_tmpl.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index a0f84a5..a9e687b 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -74,9 +74,14 @@ static void FNAME(update_dirty_bit)(struct kvm_vcpu *vcpu,
 				    pt_element_t *ptep,
 				    gfn_t table_gfn)
 {
+	gpa_t pte_gpa;
+
 	if (write_fault && !is_dirty_pte(*ptep)) {
 		mark_page_dirty(vcpu->kvm, table_gfn);
 		*ptep |= PT_DIRTY_MASK;
+		pte_gpa = ((gpa_t)table_gfn << PAGE_SHIFT);
+		pte_gpa += offset_in_page(ptep);
+		kvm_mmu_pte_write(vcpu, pte_gpa, (u8 *)ptep, sizeof(*ptep));
 	}
 }
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 37/50] KVM: Portability: split kvm_vcpu_ioctl
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (35 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 36/50] KVM: MMU: When updating the dirty bit, inform the mmu about it Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 38/50] KVM: apic round robin cleanup Avi Kivity
                     ` (12 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Carsten Otte

From: Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.

Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.

x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS

An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.

Signed-off-by: Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Christian Borntraeger <borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Christian Ehrhardt <ehrhardt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    9 ++
 drivers/kvm/kvm_main.c |  200 ++------------------------------------------
 drivers/kvm/x86.c      |  219 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 234 insertions(+), 194 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index d56962d..1edf8a5 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -537,6 +537,10 @@ extern struct kvm_x86_ops *kvm_x86_ops;
 int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id);
 void kvm_vcpu_uninit(struct kvm_vcpu *vcpu);
 
+void vcpu_load(struct kvm_vcpu *vcpu);
+void vcpu_put(struct kvm_vcpu *vcpu);
+
+
 int kvm_init_x86(struct kvm_x86_ops *ops, unsigned int vcpu_size,
 		  struct module *module);
 void kvm_exit_x86(void);
@@ -655,6 +659,11 @@ int kvm_fix_hypercall(struct kvm_vcpu *vcpu);
 
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg);
+long kvm_arch_vcpu_ioctl(struct file *filp,
+			 unsigned int ioctl, unsigned long arg);
+void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
+void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
+
 __init void kvm_arch_init(void);
 
 static inline void kvm_guest_enter(void)
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 9f7370f..03d6069 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -90,8 +90,6 @@ static struct kvm_stats_debugfs_item {
 
 static struct dentry *debugfs_dir;
 
-#define MAX_IO_MSRS 256
-
 #define CR0_RESERVED_BITS						\
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
 			  | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
@@ -179,21 +177,21 @@ EXPORT_SYMBOL_GPL(kvm_put_guest_fpu);
 /*
  * Switches to specified vcpu, until a matching vcpu_put()
  */
-static void vcpu_load(struct kvm_vcpu *vcpu)
+void vcpu_load(struct kvm_vcpu *vcpu)
 {
 	int cpu;
 
 	mutex_lock(&vcpu->mutex);
 	cpu = get_cpu();
 	preempt_notifier_register(&vcpu->preempt_notifier);
-	kvm_x86_ops->vcpu_load(vcpu, cpu);
+	kvm_arch_vcpu_load(vcpu, cpu);
 	put_cpu();
 }
 
-static void vcpu_put(struct kvm_vcpu *vcpu)
+void vcpu_put(struct kvm_vcpu *vcpu)
 {
 	preempt_disable();
-	kvm_x86_ops->vcpu_put(vcpu);
+	kvm_arch_vcpu_put(vcpu);
 	preempt_notifier_unregister(&vcpu->preempt_notifier);
 	preempt_enable();
 	mutex_unlock(&vcpu->mutex);
@@ -2509,86 +2507,6 @@ void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
 EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
 
 /*
- * Adapt set_msr() to msr_io()'s calling convention
- */
-static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
-{
-	return kvm_set_msr(vcpu, index, *data);
-}
-
-/*
- * Read or write a bunch of msrs. All parameters are kernel addresses.
- *
- * @return number of msrs set successfully.
- */
-static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
-		    struct kvm_msr_entry *entries,
-		    int (*do_msr)(struct kvm_vcpu *vcpu,
-				  unsigned index, u64 *data))
-{
-	int i;
-
-	vcpu_load(vcpu);
-
-	for (i = 0; i < msrs->nmsrs; ++i)
-		if (do_msr(vcpu, entries[i].index, &entries[i].data))
-			break;
-
-	vcpu_put(vcpu);
-
-	return i;
-}
-
-/*
- * Read or write a bunch of msrs. Parameters are user addresses.
- *
- * @return number of msrs set successfully.
- */
-static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
-		  int (*do_msr)(struct kvm_vcpu *vcpu,
-				unsigned index, u64 *data),
-		  int writeback)
-{
-	struct kvm_msrs msrs;
-	struct kvm_msr_entry *entries;
-	int r, n;
-	unsigned size;
-
-	r = -EFAULT;
-	if (copy_from_user(&msrs, user_msrs, sizeof msrs))
-		goto out;
-
-	r = -E2BIG;
-	if (msrs.nmsrs >= MAX_IO_MSRS)
-		goto out;
-
-	r = -ENOMEM;
-	size = sizeof(struct kvm_msr_entry) * msrs.nmsrs;
-	entries = vmalloc(size);
-	if (!entries)
-		goto out;
-
-	r = -EFAULT;
-	if (copy_from_user(entries, user_msrs->entries, size))
-		goto out_free;
-
-	r = n = __msr_io(vcpu, &msrs, entries, do_msr);
-	if (r < 0)
-		goto out_free;
-
-	r = -EFAULT;
-	if (writeback && copy_to_user(user_msrs->entries, entries, size))
-		goto out_free;
-
-	r = n;
-
-out_free:
-	vfree(entries);
-out:
-	return r;
-}
-
-/*
  * Translate a guest virtual address to a guest physical address.
  */
 static int kvm_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
@@ -2761,48 +2679,6 @@ free_vcpu:
 	return r;
 }
 
-static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
-{
-	u64 efer;
-	int i;
-	struct kvm_cpuid_entry *e, *entry;
-
-	rdmsrl(MSR_EFER, efer);
-	entry = NULL;
-	for (i = 0; i < vcpu->cpuid_nent; ++i) {
-		e = &vcpu->cpuid_entries[i];
-		if (e->function == 0x80000001) {
-			entry = e;
-			break;
-		}
-	}
-	if (entry && (entry->edx & (1 << 20)) && !(efer & EFER_NX)) {
-		entry->edx &= ~(1 << 20);
-		printk(KERN_INFO "kvm: guest NX capability removed\n");
-	}
-}
-
-static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
-				    struct kvm_cpuid *cpuid,
-				    struct kvm_cpuid_entry __user *entries)
-{
-	int r;
-
-	r = -E2BIG;
-	if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
-		goto out;
-	r = -EFAULT;
-	if (copy_from_user(&vcpu->cpuid_entries, entries,
-			   cpuid->nent * sizeof(struct kvm_cpuid_entry)))
-		goto out;
-	vcpu->cpuid_nent = cpuid->nent;
-	cpuid_fix_nx_cap(vcpu);
-	return 0;
-
-out:
-	return r;
-}
-
 static int kvm_vcpu_ioctl_set_sigmask(struct kvm_vcpu *vcpu, sigset_t *sigset)
 {
 	if (sigset) {
@@ -2875,33 +2751,12 @@ static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
 	return 0;
 }
 
-static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
-				    struct kvm_lapic_state *s)
-{
-	vcpu_load(vcpu);
-	memcpy(s->regs, vcpu->apic->regs, sizeof *s);
-	vcpu_put(vcpu);
-
-	return 0;
-}
-
-static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu *vcpu,
-				    struct kvm_lapic_state *s)
-{
-	vcpu_load(vcpu);
-	memcpy(vcpu->apic->regs, s->regs, sizeof *s);
-	kvm_apic_post_state_restore(vcpu);
-	vcpu_put(vcpu);
-
-	return 0;
-}
-
 static long kvm_vcpu_ioctl(struct file *filp,
 			   unsigned int ioctl, unsigned long arg)
 {
 	struct kvm_vcpu *vcpu = filp->private_data;
 	void __user *argp = (void __user *)arg;
-	int r = -EINVAL;
+	int r;
 
 	switch (ioctl) {
 	case KVM_RUN:
@@ -2999,24 +2854,6 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
-	case KVM_GET_MSRS:
-		r = msr_io(vcpu, argp, kvm_get_msr, 1);
-		break;
-	case KVM_SET_MSRS:
-		r = msr_io(vcpu, argp, do_set_msr, 0);
-		break;
-	case KVM_SET_CPUID: {
-		struct kvm_cpuid __user *cpuid_arg = argp;
-		struct kvm_cpuid cpuid;
-
-		r = -EFAULT;
-		if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
-			goto out;
-		r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
-		if (r)
-			goto out;
-		break;
-	}
 	case KVM_SET_SIGNAL_MASK: {
 		struct kvm_signal_mask __user *sigmask_arg = argp;
 		struct kvm_signal_mask kvm_sigmask;
@@ -3065,33 +2902,8 @@ static long kvm_vcpu_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
-	case KVM_GET_LAPIC: {
-		struct kvm_lapic_state lapic;
-
-		memset(&lapic, 0, sizeof lapic);
-		r = kvm_vcpu_ioctl_get_lapic(vcpu, &lapic);
-		if (r)
-			goto out;
-		r = -EFAULT;
-		if (copy_to_user(argp, &lapic, sizeof lapic))
-			goto out;
-		r = 0;
-		break;
-	}
-	case KVM_SET_LAPIC: {
-		struct kvm_lapic_state lapic;
-
-		r = -EFAULT;
-		if (copy_from_user(&lapic, argp, sizeof lapic))
-			goto out;
-		r = kvm_vcpu_ioctl_set_lapic(vcpu, &lapic);;
-		if (r)
-			goto out;
-		r = 0;
-		break;
-	}
 	default:
-		;
+		r = kvm_arch_vcpu_ioctl(filp, ioctl, arg);
 	}
 out:
 	return r;
diff --git a/drivers/kvm/x86.c b/drivers/kvm/x86.c
index 437902c..1fe209d 100644
--- a/drivers/kvm/x86.c
+++ b/drivers/kvm/x86.c
@@ -14,10 +14,18 @@
  *
  */
 
+#include "kvm.h"
 #include "x86.h"
+#include "irq.h"
+
+#include <linux/kvm.h>
+#include <linux/fs.h>
+#include <linux/vmalloc.h>
 
 #include <asm/uaccess.h>
 
+#define MAX_IO_MSRS 256
+
 /*
  * List of msr numbers which we expose to userspace through KVM_GET_MSRS
  * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
@@ -40,6 +48,86 @@ static u32 emulated_msrs[] = {
 	MSR_IA32_MISC_ENABLE,
 };
 
+/*
+ * Adapt set_msr() to msr_io()'s calling convention
+ */
+static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
+{
+	return kvm_set_msr(vcpu, index, *data);
+}
+
+/*
+ * Read or write a bunch of msrs. All parameters are kernel addresses.
+ *
+ * @return number of msrs set successfully.
+ */
+static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
+		    struct kvm_msr_entry *entries,
+		    int (*do_msr)(struct kvm_vcpu *vcpu,
+				  unsigned index, u64 *data))
+{
+	int i;
+
+	vcpu_load(vcpu);
+
+	for (i = 0; i < msrs->nmsrs; ++i)
+		if (do_msr(vcpu, entries[i].index, &entries[i].data))
+			break;
+
+	vcpu_put(vcpu);
+
+	return i;
+}
+
+/*
+ * Read or write a bunch of msrs. Parameters are user addresses.
+ *
+ * @return number of msrs set successfully.
+ */
+static int msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs __user *user_msrs,
+		  int (*do_msr)(struct kvm_vcpu *vcpu,
+				unsigned index, u64 *data),
+		  int writeback)
+{
+	struct kvm_msrs msrs;
+	struct kvm_msr_entry *entries;
+	int r, n;
+	unsigned size;
+
+	r = -EFAULT;
+	if (copy_from_user(&msrs, user_msrs, sizeof msrs))
+		goto out;
+
+	r = -E2BIG;
+	if (msrs.nmsrs >= MAX_IO_MSRS)
+		goto out;
+
+	r = -ENOMEM;
+	size = sizeof(struct kvm_msr_entry) * msrs.nmsrs;
+	entries = vmalloc(size);
+	if (!entries)
+		goto out;
+
+	r = -EFAULT;
+	if (copy_from_user(entries, user_msrs->entries, size))
+		goto out_free;
+
+	r = n = __msr_io(vcpu, &msrs, entries, do_msr);
+	if (r < 0)
+		goto out_free;
+
+	r = -EFAULT;
+	if (writeback && copy_to_user(user_msrs->entries, entries, size))
+		goto out_free;
+
+	r = n;
+
+out_free:
+	vfree(entries);
+out:
+	return r;
+}
+
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg)
 {
@@ -81,6 +169,137 @@ out:
 	return r;
 }
 
+void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+{
+	kvm_x86_ops->vcpu_load(vcpu, cpu);
+}
+
+void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
+{
+	kvm_x86_ops->vcpu_put(vcpu);
+}
+
+static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
+{
+	u64 efer;
+	int i;
+	struct kvm_cpuid_entry *e, *entry;
+
+	rdmsrl(MSR_EFER, efer);
+	entry = NULL;
+	for (i = 0; i < vcpu->cpuid_nent; ++i) {
+		e = &vcpu->cpuid_entries[i];
+		if (e->function == 0x80000001) {
+			entry = e;
+			break;
+		}
+	}
+	if (entry && (entry->edx & (1 << 20)) && !(efer & EFER_NX)) {
+		entry->edx &= ~(1 << 20);
+		printk(KERN_INFO "kvm: guest NX capability removed\n");
+	}
+}
+
+static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
+				    struct kvm_cpuid *cpuid,
+				    struct kvm_cpuid_entry __user *entries)
+{
+	int r;
+
+	r = -E2BIG;
+	if (cpuid->nent > KVM_MAX_CPUID_ENTRIES)
+		goto out;
+	r = -EFAULT;
+	if (copy_from_user(&vcpu->cpuid_entries, entries,
+			   cpuid->nent * sizeof(struct kvm_cpuid_entry)))
+		goto out;
+	vcpu->cpuid_nent = cpuid->nent;
+	cpuid_fix_nx_cap(vcpu);
+	return 0;
+
+out:
+	return r;
+}
+
+static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
+				    struct kvm_lapic_state *s)
+{
+	vcpu_load(vcpu);
+	memcpy(s->regs, vcpu->apic->regs, sizeof *s);
+	vcpu_put(vcpu);
+
+	return 0;
+}
+
+static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu *vcpu,
+				    struct kvm_lapic_state *s)
+{
+	vcpu_load(vcpu);
+	memcpy(vcpu->apic->regs, s->regs, sizeof *s);
+	kvm_apic_post_state_restore(vcpu);
+	vcpu_put(vcpu);
+
+	return 0;
+}
+
+long kvm_arch_vcpu_ioctl(struct file *filp,
+			 unsigned int ioctl, unsigned long arg)
+{
+	struct kvm_vcpu *vcpu = filp->private_data;
+	void __user *argp = (void __user *)arg;
+	int r;
+
+	switch (ioctl) {
+	case KVM_GET_LAPIC: {
+		struct kvm_lapic_state lapic;
+
+		memset(&lapic, 0, sizeof lapic);
+		r = kvm_vcpu_ioctl_get_lapic(vcpu, &lapic);
+		if (r)
+			goto out;
+		r = -EFAULT;
+		if (copy_to_user(argp, &lapic, sizeof lapic))
+			goto out;
+		r = 0;
+		break;
+	}
+	case KVM_SET_LAPIC: {
+		struct kvm_lapic_state lapic;
+
+		r = -EFAULT;
+		if (copy_from_user(&lapic, argp, sizeof lapic))
+			goto out;
+		r = kvm_vcpu_ioctl_set_lapic(vcpu, &lapic);;
+		if (r)
+			goto out;
+		r = 0;
+		break;
+	}
+	case KVM_SET_CPUID: {
+		struct kvm_cpuid __user *cpuid_arg = argp;
+		struct kvm_cpuid cpuid;
+
+		r = -EFAULT;
+		if (copy_from_user(&cpuid, cpuid_arg, sizeof cpuid))
+			goto out;
+		r = kvm_vcpu_ioctl_set_cpuid(vcpu, &cpuid, cpuid_arg->entries);
+		if (r)
+			goto out;
+		break;
+	}
+	case KVM_GET_MSRS:
+		r = msr_io(vcpu, argp, kvm_get_msr, 1);
+		break;
+	case KVM_SET_MSRS:
+		r = msr_io(vcpu, argp, do_set_msr, 0);
+		break;
+	default:
+		r = -EINVAL;
+	}
+out:
+	return r;
+}
+
 static __init void kvm_init_msr_list(void)
 {
 	u32 dummy[2];
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 38/50] KVM: apic round robin cleanup
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (36 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 37/50] KVM: Portability: split kvm_vcpu_ioctl Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 39/50] KVM: Add some \n in ioapic_debug() Avi Kivity
                     ` (11 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Qing He

From: Qing He <qing.he-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If no apic is enabled in the bitmap of an interrupt delivery with delivery
mode of lowest priority, a warning should be reported rather than select
a fallback vcpu

Signed-off-by: Qing He <qing.he-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Eddie (Yaozu) Dong <eddie.dong-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/lapic.c |   13 +++----------
 1 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
index e15b42e..8840f9d 100644
--- a/drivers/kvm/lapic.c
+++ b/drivers/kvm/lapic.c
@@ -395,10 +395,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 struct kvm_lapic *kvm_apic_round_robin(struct kvm *kvm, u8 vector,
 				       unsigned long bitmap)
 {
-	int vcpu_id;
 	int last;
 	int next;
-	struct kvm_lapic *apic;
+	struct kvm_lapic *apic = NULL;
 
 	last = kvm->round_robin_prev_vcpu;
 	next = last;
@@ -415,14 +414,8 @@ struct kvm_lapic *kvm_apic_round_robin(struct kvm *kvm, u8 vector,
 	} while (next != last);
 	kvm->round_robin_prev_vcpu = next;
 
-	if (!apic) {
-		vcpu_id = ffs(bitmap) - 1;
-		if (vcpu_id < 0) {
-			vcpu_id = 0;
-			printk(KERN_DEBUG "vcpu not ready for apic_round_robin\n");
-		}
-		apic = kvm->vcpus[vcpu_id]->apic;
-	}
+	if (!apic)
+		printk(KERN_DEBUG "vcpu not ready for apic_round_robin\n");
 
 	return apic;
 }
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 39/50] KVM: Add some \n in ioapic_debug()
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (37 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 38/50] KVM: apic round robin cleanup Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 40/50] KVM: Move apic timer interrupt backlog processing to common code Avi Kivity
                     ` (10 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Laurent Vivier

From: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>

Add new-line at end of debug strings.

Signed-off-by: Laurent Vivier <Laurent.Vivier-6ktuUTfB/bM@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/ioapic.c |   25 ++++++++++++++-----------
 1 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/ioapic.c b/drivers/kvm/ioapic.c
index c7992e6..8503d99 100644
--- a/drivers/kvm/ioapic.c
+++ b/drivers/kvm/ioapic.c
@@ -40,8 +40,11 @@
 #include <asm/apicdef.h>
 #include <asm/io_apic.h>
 #include "irq.h"
-/* #define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg) */
+#if 0
+#define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg)
+#else
 #define ioapic_debug(fmt, arg...)
+#endif
 static void ioapic_deliver(struct kvm_ioapic *vioapic, int irq);
 
 static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
@@ -113,7 +116,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
 	default:
 		index = (ioapic->ioregsel - 0x10) >> 1;
 
-		ioapic_debug("change redir index %x val %x", index, val);
+		ioapic_debug("change redir index %x val %x\n", index, val);
 		if (index >= IOAPIC_NUM_PINS)
 			return;
 		if (ioapic->ioregsel & 1) {
@@ -134,7 +137,7 @@ static void ioapic_inj_irq(struct kvm_ioapic *ioapic,
 			   struct kvm_lapic *target,
 			   u8 vector, u8 trig_mode, u8 delivery_mode)
 {
-	ioapic_debug("irq %d trig %d deliv %d", vector, trig_mode,
+	ioapic_debug("irq %d trig %d deliv %d\n", vector, trig_mode,
 		     delivery_mode);
 
 	ASSERT((delivery_mode == dest_Fixed) ||
@@ -151,7 +154,7 @@ static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
 	struct kvm *kvm = ioapic->kvm;
 	struct kvm_vcpu *vcpu;
 
-	ioapic_debug("dest %d dest_mode %d", dest, dest_mode);
+	ioapic_debug("dest %d dest_mode %d\n", dest, dest_mode);
 
 	if (dest_mode == 0) {	/* Physical mode. */
 		if (dest == 0xFF) {	/* Broadcast. */
@@ -179,7 +182,7 @@ static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
 			    kvm_apic_match_logical_addr(vcpu->apic, dest))
 				mask |= 1 << vcpu->vcpu_id;
 		}
-	ioapic_debug("mask %x", mask);
+	ioapic_debug("mask %x\n", mask);
 	return mask;
 }
 
@@ -196,12 +199,12 @@ static void ioapic_deliver(struct kvm_ioapic *ioapic, int irq)
 	int vcpu_id;
 
 	ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
-		     "vector=%x trig_mode=%x",
+		     "vector=%x trig_mode=%x\n",
 		     dest, dest_mode, delivery_mode, vector, trig_mode);
 
 	deliver_bitmask = ioapic_get_delivery_bitmask(ioapic, dest, dest_mode);
 	if (!deliver_bitmask) {
-		ioapic_debug("no target on destination");
+		ioapic_debug("no target on destination\n");
 		return;
 	}
 
@@ -214,7 +217,7 @@ static void ioapic_deliver(struct kvm_ioapic *ioapic, int irq)
 				       trig_mode, delivery_mode);
 		else
 			ioapic_debug("null round robin: "
-				     "mask=%x vector=%x delivery_mode=%x",
+				     "mask=%x vector=%x delivery_mode=%x\n",
 				     deliver_bitmask, vector, dest_LowestPrio);
 		break;
 	case dest_Fixed:
@@ -304,7 +307,7 @@ static void ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
 	struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this->private;
 	u32 result;
 
-	ioapic_debug("addr %lx", (unsigned long)addr);
+	ioapic_debug("addr %lx\n", (unsigned long)addr);
 	ASSERT(!(addr & 0xf));	/* check alignment */
 
 	addr &= 0xff;
@@ -341,8 +344,8 @@ static void ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
 	struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this->private;
 	u32 data;
 
-	ioapic_debug("ioapic_mmio_write addr=%lx len=%d val=%p\n",
-		     addr, len, val);
+	ioapic_debug("ioapic_mmio_write addr=%p len=%d val=%p\n",
+		     (void*)addr, len, val);
 	ASSERT(!(addr & 0xf));	/* check alignment */
 	if (len == 4 || len == 8)
 		data = *(u32 *) val;
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 40/50] KVM: Move apic timer interrupt backlog processing to common code
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (38 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 39/50] KVM: Add some \n in ioapic_debug() Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 41/50] KVM: Rename KVM_TLB_FLUSH to KVM_REQ_TLB_FLUSH Avi Kivity
                     ` (9 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Beside the obvious goodness of making code more common, this prevents
a livelock with the next patch which moves interrupt injection out of the
critical section.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm_main.c |    2 ++
 drivers/kvm/svm.c      |    1 -
 drivers/kvm/vmx.c      |    1 -
 3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 03d6069..c94d4df 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -2144,6 +2144,8 @@ again:
 	if (unlikely(r))
 		goto out;
 
+	kvm_inject_pending_timer_irqs(vcpu);
+
 	preempt_disable();
 
 	kvm_x86_ops->prepare_guest_switch(vcpu);
diff --git a/drivers/kvm/svm.c b/drivers/kvm/svm.c
index 746a377..4ff2922 100644
--- a/drivers/kvm/svm.c
+++ b/drivers/kvm/svm.c
@@ -1355,7 +1355,6 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu)
 	struct vmcb *vmcb = svm->vmcb;
 	int intr_vector = -1;
 
-	kvm_inject_pending_timer_irqs(vcpu);
 	if ((vmcb->control.exit_int_info & SVM_EVTINJ_VALID) &&
 	    ((vmcb->control.exit_int_info & SVM_EVTINJ_TYPE_MASK) == 0)) {
 		intr_vector = vmcb->control.exit_int_info &
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 1336174..be6846d 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -2191,7 +2191,6 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
 	int has_ext_irq, interrupt_window_open;
 	int vector;
 
-	kvm_inject_pending_timer_irqs(vcpu);
 	update_tpr_threshold(vcpu);
 
 	has_ext_irq = kvm_cpu_has_interrupt(vcpu);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 41/50] KVM: Rename KVM_TLB_FLUSH to KVM_REQ_TLB_FLUSH
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (39 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 40/50] KVM: Move apic timer interrupt backlog processing to common code Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 42/50] KVM: x86 emulator: Implement emulation of instruction: inc & dec Avi Kivity
                     ` (8 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

We now have a new namespace, KVM_REQ_*, for bits in vcpu->requests.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    2 +-
 drivers/kvm/kvm_main.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 1edf8a5..6ae7b63 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -67,7 +67,7 @@
 /*
  * vcpu->requests bit members
  */
-#define KVM_TLB_FLUSH 0
+#define KVM_REQ_TLB_FLUSH          0
 
 /*
  * Address types:
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index c94d4df..a1a3be9 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -212,7 +212,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 		vcpu = kvm->vcpus[i];
 		if (!vcpu)
 			continue;
-		if (test_and_set_bit(KVM_TLB_FLUSH, &vcpu->requests))
+		if (test_and_set_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests))
 			continue;
 		cpu = vcpu->cpu;
 		if (cpu != -1 && cpu != raw_smp_processor_id())
@@ -2171,7 +2171,7 @@ again:
 	kvm_guest_enter();
 
 	if (vcpu->requests)
-		if (test_and_clear_bit(KVM_TLB_FLUSH, &vcpu->requests))
+		if (test_and_clear_bit(KVM_REQ_TLB_FLUSH, &vcpu->requests))
 			kvm_x86_ops->tlb_flush(vcpu);
 
 	kvm_x86_ops->run(vcpu, kvm_run);
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 42/50] KVM: x86 emulator: Implement emulation of instruction: inc & dec
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (40 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 41/50] KVM: Rename KVM_TLB_FLUSH to KVM_REQ_TLB_FLUSH Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 43/50] KVM: MMU: Simplify page table walker Avi Kivity
                     ` (7 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Nitin A Kamble <nitin.a.kamble-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Instructions:
	inc r16/r32 (opcode 0x40-0x47)
	dec r16/r32 (opcode 0x48-0x4f)

Signed-off-by: Nitin A Kamble <nitin.a.kamble-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   20 ++++++++++++++++++--
 1 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index b03029e..72621c9 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -96,8 +96,12 @@ static u8 opcode_table[256] = {
 	ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM,
 	ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM,
 	0, 0, 0, 0,
-	/* 0x40 - 0x4F */
-	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+	/* 0x40 - 0x47 */
+	ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+	ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+	/* 0x48 - 0x4F */
+	ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
+	ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 	/* 0x50 - 0x57 */
 	ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
 	ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps,
@@ -1376,6 +1380,18 @@ special_insn:
 	if (c->twobyte)
 		goto twobyte_special_insn;
 	switch (c->b) {
+	case 0x40 ... 0x47: /* inc r16/r32 */
+		c->dst.bytes = c->op_bytes;
+		c->dst.ptr = (unsigned long *)&c->regs[c->b & 0x7];
+		c->dst.val = *c->dst.ptr;
+		emulate_1op("inc", c->dst, ctxt->eflags);
+		break;
+	case 0x48 ... 0x4f: /* dec r16/r32 */
+		c->dst.bytes = c->op_bytes;
+		c->dst.ptr = (unsigned long *)&c->regs[c->b & 0x7];
+		c->dst.val = *c->dst.ptr;
+		emulate_1op("dec", c->dst, ctxt->eflags);
+		break;
 	case 0x50 ... 0x57:  /* push reg */
 		if (c->op_bytes == 2)
 			c->src.val = (u16) c->regs[c->b & 0x7];
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 43/50] KVM: MMU: Simplify page table walker
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (41 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 42/50] KVM: x86 emulator: Implement emulation of instruction: inc & dec Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 44/50] KVM: x86 emulator: cmc, clc, cli, sti Avi Kivity
                     ` (6 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Simplify the walker level loop not to carry so much information from one
loop to the next.  In addition to being complex, this made kmap_atomic()
critical sections difficult to manage.

As a result of this change, kmap_atomic() sections are limited to actually
touching the guest pte, which allows the other functions called from the
walker to do sleepy operations.  This will happen when we enable swapping.

Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/paging_tmpl.h |  124 +++++++++++++++++---------------------------
 1 files changed, 48 insertions(+), 76 deletions(-)

diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index a9e687b..bab1b7f 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -59,32 +59,12 @@
 struct guest_walker {
 	int level;
 	gfn_t table_gfn[PT_MAX_FULL_LEVELS];
-	pt_element_t *table;
 	pt_element_t pte;
-	pt_element_t *ptep;
-	struct page *page;
-	int index;
 	pt_element_t inherited_ar;
 	gfn_t gfn;
 	u32 error_code;
 };
 
-static void FNAME(update_dirty_bit)(struct kvm_vcpu *vcpu,
-				    int write_fault,
-				    pt_element_t *ptep,
-				    gfn_t table_gfn)
-{
-	gpa_t pte_gpa;
-
-	if (write_fault && !is_dirty_pte(*ptep)) {
-		mark_page_dirty(vcpu->kvm, table_gfn);
-		*ptep |= PT_DIRTY_MASK;
-		pte_gpa = ((gpa_t)table_gfn << PAGE_SHIFT);
-		pte_gpa += offset_in_page(ptep);
-		kvm_mmu_pte_write(vcpu, pte_gpa, (u8 *)ptep, sizeof(*ptep));
-	}
-}
-
 /*
  * Fetch a guest pte for a guest virtual address
  */
@@ -94,105 +74,99 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 {
 	hpa_t hpa;
 	struct kvm_memory_slot *slot;
-	pt_element_t *ptep;
-	pt_element_t root;
+	struct page *page;
+	pt_element_t *table;
+	pt_element_t pte;
 	gfn_t table_gfn;
+	unsigned index;
+	gpa_t pte_gpa;
 
 	pgprintk("%s: addr %lx\n", __FUNCTION__, addr);
 	walker->level = vcpu->mmu.root_level;
-	walker->table = NULL;
-	walker->page = NULL;
-	walker->ptep = NULL;
-	root = vcpu->cr3;
+	pte = vcpu->cr3;
 #if PTTYPE == 64
 	if (!is_long_mode(vcpu)) {
-		walker->ptep = &vcpu->pdptrs[(addr >> 30) & 3];
-		root = *walker->ptep;
-		walker->pte = root;
-		if (!(root & PT_PRESENT_MASK))
+		pte = vcpu->pdptrs[(addr >> 30) & 3];
+		if (!is_present_pte(pte))
 			goto not_present;
 		--walker->level;
 	}
 #endif
-	table_gfn = (root & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
-	walker->table_gfn[walker->level - 1] = table_gfn;
-	pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
-		 walker->level - 1, table_gfn);
-	slot = gfn_to_memslot(vcpu->kvm, table_gfn);
-	hpa = safe_gpa_to_hpa(vcpu->kvm, root & PT64_BASE_ADDR_MASK);
-	walker->page = pfn_to_page(hpa >> PAGE_SHIFT);
-	walker->table = kmap_atomic(walker->page, KM_USER0);
-
 	ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
 	       (vcpu->cr3 & CR3_NONPAE_RESERVED_BITS) == 0);
 
 	walker->inherited_ar = PT_USER_MASK | PT_WRITABLE_MASK;
 
 	for (;;) {
-		int index = PT_INDEX(addr, walker->level);
-		hpa_t paddr;
+		index = PT_INDEX(addr, walker->level);
 
-		ptep = &walker->table[index];
-		walker->index = index;
-		ASSERT(((unsigned long)walker->table & PAGE_MASK) ==
-		       ((unsigned long)ptep & PAGE_MASK));
+		table_gfn = (pte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
+		walker->table_gfn[walker->level - 1] = table_gfn;
+		pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
+			 walker->level - 1, table_gfn);
+
+		slot = gfn_to_memslot(vcpu->kvm, table_gfn);
+		hpa = safe_gpa_to_hpa(vcpu->kvm, pte & PT64_BASE_ADDR_MASK);
+		page = pfn_to_page(hpa >> PAGE_SHIFT);
 
-		if (!is_present_pte(*ptep))
+		table = kmap_atomic(page, KM_USER0);
+		pte = table[index];
+		kunmap_atomic(table, KM_USER0);
+
+		if (!is_present_pte(pte))
 			goto not_present;
 
-		if (write_fault && !is_writeble_pte(*ptep))
+		if (write_fault && !is_writeble_pte(pte))
 			if (user_fault || is_write_protection(vcpu))
 				goto access_error;
 
-		if (user_fault && !(*ptep & PT_USER_MASK))
+		if (user_fault && !(pte & PT_USER_MASK))
 			goto access_error;
 
 #if PTTYPE == 64
-		if (fetch_fault && is_nx(vcpu) && (*ptep & PT64_NX_MASK))
+		if (fetch_fault && is_nx(vcpu) && (pte & PT64_NX_MASK))
 			goto access_error;
 #endif
 
-		if (!(*ptep & PT_ACCESSED_MASK)) {
+		if (!(pte & PT_ACCESSED_MASK)) {
 			mark_page_dirty(vcpu->kvm, table_gfn);
-			*ptep |= PT_ACCESSED_MASK;
+			pte |= PT_ACCESSED_MASK;
+			table = kmap_atomic(page, KM_USER0);
+			table[index] = pte;
+			kunmap_atomic(table, KM_USER0);
 		}
 
 		if (walker->level == PT_PAGE_TABLE_LEVEL) {
-			walker->gfn = (*ptep & PT_BASE_ADDR_MASK)
-				>> PAGE_SHIFT;
-			FNAME(update_dirty_bit)(vcpu, write_fault, ptep,
-						table_gfn);
+			walker->gfn = (pte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
 			break;
 		}
 
 		if (walker->level == PT_DIRECTORY_LEVEL
-		    && (*ptep & PT_PAGE_SIZE_MASK)
+		    && (pte & PT_PAGE_SIZE_MASK)
 		    && (PTTYPE == 64 || is_pse(vcpu))) {
-			walker->gfn = (*ptep & PT_DIR_BASE_ADDR_MASK)
+			walker->gfn = (pte & PT_DIR_BASE_ADDR_MASK)
 				>> PAGE_SHIFT;
 			walker->gfn += PT_INDEX(addr, PT_PAGE_TABLE_LEVEL);
-			FNAME(update_dirty_bit)(vcpu, write_fault, ptep,
-						table_gfn);
 			break;
 		}
 
-		walker->inherited_ar &= walker->table[index];
-		table_gfn = (*ptep & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
-		kunmap_atomic(walker->table, KM_USER0);
-		paddr = safe_gpa_to_hpa(vcpu->kvm, table_gfn << PAGE_SHIFT);
-		walker->page = pfn_to_page(paddr >> PAGE_SHIFT);
-		walker->table = kmap_atomic(walker->page, KM_USER0);
+		walker->inherited_ar &= pte;
 		--walker->level;
-		walker->table_gfn[walker->level - 1] = table_gfn;
-		pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
-			 walker->level - 1, table_gfn);
 	}
-	walker->pte = *ptep;
-	if (walker->page)
-		walker->ptep = NULL;
-	if (walker->table)
-		kunmap_atomic(walker->table, KM_USER0);
-	pgprintk("%s: pte %llx\n", __FUNCTION__, (u64)*ptep);
+
+	if (write_fault && !is_dirty_pte(pte)) {
+		mark_page_dirty(vcpu->kvm, table_gfn);
+		pte |= PT_DIRTY_MASK;
+		table = kmap_atomic(page, KM_USER0);
+		table[index] = pte;
+		kunmap_atomic(table, KM_USER0);
+		pte_gpa = table_gfn << PAGE_SHIFT;
+		pte_gpa += index * sizeof(pt_element_t);
+		kvm_mmu_pte_write(vcpu, pte_gpa, (u8 *)&pte, sizeof(pte));
+	}
+
+	walker->pte = pte;
+	pgprintk("%s: pte %llx\n", __FUNCTION__, (u64)pte);
 	return 1;
 
 not_present:
@@ -209,8 +183,6 @@ err:
 		walker->error_code |= PFERR_USER_MASK;
 	if (fetch_fault)
 		walker->error_code |= PFERR_FETCH_MASK;
-	if (walker->table)
-		kunmap_atomic(walker->table, KM_USER0);
 	return 0;
 }
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 44/50] KVM: x86 emulator: cmc, clc, cli, sti
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (42 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 43/50] KVM: MMU: Simplify page table walker Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 45/50] KVM: MMU: Add rmap_next(), a helper for walking kvm rmaps Avi Kivity
                     ` (5 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Nitin A Kamble <nitin.a.kamble-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Instruction: cmc, clc, cli, sti
opcodes: 0xf5, 0xf8, 0xfa, 0xfb respectively.

[avi: fix reference to EFLG_IF which is not defined anywhere]

Signed-off-by: Nitin A Kamble <nitin.a.kamble-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/x86_emulate.c |   21 +++++++++++++++++++--
 1 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index 72621c9..af98ea1 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -161,10 +161,10 @@ static u8 opcode_table[256] = {
 	ImplicitOps, SrcImm|ImplicitOps, 0, SrcImmByte|ImplicitOps, 0, 0, 0, 0,
 	/* 0xF0 - 0xF7 */
 	0, 0, 0, 0,
-	ImplicitOps, 0,
+	ImplicitOps, ImplicitOps,
 	ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM,
 	/* 0xF8 - 0xFF */
-	0, 0, 0, 0,
+	ImplicitOps, 0, ImplicitOps, ImplicitOps,
 	0, 0, ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM
 };
 
@@ -1476,6 +1476,23 @@ special_insn:
 	case 0xf4:              /* hlt */
 		ctxt->vcpu->halt_request = 1;
 		goto done;
+	case 0xf5:	/* cmc */
+		/* complement carry flag from eflags reg */
+		ctxt->eflags ^= EFLG_CF;
+		c->dst.type = OP_NONE;	/* Disable writeback. */
+		break;
+	case 0xf8: /* clc */
+		ctxt->eflags &= ~EFLG_CF;
+		c->dst.type = OP_NONE;	/* Disable writeback. */
+		break;
+	case 0xfa: /* cli */
+		ctxt->eflags &= ~X86_EFLAGS_IF;
+		c->dst.type = OP_NONE;	/* Disable writeback. */
+		break;
+	case 0xfb: /* sti */
+		ctxt->eflags |= X86_EFLAGS_IF;
+		c->dst.type = OP_NONE;	/* Disable writeback. */
+		break;
 	}
 	if (c->rep_prefix) {
 		if (c->regs[VCPU_REGS_RCX] == 0) {
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 45/50] KVM: MMU: Add rmap_next(), a helper for walking kvm rmaps
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (43 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 44/50] KVM: x86 emulator: cmc, clc, cli, sti Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 46/50] KVM: MMU: Keep a reverse mapping of non-writable translations Avi Kivity
                     ` (4 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |   45 +++++++++++++++++++++++++++++++++++----------
 1 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index f52604a..14e54e3 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -456,28 +456,53 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	}
 }
 
-static void rmap_write_protect(struct kvm *kvm, u64 gfn)
+static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
 {
 	struct kvm_rmap_desc *desc;
+	struct kvm_rmap_desc *prev_desc;
+	u64 *prev_spte;
+	int i;
+
+	if (!*rmapp)
+		return NULL;
+	else if (!(*rmapp & 1)) {
+		if (!spte)
+			return (u64 *)*rmapp;
+		return NULL;
+	}
+	desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
+	prev_desc = NULL;
+	prev_spte = NULL;
+	while (desc) {
+		for (i = 0; i < RMAP_EXT && desc->shadow_ptes[i]; ++i) {
+			if (prev_spte == spte)
+				return desc->shadow_ptes[i];
+			prev_spte = desc->shadow_ptes[i];
+		}
+		desc = desc->more;
+	}
+	return NULL;
+}
+
+static void rmap_write_protect(struct kvm *kvm, u64 gfn)
+{
 	unsigned long *rmapp;
 	u64 *spte;
+	u64 *prev_spte;
 
 	gfn = unalias_gfn(kvm, gfn);
 	rmapp = gfn_to_rmap(kvm, gfn);
 
-	while (*rmapp) {
-		if (!(*rmapp & 1))
-			spte = (u64 *)*rmapp;
-		else {
-			desc = (struct kvm_rmap_desc *)(*rmapp & ~1ul);
-			spte = desc->shadow_ptes[0];
-		}
+	spte = rmap_next(kvm, rmapp, NULL);
+	while (spte) {
 		BUG_ON(!spte);
 		BUG_ON(!(*spte & PT_PRESENT_MASK));
 		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
-		rmap_remove(kvm, spte);
-		set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
+		prev_spte = spte;
+		spte = rmap_next(kvm, rmapp, spte);
+		rmap_remove(kvm, prev_spte);
+		set_shadow_pte(prev_spte, *prev_spte & ~PT_WRITABLE_MASK);
 		kvm_flush_remote_tlbs(kvm);
 	}
 }
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 46/50] KVM: MMU: Keep a reverse mapping of non-writable translations
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (44 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 45/50] KVM: MMU: Add rmap_next(), a helper for walking kvm rmaps Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 47/50] KVM: MMU: Make gfn_to_page() always safe Avi Kivity
                     ` (3 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

The current kvm mmu only reverse maps writable translation.  This is used
to write-protect a page in case it becomes a pagetable.

But with swapping support, we need a reverse mapping of read-only pages as
well:  when we evict a page, we need to remove any mapping to it, whether
writable or not.

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/mmu.c |   23 +++++++++++------------
 1 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 14e54e3..bbf5eb4 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -211,8 +211,8 @@ static int is_io_pte(unsigned long pte)
 
 static int is_rmap_pte(u64 pte)
 {
-	return (pte & (PT_WRITABLE_MASK | PT_PRESENT_MASK))
-		== (PT_WRITABLE_MASK | PT_PRESENT_MASK);
+	return pte != shadow_trap_nonpresent_pte
+		&& pte != shadow_notrap_nonpresent_pte;
 }
 
 static void set_shadow_pte(u64 *sptep, u64 spte)
@@ -488,7 +488,6 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn)
 {
 	unsigned long *rmapp;
 	u64 *spte;
-	u64 *prev_spte;
 
 	gfn = unalias_gfn(kvm, gfn);
 	rmapp = gfn_to_rmap(kvm, gfn);
@@ -497,13 +496,11 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn)
 	while (spte) {
 		BUG_ON(!spte);
 		BUG_ON(!(*spte & PT_PRESENT_MASK));
-		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
-		prev_spte = spte;
-		spte = rmap_next(kvm, rmapp, spte);
-		rmap_remove(kvm, prev_spte);
-		set_shadow_pte(prev_spte, *prev_spte & ~PT_WRITABLE_MASK);
+		if (is_writeble_pte(*spte))
+			set_shadow_pte(spte, *spte & ~PT_WRITABLE_MASK);
 		kvm_flush_remote_tlbs(kvm);
+		spte = rmap_next(kvm, rmapp, spte);
 	}
 }
 
@@ -908,14 +905,18 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 		table = __va(table_addr);
 
 		if (level == 1) {
+			int was_rmapped;
+
 			pte = table[index];
+			was_rmapped = is_rmap_pte(pte);
 			if (is_shadow_present_pte(pte) && is_writeble_pte(pte))
 				return 0;
 			mark_page_dirty(vcpu->kvm, v >> PAGE_SHIFT);
 			page_header_update_slot(vcpu->kvm, table, v);
 			table[index] = p | PT_PRESENT_MASK | PT_WRITABLE_MASK |
 								PT_USER_MASK;
-			rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
+			if (!was_rmapped)
+				rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
 			return 0;
 		}
 
@@ -1424,10 +1425,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 		pt = page->spt;
 		for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
 			/* avoid RMW */
-			if (pt[i] & PT_WRITABLE_MASK) {
-				rmap_remove(kvm, &pt[i]);
+			if (pt[i] & PT_WRITABLE_MASK)
 				pt[i] &= ~PT_WRITABLE_MASK;
-			}
 	}
 }
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 47/50] KVM: MMU: Make gfn_to_page() always safe
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (45 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 46/50] KVM: MMU: Keep a reverse mapping of non-writable translations Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 48/50] KVM: MMU: Partial swapping of guest memory Avi Kivity
                     ` (2 subsequent siblings)
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Izik Eidus

From: Izik Eidus <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

In case the page is not present in the guest memory map, return a dummy
page the guest can scribble on.

This simplifies error checking in its users.

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    3 ++-
 drivers/kvm/kvm_main.c    |   26 ++++++++++++++------------
 drivers/kvm/mmu.c         |   16 +++++-----------
 drivers/kvm/paging_tmpl.h |    7 ++-----
 4 files changed, 23 insertions(+), 29 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 6ae7b63..0c17c76 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -565,8 +565,9 @@ static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; }
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva);
 struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
 
-extern hpa_t bad_page_address;
+extern struct page *bad_page;
 
+int is_error_page(struct page *page);
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index a1a3be9..ebfb967 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -993,6 +993,12 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	return r;
 }
 
+int is_error_page(struct page *page)
+{
+	return page == bad_page;
+}
+EXPORT_SYMBOL_GPL(is_error_page);
+
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
@@ -1034,7 +1040,7 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
 	if (!slot)
-		return NULL;
+		return bad_page;
 	return slot->phys_mem[gfn - slot->base_gfn];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
@@ -1054,7 +1060,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -1092,7 +1098,7 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -1130,7 +1136,7 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -3068,7 +3074,7 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
 	page = gfn_to_page(kvm, pgoff);
-	if (!page)
+	if (is_error_page(page))
 		return NOPAGE_SIGBUS;
 	get_page(page);
 	if (type != NULL)
@@ -3383,7 +3389,7 @@ static struct sys_device kvm_sysdev = {
 	.cls = &kvm_sysdev_class,
 };
 
-hpa_t bad_page_address;
+struct page *bad_page;
 
 static inline
 struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
@@ -3512,7 +3518,6 @@ EXPORT_SYMBOL_GPL(kvm_exit_x86);
 
 static __init int kvm_init(void)
 {
-	static struct page *bad_page;
 	int r;
 
 	r = kvm_mmu_module_init();
@@ -3523,16 +3528,13 @@ static __init int kvm_init(void)
 
 	kvm_arch_init();
 
-	bad_page = alloc_page(GFP_KERNEL);
+	bad_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
 
 	if (bad_page == NULL) {
 		r = -ENOMEM;
 		goto out;
 	}
 
-	bad_page_address = page_to_pfn(bad_page) << PAGE_SHIFT;
-	memset(__va(bad_page_address), 0, PAGE_SIZE);
-
 	return 0;
 
 out:
@@ -3545,7 +3547,7 @@ out4:
 static __exit void kvm_exit(void)
 {
 	kvm_exit_debug();
-	__free_page(pfn_to_page(bad_page_address >> PAGE_SHIFT));
+	__free_page(bad_page);
 	kvm_mmu_module_exit();
 }
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index bbf5eb4..2ad14fb 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -850,23 +850,17 @@ static void page_header_update_slot(struct kvm *kvm, void *pte, gpa_t gpa)
 	__set_bit(slot, &page_head->slot_bitmap);
 }
 
-hpa_t safe_gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
-{
-	hpa_t hpa = gpa_to_hpa(kvm, gpa);
-
-	return is_error_hpa(hpa) ? bad_page_address | (gpa & ~PAGE_MASK): hpa;
-}
-
 hpa_t gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
 {
 	struct page *page;
+	hpa_t hpa;
 
 	ASSERT((gpa & HPA_ERR_MASK) == 0);
 	page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
-	if (!page)
-		return gpa | HPA_ERR_MASK;
-	return ((hpa_t)page_to_pfn(page) << PAGE_SHIFT)
-		| (gpa & (PAGE_SIZE-1));
+	hpa = ((hpa_t)page_to_pfn(page) << PAGE_SHIFT) | (gpa & (PAGE_SIZE-1));
+	if (is_error_page(page))
+		return hpa | HPA_ERR_MASK;
+	return hpa;
 }
 
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index bab1b7f..572e5b6 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -72,8 +72,6 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			    struct kvm_vcpu *vcpu, gva_t addr,
 			    int write_fault, int user_fault, int fetch_fault)
 {
-	hpa_t hpa;
-	struct kvm_memory_slot *slot;
 	struct page *page;
 	pt_element_t *table;
 	pt_element_t pte;
@@ -105,9 +103,8 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		pgprintk("%s: table_gfn[%d] %lx\n", __FUNCTION__,
 			 walker->level - 1, table_gfn);
 
-		slot = gfn_to_memslot(vcpu->kvm, table_gfn);
-		hpa = safe_gpa_to_hpa(vcpu->kvm, pte & PT64_BASE_ADDR_MASK);
-		page = pfn_to_page(hpa >> PAGE_SHIFT);
+		page = gfn_to_page(vcpu->kvm, (pte & PT64_BASE_ADDR_MASK)
+				   >> PAGE_SHIFT);
 
 		table = kmap_atomic(page, KM_USER0);
 		pte = table[index];
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 48/50] KVM: MMU: Partial swapping of guest memory
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (46 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 47/50] KVM: MMU: Make gfn_to_page() always safe Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 49/50] KVM: Use virtual cpu accounting if available for guest times Avi Kivity
  2007-12-23 14:51   ` [PATCH 50/50] KVM: Allocate userspace memory for older userspace Avi Kivity
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Izik Eidus

From: Izik Eidus <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>

This allows guest memory to be swapped.  Pages which are currently mapped
via shadow page tables are pinned into memory, but all other pages can
be freely swapped.

The patch makes gfn_to_page() elevate the page's reference count, and
introduces kvm_release_page() that pairs with it.

Signed-off-by: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h         |    2 +
 drivers/kvm/kvm_main.c    |   83 +++++++++++++++++++++++++--------------------
 drivers/kvm/mmu.c         |   14 +++++++-
 drivers/kvm/paging_tmpl.h |   26 ++++++++++++--
 4 files changed, 84 insertions(+), 41 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 0c17c76..df0711c 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -409,6 +409,7 @@ struct kvm_memory_slot {
 	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
 	int user_alloc; /* user allocated memory */
+	unsigned long userspace_addr;
 };
 
 struct kvm {
@@ -570,6 +571,7 @@ extern struct page *bad_page;
 int is_error_page(struct page *page);
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
+void kvm_release_page(struct page *page);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 			int len);
 int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index ebfb967..1c64047 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -300,19 +300,6 @@ static struct kvm *kvm_create_vm(void)
 	return kvm;
 }
 
-static void kvm_free_userspace_physmem(struct kvm_memory_slot *free)
-{
-	int i;
-
-	for (i = 0; i < free->npages; ++i) {
-		if (free->phys_mem[i]) {
-			if (!PageReserved(free->phys_mem[i]))
-				SetPageDirty(free->phys_mem[i]);
-			page_cache_release(free->phys_mem[i]);
-		}
-	}
-}
-
 static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
 {
 	int i;
@@ -330,9 +317,7 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 {
 	if (!dont || free->phys_mem != dont->phys_mem)
 		if (free->phys_mem) {
-			if (free->user_alloc)
-				kvm_free_userspace_physmem(free);
-			else
+			if (!free->user_alloc)
 				kvm_free_kernel_physmem(free);
 			vfree(free->phys_mem);
 		}
@@ -361,7 +346,7 @@ static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
 
 	for (i = 0; i < ARRAY_SIZE(vcpu->pio.guest_pages); ++i)
 		if (vcpu->pio.guest_pages[i]) {
-			__free_page(vcpu->pio.guest_pages[i]);
+			kvm_release_page(vcpu->pio.guest_pages[i]);
 			vcpu->pio.guest_pages[i] = NULL;
 		}
 }
@@ -752,19 +737,8 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 		memset(new.phys_mem, 0, npages * sizeof(struct page *));
 		memset(new.rmap, 0, npages * sizeof(*new.rmap));
 		if (user_alloc) {
-			unsigned long pages_num;
-
 			new.user_alloc = 1;
-			down_read(&current->mm->mmap_sem);
-
-			pages_num = get_user_pages(current, current->mm,
-						   mem->userspace_addr,
-						   npages, 1, 0, new.phys_mem,
-						   NULL);
-
-			up_read(&current->mm->mmap_sem);
-			if (pages_num != npages)
-				goto out_unlock;
+			new.userspace_addr = mem->userspace_addr;
 		} else {
 			for (i = 0; i < npages; ++i) {
 				new.phys_mem[i] = alloc_page(GFP_HIGHUSER
@@ -1039,12 +1013,39 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
-	if (!slot)
+	if (!slot) {
+		get_page(bad_page);
 		return bad_page;
+	}
+	if (slot->user_alloc) {
+		struct page *page[1];
+		int npages;
+
+		down_read(&current->mm->mmap_sem);
+		npages = get_user_pages(current, current->mm,
+					slot->userspace_addr
+					+ (gfn - slot->base_gfn) * PAGE_SIZE, 1,
+					1, 0, page, NULL);
+		up_read(&current->mm->mmap_sem);
+		if (npages != 1) {
+			get_page(bad_page);
+			return bad_page;
+		}
+		return page[0];
+	}
+	get_page(slot->phys_mem[gfn - slot->base_gfn]);
 	return slot->phys_mem[gfn - slot->base_gfn];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
+void kvm_release_page(struct page *page)
+{
+	if (!PageReserved(page))
+		SetPageDirty(page);
+	put_page(page);
+}
+EXPORT_SYMBOL_GPL(kvm_release_page);
+
 static int next_segment(unsigned long len, int offset)
 {
 	if (len > PAGE_SIZE - offset)
@@ -1060,13 +1061,16 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memcpy(data, page_virt + offset, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
+	kvm_release_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_page);
@@ -1098,14 +1102,17 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memcpy(page_virt + offset, data, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
 	mark_page_dirty(kvm, gfn);
+	kvm_release_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_page);
@@ -1136,13 +1143,16 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memset(page_virt + offset, 0, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
+	kvm_release_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_clear_guest_page);
@@ -2070,8 +2080,6 @@ int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 	for (i = 0; i < nr_pages; ++i) {
 		mutex_lock(&vcpu->kvm->lock);
 		page = gva_to_page(vcpu, address + i * PAGE_SIZE);
-		if (page)
-			get_page(page);
 		vcpu->pio.guest_pages[i] = page;
 		mutex_unlock(&vcpu->kvm->lock);
 		if (!page) {
@@ -3074,9 +3082,10 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 
 	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
 	page = gfn_to_page(kvm, pgoff);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		kvm_release_page(page);
 		return NOPAGE_SIGBUS;
-	get_page(page);
+	}
 	if (type != NULL)
 		*type = VM_FAULT_MINOR;
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index 2ad14fb..5d7af4b 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -425,6 +425,8 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	if (!is_rmap_pte(*spte))
 		return;
 	page = page_header(__pa(spte));
+	kvm_release_page(pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >>
+			 PAGE_SHIFT));
 	rmapp = gfn_to_rmap(kvm, page->gfns[spte - page->spt]);
 	if (!*rmapp) {
 		printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte);
@@ -911,6 +913,8 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 								PT_USER_MASK;
 			if (!was_rmapped)
 				rmap_add(vcpu, &table[index], v >> PAGE_SHIFT);
+			else
+				kvm_release_page(pfn_to_page(p >> PAGE_SHIFT));
 			return 0;
 		}
 
@@ -925,6 +929,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, hpa_t p)
 						     1, 3, &table[index]);
 			if (!new_table) {
 				pgprintk("nonpaging_map: ENOMEM\n");
+				kvm_release_page(pfn_to_page(p >> PAGE_SHIFT));
 				return -ENOMEM;
 			}
 
@@ -1039,8 +1044,11 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 
 	paddr = gpa_to_hpa(vcpu->kvm, addr & PT64_BASE_ADDR_MASK);
 
-	if (is_error_hpa(paddr))
+	if (is_error_hpa(paddr)) {
+		kvm_release_page(pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+				 >> PAGE_SHIFT));
 		return 1;
+	}
 
 	return nonpaging_map(vcpu, addr & PAGE_MASK, paddr);
 }
@@ -1507,6 +1515,7 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 		} else {
 			gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, va);
 			hpa_t hpa = gpa_to_hpa(vcpu, gpa);
+			struct page *page;
 
 			if (is_shadow_present_pte(ent)
 			    && (ent & PT64_BASE_ADDR_MASK) != hpa)
@@ -1519,6 +1528,9 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte,
 				 && !is_error_hpa(hpa))
 				printk(KERN_ERR "audit: (%s) notrap shadow,"
 				       " valid guest gva %lx\n", audit_msg, va);
+			page = pfn_to_page((gpa & PT64_BASE_ADDR_MASK)
+					   >> PAGE_SHIFT);
+			kvm_release_page(page);
 
 		}
 	}
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index 572e5b6..0f0266a 100644
--- a/drivers/kvm/paging_tmpl.h
+++ b/drivers/kvm/paging_tmpl.h
@@ -72,7 +72,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 			    struct kvm_vcpu *vcpu, gva_t addr,
 			    int write_fault, int user_fault, int fetch_fault)
 {
-	struct page *page;
+	struct page *page = NULL;
 	pt_element_t *table;
 	pt_element_t pte;
 	gfn_t table_gfn;
@@ -149,6 +149,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 
 		walker->inherited_ar &= pte;
 		--walker->level;
+		kvm_release_page(page);
 	}
 
 	if (write_fault && !is_dirty_pte(pte)) {
@@ -162,6 +163,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 		kvm_mmu_pte_write(vcpu, pte_gpa, (u8 *)&pte, sizeof(pte));
 	}
 
+	kvm_release_page(page);
 	walker->pte = pte;
 	pgprintk("%s: pte %llx\n", __FUNCTION__, (u64)pte);
 	return 1;
@@ -180,6 +182,8 @@ err:
 		walker->error_code |= PFERR_USER_MASK;
 	if (fetch_fault)
 		walker->error_code |= PFERR_FETCH_MASK;
+	if (page)
+		kvm_release_page(page);
 	return 0;
 }
 
@@ -223,6 +227,8 @@ static void FNAME(set_pte_common)(struct kvm_vcpu *vcpu,
 	if (is_error_hpa(paddr)) {
 		set_shadow_pte(shadow_pte,
 			       shadow_trap_nonpresent_pte | PT_SHADOW_IO_MARK);
+		kvm_release_page(pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+					     >> PAGE_SHIFT));
 		return;
 	}
 
@@ -260,9 +266,20 @@ unshadowed:
 	pgprintk("%s: setting spte %llx\n", __FUNCTION__, spte);
 	set_shadow_pte(shadow_pte, spte);
 	page_header_update_slot(vcpu->kvm, shadow_pte, gaddr);
-	if (!was_rmapped)
+	if (!was_rmapped) {
 		rmap_add(vcpu, shadow_pte, (gaddr & PT64_BASE_ADDR_MASK)
 			 >> PAGE_SHIFT);
+		if (!is_rmap_pte(*shadow_pte)) {
+			struct page *page;
+
+			page = pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+					   >> PAGE_SHIFT);
+			kvm_release_page(page);
+		}
+	}
+	else
+		kvm_release_page(pfn_to_page((paddr & PT64_BASE_ADDR_MASK)
+				 >> PAGE_SHIFT));
 	if (!ptwrite || !*ptwrite)
 		vcpu->last_pte_updated = shadow_pte;
 }
@@ -486,19 +503,22 @@ static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu,
 {
 	int i;
 	pt_element_t *gpt;
+	struct page *page;
 
 	if (sp->role.metaphysical || PTTYPE == 32) {
 		nonpaging_prefetch_page(vcpu, sp);
 		return;
 	}
 
-	gpt = kmap_atomic(gfn_to_page(vcpu->kvm, sp->gfn), KM_USER0);
+	page = gfn_to_page(vcpu->kvm, sp->gfn);
+	gpt = kmap_atomic(page, KM_USER0);
 	for (i = 0; i < PT64_ENT_PER_PAGE; ++i)
 		if (is_present_pte(gpt[i]))
 			sp->spt[i] = shadow_trap_nonpresent_pte;
 		else
 			sp->spt[i] = shadow_notrap_nonpresent_pte;
 	kunmap_atomic(gpt, KM_USER0);
+	kvm_release_page(page);
 }
 
 #undef pt_element_t
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 49/50] KVM: Use virtual cpu accounting if available for guest times.
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (47 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 48/50] KVM: MMU: Partial swapping of guest memory Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  2007-12-23 14:51   ` [PATCH 50/50] KVM: Allocate userspace memory for older userspace Avi Kivity
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: Christian Borntraeger

From: Christian Borntraeger <borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>

ppc and s390 offer the possibility to track process times precisely
by looking at cpu timer on every context switch, irq, softirq etc.
We can use that infrastructure as well for guest time accounting.
We need to account the used time before we change the state.
This patch adds a call to account_system_vtime to kvm_guest_enter
and kvm_guest exit. If CONFIG_VIRT_CPU_ACCOUNTING is not set,
account_system_vtime is defined in hardirq.h as an empty function,
which means this patch does not change the behaviour on other
platforms.

I compile tested this patch on x86 and function tested the patch on
s390.

Signed-off-by: Christian Borntraeger <borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index df0711c..e8a21e8 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -7,6 +7,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/hardirq.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/spinlock.h>
@@ -671,11 +672,13 @@ __init void kvm_arch_init(void);
 
 static inline void kvm_guest_enter(void)
 {
+	account_system_vtime(current);
 	current->flags |= PF_VCPU;
 }
 
 static inline void kvm_guest_exit(void)
 {
+	account_system_vtime(current);
 	current->flags &= ~PF_VCPU;
 }
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 50/50] KVM: Allocate userspace memory for older userspace
       [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
                     ` (48 preceding siblings ...)
  2007-12-23 14:51   ` [PATCH 49/50] KVM: Use virtual cpu accounting if available for guest times Avi Kivity
@ 2007-12-23 14:51   ` Avi Kivity
  49 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 14:51 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

From: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

Allocate a userspace buffer for older userspaces.  Also eliminate phys_mem
buffer.  The memset() in kvmctl really kills initial memory usage but swapping
works even with old userspaces.

A side effect is that maximum guest side is reduced for older userspace on
i386.

Signed-off-by: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
---
 drivers/kvm/kvm.h      |    2 -
 drivers/kvm/kvm_main.c |   83 +++++++++++++++++------------------------------
 2 files changed, 30 insertions(+), 55 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index e8a21e8..eb006ed 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -406,10 +406,8 @@ struct kvm_memory_slot {
 	gfn_t base_gfn;
 	unsigned long npages;
 	unsigned long flags;
-	struct page **phys_mem;
 	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
-	int user_alloc; /* user allocated memory */
 	unsigned long userspace_addr;
 };
 
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 1c64047..3aec716 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -42,6 +42,7 @@
 #include <linux/profile.h>
 #include <linux/kvm_para.h>
 #include <linux/pagemap.h>
+#include <linux/mman.h>
 
 #include <asm/processor.h>
 #include <asm/msr.h>
@@ -300,36 +301,21 @@ static struct kvm *kvm_create_vm(void)
 	return kvm;
 }
 
-static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
-{
-	int i;
-
-	for (i = 0; i < free->npages; ++i)
-		if (free->phys_mem[i])
-			__free_page(free->phys_mem[i]);
-}
-
 /*
  * Free any memory in @free but not in @dont.
  */
 static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 				  struct kvm_memory_slot *dont)
 {
-	if (!dont || free->phys_mem != dont->phys_mem)
-		if (free->phys_mem) {
-			if (!free->user_alloc)
-				kvm_free_kernel_physmem(free);
-			vfree(free->phys_mem);
-		}
 	if (!dont || free->rmap != dont->rmap)
 		vfree(free->rmap);
 
 	if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
 		vfree(free->dirty_bitmap);
 
-	free->phys_mem = NULL;
 	free->npages = 0;
 	free->dirty_bitmap = NULL;
+	free->rmap = NULL;
 }
 
 static void kvm_free_physmem(struct kvm *kvm)
@@ -712,10 +698,6 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 			goto out_unlock;
 	}
 
-	/* Deallocate if slot is being removed */
-	if (!npages)
-		new.phys_mem = NULL;
-
 	/* Free page dirty bitmap if unneeded */
 	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
 		new.dirty_bitmap = NULL;
@@ -723,29 +705,27 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 	r = -ENOMEM;
 
 	/* Allocate if a slot is being created */
-	if (npages && !new.phys_mem) {
-		new.phys_mem = vmalloc(npages * sizeof(struct page *));
-
-		if (!new.phys_mem)
-			goto out_unlock;
-
+	if (npages && !new.rmap) {
 		new.rmap = vmalloc(npages * sizeof(struct page *));
 
 		if (!new.rmap)
 			goto out_unlock;
 
-		memset(new.phys_mem, 0, npages * sizeof(struct page *));
 		memset(new.rmap, 0, npages * sizeof(*new.rmap));
-		if (user_alloc) {
-			new.user_alloc = 1;
+
+		if (user_alloc)
 			new.userspace_addr = mem->userspace_addr;
-		} else {
-			for (i = 0; i < npages; ++i) {
-				new.phys_mem[i] = alloc_page(GFP_HIGHUSER
-							     | __GFP_ZERO);
-				if (!new.phys_mem[i])
-					goto out_unlock;
-			}
+		else {
+			down_write(&current->mm->mmap_sem);
+			new.userspace_addr = do_mmap(NULL, 0,
+						     npages * PAGE_SIZE,
+						     PROT_READ | PROT_WRITE,
+						     MAP_SHARED | MAP_ANONYMOUS,
+						     0);
+			up_write(&current->mm->mmap_sem);
+
+			if (IS_ERR((void *)new.userspace_addr))
+				goto out_unlock;
 		}
 	}
 
@@ -1010,6 +990,8 @@ struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 {
 	struct kvm_memory_slot *slot;
+	struct page *page[1];
+	int npages;
 
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
@@ -1017,24 +999,19 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 		get_page(bad_page);
 		return bad_page;
 	}
-	if (slot->user_alloc) {
-		struct page *page[1];
-		int npages;
-
-		down_read(&current->mm->mmap_sem);
-		npages = get_user_pages(current, current->mm,
-					slot->userspace_addr
-					+ (gfn - slot->base_gfn) * PAGE_SIZE, 1,
-					1, 0, page, NULL);
-		up_read(&current->mm->mmap_sem);
-		if (npages != 1) {
-			get_page(bad_page);
-			return bad_page;
-		}
-		return page[0];
+
+	down_read(&current->mm->mmap_sem);
+	npages = get_user_pages(current, current->mm,
+				slot->userspace_addr
+				+ (gfn - slot->base_gfn) * PAGE_SIZE, 1,
+				1, 0, page, NULL);
+	up_read(&current->mm->mmap_sem);
+	if (npages != 1) {
+		get_page(bad_page);
+		return bad_page;
 	}
-	get_page(slot->phys_mem[gfn - slot->base_gfn]);
-	return slot->phys_mem[gfn - slot->base_gfn];
+
+	return page[0];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
-- 
1.5.3.7


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 27/50] KVM: Support assigning userspace memory to the guest
       [not found]     ` <1198421495-31481-28-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-12-23 18:16       ` Avi Kivity
  0 siblings, 0 replies; 52+ messages in thread
From: Avi Kivity @ 2007-12-23 18:16 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Avi Kivity wrote:
> From: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
>
> Instead of having the kernel allocate memory to the guest, let userspace
> allocate it and pass the address to the kernel.
>
> This is required for s390 support, but also enables features like memory
> sharing and using hugetlbfs backed memory.
>
>   

[...]

> @@ -728,11 +752,27 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
>  
>  		memset(new.phys_mem, 0, npages * sizeof(struct page *));
>  		memset(new.rmap, 0, npages * sizeof(*new.rmap));
> -		for (i = 0; i < npages; ++i) {
> -			new.phys_mem[i] = alloc_page(GFP_HIGHUSER
> -						     | __GFP_ZERO);
> -			if (!new.phys_mem[i])
> +		if (user_alloc) {
> +			unsigned long pages_num;
> +
> +			new.user_alloc = 1;
> +			down_read(&current->mm->mmap_sem);
> +
> +			pages_num = get_user_pages(current, current->mm,
> +						   mem->userspace_addr,
> +						   npages, 1, 0, new.phys_mem,
> +						   NULL);
> +
>   

I just combined a patch that changes the 'force' parameter to 
get_user_pages from 0 to 1, into this patch, to avoid introducing a bug 
and its fix in the same patchset.  I won't be resending this patch since 
the change is too trivial.  Same change applies to patch 48, "KVM: MMU: 
Partial swapping of guest memory".

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2007-12-23 18:16 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-23 14:50 [PATCH 00/50] KVM patch queue review for 2.6.25 merge window (part I) Avi Kivity
     [not found] ` <1198421495-31481-1-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-23 14:50   ` [PATCH 01/50] KVM: x86 emulator: Add vmmcall/vmcall to x86_emulate (v3) Avi Kivity
2007-12-23 14:50   ` [PATCH 02/50] KVM: Refactor hypercall infrastructure (v3) Avi Kivity
2007-12-23 14:50   ` [PATCH 03/50] KVM: x86 emulator: remove unused functions Avi Kivity
2007-12-23 14:50   ` [PATCH 04/50] KVM: x86 emulator: move all x86_emulate_memop() to a structure Avi Kivity
2007-12-23 14:50   ` [PATCH 05/50] KVM: x86 emulator: move all decoding process to function x86_decode_insn() Avi Kivity
2007-12-23 14:50   ` [PATCH 06/50] KVM: emulate_instruction() calls now x86_decode_insn() and x86_emulate_insn() Avi Kivity
2007-12-23 14:50   ` [PATCH 07/50] KVM: Call x86_decode_insn() only when needed Avi Kivity
2007-12-23 14:50   ` [PATCH 08/50] KVM: VMX: Further reduce efer reloads Avi Kivity
2007-12-23 14:50   ` [PATCH 09/50] KVM: Allow not-present guest page faults to bypass kvm Avi Kivity
2007-12-23 14:50   ` [PATCH 10/50] KVM: MMU: Make flooding detection work when guest page faults are bypassed Avi Kivity
2007-12-23 14:50   ` [PATCH 11/50] KVM: MMU: Ignore reserved bits in cr3 in non-pae mode Avi Kivity
2007-12-23 14:50   ` [PATCH 12/50] KVM: x86 emulator: split some decoding into functions for readability Avi Kivity
2007-12-23 14:50   ` [PATCH 13/50] KVM: x86 emulator: remove _eflags and use directly ctxt->eflags Avi Kivity
2007-12-23 14:50   ` [PATCH 14/50] KVM: x86 emulator: Remove no_wb, use dst.type = OP_NONE instead Avi Kivity
2007-12-23 14:51   ` [PATCH 15/50] KVM: x86_emulator: no writeback for bt Avi Kivity
2007-12-23 14:51   ` [PATCH 16/50] KVM: Purify x86_decode_insn() error case management Avi Kivity
2007-12-23 14:51   ` [PATCH 17/50] KVM: x86 emulator: Any legacy prefix after a REX prefix nullifies its effect Avi Kivity
2007-12-23 14:51   ` [PATCH 18/50] KVM: VMX: Don't clear the vmcs if the vcpu is not loaded on any processor Avi Kivity
2007-12-23 14:51   ` [PATCH 19/50] KVM: VMX: Simplify vcpu_clear() Avi Kivity
2007-12-23 14:51   ` [PATCH 20/50] KVM: Remove the usage of page->private field by rmap Avi Kivity
2007-12-23 14:51   ` [PATCH 21/50] KVM: Add general accessors to read and write guest memory Avi Kivity
2007-12-23 14:51   ` [PATCH 22/50] KVM: Allow dynamic allocation of the mmu shadow cache size Avi Kivity
2007-12-23 14:51   ` [PATCH 23/50] KVM: Add kvm_free_lapic() to pair with kvm_create_lapic() Avi Kivity
2007-12-23 14:51   ` [PATCH 24/50] KVM: Hoist kvm_create_lapic() into kvm_vcpu_init() Avi Kivity
2007-12-23 14:51   ` [PATCH 25/50] KVM: Remove gratuitous casts from lapic.c Avi Kivity
2007-12-23 14:51   ` [PATCH 26/50] KVM: CodingStyle cleanup Avi Kivity
2007-12-23 14:51   ` [PATCH 27/50] KVM: Support assigning userspace memory to the guest Avi Kivity
     [not found]     ` <1198421495-31481-28-git-send-email-avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-23 18:16       ` Avi Kivity
2007-12-23 14:51   ` [PATCH 28/50] KVM: Move x86 msr handling to new files x86.[ch] Avi Kivity
2007-12-23 14:51   ` [PATCH 29/50] KVM: MMU: Clean up MMU functions to take struct kvm when appropriate Avi Kivity
2007-12-23 14:51   ` [PATCH 30/50] KVM: MMU: More struct kvm_vcpu -> struct kvm cleanups Avi Kivity
2007-12-23 14:51   ` [PATCH 31/50] KVM: Move guest pte dirty bit management to the guest pagetable walker Avi Kivity
2007-12-23 14:51   ` [PATCH 32/50] KVM: MMU: Fix nx access bit for huge pages Avi Kivity
2007-12-23 14:51   ` [PATCH 33/50] KVM: MMU: Disable write access on clean large pages Avi Kivity
2007-12-23 14:51   ` [PATCH 34/50] KVM: MMU: Instantiate real-mode shadows as user writable shadows Avi Kivity
2007-12-23 14:51   ` [PATCH 35/50] KVM: MMU: Move dirty bit updates to a separate function Avi Kivity
2007-12-23 14:51   ` [PATCH 36/50] KVM: MMU: When updating the dirty bit, inform the mmu about it Avi Kivity
2007-12-23 14:51   ` [PATCH 37/50] KVM: Portability: split kvm_vcpu_ioctl Avi Kivity
2007-12-23 14:51   ` [PATCH 38/50] KVM: apic round robin cleanup Avi Kivity
2007-12-23 14:51   ` [PATCH 39/50] KVM: Add some \n in ioapic_debug() Avi Kivity
2007-12-23 14:51   ` [PATCH 40/50] KVM: Move apic timer interrupt backlog processing to common code Avi Kivity
2007-12-23 14:51   ` [PATCH 41/50] KVM: Rename KVM_TLB_FLUSH to KVM_REQ_TLB_FLUSH Avi Kivity
2007-12-23 14:51   ` [PATCH 42/50] KVM: x86 emulator: Implement emulation of instruction: inc & dec Avi Kivity
2007-12-23 14:51   ` [PATCH 43/50] KVM: MMU: Simplify page table walker Avi Kivity
2007-12-23 14:51   ` [PATCH 44/50] KVM: x86 emulator: cmc, clc, cli, sti Avi Kivity
2007-12-23 14:51   ` [PATCH 45/50] KVM: MMU: Add rmap_next(), a helper for walking kvm rmaps Avi Kivity
2007-12-23 14:51   ` [PATCH 46/50] KVM: MMU: Keep a reverse mapping of non-writable translations Avi Kivity
2007-12-23 14:51   ` [PATCH 47/50] KVM: MMU: Make gfn_to_page() always safe Avi Kivity
2007-12-23 14:51   ` [PATCH 48/50] KVM: MMU: Partial swapping of guest memory Avi Kivity
2007-12-23 14:51   ` [PATCH 49/50] KVM: Use virtual cpu accounting if available for guest times Avi Kivity
2007-12-23 14:51   ` [PATCH 50/50] KVM: Allocate userspace memory for older userspace Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox