* [RFC][PATCH]Memory mapped TPR shadow feature enabling
@ 2007-09-25 5:52 Yang, Sheng
[not found] ` <DB3BD37E3533EE46BED2FBA80995557F87DA24-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 18+ messages in thread
From: Yang, Sheng @ 2007-09-25 5:52 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
[-- Attachment #1: Type: text/plain, Size: 15021 bytes --]
These patches enable memory mapped TPR shadow (FlexPriority).
Since TPR is accessed very frequently by 32bit Windows, especially SMP
guest, with FlexPriority enabled, we saw significant performance gain.
The issue is: FlexPriority needs to add a memory slot to the vm to make
shadow work with APIC access page.
We don't like the idea to add a memory slot, but no better choice now.
Our propose is to add p2m table to KVM, while seems this is still a long
way to go.
BTW: I didn't use the offset(or other info) provide by CPU when handling
APIC access vmexit. Instead, I used a bit in cmd_type(including
no_decode) to tell emulator decode memory operand by itself when
necessary. That's because I only got the guest physical address when
handling APIC access vmexit, but emulator need a guest virtual address
to fit its flow. I have tried some ways, and current solution seems the
most proper one.
--
>From 3e83b579d0e9368f0f8223c24eac9898b9623aa2 Mon Sep 17 00:00:00 2001
From: Sheng Yang <sheng.yang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date: Fri, 14 Sep 2007 09:51:54 +0800
Subject: [PATCH] Add a slot for apic access usage, not elegant but no
choice
---
user/kvmctl.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/user/kvmctl.c b/user/kvmctl.c
index f358dc1..7e75945 100644
--- a/user/kvmctl.c
+++ b/user/kvmctl.c
@@ -248,6 +248,7 @@ int kvm_create(kvm_context_t kvm, unsigned long
phys_mem_bytes, void **vm_mem)
unsigned long dosmem = 0xa0000;
unsigned long exmem = 0xc0000;
unsigned long pcimem = 0xf0000000;
+ unsigned long apicmem= 0xfee00000;
unsigned long memory = (phys_mem_bytes + PAGE_SIZE - 1) &
PAGE_MASK;
int fd = kvm->fd;
int zfd;
@@ -267,6 +268,11 @@ int kvm_create(kvm_context_t kvm, unsigned long
phys_mem_bytes, void **vm_mem)
.memory_size = memory < pcimem ? 0 : memory - pcimem,
.guest_phys_addr = 0x100000000,
};
+ struct kvm_memory_region apic_memory = {
+ .slot = 5,
+ .memory_size = PAGE_SIZE,
+ .guest_phys_addr = apicmem,
+ };
if (memory >= pcimem)
extended_memory.memory_size = pcimem - exmem;
@@ -302,9 +308,16 @@ int kvm_create(kvm_context_t kvm, unsigned long
phys_mem_bytes, void **vm_mem)
}
}
+ r = ioctl(fd, KVM_SET_MEMORY_REGION, &apic_memory);
+ if (r == -1) {
+ fprintf(stderr, "kvm_create_memory_region: %m\n");
+ return -1;
+ }
+
kvm_memory_region_save_params(kvm, &low_memory);
kvm_memory_region_save_params(kvm, &extended_memory);
kvm_memory_region_save_params(kvm, &above_4g_memory);
+ kvm_memory_region_save_params(kvm, &apic_memory);
*vm_mem = mmap(NULL, memory, PROT_READ|PROT_WRITE, MAP_SHARED,
fd, 0);
if (*vm_mem == MAP_FAILED) {
--
1.5.2
>From 5b814299e3fb0912b1337749d42e3ef33b2615e7 Mon Sep 17 00:00:00 2001
From: Sheng Yang <sheng.yang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date: Mon, 24 Sep 2007 16:10:40 +0800
Subject: [PATCH] Enable memory-mapped TPR shadow feature
---
drivers/kvm/irq.h | 2 +
drivers/kvm/kvm.h | 2 +-
drivers/kvm/kvm_main.c | 23 ++++++++++---
drivers/kvm/lapic.c | 3 ++
drivers/kvm/vmx.c | 76
+++++++++++++++++++++++++++++++++++++++++---
drivers/kvm/vmx.h | 3 ++
drivers/kvm/x86_emulate.c | 14 +++++++-
drivers/kvm/x86_emulate.h | 4 ++
8 files changed, 112 insertions(+), 15 deletions(-)
diff --git a/drivers/kvm/irq.h b/drivers/kvm/irq.h
index 11fc014..afbfa0c 100644
--- a/drivers/kvm/irq.h
+++ b/drivers/kvm/irq.h
@@ -118,6 +118,8 @@ struct kvm_lapic {
struct kvm_vcpu *vcpu;
struct page *regs_page;
void *regs;
+ struct page *apic_access_page;
+ hpa_t apic_access_hpa;
};
#ifdef DEBUG
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 051cdbe..bb8534a 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -565,7 +565,7 @@ enum emulation_result {
};
int emulate_instruction(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long cr2, u16 error_code, int
no_decode);
+ unsigned long cr2, u16 error_code, int
cmd_type);
void kvm_report_emulation_failure(struct kvm_vcpu *cvpu, const char
*context);
void realmode_lgdt(struct kvm_vcpu *vcpu, u16 size, unsigned long
address);
void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long
address);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index cecdb1b..0ebae4c 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1080,14 +1080,19 @@ static int emulator_read_emulated(unsigned long
addr,
memcpy(val, vcpu->mmio_data, bytes);
vcpu->mmio_read_completed = 0;
return X86EMUL_CONTINUE;
- } else if (emulator_read_std(addr, val, bytes, vcpu)
- == X86EMUL_CONTINUE)
- return X86EMUL_CONTINUE;
+ }
gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
+ if ((gpa & PAGE_MASK) == 0xfee00000)
+ goto mmio;
+
+ if (emulator_read_std(addr, val, bytes, vcpu)
+ == X86EMUL_CONTINUE)
+ return X86EMUL_CONTINUE;
if (gpa == UNMAPPED_GVA)
return X86EMUL_PROPAGATE_FAULT;
+mmio:
/*
* Is this MMIO handled locally?
*/
@@ -1132,6 +1137,9 @@ static int
emulator_write_emulated_onepage(unsigned long addr,
struct kvm_io_device *mmio_dev;
gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
+ if ((gpa & PAGE_MASK) == 0xfee00000)
+ goto mmio;
+
if (gpa == UNMAPPED_GVA) {
kvm_x86_ops->inject_page_fault(vcpu, addr, 2);
return X86EMUL_PROPAGATE_FAULT;
@@ -1140,6 +1148,7 @@ static int
emulator_write_emulated_onepage(unsigned long addr,
if (emulator_write_phys(vcpu, gpa, val, bytes))
return X86EMUL_CONTINUE;
+mmio:
/*
* Is this MMIO handled locally?
*/
@@ -1270,7 +1279,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
struct kvm_run *run,
unsigned long cr2,
u16 error_code,
- int no_decode)
+ int cmd_type)
{
int r = 0;
@@ -1279,8 +1288,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
vcpu->mmio_is_write = 0;
vcpu->pio.string = 0;
+ vcpu->emulate_ctxt.cmd_type = cmd_type;
- if (!no_decode) {
+ if ((cmd_type & EMULCMD_NO_DECODE) == 0) {
int cs_db, cs_l;
kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
@@ -2073,7 +2083,8 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu
*vcpu, struct kvm_run *kvm_run)
vcpu->mmio_read_completed = 1;
vcpu->mmio_needed = 0;
r = emulate_instruction(vcpu, kvm_run,
- vcpu->mmio_fault_cr2, 0, 1);
+ vcpu->mmio_fault_cr2, 0,
+ EMULCMD_NO_DECODE);
if (r == EMULATE_DO_MMIO) {
/*
* Read-modify-write. Back to userspace.
diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
index ddf9f20..b59dcda 100644
--- a/drivers/kvm/lapic.c
+++ b/drivers/kvm/lapic.c
@@ -952,6 +952,9 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
memset(apic->regs, 0, PAGE_SIZE);
apic->vcpu = vcpu;
+ apic->apic_access_page = vcpu->kvm->memslots[5].phys_mem[0];
+ apic->apic_access_hpa = page_to_phys(apic->apic_access_page);
+
hrtimer_init(&apic->timer.dev, CLOCK_MONOTONIC,
HRTIMER_MODE_ABS);
apic->timer.dev.function = apic_timer_fn;
apic->base_address = APIC_DEFAULT_PHYS_BASE;
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 6f1ad90..a7fe87c 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -85,6 +85,7 @@ static struct vmcs_config {
u32 revision_id;
u32 pin_based_exec_ctrl;
u32 cpu_based_exec_ctrl;
+ u32 cpu_based_2nd_exec_ctrl;
u32 vmexit_ctrl;
u32 vmentry_ctrl;
} vmcs_config;
@@ -178,6 +179,29 @@ static inline int vm_need_tpr_shadow(struct kvm
*kvm)
return ((cpu_has_vmx_tpr_shadow()) && (irqchip_in_kernel(kvm)));
}
+static inline int cpu_has_secondary_exec_ctrls(void)
+{
+ return (vmcs_config.cpu_based_exec_ctrl & \
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS);
+}
+
+static inline int vm_need_secondary_exec_ctrls(struct kvm *kvm)
+{
+ return ((cpu_has_secondary_exec_ctrls()) &&
(irqchip_in_kernel(kvm)));
+}
+
+static inline int cpu_has_vmx_virtualize_apic_accesses(void)
+{
+ return (vmcs_config.cpu_based_2nd_exec_ctrl & \
+ SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
+}
+
+static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm)
+{
+ return ((cpu_has_vmx_virtualize_apic_accesses()) && \
+ (irqchip_in_kernel(kvm)));
+}
+
static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr)
{
int i;
@@ -915,6 +939,7 @@ static __init int setup_vmcs_config(struct
vmcs_config *vmcs_conf)
u32 min, opt;
u32 _pin_based_exec_control = 0;
u32 _cpu_based_exec_control = 0;
+ u32 _cpu_based_2nd_exec_control = 0;
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
@@ -932,11 +957,8 @@ static __init int setup_vmcs_config(struct
vmcs_config *vmcs_conf)
CPU_BASED_USE_IO_BITMAPS |
CPU_BASED_MOV_DR_EXITING |
CPU_BASED_USE_TSC_OFFSETING;
-#ifdef CONFIG_X86_64
- opt = CPU_BASED_TPR_SHADOW;
-#else
- opt = 0;
-#endif
+ opt = CPU_BASED_TPR_SHADOW |
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
&_cpu_based_exec_control) < 0)
return -EIO;
@@ -945,6 +967,18 @@ static __init int setup_vmcs_config(struct
vmcs_config *vmcs_conf)
_cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
~CPU_BASED_CR8_STORE_EXITING;
#endif
+ if (_cpu_based_exec_control &
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
+ min = 0;
+ opt = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+ if (adjust_vmx_controls(min, opt,
MSR_IA32_VMX_PROCBASED_CTLS2,
+ &_cpu_based_2nd_exec_control) <
0)
+ return -EIO;
+ }
+#ifndef CONFIG_X86_64
+ if (!(_cpu_based_2nd_exec_control &
+
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
+ _cpu_based_exec_control &= ~CPU_BASED_TPR_SHADOW;
+#endif
min = 0;
#ifdef CONFIG_X86_64
@@ -982,6 +1016,7 @@ static __init int setup_vmcs_config(struct
vmcs_config *vmcs_conf)
vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
+ vmcs_conf->cpu_based_2nd_exec_ctrl =
_cpu_based_2nd_exec_control;
vmcs_conf->vmexit_ctrl = _vmexit_control;
vmcs_conf->vmentry_ctrl = _vmentry_control;
@@ -1532,8 +1567,14 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
CPU_BASED_CR8_LOAD_EXITING;
#endif
}
+ if (!vm_need_secondary_exec_ctrls(vmx->vcpu.kvm))
+ exec_control &= ~CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control);
+ if (vm_need_secondary_exec_ctrls(vmx->vcpu.kvm))
+ vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
+ vmcs_config.cpu_based_2nd_exec_ctrl);
+
vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, !!bypass_guest_pf);
vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, !!bypass_guest_pf);
vmcs_write32(CR3_TARGET_COUNT, 0); /* 22.2.1 */
@@ -1610,6 +1651,9 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
page_to_phys(vmx->vcpu.apic->regs_page));
vmcs_write32(TPR_THRESHOLD, 0);
#endif
+ if (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm))
+ vmcs_write64(APIC_ACCESS_ADDR,
+ vmx->vcpu.apic->apic_access_hpa);
vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
@@ -2100,6 +2144,25 @@ static int handle_vmcall(struct kvm_vcpu *vcpu,
struct kvm_run *kvm_run)
return 1;
}
+static int handle_apic_access(struct kvm_vcpu *vcpu, struct kvm_run
*kvm_run)
+{
+ u64 exit_qualification;
+ enum emulation_result er;
+ unsigned long offset;
+
+ exit_qualification = vmcs_read64(EXIT_QUALIFICATION);
+ offset = exit_qualification & 0xffful;
+
+ er = emulate_instruction(vcpu, kvm_run, 0, 0,
EMULCMD_DECODE_ADDR);
+
+ if (er != EMULATE_DONE) {
+ BUG();
+ return 0;
+ }
+ return 1;
+}
+
+
/*
* The exit handlers return 1 if the exit was handled fully and guest
execution
* may resume. Otherwise they set the kvm_run parameter to indicate
what needs
@@ -2119,7 +2182,8 @@ static int (*kvm_vmx_exit_handlers[])(struct
kvm_vcpu *vcpu,
[EXIT_REASON_PENDING_INTERRUPT] = handle_interrupt_window,
[EXIT_REASON_HLT] = handle_halt,
[EXIT_REASON_VMCALL] = handle_vmcall,
- [EXIT_REASON_TPR_BELOW_THRESHOLD] =
handle_tpr_below_threshold
+ [EXIT_REASON_TPR_BELOW_THRESHOLD] =
handle_tpr_below_threshold,
+ [EXIT_REASON_APIC_ACCESS] = handle_apic_access,
};
static const int kvm_vmx_max_exit_handlers =
diff --git a/drivers/kvm/vmx.h b/drivers/kvm/vmx.h
index fd4e146..07cf1b5 100644
--- a/drivers/kvm/vmx.h
+++ b/drivers/kvm/vmx.h
@@ -89,6 +89,8 @@ enum vmcs_field {
TSC_OFFSET_HIGH = 0x00002011,
VIRTUAL_APIC_PAGE_ADDR = 0x00002012,
VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x00002013,
+ APIC_ACCESS_ADDR = 0x00002014,
+ APIC_ACCESS_ADDR_HIGH = 0x00002015,
VMCS_LINK_POINTER = 0x00002800,
VMCS_LINK_POINTER_HIGH = 0x00002801,
GUEST_IA32_DEBUGCTL = 0x00002802,
@@ -214,6 +216,7 @@ enum vmcs_field {
#define EXIT_REASON_MSR_WRITE 32
#define EXIT_REASON_MWAIT_INSTRUCTION 36
#define EXIT_REASON_TPR_BELOW_THRESHOLD 43
+#define EXIT_REASON_APIC_ACCESS 44
/*
* Interruption-information format
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index f294a49..d04e4c6 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -820,13 +820,17 @@ done_prefixes:
c->src.bytes = 4;
goto srcmem_common;
case SrcMem:
- c->src.bytes = (c->d & ByteOp) ? 1 :
- c->op_bytes;
+ c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
/* Don't fetch the address for invlpg: it could be
unmapped. */
if (c->twobyte && c->b == 0x01
&& c->modrm_reg == 7)
break;
srcmem_common:
+ if (((ctxt->cmd_type & EMULCMD_DECODE_ADDR) != 0) &&
+ (c->modrm_ea == 0)) {
+ ctxt->cr2 = insn_fetch(u32, c->src.bytes,
c->eip);
+ c->eip -= c->src.bytes;
+ }
c->src.type = OP_MEM;
break;
case SrcImm:
@@ -888,6 +892,12 @@ done_prefixes:
}
break;
case DstMem:
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+ if (((ctxt->cmd_type & EMULCMD_DECODE_ADDR) != 0) &&
+ (c->modrm_ea == 0)) {
+ ctxt->cr2 = insn_fetch(u32, c->dst.bytes,
c->eip);
+ c->eip -= c->dst.bytes;
+ }
c->dst.type = OP_MEM;
break;
}
diff --git a/drivers/kvm/x86_emulate.h b/drivers/kvm/x86_emulate.h
index 28acad4..26dc6b0 100644
--- a/drivers/kvm/x86_emulate.h
+++ b/drivers/kvm/x86_emulate.h
@@ -153,6 +153,10 @@ struct x86_emulate_ctxt {
/* Emulated execution mode, represented by an X86EMUL_MODE
value. */
int mode;
+#define EMULCMD_NO_DECODE (1 << 0)
+#define EMULCMD_DECODE_ADDR (1 << 1)
+ int cmd_type;
+
unsigned long cs_base;
unsigned long ds_base;
unsigned long es_base;
--
1.5.2
Thanks
Yang, Sheng
[-- Attachment #2: 0001-Add-a-slot-for-apic-access-usage-not-elegant-but-no.patch --]
[-- Type: application/octet-stream, Size: 1756 bytes --]
From 3e83b579d0e9368f0f8223c24eac9898b9623aa2 Mon Sep 17 00:00:00 2001
From: Sheng Yang <sheng.yang@intel.com>
Date: Fri, 14 Sep 2007 09:51:54 +0800
Subject: [PATCH] Add a slot for apic access usage, not elegant but no choice
---
user/kvmctl.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/user/kvmctl.c b/user/kvmctl.c
index f358dc1..7e75945 100644
--- a/user/kvmctl.c
+++ b/user/kvmctl.c
@@ -248,6 +248,7 @@ int kvm_create(kvm_context_t kvm, unsigned long phys_mem_bytes, void **vm_mem)
unsigned long dosmem = 0xa0000;
unsigned long exmem = 0xc0000;
unsigned long pcimem = 0xf0000000;
+ unsigned long apicmem= 0xfee00000;
unsigned long memory = (phys_mem_bytes + PAGE_SIZE - 1) & PAGE_MASK;
int fd = kvm->fd;
int zfd;
@@ -267,6 +268,11 @@ int kvm_create(kvm_context_t kvm, unsigned long phys_mem_bytes, void **vm_mem)
.memory_size = memory < pcimem ? 0 : memory - pcimem,
.guest_phys_addr = 0x100000000,
};
+ struct kvm_memory_region apic_memory = {
+ .slot = 5,
+ .memory_size = PAGE_SIZE,
+ .guest_phys_addr = apicmem,
+ };
if (memory >= pcimem)
extended_memory.memory_size = pcimem - exmem;
@@ -302,9 +308,16 @@ int kvm_create(kvm_context_t kvm, unsigned long phys_mem_bytes, void **vm_mem)
}
}
+ r = ioctl(fd, KVM_SET_MEMORY_REGION, &apic_memory);
+ if (r == -1) {
+ fprintf(stderr, "kvm_create_memory_region: %m\n");
+ return -1;
+ }
+
kvm_memory_region_save_params(kvm, &low_memory);
kvm_memory_region_save_params(kvm, &extended_memory);
kvm_memory_region_save_params(kvm, &above_4g_memory);
+ kvm_memory_region_save_params(kvm, &apic_memory);
*vm_mem = mmap(NULL, memory, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (*vm_mem == MAP_FAILED) {
--
1.5.2
[-- Attachment #3: 0001-Enable-memory-mapped-TPR-shadow-feature.patch --]
[-- Type: application/octet-stream, Size: 11805 bytes --]
From 5b814299e3fb0912b1337749d42e3ef33b2615e7 Mon Sep 17 00:00:00 2001
From: Sheng Yang <sheng.yang@intel.com>
Date: Mon, 24 Sep 2007 16:10:40 +0800
Subject: [PATCH] Enable memory-mapped TPR shadow feature
---
drivers/kvm/irq.h | 2 +
drivers/kvm/kvm.h | 2 +-
drivers/kvm/kvm_main.c | 23 ++++++++++---
drivers/kvm/lapic.c | 3 ++
drivers/kvm/vmx.c | 76 +++++++++++++++++++++++++++++++++++++++++---
drivers/kvm/vmx.h | 3 ++
drivers/kvm/x86_emulate.c | 14 +++++++-
drivers/kvm/x86_emulate.h | 4 ++
8 files changed, 112 insertions(+), 15 deletions(-)
diff --git a/drivers/kvm/irq.h b/drivers/kvm/irq.h
index 11fc014..afbfa0c 100644
--- a/drivers/kvm/irq.h
+++ b/drivers/kvm/irq.h
@@ -118,6 +118,8 @@ struct kvm_lapic {
struct kvm_vcpu *vcpu;
struct page *regs_page;
void *regs;
+ struct page *apic_access_page;
+ hpa_t apic_access_hpa;
};
#ifdef DEBUG
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 051cdbe..bb8534a 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -565,7 +565,7 @@ enum emulation_result {
};
int emulate_instruction(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long cr2, u16 error_code, int no_decode);
+ unsigned long cr2, u16 error_code, int cmd_type);
void kvm_report_emulation_failure(struct kvm_vcpu *cvpu, const char *context);
void realmode_lgdt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index cecdb1b..0ebae4c 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1080,14 +1080,19 @@ static int emulator_read_emulated(unsigned long addr,
memcpy(val, vcpu->mmio_data, bytes);
vcpu->mmio_read_completed = 0;
return X86EMUL_CONTINUE;
- } else if (emulator_read_std(addr, val, bytes, vcpu)
- == X86EMUL_CONTINUE)
- return X86EMUL_CONTINUE;
+ }
gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
+ if ((gpa & PAGE_MASK) == 0xfee00000)
+ goto mmio;
+
+ if (emulator_read_std(addr, val, bytes, vcpu)
+ == X86EMUL_CONTINUE)
+ return X86EMUL_CONTINUE;
if (gpa == UNMAPPED_GVA)
return X86EMUL_PROPAGATE_FAULT;
+mmio:
/*
* Is this MMIO handled locally?
*/
@@ -1132,6 +1137,9 @@ static int emulator_write_emulated_onepage(unsigned long addr,
struct kvm_io_device *mmio_dev;
gpa_t gpa = vcpu->mmu.gva_to_gpa(vcpu, addr);
+ if ((gpa & PAGE_MASK) == 0xfee00000)
+ goto mmio;
+
if (gpa == UNMAPPED_GVA) {
kvm_x86_ops->inject_page_fault(vcpu, addr, 2);
return X86EMUL_PROPAGATE_FAULT;
@@ -1140,6 +1148,7 @@ static int emulator_write_emulated_onepage(unsigned long addr,
if (emulator_write_phys(vcpu, gpa, val, bytes))
return X86EMUL_CONTINUE;
+mmio:
/*
* Is this MMIO handled locally?
*/
@@ -1270,7 +1279,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
struct kvm_run *run,
unsigned long cr2,
u16 error_code,
- int no_decode)
+ int cmd_type)
{
int r = 0;
@@ -1279,8 +1288,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
vcpu->mmio_is_write = 0;
vcpu->pio.string = 0;
+ vcpu->emulate_ctxt.cmd_type = cmd_type;
- if (!no_decode) {
+ if ((cmd_type & EMULCMD_NO_DECODE) == 0) {
int cs_db, cs_l;
kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
@@ -2073,7 +2083,8 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
vcpu->mmio_read_completed = 1;
vcpu->mmio_needed = 0;
r = emulate_instruction(vcpu, kvm_run,
- vcpu->mmio_fault_cr2, 0, 1);
+ vcpu->mmio_fault_cr2, 0,
+ EMULCMD_NO_DECODE);
if (r == EMULATE_DO_MMIO) {
/*
* Read-modify-write. Back to userspace.
diff --git a/drivers/kvm/lapic.c b/drivers/kvm/lapic.c
index ddf9f20..b59dcda 100644
--- a/drivers/kvm/lapic.c
+++ b/drivers/kvm/lapic.c
@@ -952,6 +952,9 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
memset(apic->regs, 0, PAGE_SIZE);
apic->vcpu = vcpu;
+ apic->apic_access_page = vcpu->kvm->memslots[5].phys_mem[0];
+ apic->apic_access_hpa = page_to_phys(apic->apic_access_page);
+
hrtimer_init(&apic->timer.dev, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
apic->timer.dev.function = apic_timer_fn;
apic->base_address = APIC_DEFAULT_PHYS_BASE;
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index 6f1ad90..a7fe87c 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -85,6 +85,7 @@ static struct vmcs_config {
u32 revision_id;
u32 pin_based_exec_ctrl;
u32 cpu_based_exec_ctrl;
+ u32 cpu_based_2nd_exec_ctrl;
u32 vmexit_ctrl;
u32 vmentry_ctrl;
} vmcs_config;
@@ -178,6 +179,29 @@ static inline int vm_need_tpr_shadow(struct kvm *kvm)
return ((cpu_has_vmx_tpr_shadow()) && (irqchip_in_kernel(kvm)));
}
+static inline int cpu_has_secondary_exec_ctrls(void)
+{
+ return (vmcs_config.cpu_based_exec_ctrl & \
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS);
+}
+
+static inline int vm_need_secondary_exec_ctrls(struct kvm *kvm)
+{
+ return ((cpu_has_secondary_exec_ctrls()) && (irqchip_in_kernel(kvm)));
+}
+
+static inline int cpu_has_vmx_virtualize_apic_accesses(void)
+{
+ return (vmcs_config.cpu_based_2nd_exec_ctrl & \
+ SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
+}
+
+static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm)
+{
+ return ((cpu_has_vmx_virtualize_apic_accesses()) && \
+ (irqchip_in_kernel(kvm)));
+}
+
static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr)
{
int i;
@@ -915,6 +939,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
u32 min, opt;
u32 _pin_based_exec_control = 0;
u32 _cpu_based_exec_control = 0;
+ u32 _cpu_based_2nd_exec_control = 0;
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
@@ -932,11 +957,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
CPU_BASED_USE_IO_BITMAPS |
CPU_BASED_MOV_DR_EXITING |
CPU_BASED_USE_TSC_OFFSETING;
-#ifdef CONFIG_X86_64
- opt = CPU_BASED_TPR_SHADOW;
-#else
- opt = 0;
-#endif
+ opt = CPU_BASED_TPR_SHADOW |
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
&_cpu_based_exec_control) < 0)
return -EIO;
@@ -945,6 +967,18 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
_cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
~CPU_BASED_CR8_STORE_EXITING;
#endif
+ if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
+ min = 0;
+ opt = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+ if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS2,
+ &_cpu_based_2nd_exec_control) < 0)
+ return -EIO;
+ }
+#ifndef CONFIG_X86_64
+ if (!(_cpu_based_2nd_exec_control &
+ SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
+ _cpu_based_exec_control &= ~CPU_BASED_TPR_SHADOW;
+#endif
min = 0;
#ifdef CONFIG_X86_64
@@ -982,6 +1016,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
+ vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
vmcs_conf->vmexit_ctrl = _vmexit_control;
vmcs_conf->vmentry_ctrl = _vmentry_control;
@@ -1532,8 +1567,14 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
CPU_BASED_CR8_LOAD_EXITING;
#endif
}
+ if (!vm_need_secondary_exec_ctrls(vmx->vcpu.kvm))
+ exec_control &= ~CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control);
+ if (vm_need_secondary_exec_ctrls(vmx->vcpu.kvm))
+ vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
+ vmcs_config.cpu_based_2nd_exec_ctrl);
+
vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, !!bypass_guest_pf);
vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, !!bypass_guest_pf);
vmcs_write32(CR3_TARGET_COUNT, 0); /* 22.2.1 */
@@ -1610,6 +1651,9 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
page_to_phys(vmx->vcpu.apic->regs_page));
vmcs_write32(TPR_THRESHOLD, 0);
#endif
+ if (vm_need_virtualize_apic_accesses(vmx->vcpu.kvm))
+ vmcs_write64(APIC_ACCESS_ADDR,
+ vmx->vcpu.apic->apic_access_hpa);
vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL);
vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
@@ -2100,6 +2144,25 @@ static int handle_vmcall(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
return 1;
}
+static int handle_apic_access(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+{
+ u64 exit_qualification;
+ enum emulation_result er;
+ unsigned long offset;
+
+ exit_qualification = vmcs_read64(EXIT_QUALIFICATION);
+ offset = exit_qualification & 0xffful;
+
+ er = emulate_instruction(vcpu, kvm_run, 0, 0, EMULCMD_DECODE_ADDR);
+
+ if (er != EMULATE_DONE) {
+ BUG();
+ return 0;
+ }
+ return 1;
+}
+
+
/*
* The exit handlers return 1 if the exit was handled fully and guest execution
* may resume. Otherwise they set the kvm_run parameter to indicate what needs
@@ -2119,7 +2182,8 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu,
[EXIT_REASON_PENDING_INTERRUPT] = handle_interrupt_window,
[EXIT_REASON_HLT] = handle_halt,
[EXIT_REASON_VMCALL] = handle_vmcall,
- [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold
+ [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold,
+ [EXIT_REASON_APIC_ACCESS] = handle_apic_access,
};
static const int kvm_vmx_max_exit_handlers =
diff --git a/drivers/kvm/vmx.h b/drivers/kvm/vmx.h
index fd4e146..07cf1b5 100644
--- a/drivers/kvm/vmx.h
+++ b/drivers/kvm/vmx.h
@@ -89,6 +89,8 @@ enum vmcs_field {
TSC_OFFSET_HIGH = 0x00002011,
VIRTUAL_APIC_PAGE_ADDR = 0x00002012,
VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x00002013,
+ APIC_ACCESS_ADDR = 0x00002014,
+ APIC_ACCESS_ADDR_HIGH = 0x00002015,
VMCS_LINK_POINTER = 0x00002800,
VMCS_LINK_POINTER_HIGH = 0x00002801,
GUEST_IA32_DEBUGCTL = 0x00002802,
@@ -214,6 +216,7 @@ enum vmcs_field {
#define EXIT_REASON_MSR_WRITE 32
#define EXIT_REASON_MWAIT_INSTRUCTION 36
#define EXIT_REASON_TPR_BELOW_THRESHOLD 43
+#define EXIT_REASON_APIC_ACCESS 44
/*
* Interruption-information format
diff --git a/drivers/kvm/x86_emulate.c b/drivers/kvm/x86_emulate.c
index f294a49..d04e4c6 100644
--- a/drivers/kvm/x86_emulate.c
+++ b/drivers/kvm/x86_emulate.c
@@ -820,13 +820,17 @@ done_prefixes:
c->src.bytes = 4;
goto srcmem_common;
case SrcMem:
- c->src.bytes = (c->d & ByteOp) ? 1 :
- c->op_bytes;
+ c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
/* Don't fetch the address for invlpg: it could be unmapped. */
if (c->twobyte && c->b == 0x01
&& c->modrm_reg == 7)
break;
srcmem_common:
+ if (((ctxt->cmd_type & EMULCMD_DECODE_ADDR) != 0) &&
+ (c->modrm_ea == 0)) {
+ ctxt->cr2 = insn_fetch(u32, c->src.bytes, c->eip);
+ c->eip -= c->src.bytes;
+ }
c->src.type = OP_MEM;
break;
case SrcImm:
@@ -888,6 +892,12 @@ done_prefixes:
}
break;
case DstMem:
+ c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+ if (((ctxt->cmd_type & EMULCMD_DECODE_ADDR) != 0) &&
+ (c->modrm_ea == 0)) {
+ ctxt->cr2 = insn_fetch(u32, c->dst.bytes, c->eip);
+ c->eip -= c->dst.bytes;
+ }
c->dst.type = OP_MEM;
break;
}
diff --git a/drivers/kvm/x86_emulate.h b/drivers/kvm/x86_emulate.h
index 28acad4..26dc6b0 100644
--- a/drivers/kvm/x86_emulate.h
+++ b/drivers/kvm/x86_emulate.h
@@ -153,6 +153,10 @@ struct x86_emulate_ctxt {
/* Emulated execution mode, represented by an X86EMUL_MODE value. */
int mode;
+#define EMULCMD_NO_DECODE (1 << 0)
+#define EMULCMD_DECODE_ADDR (1 << 1)
+ int cmd_type;
+
unsigned long cs_base;
unsigned long ds_base;
unsigned long es_base;
--
1.5.2
[-- Attachment #4: Type: text/plain, Size: 228 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
[-- Attachment #5: Type: text/plain, Size: 186 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel
^ permalink raw reply related [flat|nested] 18+ messages in thread[parent not found: <DB3BD37E3533EE46BED2FBA80995557F87DA24-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <DB3BD37E3533EE46BED2FBA80995557F87DA24-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-09-25 8:35 ` Avi Kivity [not found] ` <46F8C84F.7090605-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-09-25 8:35 UTC (permalink / raw) To: Yang, Sheng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Yang, Sheng wrote: > These patches enable memory mapped TPR shadow (FlexPriority). > > Since TPR is accessed very frequently by 32bit Windows, especially SMP > guest, with FlexPriority enabled, we saw significant performance gain. > > The issue is: FlexPriority needs to add a memory slot to the vm to make > shadow work with APIC access page. > > We don't like the idea to add a memory slot, but no better choice now. > Our propose is to add p2m table to KVM, while seems this is still a long > way to go. > > BTW: I didn't use the offset(or other info) provide by CPU when handling > APIC access vmexit. Instead, I used a bit in cmd_type(including > no_decode) to tell emulator decode memory operand by itself when > necessary. That's because I only got the guest physical address when > handling APIC access vmexit, but emulator need a guest virtual address > to fit its flow. I have tried some ways, and current solution seems the > most proper one. > > > These patches enable memory mapped TPR shadow (FlexPriority). > > Since TPR is accessed very frequently by 32bit Windows, especially SMP > guest, with FlexPriority enabled, we saw significant performance gain. > > The issue is: FlexPriority needs to add a memory slot to the vm to make > shadow work with APIC access page. > > We don't like the idea to add a memory slot, but no better choice now. > Our propose is to add p2m table to KVM, while seems this is still a long > way to go. > > BTW: I didn't use the offset(or other info) provide by CPU when handling > APIC access vmexit. Instead, I used a bit in cmd_type(including > no_decode) to tell emulator decode memory operand by itself when > necessary. That's because I only got the guest physical address when > handling APIC access vmexit, but emulator need a guest virtual address > to fit its flow. I have tried some ways, and current solution seems the > most proper one. > > + struct kvm_memory_region apic_memory = { > + .slot = 5, > + .memory_size = PAGE_SIZE, > + .guest_phys_addr = apicmem, > + }; > > if (memory >= pcimem) > extended_memory.memory_size = pcimem - exmem; > @@ -302,9 +308,16 @@ int kvm_create(kvm_context_t kvm, unsigned long > phys_mem_bytes, void **vm_mem) > } > } > > + r = ioctl(fd, KVM_SET_MEMORY_REGION, &apic_memory); > + if (r == -1) { > + fprintf(stderr, "kvm_create_memory_region: %m\n"); > + return -1; > + } > Older kernels support only 4 memory slots, so you need to tolerate failures here (this isn't an issue for large memory, because there's no way the older kernel can run with large memory, so you can't continue. but new userspace should be able to run with an older kernel if is not using newer features. I don't like userspace involvement in this. Perhaps we can have a memory slot controlled by the kernel for this? It would be activated by the feature, so it we won't have it on AMD or when the feature isn't available. It can also be just a special case in gfn_to_page. > ns(-) diff --git a/drivers/kvm/irq.h b/drivers/kvm/irq.h index > 11fc014..afbfa0c 100644 --- a/drivers/kvm/irq.h +++ > b/drivers/kvm/irq.h @@ -118,6 +118,8 @@ struct kvm_lapic { struct > kvm_vcpu *vcpu; struct page *regs_page; void *regs; + struct page > *apic_access_page; + hpa_t apic_access_hpa; }; The second variable is redundant; just use page_address(). > diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c index > cecdb1b..0ebae4c 100644 --- a/drivers/kvm/kvm_main.c +++ > b/drivers/kvm/kvm_main.c @@ -1080,14 +1080,19 @@ static int > emulator_read_emulated(unsigned long addr, memcpy(val, > vcpu->mmio_data, bytes); vcpu->mmio_read_completed = 0; return > X86EMUL_CONTINUE; - } else if (emulator_read_std(addr, val, bytes, > vcpu) - == X86EMUL_CONTINUE) - return X86EMUL_CONTINUE; + } gpa = > vcpu->mmu.gva_to_gpa(vcpu, addr); + if ((gpa & PAGE_MASK) == > 0xfee00000) + goto mmio; + The guest can change the apic base address. Different vcpus can have different addresses. > + if ((gpa & PAGE_MASK) == 0xfee00000) + goto mmio; + Same here. > +static inline int cpu_has_secondary_exec_ctrls(void) +{ + return > (vmcs_config.cpu_based_exec_ctrl & \ + > CPU_BASED_ACTIVATE_SECONDARY_CONTROLS); +} We aren't in a macro there's no \ need for a backslash. > + +static inline int cpu_has_vmx_virtualize_apic_accesses(void) +{ + > return (vmcs_config.cpu_based_2nd_exec_ctrl & \ + > SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES); +} ditto > + +static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm) > +{ + return ((cpu_has_vmx_virtualize_apic_accesses()) && \ + > (irqchip_in_kernel(kvm))); +} + ditto > +static int handle_apic_access(struct kvm_vcpu *vcpu, struct kvm_run > *kvm_run) +{ + u64 exit_qualification; + enum emulation_result er; + > unsigned long offset; + + exit_qualification = > vmcs_read64(EXIT_QUALIFICATION); + offset = exit_qualification & > 0xffful; + + er = emulate_instruction(vcpu, kvm_run, 0, 0, > EMULCMD_DECODE_ADDR); + + if (er != EMULATE_DONE) { + BUG(); BUG() is a big here. You allow a guest to crash the host. We can return -EOPNOTSUPP here (that goes for the other places we emulate and can't recover; we can trap that in kvmctl.c and give better error messages to users). > @@ -820,13 +820,17 @@ done_prefixes: c->src.bytes = 4; goto > srcmem_common; case SrcMem: - c->src.bytes = (c->d & ByteOp) ? 1 : - > c->op_bytes; + c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; Don't add unrelated changes please. > ->b == 0x01 && c->modrm_reg == 7) break; srcmem_common: + if > (((ctxt->cmd_type & EMULCMD_DECODE_ADDR) != 0) && + (c->modrm_ea == > 0)) { + ctxt->cr2 = insn_fetch(u32, c->src.bytes, c->eip); + c->eip -= > c->src.bytes; + } Confused. What is this? why is eip going backwards? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <46F8C84F.7090605-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow featureenabling [not found] ` <46F8C84F.7090605-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-09-28 4:49 ` Dong, Eddie [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AE787-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> 2007-10-22 8:49 ` [RFC][PATCH]Memory mapped TPR shadow feature enabling Yang, Sheng 1 sibling, 1 reply; 18+ messages in thread From: Dong, Eddie @ 2007-09-28 4:49 UTC (permalink / raw) To: Avi Kivity, Yang, Sheng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f >> diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c index >> cecdb1b..0ebae4c 100644 --- a/drivers/kvm/kvm_main.c +++ >> b/drivers/kvm/kvm_main.c @@ -1080,14 +1080,19 @@ static int >> emulator_read_emulated(unsigned long addr, memcpy(val, >> vcpu->mmio_data, bytes); vcpu->mmio_read_completed = 0; return >> X86EMUL_CONTINUE; - } else if (emulator_read_std(addr, val, bytes, >> vcpu) - == X86EMUL_CONTINUE) - return X86EMUL_CONTINUE; + } gpa = >> vcpu->mmu.gva_to_gpa(vcpu, addr); + if ((gpa & PAGE_MASK) == >> 0xfee00000) + goto mmio; + > > The guest can change the apic base address. Different vcpus can have > different addresses. > In theory yes. But we didn't observe this so far. Xen with this feature with same assumption works for quit a long time. Also given that we are using global shadow page table, so probably we have to take this assumption :-) thx,eddie ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <10EA09EFD8728347A513008B6B0DA77A022AE787-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow featureenabling [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AE787-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-09-28 14:41 ` Avi Kivity [not found] ` <46FD1288.9030507-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-09-28 14:41 UTC (permalink / raw) To: Dong, Eddie; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Dong, Eddie wrote: >>> diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c index >>> cecdb1b..0ebae4c 100644 --- a/drivers/kvm/kvm_main.c +++ >>> b/drivers/kvm/kvm_main.c @@ -1080,14 +1080,19 @@ static int >>> emulator_read_emulated(unsigned long addr, memcpy(val, >>> vcpu->mmio_data, bytes); vcpu->mmio_read_completed = 0; return >>> X86EMUL_CONTINUE; - } else if (emulator_read_std(addr, val, bytes, >>> vcpu) - == X86EMUL_CONTINUE) - return X86EMUL_CONTINUE; + } gpa = >>> vcpu->mmu.gva_to_gpa(vcpu, addr); + if ((gpa & PAGE_MASK) == >>> 0xfee00000) + goto mmio; + >>> >> The guest can change the apic base address. Different vcpus can have >> different addresses. >> >> > > In theory yes. But we didn't observe this so far. Xen with this feature > with same assumption > works for quit a long time. > Also given that we are using global shadow page table, so probably we > have to > take this assumption :-) > We can workaround this by disabling the optimization when a guest has different addresses for the lapic. But I agree there's no need to do that now. -- Any sufficiently difficult bug is indistinguishable from a feature. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <46FD1288.9030507-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow featureenabling [not found] ` <46FD1288.9030507-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-09-28 15:48 ` Dong, Eddie [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEA2F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Dong, Eddie @ 2007-09-28 15:48 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Avi Kivity wrote: > >> In theory yes. But we didn't observe this so far. Xen with this >> feature with same assumption works for quit a long time. >> Also given that we are using global shadow page table, so probably >> we have to take this assumption :-) >> > > We can workaround this by disabling the optimization when a guest has > different addresses for the lapic. Yes, this is a perfect solution, not only a workaround though a little bit complex:-) > > But I agree there's no need to do that now. > Taking this chance to brainstorming what is the direction of slot structure and VT-d support. In later case, we have to generate a P2M (gpa to hpa or mpa) table eventually. This table won't replace slot data structure since we still need struct page * some place, but it will help gpa_to_hpa search. Do u have any solid idea right now? My suggestion is to add additional P2M table and restructure shadow code to use P2M table for most frequent APIs which will help performance. Increasing slot number will simply reduce performance of slot table :-( thx,eddie ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <10EA09EFD8728347A513008B6B0DA77A022AEA2F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Windows 2003 Server SMP Guest crashes. [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEA2F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-09-28 16:22 ` Fabian Deutsch [not found] ` <20070928162210.98710-hi6Y0CQ0nG0@public.gmane.org> 2007-09-28 16:56 ` kvm guest memory management (was: Re: [RFC][PATCH]Memory mapped TPR shadow featureenabling) Avi Kivity 1 sibling, 1 reply; 18+ messages in thread From: Fabian Deutsch @ 2007-09-28 16:22 UTC (permalink / raw) To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Hey, I just noticed that is it nmot possible to install Windows 2003 Serevr SMP guests (using: -smp 2 -boot d -m 2048 -net nic,model=rtl8139 -no-acpi OR without -no-acpi). The crash can be erproduced, when installing Windows using -smp 2 OR when installing a WIndows Guest with -smp 1 and pressing F5 (just when you are prompted for pressing F6 or F2) and selecting one of the multiprocessor options. The guest should crash a couple of seconds later. kvm-44 2.6.22.7-85.fc7 #1 SMP Fri Sep 21 19:59:51 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz Greetings - fabiand ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070928162210.98710-hi6Y0CQ0nG0@public.gmane.org>]
* Re: Windows 2003 Server SMP Guest crashes. [not found] ` <20070928162210.98710-hi6Y0CQ0nG0@public.gmane.org> @ 2007-09-28 16:57 ` Avi Kivity [not found] ` <46FD3295.5020700-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-09-28 16:57 UTC (permalink / raw) To: Fabian Deutsch; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Fabian Deutsch wrote: > Hey, > > I just noticed that is it nmot possible to install Windows 2003 Serevr SMP guests (using: -smp 2 -boot d -m 2048 -net nic,model=rtl8139 -no-acpi OR without -no-acpi). > The crash can be erproduced, when installing Windows using -smp 2 OR when installing a WIndows Guest with -smp 1 and pressing F5 (just when you are prompted for pressing F6 or F2) and selecting one of the multiprocessor options. > The guest should crash a couple of seconds later. > > kvm-44 > > 2.6.22.7-85.fc7 #1 SMP Fri Sep 21 19:59:51 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux > > model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz > > Does -no-kvm-irqchip help? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <46FD3295.5020700-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: Windows 2003 Server SMP Guest crashes. [not found] ` <46FD3295.5020700-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-09-28 16:52 ` Fabian Deutsch 0 siblings, 0 replies; 18+ messages in thread From: Fabian Deutsch @ 2007-09-28 16:52 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f > Fabian Deutsch wrote: > > Hey, > > > > I just noticed that is it nmot possible to install Windows 2003 Serevr > SMP guests (using: -smp 2 -boot d -m 2048 -net nic,model=rtl8139 -no-acpi OR > without -no-acpi). > > The crash can be erproduced, when installing Windows using -smp 2 OR > when installing a WIndows Guest with -smp 1 and pressing F5 (just when you are > prompted for pressing F6 or F2) and selecting one of the multiprocessor > options. > > The guest should crash a couple of seconds later. > > > > kvm-44 > > > > 2.6.22.7-85.fc7 #1 SMP Fri Sep 21 19:59:51 EDT 2007 x86_64 x86_64 x86_64 > GNU/Linux > > > > model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz > > > > > > Does -no-kvm-irqchip help? > Well, yes: It doesn't crash with exception 14(0), but: The guest will hang when trying to formating the hd/reading patitiontable or so. http://sourceforge.net/tracker/index.php?func=detail&aid=1804408&group_id=180599&atid=893831 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* kvm guest memory management (was: Re: [RFC][PATCH]Memory mapped TPR shadow featureenabling) [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEA2F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> 2007-09-28 16:22 ` Windows 2003 Server SMP Guest crashes Fabian Deutsch @ 2007-09-28 16:56 ` Avi Kivity [not found] ` <46FD323C.3090905-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 1 sibling, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-09-28 16:56 UTC (permalink / raw) To: Dong, Eddie; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Dong, Eddie wrote: > Taking this chance to brainstorming what is the direction of slot > structure and > VT-d support. In later case, we have to generate a P2M (gpa to hpa or > mpa) table > eventually. This table won't replace slot data structure since we still > need > struct page * some place, but it will help gpa_to_hpa search. Do u have > any > solid idea right now? My suggestion is to add additional P2M table and > restructure > shadow code to use P2M table for most frequent APIs which will help > performance. > > Increasing slot number will simply reduce performance of slot table :-( > First, I doubt that memory slot count will really affect performance. The memory slot structure is small and it is likely that all slots will be kept in cache at all times. Walking all slots should take a lot less time than a vmexit, even if there are 16 slots. Placing the most heavily used slots first reduces that time further. But to the bigger picture. We're quite close to using user-allocated memory for the guest, instead of kernel allocated memory. This means that userspace will allocate a memory region and assign it to kvm as a memory slot. On x86-64, where we have a large address space, this means that all of memory can be in just one slot (well, slots also allow us to do tracking of dirty pages on a subset of memory, so maybe three slots are needed). In effect, the Linux process page tables become the g2h (or p2m) tables, and access to guest memory is a simple copy_to_user()/copy_from_user(). User-allocated memory will enable the following features: - s390 support - guest swapping - page migration (where a guest is migrated from one NUMA node to another) - in conjunction with a de-duplicating file system, page sharing among guests - inter-guest shared memory (mmap() one file in two or more guests) - easier use of huge pages - more? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <46FD323C.3090905-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: kvm guest memory management [not found] ` <46FD323C.3090905-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-09-29 3:34 ` Dong, Eddie [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEAFE-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Dong, Eddie @ 2007-09-29 3:34 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Avi Kivity wrote: > Dong, Eddie wrote: > > First, I doubt that memory slot count will really affect performance. > The memory slot structure is small and it is likely that all slots > will be kept in cache at all times. Walking all slots should take > a lot less > time than a vmexit, even if there are 16 slots. Placing the most > heavily used slots first reduces that time further. Defintely it is small compare with vm exit. We will try to measure. > > But to the bigger picture. We're quite close to using user-allocated > memory for the guest, instead of kernel allocated memory. This means > that userspace will allocate a memory region and assign it to kvm as a > memory slot. On x86-64, where we have a large address space, > this means > that all of memory can be in just one slot (well, slots also > allow us to > do tracking of dirty pages on a subset of memory, so maybe three slots > are needed). In effect, the Linux process page tables become the g2h > (or p2m) tables, and access to guest memory is a simple > copy_to_user()/copy_from_user(). There are couple reasons that g2h can't server this. A VT-d device or EPT/NPT table need to translate from guest physical to machine physical address, while g2h uses host mode va as index. Other reason is that g2h also include host user space memory & kernel memory which guest should never touch. (A bad programmed VT-d device may modify the memory listed by VT-d table). Dirty tracking can certainly be serviced even using p2m solely. > > User-allocated memory will enable the following features: > - s390 support > - guest swapping > - page migration (where a guest is migrated from one NUMA node > to another) > - in conjunction with a de-duplicating file system, page sharing > among guests > - inter-guest shared memory (mmap() one file in two or more guests) > - easier use of huge pages > - more? > This doesn't conflict with my suggestion though the p2m table then need to be dynamically modifed in case swapping happens. thx,eddie ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <10EA09EFD8728347A513008B6B0DA77A022AEAFE-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: kvm guest memory management [not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEAFE-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-09-30 8:55 ` Avi Kivity [not found] ` <46FF647E.6080506-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-09-30 8:55 UTC (permalink / raw) To: Dong, Eddie; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Dong, Eddie wrote: > >> But to the bigger picture. We're quite close to using user-allocated >> memory for the guest, instead of kernel allocated memory. This means >> that userspace will allocate a memory region and assign it to kvm as a >> memory slot. On x86-64, where we have a large address space, >> this means >> that all of memory can be in just one slot (well, slots also >> allow us to >> do tracking of dirty pages on a subset of memory, so maybe three slots >> are needed). In effect, the Linux process page tables become the g2h >> (or p2m) tables, and access to guest memory is a simple >> copy_to_user()/copy_from_user(). >> > > There are couple reasons that g2h can't server this. > A VT-d device or EPT/NPT table need to translate from guest physical > to machine physical address, while g2h uses host mode va as index. > Other reason is that g2h also include host user space memory & kernel > memory which guest should never touch. (A bad programmed VT-d device > may modify the memory listed by VT-d table). > > For EPT, we can't use the host page tables because EPT does not support the dirty bit. So EPT requires duplication of the page tables anyway. > Dirty tracking can certainly be serviced even using p2m solely. > > >> User-allocated memory will enable the following features: >> - s390 support >> - guest swapping >> - page migration (where a guest is migrated from one NUMA node >> to another) >> - in conjunction with a de-duplicating file system, page sharing >> among guests >> - inter-guest shared memory (mmap() one file in two or more guests) >> - easier use of huge pages >> - more? >> >> > > This doesn't conflict with my suggestion though the p2m table then > need to be dynamically modifed in case swapping happens. > The nice thing about using the host page tables is that it's automatically updated to reflect changes in mapping. Translating a page (gfn_to_page) becomes a call to get_user_pages(). -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <46FF647E.6080506-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: kvm guest memory management [not found] ` <46FF647E.6080506-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-10-03 15:03 ` Anthony Liguori 0 siblings, 0 replies; 18+ messages in thread From: Anthony Liguori @ 2007-10-03 15:03 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Avi Kivity wrote: > Dong, Eddie wrote: > >> There are couple reasons that g2h can't server this. >> A VT-d device or EPT/NPT table need to translate from guest physical >> to machine physical address, while g2h uses host mode va as index. >> Other reason is that g2h also include host user space memory & kernel >> memory which guest should never touch. (A bad programmed VT-d device >> may modify the memory listed by VT-d table). >> >> >> > > For EPT, we can't use the host page tables because EPT does not support > the dirty bit. So EPT requires duplication of the page tables anyway. > EPT fill in the dirty bit? To track dirty memory, you would have to read protect portions of the EPT? That seems quite unfortunate... Regards, Anthony Liguori >> Dirty tracking can certainly be serviced even using p2m solely. >> >> >> >>> User-allocated memory will enable the following features: >>> - s390 support >>> - guest swapping >>> - page migration (where a guest is migrated from one NUMA node >>> to another) >>> - in conjunction with a de-duplicating file system, page sharing >>> among guests >>> - inter-guest shared memory (mmap() one file in two or more guests) >>> - easier use of huge pages >>> - more? >>> >>> >>> >> This doesn't conflict with my suggestion though the p2m table then >> need to be dynamically modifed in case swapping happens. >> >> > > The nice thing about using the host page tables is that it's > automatically updated to reflect changes in mapping. Translating a page > (gfn_to_page) becomes a call to get_user_pages(). > > > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <46F8C84F.7090605-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 2007-09-28 4:49 ` [RFC][PATCH]Memory mapped TPR shadow featureenabling Dong, Eddie @ 2007-10-22 8:49 ` Yang, Sheng [not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE17A-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 1 sibling, 1 reply; 18+ messages in thread From: Yang, Sheng @ 2007-10-22 8:49 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Hi Avi, I've been work on other things for long time, so this patch delayed for a while. I am planning to post it out recently, but found that, because of userspace allocate memory patch, rmap was moved to memslot. So your suggestion on deal it with gfn_to_page seems got a little trouble. And I don't perfer to use one memslot to do this kind of thing alone. Any suggestions? Avi Kivity wrote: > Yang, Sheng wrote: >> These patches enable memory mapped TPR shadow (FlexPriority). >> >> Since TPR is accessed very frequently by 32bit Windows, especially SMP >> guest, with FlexPriority enabled, we saw significant performance gain. >> >> The issue is: FlexPriority needs to add a memory slot to the vm to make >> shadow work with APIC access page. >> >> We don't like the idea to add a memory slot, but no better choice now. >> Our propose is to add p2m table to KVM, while seems this is still a long >> way to go. >> >> BTW: I didn't use the offset(or other info) provide by CPU when handling >> APIC access vmexit. Instead, I used a bit in cmd_type(including >> no_decode) to tell emulator decode memory operand by itself when >> necessary. That's because I only got the guest physical address when >> handling APIC access vmexit, but emulator need a guest virtual address >> to fit its flow. I have tried some ways, and current solution seems the >> most proper one. >> >> > >> These patches enable memory mapped TPR shadow (FlexPriority). >> >> Since TPR is accessed very frequently by 32bit Windows, especially SMP >> guest, with FlexPriority enabled, we saw significant performance gain. >> >> The issue is: FlexPriority needs to add a memory slot to the vm to make >> shadow work with APIC access page. >> >> We don't like the idea to add a memory slot, but no better choice now. >> Our propose is to add p2m table to KVM, while seems this is still a long >> way to go. >> >> BTW: I didn't use the offset(or other info) provide by CPU when handling >> APIC access vmexit. Instead, I used a bit in cmd_type(including >> no_decode) to tell emulator decode memory operand by itself when >> necessary. That's because I only got the guest physical address when >> handling APIC access vmexit, but emulator need a guest virtual address >> to fit its flow. I have tried some ways, and current solution seems the >> most proper one. >> >> + struct kvm_memory_region apic_memory = { >> + .slot = 5, >> + .memory_size = PAGE_SIZE, >> + .guest_phys_addr = apicmem, >> + }; >> >> if (memory >= pcimem) >> extended_memory.memory_size = pcimem - exmem; >> @@ -302,9 +308,16 @@ int kvm_create(kvm_context_t kvm, unsigned long >> phys_mem_bytes, void **vm_mem) >> } >> } >> >> + r = ioctl(fd, KVM_SET_MEMORY_REGION, &apic_memory); + if (r == -1) { >> + fprintf(stderr, "kvm_create_memory_region: %m\n"); + return -1; >> + } >> > > Older kernels support only 4 memory slots, so you need to tolerate > failures here (this isn't an issue for large memory, because there's no > way the older kernel can run with large memory, so you can't continue. > but new userspace should be able to run with an older kernel if is not > using newer features. > > I don't like userspace involvement in this. Perhaps we can have a > memory slot controlled by the kernel for this? It would be activated by > the feature, so it we won't have it on AMD or when the feature isn't > available. > > It can also be just a special case in gfn_to_page. Thanks Yang, Sheng ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <DB3BD37E3533EE46BED2FBA80995557F9BE17A-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE17A-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-10-22 9:00 ` Avi Kivity [not found] ` <471C66C1.6000508-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-10-22 9:00 UTC (permalink / raw) To: Yang, Sheng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Yang, Sheng wrote: > Hi Avi, > > I've been work on other things for long time, so this patch delayed for > a while. I am planning to post it out recently, but found that, because > of userspace allocate memory patch, rmap was moved to memslot. So your > suggestion on deal it with gfn_to_page seems got a little trouble. And I > don't perfer to use one memslot to do this kind of thing alone. > I think that adding a memory slot is a good first step. Later if it shows up in profiles we can replace it with something else, but right now let's keep things simple. It should probably be an internal memory slot, so we don't need to change userspace to use this feature, and we can react to changes in the apic base address. (An internal memory slot would also be useful for storing the real-mode tss on Intel) Question: how does this interact with SMP? Does each vcpu require its own page? -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <471C66C1.6000508-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <471C66C1.6000508-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-10-22 9:11 ` Yang, Sheng [not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE19D-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Yang, Sheng @ 2007-10-22 9:11 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Thanks for your advice. :) It have nothing to do with smp, though each domain shares one apic_access_page now. In fact, currently this page won't be accessed, just a address mark for cpu. Avi Kivity wrote: > Yang, Sheng wrote: >> Hi Avi, >> >> I've been work on other things for long time, so this patch delayed for >> a while. I am planning to post it out recently, but found that, because >> of userspace allocate memory patch, rmap was moved to memslot. So your >> suggestion on deal it with gfn_to_page seems got a little trouble. And I >> don't perfer to use one memslot to do this kind of thing alone. >> > > I think that adding a memory slot is a good first step. Later if it > shows up in profiles we can replace it with something else, but right > now let's keep things simple. > > It should probably be an internal memory slot, so we don't need to > change userspace to use this feature, and we can react to changes in the > apic base address. > > (An internal memory slot would also be useful for storing the real-mode > tss on Intel) > > Question: how does this interact with SMP? Does each vcpu require its > own page? Thanks Yang, Sheng ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <DB3BD37E3533EE46BED2FBA80995557F9BE19D-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE19D-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-10-22 9:32 ` Avi Kivity [not found] ` <471C6E14.9030502-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Avi Kivity @ 2007-10-22 9:32 UTC (permalink / raw) To: Yang, Sheng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Yang, Sheng wrote: > Thanks for your advice. :) > > It have nothing to do with smp, though each domain shares one > apic_access_page now. In fact, currently this page won't be accessed, > just a address mark for cpu. > > You mean, all that's necessary is that the virtual address be mapped somewhere? -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <471C6E14.9030502-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <471C6E14.9030502-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-10-22 9:45 ` Yang, Sheng [not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE1B0-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Yang, Sheng @ 2007-10-22 9:45 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Yeah. We must map guest virtual address of apic page to some determined hpa(4k aligned), then we write this hpa to vmcs for cpu handling apic access vmexit. Avi Kivity wrote: > Yang, Sheng wrote: >> Thanks for your advice. :) >> >> It have nothing to do with smp, though each domain shares one >> apic_access_page now. In fact, currently this page won't be accessed, just >> a address mark for cpu. >> >> > > You mean, all that's necessary is that the virtual address be mapped > somewhere? Thanks Yang, Sheng ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <DB3BD37E3533EE46BED2FBA80995557F9BE1B0-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [RFC][PATCH]Memory mapped TPR shadow feature enabling [not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE1B0-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2007-10-22 9:48 ` Avi Kivity 0 siblings, 0 replies; 18+ messages in thread From: Avi Kivity @ 2007-10-22 9:48 UTC (permalink / raw) To: Yang, Sheng; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Yang, Sheng wrote: > Yeah. We must map guest virtual address of apic page to some determined > hpa(4k aligned), then we write this hpa to vmcs for cpu handling apic > access vmexit. > > Ah of course. The processor traps on the physical address, not the virtual address. -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2007-10-22 9:48 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-25 5:52 [RFC][PATCH]Memory mapped TPR shadow feature enabling Yang, Sheng
[not found] ` <DB3BD37E3533EE46BED2FBA80995557F87DA24-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-25 8:35 ` Avi Kivity
[not found] ` <46F8C84F.7090605-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-28 4:49 ` [RFC][PATCH]Memory mapped TPR shadow featureenabling Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A022AE787-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-28 14:41 ` Avi Kivity
[not found] ` <46FD1288.9030507-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-28 15:48 ` Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEA2F-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-28 16:22 ` Windows 2003 Server SMP Guest crashes Fabian Deutsch
[not found] ` <20070928162210.98710-hi6Y0CQ0nG0@public.gmane.org>
2007-09-28 16:57 ` Avi Kivity
[not found] ` <46FD3295.5020700-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-28 16:52 ` Fabian Deutsch
2007-09-28 16:56 ` kvm guest memory management (was: Re: [RFC][PATCH]Memory mapped TPR shadow featureenabling) Avi Kivity
[not found] ` <46FD323C.3090905-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-09-29 3:34 ` kvm guest memory management Dong, Eddie
[not found] ` <10EA09EFD8728347A513008B6B0DA77A022AEAFE-wq7ZOvIWXbNpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-09-30 8:55 ` Avi Kivity
[not found] ` <46FF647E.6080506-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-03 15:03 ` Anthony Liguori
2007-10-22 8:49 ` [RFC][PATCH]Memory mapped TPR shadow feature enabling Yang, Sheng
[not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE17A-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-10-22 9:00 ` Avi Kivity
[not found] ` <471C66C1.6000508-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-22 9:11 ` Yang, Sheng
[not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE19D-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-10-22 9:32 ` Avi Kivity
[not found] ` <471C6E14.9030502-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-22 9:45 ` Yang, Sheng
[not found] ` <DB3BD37E3533EE46BED2FBA80995557F9BE1B0-wq7ZOvIWXbM/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-10-22 9:48 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox