* [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4)
@ 2009-08-23 11:55 Avi Kivity
2009-08-23 11:56 ` [PATCH 01/46] KVM: Trace irq level and source id Avi Kivity
` (45 more replies)
0 siblings, 46 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:55 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Third batch of the KVM patch queue.
Akinobu Mita (2):
KVM: x86: use get_desc_base() and get_desc_limit()
KVM: x86: use kvm_get_gdt() and kvm_read_ldt()
Andre Przywara (4):
KVM: Ignore PCI ECS I/O enablement
KVM: handle AMD microcode MSR
KVM: fix MMIO_CONF_BASE MSR access
KVM: add module parameters documentation
Avi Kivity (6):
KVM: Trace irq level and source id
KVM: Trace mmio
KVM: Trace apic registers using their symbolic names
KVM: MMU: Trace guest pagetable walker
KVM: Document basic API
KVM: Trace shadow page lifecycle
Beth Kon (1):
KVM: PIT support for HPET legacy mode
Gleb Natapov (12):
KVM: Add Directed EOI support to APIC emulation
KVM: x2apic interface to lapic
KVM: Use temporary variable to shorten lines.
KVM: Add trace points in irqchip code
KVM: No need to kick cpu if not in a guest mode
KVM: Always report x2apic as supported feature
KVM: Move exception handling to the same place as other events
KVM: Move kvm_cpu_get_interrupt() declaration to x86 code
KVM: Reduce runnability interface with arch support code
KVM: silence lapic kernel messages that can be triggered by a guest
KVM: s390: remove unused structs
KVM: PIT: Unregister ack notifier callback when freeing
Gregory Haskins (2):
KVM: make io_bus interface more robust
KVM: add ioeventfd support
Jan Kiszka (3):
Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs"
KVM: Drop obsolete cpu_get/put in make_all_cpus_request
KVM: VMX: Avoid to return ENOTSUPP to userland
Joerg Roedel (8):
KVM: MMU: Fix MMU_DEBUG compile breakage
KVM: MMU: make rmap code aware of mapping levels
KVM: MMU: rename is_largepage_backed to mapping_level
KVM: MMU: make direct mapping paths aware of mapping levels
KVM: MMU: make page walker aware of mapping levels
KVM: MMU: shadow support for 1gb pages
KVM: MMU: enable gbpages by increasing nr of pagesizes
KVM: report 1GB page support to userspace
Marcelo Tosatti (2):
KVM: MMU: fix missing locking in alloc_mmu_pages
KVM: limit lapic periodic timer frequency
Michael S. Tsirkin (1):
KVM: ignore msi request if !level
Mikhail Ershov (1):
KVM: Align cr8 threshold when userspace changes cr8
Sheng Yang (3):
KVM: Fix apic_mmio_write return for unaligned write
KVM: Discard unnecessary kvm_mmu_flush_tlb() in kvm_mmu_load()
KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl
Xiao Guangrong (1):
KVM: fix kvm_init() error handling
Documentation/kernel-parameters.txt | 39 ++
Documentation/kvm/api.txt | 683 +++++++++++++++++++++++++++++++++++
arch/ia64/kvm/kvm-ia64.c | 16 +-
arch/powerpc/kvm/powerpc.c | 13 +-
arch/s390/include/asm/kvm.h | 9 -
arch/s390/kvm/interrupt.c | 8 +-
arch/x86/include/asm/apicdef.h | 2 +
arch/x86/include/asm/kvm.h | 9 +
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/i8254.c | 44 ++-
arch/x86/kvm/i8254.h | 3 +-
arch/x86/kvm/i8259.c | 12 +-
arch/x86/kvm/lapic.c | 268 +++++++++++----
arch/x86/kvm/lapic.h | 3 +
arch/x86/kvm/mmu.c | 272 +++++++++------
arch/x86/kvm/mmutrace.h | 220 +++++++++++
arch/x86/kvm/paging_tmpl.h | 108 +++---
arch/x86/kvm/svm.c | 12 +-
arch/x86/kvm/trace.h | 99 +++++-
arch/x86/kvm/vmx.c | 25 +-
arch/x86/kvm/x86.c | 176 +++++++---
arch/x86/kvm/x86.h | 4 +
include/linux/kvm.h | 32 ++
include/linux/kvm_host.h | 23 +-
include/trace/events/kvm.h | 100 +++++-
virt/kvm/coalesced_mmio.c | 8 +-
virt/kvm/eventfd.c | 251 +++++++++++++-
virt/kvm/ioapic.c | 28 +-
virt/kvm/irq_comm.c | 8 +-
virt/kvm/kvm_main.c | 59 +++-
30 files changed, 2177 insertions(+), 366 deletions(-)
create mode 100644 Documentation/kvm/api.txt
create mode 100644 arch/x86/kvm/mmutrace.h
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 01/46] KVM: Trace irq level and source id
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 02/46] KVM: Ignore PCI ECS I/O enablement Avi Kivity
` (44 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Signed-off-by: Avi Kivity <avi@redhat.com>
---
include/trace/events/kvm.h | 11 ++++++++---
virt/kvm/irq_comm.c | 2 +-
2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index d74b23d..035232d 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -9,18 +9,23 @@
#if defined(__KVM_HAVE_IOAPIC)
TRACE_EVENT(kvm_set_irq,
- TP_PROTO(unsigned int gsi),
- TP_ARGS(gsi),
+ TP_PROTO(unsigned int gsi, int level, int irq_source_id),
+ TP_ARGS(gsi, level, irq_source_id),
TP_STRUCT__entry(
__field( unsigned int, gsi )
+ __field( int, level )
+ __field( int, irq_source_id )
),
TP_fast_assign(
__entry->gsi = gsi;
+ __entry->level = level;
+ __entry->irq_source_id = irq_source_id;
),
- TP_printk("gsi %u", __entry->gsi)
+ TP_printk("gsi %u level %d source %d",
+ __entry->gsi, __entry->level, __entry->irq_source_id)
);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 94759ed..56e6961 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -126,7 +126,7 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level)
unsigned long *irq_state, sig_level;
int ret = -1;
- trace_kvm_set_irq(irq);
+ trace_kvm_set_irq(irq, level, irq_source_id);
WARN_ON(!mutex_is_locked(&kvm->irq_lock));
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 02/46] KVM: Ignore PCI ECS I/O enablement
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
2009-08-23 11:56 ` [PATCH 01/46] KVM: Trace irq level and source id Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 03/46] KVM: Trace mmio Avi Kivity
` (43 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Andre Przywara <andre.przywara@amd.com>
Linux guests will try to enable access to the extended PCI config space
via the I/O ports 0xCF8/0xCFC on AMD Fam10h CPU. Since we (currently?)
don't use ECS, simply ignore write and read attempts.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 96f0ae7..af40e23 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -844,6 +844,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
return 1;
}
break;
+ case MSR_AMD64_NB_CFG:
+ break;
case MSR_IA32_DEBUGCTLMSR:
if (!data) {
/* We support the non-activated case already */
@@ -1049,6 +1051,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
case MSR_P6_EVNTSEL1:
case MSR_K7_EVNTSEL0:
case MSR_K8_INT_PENDING_MSG:
+ case MSR_AMD64_NB_CFG:
data = 0;
break;
case MSR_MTRRcap:
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 03/46] KVM: Trace mmio
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
2009-08-23 11:56 ` [PATCH 01/46] KVM: Trace irq level and source id Avi Kivity
2009-08-23 11:56 ` [PATCH 02/46] KVM: Ignore PCI ECS I/O enablement Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 04/46] KVM: Trace apic registers using their symbolic names Avi Kivity
` (42 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 11 ++++++++++-
include/trace/events/kvm.h | 33 +++++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index af40e23..d32e3c6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -37,6 +37,8 @@
#include <linux/iommu.h>
#include <linux/intel-iommu.h>
#include <linux/cpufreq.h>
+#include <trace/events/kvm.h>
+#undef TRACE_INCLUDE_FILE
#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -2425,6 +2427,8 @@ static int emulator_read_emulated(unsigned long addr,
if (vcpu->mmio_read_completed) {
memcpy(val, vcpu->mmio_data, bytes);
+ trace_kvm_mmio(KVM_TRACE_MMIO_READ, bytes,
+ vcpu->mmio_phys_addr, *(u64 *)val);
vcpu->mmio_read_completed = 0;
return X86EMUL_CONTINUE;
}
@@ -2445,8 +2449,12 @@ mmio:
/*
* Is this MMIO handled locally?
*/
- if (!vcpu_mmio_read(vcpu, gpa, bytes, val))
+ if (!vcpu_mmio_read(vcpu, gpa, bytes, val)) {
+ trace_kvm_mmio(KVM_TRACE_MMIO_READ, bytes, gpa, *(u64 *)val);
return X86EMUL_CONTINUE;
+ }
+
+ trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, bytes, gpa, 0);
vcpu->mmio_needed = 1;
vcpu->mmio_phys_addr = gpa;
@@ -2490,6 +2498,7 @@ static int emulator_write_emulated_onepage(unsigned long addr,
return X86EMUL_CONTINUE;
mmio:
+ trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, *(u64 *)val);
/*
* Is this MMIO handled locally?
*/
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 035232d..77022af 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -56,6 +56,39 @@ TRACE_EVENT(kvm_ack_irq,
#endif /* defined(__KVM_HAVE_IOAPIC) */
+
+#define KVM_TRACE_MMIO_READ_UNSATISFIED 0
+#define KVM_TRACE_MMIO_READ 1
+#define KVM_TRACE_MMIO_WRITE 2
+
+#define kvm_trace_symbol_mmio \
+ { KVM_TRACE_MMIO_READ_UNSATISFIED, "unsatisfied-read" }, \
+ { KVM_TRACE_MMIO_READ, "read" }, \
+ { KVM_TRACE_MMIO_WRITE, "write" }
+
+TRACE_EVENT(kvm_mmio,
+ TP_PROTO(int type, int len, u64 gpa, u64 val),
+ TP_ARGS(type, len, gpa, val),
+
+ TP_STRUCT__entry(
+ __field( u32, type )
+ __field( u32, len )
+ __field( u64, gpa )
+ __field( u64, val )
+ ),
+
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->len = len;
+ __entry->gpa = gpa;
+ __entry->val = val;
+ ),
+
+ TP_printk("mmio %s len %u gpa 0x%llx val 0x%llx",
+ __print_symbolic(__entry->type, kvm_trace_symbol_mmio),
+ __entry->len, __entry->gpa, __entry->val)
+);
+
#endif /* _TRACE_KVM_MAIN_H */
/* This part must be outside protection */
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 04/46] KVM: Trace apic registers using their symbolic names
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (2 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 03/46] KVM: Trace mmio Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 05/46] KVM: Add Directed EOI support to APIC emulation Avi Kivity
` (41 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/trace.h | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index cd8c90d..6c2c87f 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -111,6 +111,15 @@ TRACE_EVENT(kvm_cpuid,
__entry->rbx, __entry->rcx, __entry->rdx)
);
+#define AREG(x) { APIC_##x, "APIC_" #x }
+
+#define kvm_trace_symbol_apic \
+ AREG(ID), AREG(LVR), AREG(TASKPRI), AREG(ARBPRI), AREG(PROCPRI), \
+ AREG(EOI), AREG(RRR), AREG(LDR), AREG(DFR), AREG(SPIV), AREG(ISR), \
+ AREG(TMR), AREG(IRR), AREG(ESR), AREG(ICR), AREG(ICR2), AREG(LVTT), \
+ AREG(LVTTHMR), AREG(LVTPC), AREG(LVT0), AREG(LVT1), AREG(LVTERR), \
+ AREG(TMICT), AREG(TMCCT), AREG(TDCR), AREG(SELF_IPI), AREG(EFEAT), \
+ AREG(ECTRL)
/*
* Tracepoint for apic access.
*/
@@ -130,9 +139,10 @@ TRACE_EVENT(kvm_apic,
__entry->val = val;
),
- TP_printk("apic_%s 0x%x = 0x%x",
+ TP_printk("apic_%s %s = 0x%x",
__entry->rw ? "write" : "read",
- __entry->reg, __entry->val)
+ __print_symbolic(__entry->reg, kvm_trace_symbol_apic),
+ __entry->val)
);
#define trace_kvm_apic_read(reg, val) trace_kvm_apic(0, reg, val)
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 05/46] KVM: Add Directed EOI support to APIC emulation
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (3 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 04/46] KVM: Trace apic registers using their symbolic names Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 06/46] KVM: x2apic interface to lapic Avi Kivity
` (40 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Directed EOI is specified by x2APIC, but is available even when lapic is
in xAPIC mode.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/apicdef.h | 2 ++
arch/x86/kvm/lapic.c | 38 ++++++++++++++++++++++++++++++--------
arch/x86/kvm/lapic.h | 1 +
arch/x86/kvm/x86.c | 4 ++--
arch/x86/kvm/x86.h | 4 ++++
5 files changed, 39 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index 7ddb36a..74ca38f 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -14,6 +14,7 @@
#define APIC_LVR 0x30
#define APIC_LVR_MASK 0xFF00FF
+#define APIC_LVR_DIRECTED_EOI (1 << 24)
#define GET_APIC_VERSION(x) ((x) & 0xFFu)
#define GET_APIC_MAXLVT(x) (((x) >> 16) & 0xFFu)
#ifdef CONFIG_X86_32
@@ -40,6 +41,7 @@
#define APIC_DFR_CLUSTER 0x0FFFFFFFul
#define APIC_DFR_FLAT 0xFFFFFFFFul
#define APIC_SPIV 0xF0
+#define APIC_SPIV_DIRECTED_EOI (1 << 12)
#define APIC_SPIV_FOCUS_DISABLED (1 << 9)
#define APIC_SPIV_APIC_ENABLED (1 << 8)
#define APIC_ISR 0x100
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 265a765..62ea2ab 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -35,6 +35,7 @@
#include "kvm_cache_regs.h"
#include "irq.h"
#include "trace.h"
+#include "x86.h"
#ifndef CONFIG_X86_64
#define mod_64(x, y) ((x) - (y) * div64_u64(x, y))
@@ -142,6 +143,21 @@ static inline int apic_lvt_nmi_mode(u32 lvt_val)
return (lvt_val & (APIC_MODE_MASK | APIC_LVT_MASKED)) == APIC_DM_NMI;
}
+void kvm_apic_set_version(struct kvm_vcpu *vcpu)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+ struct kvm_cpuid_entry2 *feat;
+ u32 v = APIC_VERSION;
+
+ if (!irqchip_in_kernel(vcpu->kvm))
+ return;
+
+ feat = kvm_find_cpuid_entry(apic->vcpu, 0x1, 0);
+ if (feat && (feat->ecx & (1 << (X86_FEATURE_X2APIC & 31))))
+ v |= APIC_LVR_DIRECTED_EOI;
+ apic_set_reg(apic, APIC_LVR, v);
+}
+
static unsigned int apic_lvt_mask[APIC_LVT_NUM] = {
LVT_MASK | APIC_LVT_TIMER_PERIODIC, /* LVTT */
LVT_MASK | APIC_MODE_MASK, /* LVTTHMR */
@@ -442,9 +458,11 @@ static void apic_set_eoi(struct kvm_lapic *apic)
trigger_mode = IOAPIC_LEVEL_TRIG;
else
trigger_mode = IOAPIC_EDGE_TRIG;
- mutex_lock(&apic->vcpu->kvm->irq_lock);
- kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
- mutex_unlock(&apic->vcpu->kvm->irq_lock);
+ if (!(apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)) {
+ mutex_lock(&apic->vcpu->kvm->irq_lock);
+ kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
+ mutex_unlock(&apic->vcpu->kvm->irq_lock);
+ }
}
static void apic_send_ipi(struct kvm_lapic *apic)
@@ -694,8 +712,11 @@ static int apic_mmio_write(struct kvm_io_device *this,
apic_set_reg(apic, APIC_DFR, val | 0x0FFFFFFF);
break;
- case APIC_SPIV:
- apic_set_reg(apic, APIC_SPIV, val & 0x3ff);
+ case APIC_SPIV: {
+ u32 mask = 0x3ff;
+ if (apic_get_reg(apic, APIC_LVR) & APIC_LVR_DIRECTED_EOI)
+ mask |= APIC_SPIV_DIRECTED_EOI;
+ apic_set_reg(apic, APIC_SPIV, val & mask);
if (!(val & APIC_SPIV_APIC_ENABLED)) {
int i;
u32 lvt_val;
@@ -710,7 +731,7 @@ static int apic_mmio_write(struct kvm_io_device *this,
}
break;
-
+ }
case APIC_ICR:
/* No delay here, so we always clear the pending bit */
apic_set_reg(apic, APIC_ICR, val & ~(1 << 12));
@@ -837,7 +858,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu)
hrtimer_cancel(&apic->lapic_timer.timer);
apic_set_reg(apic, APIC_ID, vcpu->vcpu_id << 24);
- apic_set_reg(apic, APIC_LVR, APIC_VERSION);
+ kvm_apic_set_version(apic->vcpu);
for (i = 0; i < APIC_LVT_NUM; i++)
apic_set_reg(apic, APIC_LVTT + 0x10 * i, APIC_LVT_MASKED);
@@ -1041,7 +1062,8 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu)
apic->base_address = vcpu->arch.apic_base &
MSR_IA32_APICBASE_BASE;
- apic_set_reg(apic, APIC_LVR, APIC_VERSION);
+ kvm_apic_set_version(vcpu);
+
apic_update_ppr(apic);
hrtimer_cancel(&apic->lapic_timer.timer);
update_divide_count(apic);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 3f3ecc6..bc1c524 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -29,6 +29,7 @@ u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu);
void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8);
void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value);
u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
+void kvm_apic_set_version(struct kvm_vcpu *vcpu);
int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d32e3c6..086f931 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -79,8 +79,6 @@ static u64 __read_mostly efer_reserved_bits = 0xfffffffffffffffeULL;
static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
struct kvm_cpuid_entry2 __user *entries);
-struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
- u32 function, u32 index);
struct kvm_x86_ops *kvm_x86_ops;
EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -1373,6 +1371,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
vcpu->arch.cpuid_nent = cpuid->nent;
cpuid_fix_nx_cap(vcpu);
r = 0;
+ kvm_apic_set_version(vcpu);
out_free:
vfree(cpuid_entries);
@@ -1394,6 +1393,7 @@ static int kvm_vcpu_ioctl_set_cpuid2(struct kvm_vcpu *vcpu,
cpuid->nent * sizeof(struct kvm_cpuid_entry2)))
goto out;
vcpu->arch.cpuid_nent = cpuid->nent;
+ kvm_apic_set_version(vcpu);
return 0;
out:
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 4c8e10a..5eadea5 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -31,4 +31,8 @@ static inline bool kvm_exception_is_soft(unsigned int nr)
{
return (nr == BP_VECTOR) || (nr == OF_VECTOR);
}
+
+struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
+ u32 function, u32 index);
+
#endif
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 06/46] KVM: x2apic interface to lapic
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (4 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 05/46] KVM: Add Directed EOI support to APIC emulation Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 07/46] KVM: Use temporary variable to shorten lines Avi Kivity
` (39 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
This patch implements MSR interface to local apic as defines by x2apic
Intel specification.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/lapic.c | 217 ++++++++++++++++++++++++++++++++++++-------------
arch/x86/kvm/lapic.h | 2 +
arch/x86/kvm/x86.c | 7 ++-
3 files changed, 167 insertions(+), 59 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 62ea2ab..dc6bec2 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -32,6 +32,7 @@
#include <asm/current.h>
#include <asm/apicdef.h>
#include <asm/atomic.h>
+#include <asm/apicdef.h>
#include "kvm_cache_regs.h"
#include "irq.h"
#include "trace.h"
@@ -158,6 +159,11 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu)
apic_set_reg(apic, APIC_LVR, v);
}
+static inline int apic_x2apic_mode(struct kvm_lapic *apic)
+{
+ return apic->vcpu->arch.apic_base & X2APIC_ENABLE;
+}
+
static unsigned int apic_lvt_mask[APIC_LVT_NUM] = {
LVT_MASK | APIC_LVT_TIMER_PERIODIC, /* LVTT */
LVT_MASK | APIC_MODE_MASK, /* LVTTHMR */
@@ -284,7 +290,12 @@ int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest)
int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda)
{
int result = 0;
- u8 logical_id;
+ u32 logical_id;
+
+ if (apic_x2apic_mode(apic)) {
+ logical_id = apic_get_reg(apic, APIC_LDR);
+ return logical_id & mda;
+ }
logical_id = GET_APIC_LOGICAL_ID(apic_get_reg(apic, APIC_LDR));
@@ -477,7 +488,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
irq.level = icr_low & APIC_INT_ASSERT;
irq.trig_mode = icr_low & APIC_INT_LEVELTRIG;
irq.shorthand = icr_low & APIC_SHORT_MASK;
- irq.dest_id = GET_APIC_DEST_FIELD(icr_high);
+ if (apic_x2apic_mode(apic))
+ irq.dest_id = icr_high;
+ else
+ irq.dest_id = GET_APIC_DEST_FIELD(icr_high);
apic_debug("icr_high 0x%x, icr_low 0x%x, "
"short_hand 0x%x, dest 0x%x, trig_mode 0x%x, level 0x%x, "
@@ -564,28 +578,26 @@ static inline struct kvm_lapic *to_lapic(struct kvm_io_device *dev)
return container_of(dev, struct kvm_lapic, dev);
}
-static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr)
-{
- return apic_hw_enabled(apic) &&
- addr >= apic->base_address &&
- addr < apic->base_address + LAPIC_MMIO_LENGTH;
-}
-
-static int apic_mmio_read(struct kvm_io_device *this,
- gpa_t address, int len, void *data)
+static int apic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
+ void *data)
{
- struct kvm_lapic *apic = to_lapic(this);
- unsigned int offset = address - apic->base_address;
unsigned char alignment = offset & 0xf;
u32 result;
- if (!apic_mmio_in_range(apic, address))
- return -EOPNOTSUPP;
+ /* this bitmask has a bit cleared for each reserver register */
+ static const u64 rmask = 0x43ff01ffffffe70cULL;
if ((alignment + len) > 4) {
- printk(KERN_ERR "KVM_APIC_READ: alignment error %lx %d",
- (unsigned long)address, len);
- return 0;
+ printk(KERN_ERR "KVM_APIC_READ: alignment error %x %d\n",
+ offset, len);
+ return 1;
+ }
+
+ if (offset > 0x3f0 || !(rmask & (1ULL << (offset >> 4)))) {
+ printk(KERN_ERR "KVM_APIC_READ: read reserved register %x\n",
+ offset);
+ return 1;
}
+
result = __apic_read(apic, offset & ~0xf);
trace_kvm_apic_read(offset, result);
@@ -604,6 +616,27 @@ static int apic_mmio_read(struct kvm_io_device *this,
return 0;
}
+static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr)
+{
+ return apic_hw_enabled(apic) &&
+ addr >= apic->base_address &&
+ addr < apic->base_address + LAPIC_MMIO_LENGTH;
+}
+
+static int apic_mmio_read(struct kvm_io_device *this,
+ gpa_t address, int len, void *data)
+{
+ struct kvm_lapic *apic = to_lapic(this);
+ u32 offset = address - apic->base_address;
+
+ if (!apic_mmio_in_range(apic, address))
+ return -EOPNOTSUPP;
+
+ apic_reg_read(apic, offset, len, data);
+
+ return 0;
+}
+
static void update_divide_count(struct kvm_lapic *apic)
{
u32 tmp1, tmp2, tdcr;
@@ -657,42 +690,18 @@ static void apic_manage_nmi_watchdog(struct kvm_lapic *apic, u32 lvt0_val)
apic->vcpu->kvm->arch.vapics_in_nmi_mode--;
}
-static int apic_mmio_write(struct kvm_io_device *this,
- gpa_t address, int len, const void *data)
+static int apic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
{
- struct kvm_lapic *apic = to_lapic(this);
- unsigned int offset = address - apic->base_address;
- unsigned char alignment = offset & 0xf;
- u32 val;
- if (!apic_mmio_in_range(apic, address))
- return -EOPNOTSUPP;
-
- /*
- * APIC register must be aligned on 128-bits boundary.
- * 32/64/128 bits registers must be accessed thru 32 bits.
- * Refer SDM 8.4.1
- */
- if (len != 4 || alignment) {
- /* Don't shout loud, $infamous_os would cause only noise. */
- apic_debug("apic write: bad size=%d %lx\n",
- len, (long)address);
- return 0;
- }
-
- val = *(u32 *) data;
-
- /* too common printing */
- if (offset != APIC_EOI)
- apic_debug("%s: offset 0x%x with length 0x%x, and value is "
- "0x%x\n", __func__, offset, len, val);
+ int ret = 0;
- offset &= 0xff0;
+ trace_kvm_apic_write(reg, val);
- trace_kvm_apic_write(offset, val);
-
- switch (offset) {
+ switch (reg) {
case APIC_ID: /* Local APIC ID */
- apic_set_reg(apic, APIC_ID, val);
+ if (!apic_x2apic_mode(apic))
+ apic_set_reg(apic, APIC_ID, val);
+ else
+ ret = 1;
break;
case APIC_TASKPRI:
@@ -705,11 +714,17 @@ static int apic_mmio_write(struct kvm_io_device *this,
break;
case APIC_LDR:
- apic_set_reg(apic, APIC_LDR, val & APIC_LDR_MASK);
+ if (!apic_x2apic_mode(apic))
+ apic_set_reg(apic, APIC_LDR, val & APIC_LDR_MASK);
+ else
+ ret = 1;
break;
case APIC_DFR:
- apic_set_reg(apic, APIC_DFR, val | 0x0FFFFFFF);
+ if (!apic_x2apic_mode(apic))
+ apic_set_reg(apic, APIC_DFR, val | 0x0FFFFFFF);
+ else
+ ret = 1;
break;
case APIC_SPIV: {
@@ -739,7 +754,9 @@ static int apic_mmio_write(struct kvm_io_device *this,
break;
case APIC_ICR2:
- apic_set_reg(apic, APIC_ICR2, val & 0xff000000);
+ if (!apic_x2apic_mode(apic))
+ val &= 0xff000000;
+ apic_set_reg(apic, APIC_ICR2, val);
break;
case APIC_LVT0:
@@ -753,8 +770,8 @@ static int apic_mmio_write(struct kvm_io_device *this,
if (!apic_sw_enabled(apic))
val |= APIC_LVT_MASKED;
- val &= apic_lvt_mask[(offset - APIC_LVTT) >> 4];
- apic_set_reg(apic, offset, val);
+ val &= apic_lvt_mask[(reg - APIC_LVTT) >> 4];
+ apic_set_reg(apic, reg, val);
break;
@@ -762,7 +779,7 @@ static int apic_mmio_write(struct kvm_io_device *this,
hrtimer_cancel(&apic->lapic_timer.timer);
apic_set_reg(apic, APIC_TMICT, val);
start_apic_timer(apic);
- return 0;
+ break;
case APIC_TDCR:
if (val & 4)
@@ -771,11 +788,58 @@ static int apic_mmio_write(struct kvm_io_device *this,
update_divide_count(apic);
break;
+ case APIC_ESR:
+ if (apic_x2apic_mode(apic) && val != 0) {
+ printk(KERN_ERR "KVM_WRITE:ESR not zero %x\n", val);
+ ret = 1;
+ }
+ break;
+
+ case APIC_SELF_IPI:
+ if (apic_x2apic_mode(apic)) {
+ apic_reg_write(apic, APIC_ICR, 0x40000 | (val & 0xff));
+ } else
+ ret = 1;
+ break;
default:
- apic_debug("Local APIC Write to read-only register %x\n",
- offset);
+ ret = 1;
break;
}
+ if (ret)
+ apic_debug("Local APIC Write to read-only register %x\n", reg);
+ return ret;
+}
+
+static int apic_mmio_write(struct kvm_io_device *this,
+ gpa_t address, int len, const void *data)
+{
+ struct kvm_lapic *apic = to_lapic(this);
+ unsigned int offset = address - apic->base_address;
+ u32 val;
+
+ if (!apic_mmio_in_range(apic, address))
+ return -EOPNOTSUPP;
+
+ /*
+ * APIC register must be aligned on 128-bits boundary.
+ * 32/64/128 bits registers must be accessed thru 32 bits.
+ * Refer SDM 8.4.1
+ */
+ if (len != 4 || (offset & 0xf)) {
+ /* Don't shout loud, $infamous_os would cause only noise. */
+ apic_debug("apic write: bad size=%d %lx\n", len, (long)address);
+ return;
+ }
+
+ val = *(u32*)data;
+
+ /* too common printing */
+ if (offset != APIC_EOI)
+ apic_debug("%s: offset 0x%x with length 0x%x, and value is "
+ "0x%x\n", __func__, offset, len, val);
+
+ apic_reg_write(apic, offset & 0xff0, val);
+
return 0;
}
@@ -834,6 +898,11 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
value &= ~MSR_IA32_APICBASE_BSP;
vcpu->arch.apic_base = value;
+ if (apic_x2apic_mode(apic)) {
+ u32 id = kvm_apic_id(apic);
+ u32 ldr = ((id & ~0xf) << 16) | (1 << (id & 0xf));
+ apic_set_reg(apic, APIC_LDR, ldr);
+ }
apic->base_address = apic->vcpu->arch.apic_base &
MSR_IA32_APICBASE_BASE;
@@ -1130,3 +1199,35 @@ void kvm_lapic_set_vapic_addr(struct kvm_vcpu *vcpu, gpa_t vapic_addr)
vcpu->arch.apic->vapic_addr = vapic_addr;
}
+
+int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+ u32 reg = (msr - APIC_BASE_MSR) << 4;
+
+ if (!irqchip_in_kernel(vcpu->kvm) || !apic_x2apic_mode(apic))
+ return 1;
+
+ /* if this is ICR write vector before command */
+ if (msr == 0x830)
+ apic_reg_write(apic, APIC_ICR2, (u32)(data >> 32));
+ return apic_reg_write(apic, reg, (u32)data);
+}
+
+int kvm_x2apic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
+{
+ struct kvm_lapic *apic = vcpu->arch.apic;
+ u32 reg = (msr - APIC_BASE_MSR) << 4, low, high = 0;
+
+ if (!irqchip_in_kernel(vcpu->kvm) || !apic_x2apic_mode(apic))
+ return 1;
+
+ if (apic_reg_read(apic, reg, 4, &low))
+ return 1;
+ if (msr == 0x830)
+ apic_reg_read(apic, APIC_ICR2, 4, &high);
+
+ *data = (((u64)high) << 32) | low;
+
+ return 0;
+}
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index bc1c524..40010b0 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -46,4 +46,6 @@ void kvm_lapic_set_vapic_addr(struct kvm_vcpu *vcpu, gpa_t vapic_addr);
void kvm_lapic_sync_from_vapic(struct kvm_vcpu *vcpu);
void kvm_lapic_sync_to_vapic(struct kvm_vcpu *vcpu);
+int kvm_x2apic_msr_write(struct kvm_vcpu *vcpu, u32 msr, u64 data);
+int kvm_x2apic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
#endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 086f931..a50c832 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -867,6 +867,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
case MSR_IA32_APICBASE:
kvm_set_apic_base(vcpu, data);
break;
+ case APIC_BASE_MSR ... APIC_BASE_MSR + 0x3ff:
+ return kvm_x2apic_msr_write(vcpu, msr, data);
case MSR_IA32_MISC_ENABLE:
vcpu->arch.ia32_misc_enable_msr = data;
break;
@@ -1065,6 +1067,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
case MSR_IA32_APICBASE:
data = kvm_get_apic_base(vcpu);
break;
+ case APIC_BASE_MSR ... APIC_BASE_MSR + 0x3ff:
+ return kvm_x2apic_msr_read(vcpu, msr, pdata);
+ break;
case MSR_IA32_MISC_ENABLE:
data = vcpu->arch.ia32_misc_enable_msr;
break;
@@ -1469,7 +1474,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
0 /* Reserved */ | F(CX16) | 0 /* xTPR Update, PDCM */ |
0 /* Reserved, DCA */ | F(XMM4_1) |
- F(XMM4_2) | 0 /* x2APIC */ | F(MOVBE) | F(POPCNT) |
+ F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) |
0 /* Reserved, XSAVE, OSXSAVE */;
/* cpuid 0x80000001.ecx */
const u32 kvm_supported_word6_x86_features =
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 07/46] KVM: Use temporary variable to shorten lines.
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (5 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 06/46] KVM: x2apic interface to lapic Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 08/46] KVM: Fix apic_mmio_write return for unaligned write Avi Kivity
` (38 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Cosmetic only. No logic is changed by this patch.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
virt/kvm/ioapic.c | 18 ++++++++++--------
1 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 8a9c6cc..92496ff 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -103,6 +103,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
{
unsigned index;
bool mask_before, mask_after;
+ union kvm_ioapic_redirect_entry *e;
switch (ioapic->ioregsel) {
case IOAPIC_REG_VERSION:
@@ -122,19 +123,20 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
ioapic_debug("change redir index %x val %x\n", index, val);
if (index >= IOAPIC_NUM_PINS)
return;
- mask_before = ioapic->redirtbl[index].fields.mask;
+ e = &ioapic->redirtbl[index];
+ mask_before = e->fields.mask;
if (ioapic->ioregsel & 1) {
- ioapic->redirtbl[index].bits &= 0xffffffff;
- ioapic->redirtbl[index].bits |= (u64) val << 32;
+ e->bits &= 0xffffffff;
+ e->bits |= (u64) val << 32;
} else {
- ioapic->redirtbl[index].bits &= ~0xffffffffULL;
- ioapic->redirtbl[index].bits |= (u32) val;
- ioapic->redirtbl[index].fields.remote_irr = 0;
+ e->bits &= ~0xffffffffULL;
+ e->bits |= (u32) val;
+ e->fields.remote_irr = 0;
}
- mask_after = ioapic->redirtbl[index].fields.mask;
+ mask_after = e->fields.mask;
if (mask_before != mask_after)
kvm_fire_mask_notifiers(ioapic->kvm, index, mask_after);
- if (ioapic->redirtbl[index].fields.trig_mode == IOAPIC_LEVEL_TRIG
+ if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG
&& ioapic->irr & (1 << index))
ioapic_service(ioapic, index);
break;
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 08/46] KVM: Fix apic_mmio_write return for unaligned write
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (6 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 07/46] KVM: Use temporary variable to shorten lines Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 09/46] KVM: handle AMD microcode MSR Avi Kivity
` (37 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Sheng Yang <sheng@linux.intel.com>
Some in-famous OS do unaligned writing for APIC MMIO, and the return value
has been missed in recent change, then the OS hangs.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/lapic.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dc6bec2..31f1d1c 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -828,7 +828,7 @@ static int apic_mmio_write(struct kvm_io_device *this,
if (len != 4 || (offset & 0xf)) {
/* Don't shout loud, $infamous_os would cause only noise. */
apic_debug("apic write: bad size=%d %lx\n", len, (long)address);
- return;
+ return 0;
}
val = *(u32*)data;
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 09/46] KVM: handle AMD microcode MSR
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (7 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 08/46] KVM: Fix apic_mmio_write return for unaligned write Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 10/46] Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs" Avi Kivity
` (36 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Andre Przywara <andre.przywara@amd.com>
Windows 7 tries to update the CPU's microcode on some processors,
so we ignore the MSR write here. The patchlevel register is already handled
(returning 0), because the MSR number is the same as Intel's.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a50c832..6dde99c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -861,6 +861,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
case MSR_IA32_UCODE_REV:
case MSR_IA32_UCODE_WRITE:
case MSR_VM_HSAVE_PA:
+ case MSR_AMD64_PATCH_LOADER:
break;
case 0x200 ... 0x2ff:
return set_msr_mtrr(vcpu, msr, data);
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 10/46] Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs"
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (8 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 09/46] KVM: handle AMD microcode MSR Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 11/46] KVM: MMU: Trace guest pagetable walker Avi Kivity
` (35 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Jan Kiszka <jan.kiszka@web.de>
This reverts commit 6c20e1442bb1c62914bb85b7f4a38973d2a423ba.
To my understanding, it became obsolete with the advent of the more
robust check in mmu_alloc_roots (89da4ff17f). Moreover, it prevents
the conceptually safe pattern
1. set sregs
2. register mem-slots
3. run vcpu
by setting a sticky triple fault during step 1.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 8 +-------
1 files changed, 1 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6dde99c..0e74d98 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4326,13 +4326,7 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
vcpu->arch.cr2 = sregs->cr2;
mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
-
- down_read(&vcpu->kvm->slots_lock);
- if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
- vcpu->arch.cr3 = sregs->cr3;
- else
- set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
- up_read(&vcpu->kvm->slots_lock);
+ vcpu->arch.cr3 = sregs->cr3;
kvm_set_cr8(vcpu, sregs->cr8);
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 11/46] KVM: MMU: Trace guest pagetable walker
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (9 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 10/46] Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs" Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 12/46] KVM: Document basic API Avi Kivity
` (34 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 3 +
arch/x86/kvm/mmutrace.h | 117 ++++++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/paging_tmpl.h | 11 +++-
3 files changed, 128 insertions(+), 3 deletions(-)
create mode 100644 arch/x86/kvm/mmutrace.h
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b67585c..c0dda64 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -140,6 +140,9 @@ module_param(oos_shadow, bool, 0644);
#define ACC_USER_MASK PT_USER_MASK
#define ACC_ALL (ACC_EXEC_MASK | ACC_WRITE_MASK | ACC_USER_MASK)
+#define CREATE_TRACE_POINTS
+#include "mmutrace.h"
+
#define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
struct kvm_rmap_desc {
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
new file mode 100644
index 0000000..1367f82
--- /dev/null
+++ b/arch/x86/kvm/mmutrace.h
@@ -0,0 +1,117 @@
+#if !defined(_TRACE_KVMMMU_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_KVMMMU_H
+
+#include <linux/tracepoint.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM kvmmmu
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE mmutrace
+
+#define kvm_mmu_trace_pferr_flags \
+ { PFERR_PRESENT_MASK, "P" }, \
+ { PFERR_WRITE_MASK, "W" }, \
+ { PFERR_USER_MASK, "U" }, \
+ { PFERR_RSVD_MASK, "RSVD" }, \
+ { PFERR_FETCH_MASK, "F" }
+
+/*
+ * A pagetable walk has started
+ */
+TRACE_EVENT(
+ kvm_mmu_pagetable_walk,
+ TP_PROTO(u64 addr, int write_fault, int user_fault, int fetch_fault),
+ TP_ARGS(addr, write_fault, user_fault, fetch_fault),
+
+ TP_STRUCT__entry(
+ __field(__u64, addr)
+ __field(__u32, pferr)
+ ),
+
+ TP_fast_assign(
+ __entry->addr = addr;
+ __entry->pferr = (!!write_fault << 1) | (!!user_fault << 2)
+ | (!!fetch_fault << 4);
+ ),
+
+ TP_printk("addr %llx pferr %x %s", __entry->addr, __entry->pferr,
+ __print_flags(__entry->pferr, "|", kvm_mmu_trace_pferr_flags))
+);
+
+
+/* We just walked a paging element */
+TRACE_EVENT(
+ kvm_mmu_paging_element,
+ TP_PROTO(u64 pte, int level),
+ TP_ARGS(pte, level),
+
+ TP_STRUCT__entry(
+ __field(__u64, pte)
+ __field(__u32, level)
+ ),
+
+ TP_fast_assign(
+ __entry->pte = pte;
+ __entry->level = level;
+ ),
+
+ TP_printk("pte %llx level %u", __entry->pte, __entry->level)
+);
+
+/* We set a pte accessed bit */
+TRACE_EVENT(
+ kvm_mmu_set_accessed_bit,
+ TP_PROTO(unsigned long table_gfn, unsigned index, unsigned size),
+ TP_ARGS(table_gfn, index, size),
+
+ TP_STRUCT__entry(
+ __field(__u64, gpa)
+ ),
+
+ TP_fast_assign(
+ __entry->gpa = ((u64)table_gfn << PAGE_SHIFT)
+ + index * size;
+ ),
+
+ TP_printk("gpa %llx", __entry->gpa)
+);
+
+/* We set a pte dirty bit */
+TRACE_EVENT(
+ kvm_mmu_set_dirty_bit,
+ TP_PROTO(unsigned long table_gfn, unsigned index, unsigned size),
+ TP_ARGS(table_gfn, index, size),
+
+ TP_STRUCT__entry(
+ __field(__u64, gpa)
+ ),
+
+ TP_fast_assign(
+ __entry->gpa = ((u64)table_gfn << PAGE_SHIFT)
+ + index * size;
+ ),
+
+ TP_printk("gpa %llx", __entry->gpa)
+);
+
+TRACE_EVENT(
+ kvm_mmu_walker_error,
+ TP_PROTO(u32 pferr),
+ TP_ARGS(pferr),
+
+ TP_STRUCT__entry(
+ __field(__u32, pferr)
+ ),
+
+ TP_fast_assign(
+ __entry->pferr = pferr;
+ ),
+
+ TP_printk("pferr %x %s", __entry->pferr,
+ __print_flags(__entry->pferr, "|", kvm_mmu_trace_pferr_flags))
+);
+
+#endif /* _TRACE_KVMMMU_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 53e129c..36ac6d7 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -125,13 +125,15 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
gpa_t pte_gpa;
int rsvd_fault = 0;
- pgprintk("%s: addr %lx\n", __func__, addr);
+ trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault,
+ fetch_fault);
walk:
walker->level = vcpu->arch.mmu.root_level;
pte = vcpu->arch.cr3;
#if PTTYPE == 64
if (!is_long_mode(vcpu)) {
pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
+ trace_kvm_mmu_paging_element(pte, walker->level);
if (!is_present_gpte(pte))
goto not_present;
--walker->level;
@@ -150,10 +152,9 @@ walk:
pte_gpa += index * sizeof(pt_element_t);
walker->table_gfn[walker->level - 1] = table_gfn;
walker->pte_gpa[walker->level - 1] = pte_gpa;
- pgprintk("%s: table_gfn[%d] %lx\n", __func__,
- walker->level - 1, table_gfn);
kvm_read_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte));
+ trace_kvm_mmu_paging_element(pte, walker->level);
if (!is_present_gpte(pte))
goto not_present;
@@ -175,6 +176,8 @@ walk:
#endif
if (!(pte & PT_ACCESSED_MASK)) {
+ trace_kvm_mmu_set_accessed_bit(table_gfn, index,
+ sizeof(pte));
mark_page_dirty(vcpu->kvm, table_gfn);
if (FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn,
index, pte, pte|PT_ACCESSED_MASK))
@@ -208,6 +211,7 @@ walk:
if (write_fault && !is_dirty_gpte(pte)) {
bool ret;
+ trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte));
mark_page_dirty(vcpu->kvm, table_gfn);
ret = FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, index, pte,
pte|PT_DIRTY_MASK);
@@ -239,6 +243,7 @@ err:
walker->error_code |= PFERR_FETCH_MASK;
if (rsvd_fault)
walker->error_code |= PFERR_RSVD_MASK;
+ trace_kvm_mmu_walker_error(walker->error_code);
return 0;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 12/46] KVM: Document basic API
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (10 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 11/46] KVM: MMU: Trace guest pagetable walker Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 13/46] KVM: Trace shadow page lifecycle Avi Kivity
` (33 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Document the basic API corresponding to the 2.6.22 release.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
Documentation/kvm/api.txt | 683 +++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 683 insertions(+), 0 deletions(-)
create mode 100644 Documentation/kvm/api.txt
diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
new file mode 100644
index 0000000..1b1c22d
--- /dev/null
+++ b/Documentation/kvm/api.txt
@@ -0,0 +1,683 @@
+The Definitive KVM (Kernel-based Virtual Machine) API Documentation
+===================================================================
+
+1. General description
+
+The kvm API is a set of ioctls that are issued to control various aspects
+of a virtual machine. The ioctls belong to three classes
+
+ - System ioctls: These query and set global attributes which affect the
+ whole kvm subsystem. In addition a system ioctl is used to create
+ virtual machines
+
+ - VM ioctls: These query and set attributes that affect an entire virtual
+ machine, for example memory layout. In addition a VM ioctl is used to
+ create virtual cpus (vcpus).
+
+ Only run VM ioctls from the same process (address space) that was used
+ to create the VM.
+
+ - vcpu ioctls: These query and set attributes that control the operation
+ of a single virtual cpu.
+
+ Only run vcpu ioctls from the same thread that was used to create the
+ vcpu.
+
+2. File descritpors
+
+The kvm API is centered around file descriptors. An initial
+open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
+can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
+handle will create a VM file descripror which can be used to issue VM
+ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
+and return a file descriptor pointing to it. Finally, ioctls on a vcpu
+fd can be used to control the vcpu, including the important task of
+actually running guest code.
+
+In general file descriptors can be migrated among processes by means
+of fork() and the SCM_RIGHTS facility of unix domain socket. These
+kinds of tricks are explicitly not supported by kvm. While they will
+not cause harm to the host, their actual behavior is not guaranteed by
+the API. The only supported use is one virtual machine per process,
+and one vcpu per thread.
+
+3. Extensions
+
+As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
+incompatible change are allowed. However, there is an extension
+facility that allows backward-compatible extensions to the API to be
+queried and used.
+
+The extension mechanism is not based on on the Linux version number.
+Instead, kvm defines extension identifiers and a facility to query
+whether a particular extension identifier is available. If it is, a
+set of ioctls is available for application use.
+
+4. API description
+
+This section describes ioctls that can be used to control kvm guests.
+For each ioctl, the following information is provided along with a
+description:
+
+ Capability: which KVM extension provides this ioctl. Can be 'basic',
+ which means that is will be provided by any kernel that supports
+ API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
+ means availability needs to be checked with KVM_CHECK_EXTENSION
+ (see section 4.4).
+
+ Architectures: which instruction set architectures provide this ioctl.
+ x86 includes both i386 and x86_64.
+
+ Type: system, vm, or vcpu.
+
+ Parameters: what parameters are accepted by the ioctl.
+
+ Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
+ are not detailed, but errors with specific meanings are.
+
+4.1 KVM_GET_API_VERSION
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: none
+Returns: the constant KVM_API_VERSION (=12)
+
+This identifies the API version as the stable kvm API. It is not
+expected that this number will change. However, Linux 2.6.20 and
+2.6.21 report earlier versions; these are not documented and not
+supported. Applications should refuse to run if KVM_GET_API_VERSION
+returns a value other than 12. If this check passes, all ioctls
+described as 'basic' will be available.
+
+4.2 KVM_CREATE_VM
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: none
+Returns: a VM fd that can be used to control the new virtual machine.
+
+The new VM has no virtual cpus and no memory. An mmap() of a VM fd
+will access the virtual machine's physical address space; offset zero
+corresponds to guest physical address zero. Use of mmap() on a VM fd
+is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
+available.
+
+4.3 KVM_GET_MSR_INDEX_LIST
+
+Capability: basic
+Architectures: x86
+Type: system
+Parameters: struct kvm_msr_list (in/out)
+Returns: 0 on success; -1 on error
+Errors:
+ E2BIG: the msr index list is to be to fit in the array specified by
+ the user.
+
+struct kvm_msr_list {
+ __u32 nmsrs; /* number of msrs in entries */
+ __u32 indices[0];
+};
+
+This ioctl returns the guest msrs that are supported. The list varies
+by kvm version and host processor, but does not change otherwise. The
+user fills in the size of the indices array in nmsrs, and in return
+kvm adjusts nmsrs to reflect the actual number of msrs and fills in
+the indices array with their numbers.
+
+4.4 KVM_CHECK_EXTENSION
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: extension identifier (KVM_CAP_*)
+Returns: 0 if unsupported; 1 (or some other positive integer) if supported
+
+The API allows the application to query about extensions to the core
+kvm API. Userspace passes an extension identifier (an integer) and
+receives an integer that describes the extension availability.
+Generally 0 means no and 1 means yes, but some extensions may report
+additional information in the integer return value.
+
+4.5 KVM_GET_VCPU_MMAP_SIZE
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: none
+Returns: size of vcpu mmap area, in bytes
+
+The KVM_RUN ioctl (cf.) communicates with userspace via a shared
+memory region. This ioctl returns the size of that region. See the
+KVM_RUN documentation for details.
+
+4.6 KVM_SET_MEMORY_REGION
+
+Capability: basic
+Architectures: all
+Type: vm ioctl
+Parameters: struct kvm_memory_region (in)
+Returns: 0 on success, -1 on error
+
+struct kvm_memory_region {
+ __u32 slot;
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size; /* bytes */
+};
+
+/* for kvm_memory_region::flags */
+#define KVM_MEM_LOG_DIRTY_PAGES 1UL
+
+This ioctl allows the user to create or modify a guest physical memory
+slot. When changing an existing slot, it may be moved in the guest
+physical memory space, or its flags may be modified. It may not be
+resized. Slots may not overlap.
+
+The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
+instructs kvm to keep track of writes to memory within the slot. See
+the KVM_GET_DIRTY_LOG ioctl.
+
+It is recommended to use the KVM_SET_USER_MEMORY_REGION ioctl instead
+of this API, if available. This newer API allows placing guest memory
+at specified locations in the host address space, yielding better
+control and easy access.
+
+4.6 KVM_CREATE_VCPU
+
+Capability: basic
+Architectures: all
+Type: vm ioctl
+Parameters: vcpu id (apic id on x86)
+Returns: vcpu fd on success, -1 on error
+
+This API adds a vcpu to a virtual machine. The vcpu id is a small integer
+in the range [0, max_vcpus).
+
+4.7 KVM_GET_DIRTY_LOG (vm ioctl)
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_dirty_log (in/out)
+Returns: 0 on success, -1 on error
+
+/* for KVM_GET_DIRTY_LOG */
+struct kvm_dirty_log {
+ __u32 slot;
+ __u32 padding;
+ union {
+ void __user *dirty_bitmap; /* one bit per page */
+ __u64 padding;
+ };
+};
+
+Given a memory slot, return a bitmap containing any pages dirtied
+since the last call to this ioctl. Bit 0 is the first page in the
+memory slot. Ensure the entire structure is cleared to avoid padding
+issues.
+
+4.8 KVM_SET_MEMORY_ALIAS
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_memory_alias (in)
+Returns: 0 (success), -1 (error)
+
+struct kvm_memory_alias {
+ __u32 slot; /* this has a different namespace than memory slots */
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size;
+ __u64 target_phys_addr;
+};
+
+Defines a guest physical address space region as an alias to another
+region. Useful for aliased address, for example the VGA low memory
+window. Should not be used with userspace memory.
+
+4.9 KVM_RUN
+
+Capability: basic
+Architectures: all
+Type: vcpu ioctl
+Parameters: none
+Returns: 0 on success, -1 on error
+Errors:
+ EINTR: an unmasked signal is pending
+
+This ioctl is used to run a guest virtual cpu. While there are no
+explicit parameters, there is an implicit parameter block that can be
+obtained by mmap()ing the vcpu fd at offset 0, with the size given by
+KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
+kvm_run' (see below).
+
+4.10 KVM_GET_REGS
+
+Capability: basic
+Architectures: all
+Type: vcpu ioctl
+Parameters: struct kvm_regs (out)
+Returns: 0 on success, -1 on error
+
+Reads the general purpose registers from the vcpu.
+
+/* x86 */
+struct kvm_regs {
+ /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
+ __u64 rax, rbx, rcx, rdx;
+ __u64 rsi, rdi, rsp, rbp;
+ __u64 r8, r9, r10, r11;
+ __u64 r12, r13, r14, r15;
+ __u64 rip, rflags;
+};
+
+4.11 KVM_SET_REGS
+
+Capability: basic
+Architectures: all
+Type: vcpu ioctl
+Parameters: struct kvm_regs (in)
+Returns: 0 on success, -1 on error
+
+Writes the general purpose registers into the vcpu.
+
+See KVM_GET_REGS for the data structure.
+
+4.12 KVM_GET_SREGS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_sregs (out)
+Returns: 0 on success, -1 on error
+
+Reads special registers from the vcpu.
+
+/* x86 */
+struct kvm_sregs {
+ struct kvm_segment cs, ds, es, fs, gs, ss;
+ struct kvm_segment tr, ldt;
+ struct kvm_dtable gdt, idt;
+ __u64 cr0, cr2, cr3, cr4, cr8;
+ __u64 efer;
+ __u64 apic_base;
+ __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
+};
+
+interrupt_bitmap is a bitmap of pending external interrupts. At most
+one bit may be set. This interrupt has been acknowledged by the APIC
+but not yet injected into the cpu core.
+
+4.13 KVM_SET_SREGS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_sregs (in)
+Returns: 0 on success, -1 on error
+
+Writes special registers into the vcpu. See KVM_GET_SREGS for the
+data structures.
+
+4.14 KVM_TRANSLATE
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_translation (in/out)
+Returns: 0 on success, -1 on error
+
+Translates a virtual address according to the vcpu's current address
+translation mode.
+
+struct kvm_translation {
+ /* in */
+ __u64 linear_address;
+
+ /* out */
+ __u64 physical_address;
+ __u8 valid;
+ __u8 writeable;
+ __u8 usermode;
+ __u8 pad[5];
+};
+
+4.15 KVM_INTERRUPT
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_interrupt (in)
+Returns: 0 on success, -1 on error
+
+Queues a hardware interrupt vector to be injected. This is only
+useful if in-kernel local APIC is not used.
+
+/* for KVM_INTERRUPT */
+struct kvm_interrupt {
+ /* in */
+ __u32 irq;
+};
+
+Note 'irq' is an interrupt vector, not an interrupt pin or line.
+
+4.16 KVM_DEBUG_GUEST
+
+Capability: basic
+Architectures: none
+Type: vcpu ioctl
+Parameters: none)
+Returns: -1 on error
+
+Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
+
+4.17 KVM_GET_MSRS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_msrs (in/out)
+Returns: 0 on success, -1 on error
+
+Reads model-specific registers from the vcpu. Supported msr indices can
+be obtained using KVM_GET_MSR_INDEX_LIST.
+
+struct kvm_msrs {
+ __u32 nmsrs; /* number of msrs in entries */
+ __u32 pad;
+
+ struct kvm_msr_entry entries[0];
+};
+
+struct kvm_msr_entry {
+ __u32 index;
+ __u32 reserved;
+ __u64 data;
+};
+
+Application code should set the 'nmsrs' member (which indicates the
+size of the entries array) and the 'index' member of each array entry.
+kvm will fill in the 'data' member.
+
+4.18 KVM_SET_MSRS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_msrs (in)
+Returns: 0 on success, -1 on error
+
+Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
+data structures.
+
+Application code should set the 'nmsrs' member (which indicates the
+size of the entries array), and the 'index' and 'data' members of each
+array entry.
+
+4.19 KVM_SET_CPUID
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_cpuid (in)
+Returns: 0 on success, -1 on error
+
+Defines the vcpu responses to the cpuid instruction. Applications
+should use the KVM_SET_CPUID2 ioctl if available.
+
+
+struct kvm_cpuid_entry {
+ __u32 function;
+ __u32 eax;
+ __u32 ebx;
+ __u32 ecx;
+ __u32 edx;
+ __u32 padding;
+};
+
+/* for KVM_SET_CPUID */
+struct kvm_cpuid {
+ __u32 nent;
+ __u32 padding;
+ struct kvm_cpuid_entry entries[0];
+};
+
+4.20 KVM_SET_SIGNAL_MASK
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_signal_mask (in)
+Returns: 0 on success, -1 on error
+
+Defines which signals are blocked during execution of KVM_RUN. This
+signal mask temporarily overrides the threads signal mask. Any
+unblocked signal received (except SIGKILL and SIGSTOP, which retain
+their traditional behaviour) will cause KVM_RUN to return with -EINTR.
+
+Note the signal will only be delivered if not blocked by the original
+signal mask.
+
+/* for KVM_SET_SIGNAL_MASK */
+struct kvm_signal_mask {
+ __u32 len;
+ __u8 sigset[0];
+};
+
+4.21 KVM_GET_FPU
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_fpu (out)
+Returns: 0 on success, -1 on error
+
+Reads the floating point state from the vcpu.
+
+/* for KVM_GET_FPU and KVM_SET_FPU */
+struct kvm_fpu {
+ __u8 fpr[8][16];
+ __u16 fcw;
+ __u16 fsw;
+ __u8 ftwx; /* in fxsave format */
+ __u8 pad1;
+ __u16 last_opcode;
+ __u64 last_ip;
+ __u64 last_dp;
+ __u8 xmm[16][16];
+ __u32 mxcsr;
+ __u32 pad2;
+};
+
+4.22 KVM_SET_FPU
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_fpu (in)
+Returns: 0 on success, -1 on error
+
+Writes the floating point state to the vcpu.
+
+/* for KVM_GET_FPU and KVM_SET_FPU */
+struct kvm_fpu {
+ __u8 fpr[8][16];
+ __u16 fcw;
+ __u16 fsw;
+ __u8 ftwx; /* in fxsave format */
+ __u8 pad1;
+ __u16 last_opcode;
+ __u64 last_ip;
+ __u64 last_dp;
+ __u8 xmm[16][16];
+ __u32 mxcsr;
+ __u32 pad2;
+};
+
+5. The kvm_run structure
+
+Application code obtains a pointer to the kvm_run structure by
+mmap()ing a vcpu fd. From that point, application code can control
+execution by changing fields in kvm_run prior to calling the KVM_RUN
+ioctl, and obtain information about the reason KVM_RUN returned by
+looking up structure members.
+
+struct kvm_run {
+ /* in */
+ __u8 request_interrupt_window;
+
+Request that KVM_RUN return when it becomes possible to inject external
+interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
+
+ __u8 padding1[7];
+
+ /* out */
+ __u32 exit_reason;
+
+When KVM_RUN has returned successfully (return value 0), this informs
+application code why KVM_RUN has returned. Allowable values for this
+field are detailed below.
+
+ __u8 ready_for_interrupt_injection;
+
+If request_interrupt_window has been specified, this field indicates
+an interrupt can be injected now with KVM_INTERRUPT.
+
+ __u8 if_flag;
+
+The value of the current interrupt flag. Only valid if in-kernel
+local APIC is not used.
+
+ __u8 padding2[2];
+
+ /* in (pre_kvm_run), out (post_kvm_run) */
+ __u64 cr8;
+
+The value of the cr8 register. Only valid if in-kernel local APIC is
+not used. Both input and output.
+
+ __u64 apic_base;
+
+The value of the APIC BASE msr. Only valid if in-kernel local
+APIC is not used. Both input and output.
+
+ union {
+ /* KVM_EXIT_UNKNOWN */
+ struct {
+ __u64 hardware_exit_reason;
+ } hw;
+
+If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
+reasons. Further architecture-specific information is available in
+hardware_exit_reason.
+
+ /* KVM_EXIT_FAIL_ENTRY */
+ struct {
+ __u64 hardware_entry_failure_reason;
+ } fail_entry;
+
+If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
+to unknown reasons. Further architecture-specific information is
+available in hardware_entry_failure_reason.
+
+ /* KVM_EXIT_EXCEPTION */
+ struct {
+ __u32 exception;
+ __u32 error_code;
+ } ex;
+
+Unused.
+
+ /* KVM_EXIT_IO */
+ struct {
+#define KVM_EXIT_IO_IN 0
+#define KVM_EXIT_IO_OUT 1
+ __u8 direction;
+ __u8 size; /* bytes */
+ __u16 port;
+ __u32 count;
+ __u64 data_offset; /* relative to kvm_run start */
+ } io;
+
+If exit_reason is KVM_EXIT_IO_IN or KVM_EXIT_IO_OUT, then the vcpu has
+executed a port I/O instruction which could not be satisfied by kvm.
+data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
+where kvm expects application code to place the data for the next
+KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a patcked array.
+
+ struct {
+ struct kvm_debug_exit_arch arch;
+ } debug;
+
+Unused.
+
+ /* KVM_EXIT_MMIO */
+ struct {
+ __u64 phys_addr;
+ __u8 data[8];
+ __u32 len;
+ __u8 is_write;
+ } mmio;
+
+If exit_reason is KVM_EXIT_MMIO or KVM_EXIT_IO_OUT, then the vcpu has
+executed a memory-mapped I/O instruction which could not be satisfied
+by kvm. The 'data' member contains the written data if 'is_write' is
+true, and should be filled by application code otherwise.
+
+ /* KVM_EXIT_HYPERCALL */
+ struct {
+ __u64 nr;
+ __u64 args[6];
+ __u64 ret;
+ __u32 longmode;
+ __u32 pad;
+ } hypercall;
+
+Unused.
+
+ /* KVM_EXIT_TPR_ACCESS */
+ struct {
+ __u64 rip;
+ __u32 is_write;
+ __u32 pad;
+ } tpr_access;
+
+To be documented (KVM_TPR_ACCESS_REPORTING).
+
+ /* KVM_EXIT_S390_SIEIC */
+ struct {
+ __u8 icptcode;
+ __u64 mask; /* psw upper half */
+ __u64 addr; /* psw lower half */
+ __u16 ipa;
+ __u32 ipb;
+ } s390_sieic;
+
+s390 specific.
+
+ /* KVM_EXIT_S390_RESET */
+#define KVM_S390_RESET_POR 1
+#define KVM_S390_RESET_CLEAR 2
+#define KVM_S390_RESET_SUBSYSTEM 4
+#define KVM_S390_RESET_CPU_INIT 8
+#define KVM_S390_RESET_IPL 16
+ __u64 s390_reset_flags;
+
+s390 specific.
+
+ /* KVM_EXIT_DCR */
+ struct {
+ __u32 dcrn;
+ __u32 data;
+ __u8 is_write;
+ } dcr;
+
+powerpc specific.
+
+ /* Fix the size of the union. */
+ char padding[256];
+ };
+};
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 13/46] KVM: Trace shadow page lifecycle
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (11 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 12/46] KVM: Document basic API Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 14/46] KVM: fix MMIO_CONF_BASE MSR access Avi Kivity
` (32 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
Create, sync, unsync, zap.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 10 +++--
arch/x86/kvm/mmutrace.h | 103 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 109 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c0dda64..ac121b3 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1122,6 +1122,7 @@ static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
return 1;
}
+ trace_kvm_mmu_sync_page(sp);
if (rmap_write_protect(vcpu->kvm, sp->gfn))
kvm_flush_remote_tlbs(vcpu->kvm);
kvm_unlink_unsync_page(vcpu->kvm, sp);
@@ -1244,8 +1245,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
role.quadrant = quadrant;
}
- pgprintk("%s: looking gfn %lx role %x\n", __func__,
- gfn, role.word);
index = kvm_page_table_hashfn(gfn);
bucket = &vcpu->kvm->arch.mmu_page_hash[index];
hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link)
@@ -1262,14 +1261,13 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
set_bit(KVM_REQ_MMU_SYNC, &vcpu->requests);
kvm_mmu_mark_parents_unsync(vcpu, sp);
}
- pgprintk("%s: found\n", __func__);
+ trace_kvm_mmu_get_page(sp, false);
return sp;
}
++vcpu->kvm->stat.mmu_cache_miss;
sp = kvm_mmu_alloc_page(vcpu, parent_pte);
if (!sp)
return sp;
- pgprintk("%s: adding gfn %lx role %x\n", __func__, gfn, role.word);
sp->gfn = gfn;
sp->role = role;
hlist_add_head(&sp->hash_link, bucket);
@@ -1282,6 +1280,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
vcpu->arch.mmu.prefetch_page(vcpu, sp);
else
nonpaging_prefetch_page(vcpu, sp);
+ trace_kvm_mmu_get_page(sp, true);
return sp;
}
@@ -1410,6 +1409,8 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
static int kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp)
{
int ret;
+
+ trace_kvm_mmu_zap_page(sp);
++kvm->stat.mmu_shadow_zapped;
ret = mmu_zap_unsync_children(kvm, sp);
kvm_mmu_page_unlink_children(kvm, sp);
@@ -1656,6 +1657,7 @@ static int kvm_unsync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
struct kvm_mmu_page *s;
struct hlist_node *node, *n;
+ trace_kvm_mmu_unsync_page(sp);
index = kvm_page_table_hashfn(sp->gfn);
bucket = &vcpu->kvm->arch.mmu_page_hash[index];
/* don't unsync if pagetable is shadowed with multiple roles */
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
index 1367f82..3e4a5c6 100644
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -2,12 +2,48 @@
#define _TRACE_KVMMMU_H
#include <linux/tracepoint.h>
+#include <linux/ftrace_event.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvmmmu
#define TRACE_INCLUDE_PATH .
#define TRACE_INCLUDE_FILE mmutrace
+#define KVM_MMU_PAGE_FIELDS \
+ __field(__u64, gfn) \
+ __field(__u32, role) \
+ __field(__u32, root_count) \
+ __field(__u32, unsync)
+
+#define KVM_MMU_PAGE_ASSIGN(sp) \
+ __entry->gfn = sp->gfn; \
+ __entry->role = sp->role.word; \
+ __entry->root_count = sp->root_count; \
+ __entry->unsync = sp->unsync;
+
+#define KVM_MMU_PAGE_PRINTK() ({ \
+ const char *ret = p->buffer + p->len; \
+ static const char *access_str[] = { \
+ "---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux" \
+ }; \
+ union kvm_mmu_page_role role; \
+ \
+ role.word = __entry->role; \
+ \
+ trace_seq_printf(p, "sp gfn %llx %u/%u q%u%s %s%s %spge" \
+ " %snxe root %u %s%c", \
+ __entry->gfn, role.level, role.glevels, \
+ role.quadrant, \
+ role.direct ? " direct" : "", \
+ access_str[role.access], \
+ role.invalid ? " invalid" : "", \
+ role.cr4_pge ? "" : "!", \
+ role.nxe ? "" : "!", \
+ __entry->root_count, \
+ __entry->unsync ? "unsync" : "sync", 0); \
+ ret; \
+ })
+
#define kvm_mmu_trace_pferr_flags \
{ PFERR_PRESENT_MASK, "P" }, \
{ PFERR_WRITE_MASK, "W" }, \
@@ -111,6 +147,73 @@ TRACE_EVENT(
__print_flags(__entry->pferr, "|", kvm_mmu_trace_pferr_flags))
);
+TRACE_EVENT(
+ kvm_mmu_get_page,
+ TP_PROTO(struct kvm_mmu_page *sp, bool created),
+ TP_ARGS(sp, created),
+
+ TP_STRUCT__entry(
+ KVM_MMU_PAGE_FIELDS
+ __field(bool, created)
+ ),
+
+ TP_fast_assign(
+ KVM_MMU_PAGE_ASSIGN(sp)
+ __entry->created = created;
+ ),
+
+ TP_printk("%s %s", KVM_MMU_PAGE_PRINTK(),
+ __entry->created ? "new" : "existing")
+);
+
+TRACE_EVENT(
+ kvm_mmu_sync_page,
+ TP_PROTO(struct kvm_mmu_page *sp),
+ TP_ARGS(sp),
+
+ TP_STRUCT__entry(
+ KVM_MMU_PAGE_FIELDS
+ ),
+
+ TP_fast_assign(
+ KVM_MMU_PAGE_ASSIGN(sp)
+ ),
+
+ TP_printk("%s", KVM_MMU_PAGE_PRINTK())
+);
+
+TRACE_EVENT(
+ kvm_mmu_unsync_page,
+ TP_PROTO(struct kvm_mmu_page *sp),
+ TP_ARGS(sp),
+
+ TP_STRUCT__entry(
+ KVM_MMU_PAGE_FIELDS
+ ),
+
+ TP_fast_assign(
+ KVM_MMU_PAGE_ASSIGN(sp)
+ ),
+
+ TP_printk("%s", KVM_MMU_PAGE_PRINTK())
+);
+
+TRACE_EVENT(
+ kvm_mmu_zap_page,
+ TP_PROTO(struct kvm_mmu_page *sp),
+ TP_ARGS(sp),
+
+ TP_STRUCT__entry(
+ KVM_MMU_PAGE_FIELDS
+ ),
+
+ TP_fast_assign(
+ KVM_MMU_PAGE_ASSIGN(sp)
+ ),
+
+ TP_printk("%s", KVM_MMU_PAGE_PRINTK())
+);
+
#endif /* _TRACE_KVMMMU_H */
/* This part must be outside protection */
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 14/46] KVM: fix MMIO_CONF_BASE MSR access
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (12 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 13/46] KVM: Trace shadow page lifecycle Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 15/46] KVM: ignore msi request if !level Avi Kivity
` (31 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Andre Przywara <andre.przywara@amd.com>
Some Windows versions check whether the BIOS has setup MMI/O for
config space accesses on AMD Fam10h CPUs, we say "no" by returning 0 on
reads and only allow disabling of MMI/O CfgSpace setup by igoring "0" writes.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0e74d98..95fa45c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -844,6 +844,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
return 1;
}
break;
+ case MSR_FAM10H_MMIO_CONF_BASE:
+ if (data != 0) {
+ pr_unimpl(vcpu, "unimplemented MMIO_CONF_BASE wrmsr: "
+ "0x%llx\n", data);
+ return 1;
+ }
+ break;
case MSR_AMD64_NB_CFG:
break;
case MSR_IA32_DEBUGCTLMSR:
@@ -1055,6 +1062,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
case MSR_K7_EVNTSEL0:
case MSR_K8_INT_PENDING_MSG:
case MSR_AMD64_NB_CFG:
+ case MSR_FAM10H_MMIO_CONF_BASE:
data = 0;
break;
case MSR_MTRRcap:
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 15/46] KVM: ignore msi request if !level
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (13 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 14/46] KVM: fix MMIO_CONF_BASE MSR access Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 16/46] KVM: Add trace points in irqchip code Avi Kivity
` (30 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Michael S. Tsirkin <mst@redhat.com>
Irqfd sets level for interrupt to 1 and then to 0.
For MSI, check level so that a single message is sent.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
virt/kvm/irq_comm.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 56e6961..a9d7fd1 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -139,7 +139,9 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level)
else
clear_bit(irq_source_id, irq_state);
sig_level = !!(*irq_state);
- } else /* Deal with MSI/MSI-X */
+ } else if (!level)
+ return ret;
+ else /* Deal with MSI/MSI-X */
sig_level = 1;
/* Not possible to detect if the guest uses the PIC or the
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 16/46] KVM: Add trace points in irqchip code
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (14 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 15/46] KVM: ignore msi request if !level Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 17/46] KVM: No need to kick cpu if not in a guest mode Avi Kivity
` (29 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Add tracepoint in msi/ioapic/pic set_irq() functions,
in IPI sending and in the point where IRQ is placed into
apic's IRR.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/i8259.c | 3 ++
arch/x86/kvm/lapic.c | 4 ++
arch/x86/kvm/trace.h | 85 ++++++++++++++++++++++++++++++++++++++++++++
include/trace/events/kvm.h | 56 +++++++++++++++++++++++++++++
virt/kvm/ioapic.c | 2 +
virt/kvm/irq_comm.c | 2 +
6 files changed, 152 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 1d1bb75..e4bcbdd 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -30,6 +30,7 @@
#include "irq.h"
#include <linux/kvm_host.h>
+#include "trace.h"
static void pic_lock(struct kvm_pic *s)
__acquires(&s->lock)
@@ -190,6 +191,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
if (irq >= 0 && irq < PIC_NUM_PINS) {
ret = pic_set_irq1(&s->pics[irq >> 3], irq & 7, level);
pic_update_irq(s);
+ trace_kvm_pic_set_irq(irq >> 3, irq & 7, s->pics[irq >> 3].elcr,
+ s->pics[irq >> 3].imr, ret == 0);
}
pic_unlock(s);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 31f1d1c..3b03ebd 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -375,6 +375,8 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
break;
result = !apic_test_and_set_irr(vector, apic);
+ trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
+ trig_mode, vector, result);
if (!result) {
if (trig_mode)
apic_debug("level trig mode repeatedly for "
@@ -493,6 +495,8 @@ static void apic_send_ipi(struct kvm_lapic *apic)
else
irq.dest_id = GET_APIC_DEST_FIELD(icr_high);
+ trace_kvm_apic_ipi(icr_low, irq.dest_id);
+
apic_debug("icr_high 0x%x, icr_low 0x%x, "
"short_hand 0x%x, dest 0x%x, trig_mode 0x%x, level 0x%x, "
"dest_mode 0x%x, delivery_mode 0x%x, vector 0x%x\n",
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 6c2c87f..0d480e7 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -264,6 +264,91 @@ TRACE_EVENT(kvm_cr,
#define trace_kvm_cr_read(cr, val) trace_kvm_cr(0, cr, val)
#define trace_kvm_cr_write(cr, val) trace_kvm_cr(1, cr, val)
+TRACE_EVENT(kvm_pic_set_irq,
+ TP_PROTO(__u8 chip, __u8 pin, __u8 elcr, __u8 imr, bool coalesced),
+ TP_ARGS(chip, pin, elcr, imr, coalesced),
+
+ TP_STRUCT__entry(
+ __field( __u8, chip )
+ __field( __u8, pin )
+ __field( __u8, elcr )
+ __field( __u8, imr )
+ __field( bool, coalesced )
+ ),
+
+ TP_fast_assign(
+ __entry->chip = chip;
+ __entry->pin = pin;
+ __entry->elcr = elcr;
+ __entry->imr = imr;
+ __entry->coalesced = coalesced;
+ ),
+
+ TP_printk("chip %u pin %u (%s%s)%s",
+ __entry->chip, __entry->pin,
+ (__entry->elcr & (1 << __entry->pin)) ? "level":"edge",
+ (__entry->imr & (1 << __entry->pin)) ? "|masked":"",
+ __entry->coalesced ? " (coalesced)" : "")
+);
+
+#define kvm_apic_dst_shorthand \
+ {0x0, "dst"}, \
+ {0x1, "self"}, \
+ {0x2, "all"}, \
+ {0x3, "all-but-self"}
+
+TRACE_EVENT(kvm_apic_ipi,
+ TP_PROTO(__u32 icr_low, __u32 dest_id),
+ TP_ARGS(icr_low, dest_id),
+
+ TP_STRUCT__entry(
+ __field( __u32, icr_low )
+ __field( __u32, dest_id )
+ ),
+
+ TP_fast_assign(
+ __entry->icr_low = icr_low;
+ __entry->dest_id = dest_id;
+ ),
+
+ TP_printk("dst %x vec %u (%s|%s|%s|%s|%s)",
+ __entry->dest_id, (u8)__entry->icr_low,
+ __print_symbolic((__entry->icr_low >> 8 & 0x7),
+ kvm_deliver_mode),
+ (__entry->icr_low & (1<<11)) ? "logical" : "physical",
+ (__entry->icr_low & (1<<14)) ? "assert" : "de-assert",
+ (__entry->icr_low & (1<<15)) ? "level" : "edge",
+ __print_symbolic((__entry->icr_low >> 18 & 0x3),
+ kvm_apic_dst_shorthand))
+);
+
+TRACE_EVENT(kvm_apic_accept_irq,
+ TP_PROTO(__u32 apicid, __u16 dm, __u8 tm, __u8 vec, bool coalesced),
+ TP_ARGS(apicid, dm, tm, vec, coalesced),
+
+ TP_STRUCT__entry(
+ __field( __u32, apicid )
+ __field( __u16, dm )
+ __field( __u8, tm )
+ __field( __u8, vec )
+ __field( bool, coalesced )
+ ),
+
+ TP_fast_assign(
+ __entry->apicid = apicid;
+ __entry->dm = dm;
+ __entry->tm = tm;
+ __entry->vec = vec;
+ __entry->coalesced = coalesced;
+ ),
+
+ TP_printk("apicid %x vec %u (%s|%s)%s",
+ __entry->apicid, __entry->vec,
+ __print_symbolic((__entry->dm >> 8 & 0x7), kvm_deliver_mode),
+ __entry->tm ? "level" : "edge",
+ __entry->coalesced ? " (coalesced)" : "")
+);
+
#endif /* _TRACE_KVM_H */
/* This part must be outside protection */
diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h
index 77022af..dbe1084 100644
--- a/include/trace/events/kvm.h
+++ b/include/trace/events/kvm.h
@@ -28,6 +28,62 @@ TRACE_EVENT(kvm_set_irq,
__entry->gsi, __entry->level, __entry->irq_source_id)
);
+#define kvm_deliver_mode \
+ {0x0, "Fixed"}, \
+ {0x1, "LowPrio"}, \
+ {0x2, "SMI"}, \
+ {0x3, "Res3"}, \
+ {0x4, "NMI"}, \
+ {0x5, "INIT"}, \
+ {0x6, "SIPI"}, \
+ {0x7, "ExtINT"}
+
+TRACE_EVENT(kvm_ioapic_set_irq,
+ TP_PROTO(__u64 e, int pin, bool coalesced),
+ TP_ARGS(e, pin, coalesced),
+
+ TP_STRUCT__entry(
+ __field( __u64, e )
+ __field( int, pin )
+ __field( bool, coalesced )
+ ),
+
+ TP_fast_assign(
+ __entry->e = e;
+ __entry->pin = pin;
+ __entry->coalesced = coalesced;
+ ),
+
+ TP_printk("pin %u dst %x vec=%u (%s|%s|%s%s)%s",
+ __entry->pin, (u8)(__entry->e >> 56), (u8)__entry->e,
+ __print_symbolic((__entry->e >> 8 & 0x7), kvm_deliver_mode),
+ (__entry->e & (1<<11)) ? "logical" : "physical",
+ (__entry->e & (1<<15)) ? "level" : "edge",
+ (__entry->e & (1<<16)) ? "|masked" : "",
+ __entry->coalesced ? " (coalesced)" : "")
+);
+
+TRACE_EVENT(kvm_msi_set_irq,
+ TP_PROTO(__u64 address, __u64 data),
+ TP_ARGS(address, data),
+
+ TP_STRUCT__entry(
+ __field( __u64, address )
+ __field( __u64, data )
+ ),
+
+ TP_fast_assign(
+ __entry->address = address;
+ __entry->data = data;
+ ),
+
+ TP_printk("dst %u vec %x (%s|%s|%s%s)",
+ (u8)(__entry->address >> 12), (u8)__entry->data,
+ __print_symbolic((__entry->data >> 8 & 0x7), kvm_deliver_mode),
+ (__entry->address & (1<<2)) ? "logical" : "physical",
+ (__entry->data & (1<<15)) ? "level" : "edge",
+ (__entry->address & (1<<3)) ? "|rh" : "")
+);
#define kvm_irqchips \
{KVM_IRQCHIP_PIC_MASTER, "PIC master"}, \
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 92496ff..b91fbb2 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -36,6 +36,7 @@
#include <asm/processor.h>
#include <asm/page.h>
#include <asm/current.h>
+#include <trace/events/kvm.h>
#include "ioapic.h"
#include "lapic.h"
@@ -193,6 +194,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
(!edge && !entry.fields.remote_irr))
ret = ioapic_service(ioapic, irq);
}
+ trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
}
return ret;
}
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index a9d7fd1..001663f 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -100,6 +100,8 @@ static int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
{
struct kvm_lapic_irq irq;
+ trace_kvm_msi_set_irq(e->msi.address_lo, e->msi.data);
+
irq.dest_id = (e->msi.address_lo &
MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
irq.vector = (e->msi.data &
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 17/46] KVM: No need to kick cpu if not in a guest mode
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (15 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 16/46] KVM: Add trace points in irqchip code Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 18/46] KVM: Always report x2apic as supported feature Avi Kivity
` (28 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
This will save a couple of IPIs.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 95fa45c..e3d9040 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3506,6 +3506,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
smp_mb__after_clear_bit();
if (vcpu->requests || need_resched() || signal_pending(current)) {
+ set_bit(KVM_REQ_KICK, &vcpu->requests);
local_irq_enable();
preempt_enable();
r = 1;
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 18/46] KVM: Always report x2apic as supported feature
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (16 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 17/46] KVM: No need to kick cpu if not in a guest mode Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 19/46] KVM: PIT support for HPET legacy mode Avi Kivity
` (27 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
We emulate x2apic in software, so host support is not required.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e3d9040..dfb0e37 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1504,6 +1504,9 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
case 1:
entry->edx &= kvm_supported_word0_x86_features;
entry->ecx &= kvm_supported_word4_x86_features;
+ /* we support x2apic emulation even if host does not support
+ * it since we emulate x2apic in software */
+ entry->ecx |= F(X2APIC);
break;
/* function 2 entries are STATEFUL. That is, repeated cpuid commands
* may return different values. This forces us to get_cpu() before
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 19/46] KVM: PIT support for HPET legacy mode
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (17 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 18/46] KVM: Always report x2apic as supported feature Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 20/46] KVM: add module parameters documentation Avi Kivity
` (26 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Beth Kon <eak@us.ibm.com>
When kvm is in hpet_legacy_mode, the hpet is providing the timer
interrupt and the pit should not be. So in legacy mode, the pit timer
is destroyed, but the *state* of the pit is maintained. So if kvm or
the guest tries to modify the state of the pit, this modification is
accepted, *except* that the timer isn't actually started. When we exit
hpet_legacy_mode, the current state of the pit (which is up to date
since we've been accepting modifications) is used to restart the pit
timer.
The saved_mode code in kvm_pit_load_count temporarily changes mode to
0xff in order to destroy the timer, but then restores the actual
value, again maintaining "current" state of the pit for possible later
reenablement.
[avi: add some reserved storage in the ioctl; make SET_PIT2 IOW]
Signed-off-by: Beth Kon <eak@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm.h | 9 +++++++
arch/x86/kvm/i8254.c | 22 ++++++++++++++---
arch/x86/kvm/i8254.h | 3 +-
arch/x86/kvm/x86.c | 55 +++++++++++++++++++++++++++++++++++++++++++-
include/linux/kvm.h | 6 ++++
5 files changed, 89 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
index 708b9c3..4a5fe91 100644
--- a/arch/x86/include/asm/kvm.h
+++ b/arch/x86/include/asm/kvm.h
@@ -18,6 +18,7 @@
#define __KVM_HAVE_GUEST_DEBUG
#define __KVM_HAVE_MSIX
#define __KVM_HAVE_MCE
+#define __KVM_HAVE_PIT_STATE2
/* Architectural interrupt line count. */
#define KVM_NR_INTERRUPTS 256
@@ -237,6 +238,14 @@ struct kvm_pit_state {
struct kvm_pit_channel_state channels[3];
};
+#define KVM_PIT_FLAGS_HPET_LEGACY 0x00000001
+
+struct kvm_pit_state2 {
+ struct kvm_pit_channel_state channels[3];
+ __u32 flags;
+ __u32 reserved[9];
+};
+
struct kvm_reinject_control {
__u8 pit_reinject;
__u8 reserved[31];
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 8c3ac30..9b62c57 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -332,20 +332,33 @@ static void pit_load_count(struct kvm *kvm, int channel, u32 val)
case 1:
/* FIXME: enhance mode 4 precision */
case 4:
- create_pit_timer(ps, val, 0);
+ if (!(ps->flags & KVM_PIT_FLAGS_HPET_LEGACY)) {
+ create_pit_timer(ps, val, 0);
+ }
break;
case 2:
case 3:
- create_pit_timer(ps, val, 1);
+ if (!(ps->flags & KVM_PIT_FLAGS_HPET_LEGACY)){
+ create_pit_timer(ps, val, 1);
+ }
break;
default:
destroy_pit_timer(&ps->pit_timer);
}
}
-void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val)
+void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val, int hpet_legacy_start)
{
- pit_load_count(kvm, channel, val);
+ u8 saved_mode;
+ if (hpet_legacy_start) {
+ /* save existing mode for later reenablement */
+ saved_mode = kvm->arch.vpit->pit_state.channels[0].mode;
+ kvm->arch.vpit->pit_state.channels[0].mode = 0xff; /* disable timer */
+ pit_load_count(kvm, channel, val);
+ kvm->arch.vpit->pit_state.channels[0].mode = saved_mode;
+ } else {
+ pit_load_count(kvm, channel, val);
+ }
}
static inline struct kvm_pit *dev_to_pit(struct kvm_io_device *dev)
@@ -554,6 +567,7 @@ void kvm_pit_reset(struct kvm_pit *pit)
struct kvm_kpit_channel_state *c;
mutex_lock(&pit->pit_state.lock);
+ pit->pit_state.flags = 0;
for (i = 0; i < 3; i++) {
c = &pit->pit_state.channels[i];
c->mode = 0xff;
diff --git a/arch/x86/kvm/i8254.h b/arch/x86/kvm/i8254.h
index b267018..d4c1c7f 100644
--- a/arch/x86/kvm/i8254.h
+++ b/arch/x86/kvm/i8254.h
@@ -21,6 +21,7 @@ struct kvm_kpit_channel_state {
struct kvm_kpit_state {
struct kvm_kpit_channel_state channels[3];
+ u32 flags;
struct kvm_timer pit_timer;
bool is_periodic;
u32 speaker_data_on;
@@ -49,7 +50,7 @@ struct kvm_pit {
#define KVM_PIT_CHANNEL_MASK 0x3
void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu);
-void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val);
+void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val, int hpet_legacy_start);
struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags);
void kvm_free_pit(struct kvm *kvm);
void kvm_pit_reset(struct kvm_pit *pit);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dfb0e37..1fd67dd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1213,6 +1213,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_ASSIGN_DEV_IRQ:
case KVM_CAP_IRQFD:
case KVM_CAP_PIT2:
+ case KVM_CAP_PIT_STATE2:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -2078,7 +2079,32 @@ static int kvm_vm_ioctl_set_pit(struct kvm *kvm, struct kvm_pit_state *ps)
mutex_lock(&kvm->arch.vpit->pit_state.lock);
memcpy(&kvm->arch.vpit->pit_state, ps, sizeof(struct kvm_pit_state));
- kvm_pit_load_count(kvm, 0, ps->channels[0].count);
+ kvm_pit_load_count(kvm, 0, ps->channels[0].count, 0);
+ mutex_unlock(&kvm->arch.vpit->pit_state.lock);
+ return r;
+}
+
+static int kvm_vm_ioctl_get_pit2(struct kvm *kvm, struct kvm_pit_state2 *ps)
+{
+ int r = 0;
+
+ mutex_lock(&kvm->arch.vpit->pit_state.lock);
+ memcpy(ps, &kvm->arch.vpit->pit_state, sizeof(struct kvm_pit_state2));
+ mutex_unlock(&kvm->arch.vpit->pit_state.lock);
+ return r;
+}
+
+static int kvm_vm_ioctl_set_pit2(struct kvm *kvm, struct kvm_pit_state2 *ps)
+{
+ int r = 0, start = 0;
+ u32 prev_legacy, cur_legacy;
+ mutex_lock(&kvm->arch.vpit->pit_state.lock);
+ prev_legacy = kvm->arch.vpit->pit_state.flags & KVM_PIT_FLAGS_HPET_LEGACY;
+ cur_legacy = ps->flags & KVM_PIT_FLAGS_HPET_LEGACY;
+ if (!prev_legacy && cur_legacy)
+ start = 1;
+ memcpy(&kvm->arch.vpit->pit_state, ps, sizeof(struct kvm_pit_state2));
+ kvm_pit_load_count(kvm, 0, kvm->arch.vpit->pit_state.channels[0].count, start);
mutex_unlock(&kvm->arch.vpit->pit_state.lock);
return r;
}
@@ -2140,6 +2166,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
*/
union {
struct kvm_pit_state ps;
+ struct kvm_pit_state2 ps2;
struct kvm_memory_alias alias;
struct kvm_pit_config pit_config;
} u;
@@ -2322,6 +2349,32 @@ long kvm_arch_vm_ioctl(struct file *filp,
r = 0;
break;
}
+ case KVM_GET_PIT2: {
+ r = -ENXIO;
+ if (!kvm->arch.vpit)
+ goto out;
+ r = kvm_vm_ioctl_get_pit2(kvm, &u.ps2);
+ if (r)
+ goto out;
+ r = -EFAULT;
+ if (copy_to_user(argp, &u.ps2, sizeof(u.ps2)))
+ goto out;
+ r = 0;
+ break;
+ }
+ case KVM_SET_PIT2: {
+ r = -EFAULT;
+ if (copy_from_user(&u.ps2, argp, sizeof(u.ps2)))
+ goto out;
+ r = -ENXIO;
+ if (!kvm->arch.vpit)
+ goto out;
+ r = kvm_vm_ioctl_set_pit2(kvm, &u.ps2);
+ if (r)
+ goto out;
+ r = 0;
+ break;
+ }
case KVM_REINJECT_CONTROL: {
struct kvm_reinject_control control;
r = -EFAULT;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 76c6408..a74a1fc 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -409,6 +409,9 @@ struct kvm_guest_debug {
#define KVM_CAP_PIT2 33
#endif
#define KVM_CAP_SET_BOOT_CPU_ID 34
+#ifdef __KVM_HAVE_PIT_STATE2
+#define KVM_CAP_PIT_STATE2 35
+#endif
#ifdef KVM_CAP_IRQ_ROUTING
@@ -586,6 +589,9 @@ struct kvm_debug_guest {
#define KVM_IA64_VCPU_GET_STACK _IOR(KVMIO, 0x9a, void *)
#define KVM_IA64_VCPU_SET_STACK _IOW(KVMIO, 0x9b, void *)
+#define KVM_GET_PIT2 _IOR(KVMIO, 0x9f, struct kvm_pit_state2)
+#define KVM_SET_PIT2 _IOW(KVMIO, 0xa0, struct kvm_pit_state2)
+
#define KVM_TRC_INJ_VIRQ (KVM_TRC_HANDLER + 0x02)
#define KVM_TRC_REDELIVER_EVT (KVM_TRC_HANDLER + 0x03)
#define KVM_TRC_PEND_INTR (KVM_TRC_HANDLER + 0x04)
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 20/46] KVM: add module parameters documentation
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (18 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 19/46] KVM: PIT support for HPET legacy mode Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 21/46] KVM: make io_bus interface more robust Avi Kivity
` (25 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
Documentation/kernel-parameters.txt | 39 +++++++++++++++++++++++++++++++++++
1 files changed, 39 insertions(+), 0 deletions(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index dd1a6d4..5b7a706 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -57,6 +57,7 @@ parameter is applicable:
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled.
+ KVM Kernel Virtual Machine support is enabled.
LIBATA Libata driver is enabled
LP Printer support is enabled.
LOOP Loopback device support is enabled.
@@ -1098,6 +1099,44 @@ and is between 256 and 4096 characters. It is defined in the file
kstack=N [X86] Print N words from the kernel stack
in oops dumps.
+ kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
+ Default is 0 (don't ignore, but inject #GP)
+
+ kvm.oos_shadow= [KVM] Disable out-of-sync shadow paging.
+ Default is 1 (enabled)
+
+ kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
+ Default is 0 (off)
+
+ kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
+ for all guests.
+ Default is 1 (enabled) if in 64bit or 32bit-PAE mode
+
+ kvm-intel.bypass_guest_pf=
+ [KVM,Intel] Disables bypassing of guest page faults
+ on Intel chips. Default is 1 (enabled)
+
+ kvm-intel.ept= [KVM,Intel] Disable extended page tables
+ (virtualized MMU) support on capable Intel chips.
+ Default is 1 (enabled)
+
+ kvm-intel.emulate_invalid_guest_state=
+ [KVM,Intel] Enable emulation of invalid guest states
+ Default is 0 (disabled)
+
+ kvm-intel.flexpriority=
+ [KVM,Intel] Disable FlexPriority feature (TPR shadow).
+ Default is 1 (enabled)
+
+ kvm-intel.unrestricted_guest=
+ [KVM,Intel] Disable unrestricted guest feature
+ (virtualized real and unpaged mode) on capable
+ Intel chips. Default is 1 (enabled)
+
+ kvm-intel.vpid= [KVM,Intel] Disable Virtual Processor Identification
+ feature (tagged TLBs) on capable Intel chips.
+ Default is 1 (enabled)
+
l2cr= [PPC]
l3cr= [PPC]
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 21/46] KVM: make io_bus interface more robust
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (19 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 20/46] KVM: add module parameters documentation Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 22/46] KVM: add ioeventfd support Avi Kivity
` (24 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gregory Haskins <ghaskins@novell.com>
Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON
if it fails. We want to create dynamic MMIO/PIO entries driven from
userspace later in the series, so we need to enhance the code to be more
robust with the following changes:
1) Add a return value to the registration function
2) Fix up all the callsites to check the return code, handle any
failures, and percolate the error up to the caller.
3) Add an unregister function that collapses holes in the array
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/i8254.c | 20 ++++++++++++++++++--
arch/x86/kvm/i8259.c | 9 ++++++++-
include/linux/kvm_host.h | 10 +++++++---
virt/kvm/coalesced_mmio.c | 8 ++++++--
virt/kvm/ioapic.c | 8 ++++++--
virt/kvm/kvm_main.c | 39 ++++++++++++++++++++++++++++++++++-----
6 files changed, 79 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 9b62c57..137e548 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -605,6 +605,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
{
struct kvm_pit *pit;
struct kvm_kpit_state *pit_state;
+ int ret;
pit = kzalloc(sizeof(struct kvm_pit), GFP_KERNEL);
if (!pit)
@@ -639,14 +640,29 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier);
kvm_iodevice_init(&pit->dev, &pit_dev_ops);
- __kvm_io_bus_register_dev(&kvm->pio_bus, &pit->dev);
+ ret = __kvm_io_bus_register_dev(&kvm->pio_bus, &pit->dev);
+ if (ret < 0)
+ goto fail;
if (flags & KVM_PIT_SPEAKER_DUMMY) {
kvm_iodevice_init(&pit->speaker_dev, &speaker_dev_ops);
- __kvm_io_bus_register_dev(&kvm->pio_bus, &pit->speaker_dev);
+ ret = __kvm_io_bus_register_dev(&kvm->pio_bus,
+ &pit->speaker_dev);
+ if (ret < 0)
+ goto fail_unregister;
}
return pit;
+
+fail_unregister:
+ __kvm_io_bus_unregister_dev(&kvm->pio_bus, &pit->dev);
+
+fail:
+ if (pit->irq_source_id >= 0)
+ kvm_free_irq_source_id(kvm, pit->irq_source_id);
+
+ kfree(pit);
+ return NULL;
}
void kvm_free_pit(struct kvm *kvm)
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index e4bcbdd..daf4606 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -539,6 +539,8 @@ static const struct kvm_io_device_ops picdev_ops = {
struct kvm_pic *kvm_create_pic(struct kvm *kvm)
{
struct kvm_pic *s;
+ int ret;
+
s = kzalloc(sizeof(struct kvm_pic), GFP_KERNEL);
if (!s)
return NULL;
@@ -555,6 +557,11 @@ struct kvm_pic *kvm_create_pic(struct kvm *kvm)
* Initialize PIO device
*/
kvm_iodevice_init(&s->dev, &picdev_ops);
- kvm_io_bus_register_dev(kvm, &kvm->pio_bus, &s->dev);
+ ret = kvm_io_bus_register_dev(kvm, &kvm->pio_bus, &s->dev);
+ if (ret < 0) {
+ kfree(s);
+ return NULL;
+ }
+
return s;
}
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 077e8bb..983b0bd 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -64,10 +64,14 @@ int kvm_io_bus_write(struct kvm_io_bus *bus, gpa_t addr, int len,
const void *val);
int kvm_io_bus_read(struct kvm_io_bus *bus, gpa_t addr, int len,
void *val);
-void __kvm_io_bus_register_dev(struct kvm_io_bus *bus,
+int __kvm_io_bus_register_dev(struct kvm_io_bus *bus,
+ struct kvm_io_device *dev);
+int kvm_io_bus_register_dev(struct kvm *kvm, struct kvm_io_bus *bus,
+ struct kvm_io_device *dev);
+void __kvm_io_bus_unregister_dev(struct kvm_io_bus *bus,
+ struct kvm_io_device *dev);
+void kvm_io_bus_unregister_dev(struct kvm *kvm, struct kvm_io_bus *bus,
struct kvm_io_device *dev);
-void kvm_io_bus_register_dev(struct kvm *kvm, struct kvm_io_bus *bus,
- struct kvm_io_device *dev);
struct kvm_vcpu {
struct kvm *kvm;
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 0352f81..04d69cd 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -92,6 +92,7 @@ static const struct kvm_io_device_ops coalesced_mmio_ops = {
int kvm_coalesced_mmio_init(struct kvm *kvm)
{
struct kvm_coalesced_mmio_dev *dev;
+ int ret;
dev = kzalloc(sizeof(struct kvm_coalesced_mmio_dev), GFP_KERNEL);
if (!dev)
@@ -100,9 +101,12 @@ int kvm_coalesced_mmio_init(struct kvm *kvm)
kvm_iodevice_init(&dev->dev, &coalesced_mmio_ops);
dev->kvm = kvm;
kvm->coalesced_mmio_dev = dev;
- kvm_io_bus_register_dev(kvm, &kvm->mmio_bus, &dev->dev);
- return 0;
+ ret = kvm_io_bus_register_dev(kvm, &kvm->mmio_bus, &dev->dev);
+ if (ret < 0)
+ kfree(dev);
+
+ return ret;
}
int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index b91fbb2..fa05f67 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -342,6 +342,7 @@ static const struct kvm_io_device_ops ioapic_mmio_ops = {
int kvm_ioapic_init(struct kvm *kvm)
{
struct kvm_ioapic *ioapic;
+ int ret;
ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL);
if (!ioapic)
@@ -350,7 +351,10 @@ int kvm_ioapic_init(struct kvm *kvm)
kvm_ioapic_reset(ioapic);
kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops);
ioapic->kvm = kvm;
- kvm_io_bus_register_dev(kvm, &kvm->mmio_bus, &ioapic->dev);
- return 0;
+ ret = kvm_io_bus_register_dev(kvm, &kvm->mmio_bus, &ioapic->dev);
+ if (ret < 0)
+ kfree(ioapic);
+
+ return ret;
}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fc1b58a..9c2fd02 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2533,21 +2533,50 @@ int kvm_io_bus_read(struct kvm_io_bus *bus, gpa_t addr, int len, void *val)
return -EOPNOTSUPP;
}
-void kvm_io_bus_register_dev(struct kvm *kvm, struct kvm_io_bus *bus,
+int kvm_io_bus_register_dev(struct kvm *kvm, struct kvm_io_bus *bus,
struct kvm_io_device *dev)
{
+ int ret;
+
down_write(&kvm->slots_lock);
- __kvm_io_bus_register_dev(bus, dev);
+ ret = __kvm_io_bus_register_dev(bus, dev);
up_write(&kvm->slots_lock);
+
+ return ret;
}
/* An unlocked version. Caller must have write lock on slots_lock. */
-void __kvm_io_bus_register_dev(struct kvm_io_bus *bus,
- struct kvm_io_device *dev)
+int __kvm_io_bus_register_dev(struct kvm_io_bus *bus,
+ struct kvm_io_device *dev)
{
- BUG_ON(bus->dev_count > (NR_IOBUS_DEVS-1));
+ if (bus->dev_count > NR_IOBUS_DEVS-1)
+ return -ENOSPC;
bus->devs[bus->dev_count++] = dev;
+
+ return 0;
+}
+
+void kvm_io_bus_unregister_dev(struct kvm *kvm,
+ struct kvm_io_bus *bus,
+ struct kvm_io_device *dev)
+{
+ down_write(&kvm->slots_lock);
+ __kvm_io_bus_unregister_dev(bus, dev);
+ up_write(&kvm->slots_lock);
+}
+
+/* An unlocked version. Caller must have write lock on slots_lock. */
+void __kvm_io_bus_unregister_dev(struct kvm_io_bus *bus,
+ struct kvm_io_device *dev)
+{
+ int i;
+
+ for (i = 0; i < bus->dev_count; i++)
+ if (bus->devs[i] == dev) {
+ bus->devs[i] = bus->devs[--bus->dev_count];
+ break;
+ }
}
static struct notifier_block kvm_cpu_notifier = {
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 22/46] KVM: add ioeventfd support
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (20 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 21/46] KVM: make io_bus interface more robust Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 23/46] KVM: MMU: Fix MMU_DEBUG compile breakage Avi Kivity
` (23 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gregory Haskins <ghaskins@novell.com>
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd
signal when written to by a guest. Host userspace can register any
arbitrary IO address with a corresponding eventfd and then pass the eventfd
to a specific end-point of interest for handling.
Normal IO requires a blocking round-trip since the operation may cause
side-effects in the emulated model or may return data to the caller.
Therefore, an IO in KVM traps from the guest to the host, causes a VMX/SVM
"heavy-weight" exit back to userspace, and is ultimately serviced by qemu's
device model synchronously before returning control back to the vcpu.
However, there is a subclass of IO which acts purely as a trigger for
other IO (such as to kick off an out-of-band DMA request, etc). For these
patterns, the synchronous call is particularly expensive since we really
only want to simply get our notification transmitted asychronously and
return as quickly as possible. All the sychronous infrastructure to ensure
proper data-dependencies are met in the normal IO case are just unecessary
overhead for signalling. This adds additional computational load on the
system, as well as latency to the signalling path.
Therefore, we provide a mechanism for registration of an in-kernel trigger
point that allows the VCPU to only require a very brief, lightweight
exit just long enough to signal an eventfd. This also means that any
clients compatible with the eventfd interface (which includes userspace
and kernelspace equally well) can now register to be notified. The end
result should be a more flexible and higher performance notification API
for the backend KVM hypervisor and perhipheral components.
To test this theory, we built a test-harness called "doorbell". This
module has a function called "doorbell_ring()" which simply increments a
counter for each time the doorbell is signaled. It supports signalling
from either an eventfd, or an ioctl().
We then wired up two paths to the doorbell: One via QEMU via a registered
io region and through the doorbell ioctl(). The other is direct via
ioeventfd.
You can download this test harness here:
ftp://ftp.novell.com/dev/ghaskins/doorbell.tar.bz2
The measured results are as follows:
qemu-mmio: 110000 iops, 9.09us rtt
ioeventfd-mmio: 200100 iops, 5.00us rtt
ioeventfd-pio: 367300 iops, 2.72us rtt
I didn't measure qemu-pio, because I have to figure out how to register a
PIO region with qemu's device model, and I got lazy. However, for now we
can extrapolate based on the data from the NULLIO runs of +2.56us for MMIO,
and -350ns for HC, we get:
qemu-pio: 153139 iops, 6.53us rtt
ioeventfd-hc: 412585 iops, 2.37us rtt
these are just for fun, for now, until I can gather more data.
Here is a graph for your convenience:
http://developer.novell.com/wiki/images/7/76/Iofd-chart.png
The conclusion to draw is that we save about 4us by skipping the userspace
hop.
--------------------
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 1 +
include/linux/kvm.h | 24 +++++
include/linux/kvm_host.h | 10 ++-
virt/kvm/eventfd.c | 251 +++++++++++++++++++++++++++++++++++++++++++++-
virt/kvm/kvm_main.c | 11 ++-
5 files changed, 293 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1fd67dd..a2f5a9c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1212,6 +1212,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_IRQ_INJECT_STATUS:
case KVM_CAP_ASSIGN_DEV_IRQ:
case KVM_CAP_IRQFD:
+ case KVM_CAP_IOEVENTFD:
case KVM_CAP_PIT2:
case KVM_CAP_PIT_STATE2:
r = 1;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index a74a1fc..230a91a 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -307,6 +307,28 @@ struct kvm_guest_debug {
struct kvm_guest_debug_arch arch;
};
+enum {
+ kvm_ioeventfd_flag_nr_datamatch,
+ kvm_ioeventfd_flag_nr_pio,
+ kvm_ioeventfd_flag_nr_deassign,
+ kvm_ioeventfd_flag_nr_max,
+};
+
+#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
+#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio)
+#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign)
+
+#define KVM_IOEVENTFD_VALID_FLAG_MASK ((1 << kvm_ioeventfd_flag_nr_max) - 1)
+
+struct kvm_ioeventfd {
+ __u64 datamatch;
+ __u64 addr; /* legal pio/mmio address */
+ __u32 len; /* 1, 2, 4, or 8 bytes */
+ __s32 fd;
+ __u32 flags;
+ __u8 pad[36];
+};
+
#define KVM_TRC_SHIFT 16
/*
* kvm trace categories
@@ -412,6 +434,7 @@ struct kvm_guest_debug {
#ifdef __KVM_HAVE_PIT_STATE2
#define KVM_CAP_PIT_STATE2 35
#endif
+#define KVM_CAP_IOEVENTFD 36
#ifdef KVM_CAP_IRQ_ROUTING
@@ -520,6 +543,7 @@ struct kvm_irqfd {
#define KVM_IRQFD _IOW(KVMIO, 0x76, struct kvm_irqfd)
#define KVM_CREATE_PIT2 _IOW(KVMIO, 0x77, struct kvm_pit_config)
#define KVM_SET_BOOT_CPU_ID _IO(KVMIO, 0x78)
+#define KVM_IOEVENTFD _IOW(KVMIO, 0x79, struct kvm_ioeventfd)
/*
* ioctls for vcpu fds
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 983b0bd..6ec9fc5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -155,6 +155,7 @@ struct kvm {
spinlock_t lock;
struct list_head items;
} irqfds;
+ struct list_head ioeventfds;
#endif
struct kvm_vm_stat stat;
struct kvm_arch arch;
@@ -528,19 +529,24 @@ static inline void kvm_free_irq_routing(struct kvm *kvm) {}
#ifdef CONFIG_HAVE_KVM_EVENTFD
-void kvm_irqfd_init(struct kvm *kvm);
+void kvm_eventfd_init(struct kvm *kvm);
int kvm_irqfd(struct kvm *kvm, int fd, int gsi, int flags);
void kvm_irqfd_release(struct kvm *kvm);
+int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args);
#else
-static inline void kvm_irqfd_init(struct kvm *kvm) {}
+static inline void kvm_eventfd_init(struct kvm *kvm) {}
static inline int kvm_irqfd(struct kvm *kvm, int fd, int gsi, int flags)
{
return -EINVAL;
}
static inline void kvm_irqfd_release(struct kvm *kvm) {}
+static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
+{
+ return -ENOSYS;
+}
#endif /* CONFIG_HAVE_KVM_EVENTFD */
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 4092b8d..99017e8 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -21,6 +21,7 @@
*/
#include <linux/kvm_host.h>
+#include <linux/kvm.h>
#include <linux/workqueue.h>
#include <linux/syscalls.h>
#include <linux/wait.h>
@@ -28,6 +29,9 @@
#include <linux/file.h>
#include <linux/list.h>
#include <linux/eventfd.h>
+#include <linux/kernel.h>
+
+#include "iodev.h"
/*
* --------------------------------------------------------------------
@@ -234,10 +238,11 @@ fail:
}
void
-kvm_irqfd_init(struct kvm *kvm)
+kvm_eventfd_init(struct kvm *kvm)
{
spin_lock_init(&kvm->irqfds.lock);
INIT_LIST_HEAD(&kvm->irqfds.items);
+ INIT_LIST_HEAD(&kvm->ioeventfds);
}
/*
@@ -327,3 +332,247 @@ static void __exit irqfd_module_exit(void)
module_init(irqfd_module_init);
module_exit(irqfd_module_exit);
+
+/*
+ * --------------------------------------------------------------------
+ * ioeventfd: translate a PIO/MMIO memory write to an eventfd signal.
+ *
+ * userspace can register a PIO/MMIO address with an eventfd for receiving
+ * notification when the memory has been touched.
+ * --------------------------------------------------------------------
+ */
+
+struct _ioeventfd {
+ struct list_head list;
+ u64 addr;
+ int length;
+ struct eventfd_ctx *eventfd;
+ u64 datamatch;
+ struct kvm_io_device dev;
+ bool wildcard;
+};
+
+static inline struct _ioeventfd *
+to_ioeventfd(struct kvm_io_device *dev)
+{
+ return container_of(dev, struct _ioeventfd, dev);
+}
+
+static void
+ioeventfd_release(struct _ioeventfd *p)
+{
+ eventfd_ctx_put(p->eventfd);
+ list_del(&p->list);
+ kfree(p);
+}
+
+static bool
+ioeventfd_in_range(struct _ioeventfd *p, gpa_t addr, int len, const void *val)
+{
+ u64 _val;
+
+ if (!(addr == p->addr && len == p->length))
+ /* address-range must be precise for a hit */
+ return false;
+
+ if (p->wildcard)
+ /* all else equal, wildcard is always a hit */
+ return true;
+
+ /* otherwise, we have to actually compare the data */
+
+ BUG_ON(!IS_ALIGNED((unsigned long)val, len));
+
+ switch (len) {
+ case 1:
+ _val = *(u8 *)val;
+ break;
+ case 2:
+ _val = *(u16 *)val;
+ break;
+ case 4:
+ _val = *(u32 *)val;
+ break;
+ case 8:
+ _val = *(u64 *)val;
+ break;
+ default:
+ return false;
+ }
+
+ return _val == p->datamatch ? true : false;
+}
+
+/* MMIO/PIO writes trigger an event if the addr/val match */
+static int
+ioeventfd_write(struct kvm_io_device *this, gpa_t addr, int len,
+ const void *val)
+{
+ struct _ioeventfd *p = to_ioeventfd(this);
+
+ if (!ioeventfd_in_range(p, addr, len, val))
+ return -EOPNOTSUPP;
+
+ eventfd_signal(p->eventfd, 1);
+ return 0;
+}
+
+/*
+ * This function is called as KVM is completely shutting down. We do not
+ * need to worry about locking just nuke anything we have as quickly as possible
+ */
+static void
+ioeventfd_destructor(struct kvm_io_device *this)
+{
+ struct _ioeventfd *p = to_ioeventfd(this);
+
+ ioeventfd_release(p);
+}
+
+static const struct kvm_io_device_ops ioeventfd_ops = {
+ .write = ioeventfd_write,
+ .destructor = ioeventfd_destructor,
+};
+
+/* assumes kvm->slots_lock held */
+static bool
+ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
+{
+ struct _ioeventfd *_p;
+
+ list_for_each_entry(_p, &kvm->ioeventfds, list)
+ if (_p->addr == p->addr && _p->length == p->length &&
+ (_p->wildcard || p->wildcard ||
+ _p->datamatch == p->datamatch))
+ return true;
+
+ return false;
+}
+
+static int
+kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
+{
+ int pio = args->flags & KVM_IOEVENTFD_FLAG_PIO;
+ struct kvm_io_bus *bus = pio ? &kvm->pio_bus : &kvm->mmio_bus;
+ struct _ioeventfd *p;
+ struct eventfd_ctx *eventfd;
+ int ret;
+
+ /* must be natural-word sized */
+ switch (args->len) {
+ case 1:
+ case 2:
+ case 4:
+ case 8:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* check for range overflow */
+ if (args->addr + args->len < args->addr)
+ return -EINVAL;
+
+ /* check for extra flags that we don't understand */
+ if (args->flags & ~KVM_IOEVENTFD_VALID_FLAG_MASK)
+ return -EINVAL;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(eventfd);
+
+ p = kzalloc(sizeof(*p), GFP_KERNEL);
+ if (!p) {
+ ret = -ENOMEM;
+ goto fail;
+ }
+
+ INIT_LIST_HEAD(&p->list);
+ p->addr = args->addr;
+ p->length = args->len;
+ p->eventfd = eventfd;
+
+ /* The datamatch feature is optional, otherwise this is a wildcard */
+ if (args->flags & KVM_IOEVENTFD_FLAG_DATAMATCH)
+ p->datamatch = args->datamatch;
+ else
+ p->wildcard = true;
+
+ down_write(&kvm->slots_lock);
+
+ /* Verify that there isnt a match already */
+ if (ioeventfd_check_collision(kvm, p)) {
+ ret = -EEXIST;
+ goto unlock_fail;
+ }
+
+ kvm_iodevice_init(&p->dev, &ioeventfd_ops);
+
+ ret = __kvm_io_bus_register_dev(bus, &p->dev);
+ if (ret < 0)
+ goto unlock_fail;
+
+ list_add_tail(&p->list, &kvm->ioeventfds);
+
+ up_write(&kvm->slots_lock);
+
+ return 0;
+
+unlock_fail:
+ up_write(&kvm->slots_lock);
+
+fail:
+ kfree(p);
+ eventfd_ctx_put(eventfd);
+
+ return ret;
+}
+
+static int
+kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
+{
+ int pio = args->flags & KVM_IOEVENTFD_FLAG_PIO;
+ struct kvm_io_bus *bus = pio ? &kvm->pio_bus : &kvm->mmio_bus;
+ struct _ioeventfd *p, *tmp;
+ struct eventfd_ctx *eventfd;
+ int ret = -ENOENT;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(eventfd);
+
+ down_write(&kvm->slots_lock);
+
+ list_for_each_entry_safe(p, tmp, &kvm->ioeventfds, list) {
+ bool wildcard = !(args->flags & KVM_IOEVENTFD_FLAG_DATAMATCH);
+
+ if (p->eventfd != eventfd ||
+ p->addr != args->addr ||
+ p->length != args->len ||
+ p->wildcard != wildcard)
+ continue;
+
+ if (!p->wildcard && p->datamatch != args->datamatch)
+ continue;
+
+ __kvm_io_bus_unregister_dev(bus, &p->dev);
+ ioeventfd_release(p);
+ ret = 0;
+ break;
+ }
+
+ up_write(&kvm->slots_lock);
+
+ eventfd_ctx_put(eventfd);
+
+ return ret;
+}
+
+int
+kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
+{
+ if (args->flags & KVM_IOEVENTFD_FLAG_DEASSIGN)
+ return kvm_deassign_ioeventfd(kvm, args);
+
+ return kvm_assign_ioeventfd(kvm, args);
+}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9c2fd02..d7b9bbb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -979,7 +979,7 @@ static struct kvm *kvm_create_vm(void)
spin_lock_init(&kvm->mmu_lock);
spin_lock_init(&kvm->requests_lock);
kvm_io_bus_init(&kvm->pio_bus);
- kvm_irqfd_init(kvm);
+ kvm_eventfd_init(kvm);
mutex_init(&kvm->lock);
mutex_init(&kvm->irq_lock);
kvm_io_bus_init(&kvm->mmio_bus);
@@ -2271,6 +2271,15 @@ static long kvm_vm_ioctl(struct file *filp,
r = kvm_irqfd(kvm, data.fd, data.gsi, data.flags);
break;
}
+ case KVM_IOEVENTFD: {
+ struct kvm_ioeventfd data;
+
+ r = -EFAULT;
+ if (copy_from_user(&data, argp, sizeof data))
+ goto out;
+ r = kvm_ioeventfd(kvm, &data);
+ break;
+ }
#ifdef CONFIG_KVM_APIC_ARCHITECTURE
case KVM_SET_BOOT_CPU_ID:
r = 0;
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 23/46] KVM: MMU: Fix MMU_DEBUG compile breakage
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (21 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 22/46] KVM: add ioeventfd support Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 24/46] KVM: Move exception handling to the same place as other events Avi Kivity
` (22 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ac121b3..f1f0815 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1808,8 +1808,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
pgprintk("%s: setting spte %llx\n", __func__, *sptep);
pgprintk("instantiating %s PTE (%s) at %ld (%llx) addr %p\n",
is_large_pte(*sptep)? "2MB" : "4kB",
- is_present_pte(*sptep)?"RW":"R", gfn,
- *shadow_pte, sptep);
+ *sptep & PT_PRESENT_MASK ?"RW":"R", gfn,
+ *sptep, sptep);
if (!was_rmapped && is_large_pte(*sptep))
++vcpu->kvm->stat.lpages;
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 24/46] KVM: Move exception handling to the same place as other events
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (22 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 23/46] KVM: MMU: Fix MMU_DEBUG compile breakage Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 25/46] KVM: Move kvm_cpu_get_interrupt() declaration to x86 code Avi Kivity
` (21 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 21 +++++++++------------
1 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a2f5a9c..d2e1ffb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -223,13 +223,6 @@ void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
}
EXPORT_SYMBOL_GPL(kvm_queue_exception_e);
-static void __queue_exception(struct kvm_vcpu *vcpu)
-{
- kvm_x86_ops->queue_exception(vcpu, vcpu->arch.exception.nr,
- vcpu->arch.exception.has_error_code,
- vcpu->arch.exception.error_code);
-}
-
/*
* Load the pae pdptrs. Return true is they are all valid.
*/
@@ -3487,9 +3480,16 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
kvm_x86_ops->update_cr8_intercept(vcpu, tpr, max_irr);
}
-static void inject_pending_irq(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+static void inject_pending_event(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
{
/* try to reinject previous events if any */
+ if (vcpu->arch.exception.pending) {
+ kvm_x86_ops->queue_exception(vcpu, vcpu->arch.exception.nr,
+ vcpu->arch.exception.has_error_code,
+ vcpu->arch.exception.error_code);
+ return;
+ }
+
if (vcpu->arch.nmi_injected) {
kvm_x86_ops->set_nmi(vcpu);
return;
@@ -3570,10 +3570,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
goto out;
}
- if (vcpu->arch.exception.pending)
- __queue_exception(vcpu);
- else
- inject_pending_irq(vcpu, kvm_run);
+ inject_pending_event(vcpu, kvm_run);
/* enable NMI/IRQ window open exits if needed */
if (vcpu->arch.nmi_pending)
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 25/46] KVM: Move kvm_cpu_get_interrupt() declaration to x86 code
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (23 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 24/46] KVM: Move exception handling to the same place as other events Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 26/46] KVM: Reduce runnability interface with arch support code Avi Kivity
` (20 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
It is implemented only by x86.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
include/linux/kvm_host.h | 1 -
2 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 30b625d..3f4f00a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -797,5 +797,6 @@ asmlinkage void kvm_handle_fault_on_reboot(void);
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
int kvm_age_hva(struct kvm *kvm, unsigned long hva);
int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
+int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
#endif /* _ASM_X86_KVM_HOST_H */
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6ec9fc5..2c6a5f0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -341,7 +341,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm);
void kvm_free_all_assigned_devices(struct kvm *kvm);
void kvm_arch_sync_events(struct kvm *kvm);
-int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
int kvm_cpu_has_interrupt(struct kvm_vcpu *v);
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu);
void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 26/46] KVM: Reduce runnability interface with arch support code
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (24 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 25/46] KVM: Move kvm_cpu_get_interrupt() declaration to x86 code Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 27/46] KVM: silence lapic kernel messages that can be triggered by a guest Avi Kivity
` (19 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from
interface between general code and arch code. kvm_arch_vcpu_runnable()
checks for interrupts instead.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/ia64/kvm/kvm-ia64.c | 16 ++--------------
arch/powerpc/kvm/powerpc.c | 13 +------------
arch/s390/kvm/interrupt.c | 8 +-------
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/x86.c | 6 ++++--
include/linux/kvm_host.h | 2 --
virt/kvm/kvm_main.c | 4 +---
7 files changed, 11 insertions(+), 40 deletions(-)
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index d7aa6bb..0ad09f0 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1935,19 +1935,6 @@ int kvm_highest_pending_irq(struct kvm_vcpu *vcpu)
return find_highest_bits((int *)&vpd->irr[0]);
}
-int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
-{
- if (kvm_highest_pending_irq(vcpu) != -1)
- return 1;
- return 0;
-}
-
-int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu)
-{
- /* do real check here */
- return 1;
-}
-
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
{
return vcpu->arch.timer_fired;
@@ -1960,7 +1947,8 @@ gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
{
- return vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE;
+ return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE) ||
+ (kvm_highest_pending_irq(vcpu) != -1);
}
int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 0341391..2a4551f 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -39,20 +39,9 @@ gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
return gfn;
}
-int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
-{
- return !!(v->arch.pending_exceptions);
-}
-
-int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu)
-{
- /* do real check here */
- return 1;
-}
-
int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
{
- return !(v->arch.msr & MSR_WE);
+ return !(v->arch.msr & MSR_WE) || !!(v->arch.pending_exceptions);
}
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 4d61341..2c2f983 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -283,7 +283,7 @@ static int __try_deliver_ckc_interrupt(struct kvm_vcpu *vcpu)
return 1;
}
-int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
+static int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
{
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
@@ -320,12 +320,6 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu)
return rc;
}
-int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu)
-{
- /* do real check here */
- return 1;
-}
-
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
{
return 0;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3f4f00a..08732d7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -797,6 +797,8 @@ asmlinkage void kvm_handle_fault_on_reboot(void);
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
int kvm_age_hva(struct kvm *kvm, unsigned long hva);
int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
+int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
+int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
#endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d2e1ffb..48567fa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4886,8 +4886,10 @@ void kvm_arch_flush_shadow(struct kvm *kvm)
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
{
return vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE
- || vcpu->arch.mp_state == KVM_MP_STATE_SIPI_RECEIVED
- || vcpu->arch.nmi_pending;
+ || vcpu->arch.mp_state == KVM_MP_STATE_SIPI_RECEIVED
+ || vcpu->arch.nmi_pending ||
+ (kvm_arch_interrupt_allowed(vcpu) &&
+ kvm_cpu_has_interrupt(vcpu));
}
void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2c6a5f0..4af5603 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -332,7 +332,6 @@ int kvm_arch_hardware_setup(void);
void kvm_arch_hardware_unsetup(void);
void kvm_arch_check_processor_compat(void *rtn);
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
-int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
void kvm_free_physmem(struct kvm *kvm);
@@ -341,7 +340,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm);
void kvm_free_all_assigned_devices(struct kvm *kvm);
void kvm_arch_sync_events(struct kvm *kvm);
-int kvm_cpu_has_interrupt(struct kvm_vcpu *v);
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu);
void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d7b9bbb..532af9b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1666,9 +1666,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
for (;;) {
prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
- if ((kvm_arch_interrupt_allowed(vcpu) &&
- kvm_cpu_has_interrupt(vcpu)) ||
- kvm_arch_vcpu_runnable(vcpu)) {
+ if (kvm_arch_vcpu_runnable(vcpu)) {
set_bit(KVM_REQ_UNHALT, &vcpu->requests);
break;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 27/46] KVM: silence lapic kernel messages that can be triggered by a guest
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (25 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 26/46] KVM: Reduce runnability interface with arch support code Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 28/46] KVM: Discard unnecessary kvm_mmu_flush_tlb() in kvm_mmu_load() Avi Kivity
` (18 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Some Linux versions (f8) try to read EOI register that is write only.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/lapic.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3b03ebd..fdddf48 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -591,14 +591,14 @@ static int apic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
static const u64 rmask = 0x43ff01ffffffe70cULL;
if ((alignment + len) > 4) {
- printk(KERN_ERR "KVM_APIC_READ: alignment error %x %d\n",
- offset, len);
+ apic_debug("KVM_APIC_READ: alignment error %x %d\n",
+ offset, len);
return 1;
}
if (offset > 0x3f0 || !(rmask & (1ULL << (offset >> 4)))) {
- printk(KERN_ERR "KVM_APIC_READ: read reserved register %x\n",
- offset);
+ apic_debug("KVM_APIC_READ: read reserved register %x\n",
+ offset);
return 1;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 28/46] KVM: Discard unnecessary kvm_mmu_flush_tlb() in kvm_mmu_load()
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (26 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 27/46] KVM: silence lapic kernel messages that can be triggered by a guest Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 29/46] KVM: MMU: fix missing locking in alloc_mmu_pages Avi Kivity
` (17 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Sheng Yang <sheng@linux.intel.com>
set_cr3() should already cover the TLB flushing.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/mmu.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f1f0815..87c67f4 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2373,8 +2373,8 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
spin_unlock(&vcpu->kvm->mmu_lock);
if (r)
goto out;
+ /* set_cr3() should ensure TLB has been flushed */
kvm_x86_ops->set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
- kvm_mmu_flush_tlb(vcpu);
out:
return r;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 29/46] KVM: MMU: fix missing locking in alloc_mmu_pages
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (27 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 28/46] KVM: Discard unnecessary kvm_mmu_flush_tlb() in kvm_mmu_load() Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 30/46] KVM: s390: remove unused structs Avi Kivity
` (16 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Marcelo Tosatti <mtosatti@redhat.com>
n_requested_mmu_pages/n_free_mmu_pages are used by
kvm_mmu_change_mmu_pages to calculate the number of pages to zap.
alloc_mmu_pages, called from the vcpu initialization path, modifies this
variables without proper locking, which can result in a negative value
in kvm_mmu_change_mmu_pages (say, with cpu hotplug).
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/mmu.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 87c67f4..86c2551 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2728,12 +2728,14 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
ASSERT(vcpu);
+ spin_lock(&vcpu->kvm->mmu_lock);
if (vcpu->kvm->arch.n_requested_mmu_pages)
vcpu->kvm->arch.n_free_mmu_pages =
vcpu->kvm->arch.n_requested_mmu_pages;
else
vcpu->kvm->arch.n_free_mmu_pages =
vcpu->kvm->arch.n_alloc_mmu_pages;
+ spin_unlock(&vcpu->kvm->mmu_lock);
/*
* When emulating 32-bit mode, cr3 is only 32 bits even on x86_64.
* Therefore we need to allocate shadow page tables in the first
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 30/46] KVM: s390: remove unused structs
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (28 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 29/46] KVM: MMU: fix missing locking in alloc_mmu_pages Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 31/46] KVM: x86: use get_desc_base() and get_desc_limit() Avi Kivity
` (15 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
They are not used by common code without defines which s390 does not
have.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/s390/include/asm/kvm.h | 9 ---------
1 files changed, 0 insertions(+), 9 deletions(-)
diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h
index 0b2f829..3dfcaeb 100644
--- a/arch/s390/include/asm/kvm.h
+++ b/arch/s390/include/asm/kvm.h
@@ -15,15 +15,6 @@
*/
#include <linux/types.h>
-/* for KVM_GET_IRQCHIP and KVM_SET_IRQCHIP */
-struct kvm_pic_state {
- /* no PIC for s390 */
-};
-
-struct kvm_ioapic_state {
- /* no IOAPIC for s390 */
-};
-
/* for KVM_GET_REGS and KVM_SET_REGS */
struct kvm_regs {
/* general purpose regs for s390 */
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 31/46] KVM: x86: use get_desc_base() and get_desc_limit()
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (29 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 30/46] KVM: s390: remove unused structs Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 32/46] KVM: x86: use kvm_get_gdt() and kvm_read_ldt() Avi Kivity
` (14 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Akinobu Mita <akinobu.mita@gmail.com>
Use get_desc_base() and get_desc_limit() to get the base address and
limit in desc_struct.
Cc: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/x86.c | 18 +++++-------------
1 files changed, 5 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 48567fa..0ebd684 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -142,8 +142,7 @@ unsigned long segment_base(u16 selector)
table_base = segment_base(ldt_selector);
}
d = (struct desc_struct *)(table_base + (selector & ~7));
- v = d->base0 | ((unsigned long)d->base1 << 16) |
- ((unsigned long)d->base2 << 24);
+ v = get_desc_base(d);
#ifdef CONFIG_X86_64
if (d->s == 0 && (d->type == 2 || d->type == 9 || d->type == 11))
v |= ((unsigned long)((struct ldttss_desc64 *)d)->base3) << 32;
@@ -3943,11 +3942,8 @@ static void kvm_set_segment(struct kvm_vcpu *vcpu,
static void seg_desct_to_kvm_desct(struct desc_struct *seg_desc, u16 selector,
struct kvm_segment *kvm_desct)
{
- kvm_desct->base = seg_desc->base0;
- kvm_desct->base |= seg_desc->base1 << 16;
- kvm_desct->base |= seg_desc->base2 << 24;
- kvm_desct->limit = seg_desc->limit0;
- kvm_desct->limit |= seg_desc->limit << 16;
+ kvm_desct->base = get_desc_base(seg_desc);
+ kvm_desct->limit = get_desc_limit(seg_desc);
if (seg_desc->g) {
kvm_desct->limit <<= 12;
kvm_desct->limit |= 0xfff;
@@ -4026,11 +4022,7 @@ static int save_guest_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector,
static u32 get_tss_base_addr(struct kvm_vcpu *vcpu,
struct desc_struct *seg_desc)
{
- u32 base_addr;
-
- base_addr = seg_desc->base0;
- base_addr |= (seg_desc->base1 << 16);
- base_addr |= (seg_desc->base2 << 24);
+ u32 base_addr = get_desc_base(seg_desc);
return vcpu->arch.mmu.gva_to_gpa(vcpu, base_addr);
}
@@ -4319,7 +4311,7 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason)
}
}
- if (!nseg_desc.p || (nseg_desc.limit0 | nseg_desc.limit << 16) < 0x67) {
+ if (!nseg_desc.p || get_desc_limit(&nseg_desc) < 0x67) {
kvm_queue_exception_e(vcpu, TS_VECTOR, tss_selector & 0xfffc);
return 1;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 32/46] KVM: x86: use kvm_get_gdt() and kvm_read_ldt()
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (30 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 31/46] KVM: x86: use get_desc_base() and get_desc_limit() Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 33/46] KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl Avi Kivity
` (13 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Akinobu Mita <akinobu.mita@gmail.com>
Use kvm_get_gdt() and kvm_read_ldt() to reduce inline assembly code.
Cc: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/svm.c | 6 +++---
arch/x86/kvm/x86.c | 5 ++---
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8728e51..92fc0da 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -290,7 +290,7 @@ static void svm_hardware_enable(void *garbage)
struct svm_cpu_data *svm_data;
uint64_t efer;
- struct desc_ptr gdt_descr;
+ struct descriptor_table gdt_descr;
struct desc_struct *gdt;
int me = raw_smp_processor_id();
@@ -310,8 +310,8 @@ static void svm_hardware_enable(void *garbage)
svm_data->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
svm_data->next_asid = svm_data->max_asid + 1;
- asm volatile ("sgdt %0" : "=m"(gdt_descr));
- gdt = (struct desc_struct *)gdt_descr.address;
+ kvm_get_gdt(&gdt_descr);
+ gdt = (struct desc_struct *)gdt_descr.base;
svm_data->tss_desc = (struct kvm_ldttss_desc *)(gdt + GDT_ENTRY_TSS);
rdmsrl(MSR_EFER, efer);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0ebd684..18ce27f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -132,13 +132,12 @@ unsigned long segment_base(u16 selector)
if (selector == 0)
return 0;
- asm("sgdt %0" : "=m"(gdt));
+ kvm_get_gdt(&gdt);
table_base = gdt.base;
if (selector & 4) { /* from ldt */
- u16 ldt_selector;
+ u16 ldt_selector = kvm_read_ldt();
- asm("sldt %0" : "=g"(ldt_selector));
table_base = segment_base(ldt_selector);
}
d = (struct desc_struct *)(table_base + (selector & ~7));
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 33/46] KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (31 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 32/46] KVM: x86: use kvm_get_gdt() and kvm_read_ldt() Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 34/46] KVM: PIT: Unregister ack notifier callback when freeing Avi Kivity
` (12 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Sheng Yang <sheng@linux.intel.com>
Now KVM allow guest to modify guest's physical address of EPT's identity mapping page.
(change from v1, discard unnecessary check, change ioctl to accept parameter
address rather than value)
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/vmx.c | 15 ++++++++++-----
arch/x86/kvm/x86.c | 19 +++++++++++++++++++
include/linux/kvm.h | 2 ++
4 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 08732d7..e210b21 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -411,6 +411,7 @@ struct kvm_arch{
struct page *ept_identity_pagetable;
bool ept_identity_pagetable_done;
+ gpa_t ept_identity_map_addr;
unsigned long irq_sources_bitmap;
unsigned long irq_states[KVM_IOAPIC_NUM_PINS];
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c6256b9..686e1ab 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1719,7 +1719,7 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
eptp = construct_eptp(cr3);
vmcs_write64(EPT_POINTER, eptp);
guest_cr3 = is_paging(vcpu) ? vcpu->arch.cr3 :
- VMX_EPT_IDENTITY_PAGETABLE_ADDR;
+ vcpu->kvm->arch.ept_identity_map_addr;
}
vmx_flush_tlb(vcpu);
@@ -2122,7 +2122,7 @@ static int init_rmode_identity_map(struct kvm *kvm)
if (likely(kvm->arch.ept_identity_pagetable_done))
return 1;
ret = 0;
- identity_map_pfn = VMX_EPT_IDENTITY_PAGETABLE_ADDR >> PAGE_SHIFT;
+ identity_map_pfn = kvm->arch.ept_identity_map_addr >> PAGE_SHIFT;
r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
if (r < 0)
goto out;
@@ -2191,14 +2191,15 @@ static int alloc_identity_pagetable(struct kvm *kvm)
goto out;
kvm_userspace_mem.slot = IDENTITY_PAGETABLE_PRIVATE_MEMSLOT;
kvm_userspace_mem.flags = 0;
- kvm_userspace_mem.guest_phys_addr = VMX_EPT_IDENTITY_PAGETABLE_ADDR;
+ kvm_userspace_mem.guest_phys_addr =
+ kvm->arch.ept_identity_map_addr;
kvm_userspace_mem.memory_size = PAGE_SIZE;
r = __kvm_set_memory_region(kvm, &kvm_userspace_mem, 0);
if (r)
goto out;
kvm->arch.ept_identity_pagetable = gfn_to_page(kvm,
- VMX_EPT_IDENTITY_PAGETABLE_ADDR >> PAGE_SHIFT);
+ kvm->arch.ept_identity_map_addr >> PAGE_SHIFT);
out:
up_write(&kvm->slots_lock);
return r;
@@ -3814,9 +3815,13 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
if (alloc_apic_access_page(kvm) != 0)
goto free_vmcs;
- if (enable_ept)
+ if (enable_ept) {
+ if (!kvm->arch.ept_identity_map_addr)
+ kvm->arch.ept_identity_map_addr =
+ VMX_EPT_IDENTITY_PAGETABLE_ADDR;
if (alloc_identity_pagetable(kvm) != 0)
goto free_vmcs;
+ }
return &vmx->vcpu;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 18ce27f..2539e9a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1206,6 +1206,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_IOEVENTFD:
case KVM_CAP_PIT2:
case KVM_CAP_PIT_STATE2:
+ case KVM_CAP_SET_IDENTITY_MAP_ADDR:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
@@ -1906,6 +1907,13 @@ static int kvm_vm_ioctl_set_tss_addr(struct kvm *kvm, unsigned long addr)
return ret;
}
+static int kvm_vm_ioctl_set_identity_map_addr(struct kvm *kvm,
+ u64 ident_addr)
+{
+ kvm->arch.ept_identity_map_addr = ident_addr;
+ return 0;
+}
+
static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm *kvm,
u32 kvm_nr_mmu_pages)
{
@@ -2169,6 +2177,17 @@ long kvm_arch_vm_ioctl(struct file *filp,
if (r < 0)
goto out;
break;
+ case KVM_SET_IDENTITY_MAP_ADDR: {
+ u64 ident_addr;
+
+ r = -EFAULT;
+ if (copy_from_user(&ident_addr, argp, sizeof ident_addr))
+ goto out;
+ r = kvm_vm_ioctl_set_identity_map_addr(kvm, ident_addr);
+ if (r < 0)
+ goto out;
+ break;
+ }
case KVM_SET_MEMORY_REGION: {
struct kvm_memory_region kvm_mem;
struct kvm_userspace_memory_region kvm_userspace_mem;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 230a91a..f8f8900 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -435,6 +435,7 @@ struct kvm_ioeventfd {
#define KVM_CAP_PIT_STATE2 35
#endif
#define KVM_CAP_IOEVENTFD 36
+#define KVM_CAP_SET_IDENTITY_MAP_ADDR 37
#ifdef KVM_CAP_IRQ_ROUTING
@@ -512,6 +513,7 @@ struct kvm_irqfd {
#define KVM_SET_USER_MEMORY_REGION _IOW(KVMIO, 0x46,\
struct kvm_userspace_memory_region)
#define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47)
+#define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64)
/* Device model IOC */
#define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60)
#define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level)
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 34/46] KVM: PIT: Unregister ack notifier callback when freeing
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (32 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 33/46] KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 35/46] KVM: Drop obsolete cpu_get/put in make_all_cpus_request Avi Kivity
` (11 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/i8254.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 137e548..472653c 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -672,6 +672,8 @@ void kvm_free_pit(struct kvm *kvm)
if (kvm->arch.vpit) {
kvm_unregister_irq_mask_notifier(kvm, 0,
&kvm->arch.vpit->mask_notifier);
+ kvm_unregister_irq_ack_notifier(kvm,
+ &kvm->arch.vpit->pit_state.irq_ack_notifier);
mutex_lock(&kvm->arch.vpit->pit_state.lock);
timer = &kvm->arch.vpit->pit_state.pit_timer.timer;
hrtimer_cancel(timer);
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 35/46] KVM: Drop obsolete cpu_get/put in make_all_cpus_request
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (33 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 34/46] KVM: PIT: Unregister ack notifier callback when freeing Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 36/46] KVM: VMX: Avoid to return ENOTSUPP to userland Avi Kivity
` (10 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Jan Kiszka <jan.kiszka@siemens.com>
spin_lock disables preemption, so we can simply read the current cpu.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
virt/kvm/kvm_main.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 532af9b..646cf2a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -741,8 +741,8 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)
if (alloc_cpumask_var(&cpus, GFP_ATOMIC))
cpumask_clear(cpus);
- me = get_cpu();
spin_lock(&kvm->requests_lock);
+ me = smp_processor_id();
kvm_for_each_vcpu(i, vcpu, kvm) {
if (test_and_set_bit(req, &vcpu->requests))
continue;
@@ -757,7 +757,6 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)
else
called = false;
spin_unlock(&kvm->requests_lock);
- put_cpu();
free_cpumask_var(cpus);
return called;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 36/46] KVM: VMX: Avoid to return ENOTSUPP to userland
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (34 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 35/46] KVM: Drop obsolete cpu_get/put in make_all_cpus_request Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 37/46] KVM: Align cr8 threshold when userspace changes cr8 Avi Kivity
` (9 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Jan Kiszka <jan.kiszka@web.de>
Choose some allowed error values for the cases VMX returned ENOTSUPP so
far as these values could be returned by the KVM_RUN IOCTL.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/vmx.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 686e1ab..c5aaa1b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3133,7 +3133,7 @@ static int handle_apic_access(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
printk(KERN_ERR
"Fail to handle apic access vmexit! Offset is 0x%lx\n",
offset);
- return -ENOTSUPP;
+ return -ENOEXEC;
}
return 1;
}
@@ -3202,7 +3202,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
if (exit_qualification & (1 << 6)) {
printk(KERN_ERR "EPT: GPA exceeds GAW!\n");
- return -ENOTSUPP;
+ return -EINVAL;
}
gla_validity = (exit_qualification >> 7) & 0x3;
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 37/46] KVM: Align cr8 threshold when userspace changes cr8
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (35 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 36/46] KVM: VMX: Avoid to return ENOTSUPP to userland Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 38/46] KVM: limit lapic periodic timer frequency Avi Kivity
` (8 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Mikhail Ershov <arcezed@gmail.com>
Commit f0a3602c20 ("KVM: Move interrupt injection logic to x86.c") does not
update the cr8 intercept if the lapic is disabled, so when userspace updates
cr8, the cr8 threshold control is not updated and we are left with illegal
control fields.
Fix by explicitly resetting the cr8 threshold.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2539e9a..d1bcc59 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4441,6 +4441,8 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
kvm_set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
kvm_set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
+ update_cr8_intercept(vcpu);
+
/* Older userspace won't unhalt the vcpu on reset. */
if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 &&
sregs->cs.selector == 0xf000 && sregs->cs.base == 0xffff0000 &&
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 38/46] KVM: limit lapic periodic timer frequency
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (36 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 37/46] KVM: Align cr8 threshold when userspace changes cr8 Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 39/46] KVM: fix kvm_init() error handling Avi Kivity
` (7 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Marcelo Tosatti <mtosatti@redhat.com>
Otherwise its possible to starve the host by programming lapic timer
with a very high frequency.
Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/lapic.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index fdddf48..ce195f8 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -664,6 +664,15 @@ static void start_apic_timer(struct kvm_lapic *apic)
if (!apic->lapic_timer.period)
return;
+ /*
+ * Do not allow the guest to program periodic timers with small
+ * interval, since the hrtimers are not throttled by the host
+ * scheduler.
+ */
+ if (apic_lvtt_period(apic)) {
+ if (apic->lapic_timer.period < NSEC_PER_MSEC/2)
+ apic->lapic_timer.period = NSEC_PER_MSEC/2;
+ }
hrtimer_start(&apic->lapic_timer.timer,
ktime_add_ns(now, apic->lapic_timer.period),
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 39/46] KVM: fix kvm_init() error handling
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (37 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 38/46] KVM: limit lapic periodic timer frequency Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 40/46] KVM: MMU: make rmap code aware of mapping levels Avi Kivity
` (6 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Remove debugfs file if kvm_arch_init() return error
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
virt/kvm/kvm_main.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 646cf2a..4470251 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2791,8 +2791,8 @@ out_free_0:
__free_page(bad_page);
out:
kvm_arch_exit();
- kvm_exit_debug();
out_fail:
+ kvm_exit_debug();
return r;
}
EXPORT_SYMBOL_GPL(kvm_init);
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 40/46] KVM: MMU: make rmap code aware of mapping levels
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (38 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 39/46] KVM: fix kvm_init() error handling Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 41/46] KVM: MMU: rename is_largepage_backed to mapping_level Avi Kivity
` (5 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
This patch removes the largepage parameter from the rmap_add function.
Together with rmap_remove this function now uses the role.level field to
find determine if the page is a huge page.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 53 +++++++++++++++++++++++++++------------------------
1 files changed, 28 insertions(+), 25 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 86c2551..b93ad2c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -479,19 +479,19 @@ static int is_largepage_backed(struct kvm_vcpu *vcpu, gfn_t large_gfn)
* Note: gfn must be unaliased before this function get called
*/
-static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn, int lpage)
+static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn, int level)
{
struct kvm_memory_slot *slot;
unsigned long idx;
slot = gfn_to_memslot(kvm, gfn);
- if (!lpage)
+ if (likely(level == PT_PAGE_TABLE_LEVEL))
return &slot->rmap[gfn - slot->base_gfn];
- idx = (gfn / KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL)) -
- (slot->base_gfn / KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL));
+ idx = (gfn / KVM_PAGES_PER_HPAGE(level)) -
+ (slot->base_gfn / KVM_PAGES_PER_HPAGE(level));
- return &slot->lpage_info[0][idx].rmap_pde;
+ return &slot->lpage_info[level - 2][idx].rmap_pde;
}
/*
@@ -507,7 +507,7 @@ static unsigned long *gfn_to_rmap(struct kvm *kvm, gfn_t gfn, int lpage)
* the spte was not added.
*
*/
-static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn, int lpage)
+static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
{
struct kvm_mmu_page *sp;
struct kvm_rmap_desc *desc;
@@ -519,7 +519,7 @@ static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn, int lpage)
gfn = unalias_gfn(vcpu->kvm, gfn);
sp = page_header(__pa(spte));
sp->gfns[spte - sp->spt] = gfn;
- rmapp = gfn_to_rmap(vcpu->kvm, gfn, lpage);
+ rmapp = gfn_to_rmap(vcpu->kvm, gfn, sp->role.level);
if (!*rmapp) {
rmap_printk("rmap_add: %p %llx 0->1\n", spte, *spte);
*rmapp = (unsigned long)spte;
@@ -589,7 +589,7 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
kvm_release_pfn_dirty(pfn);
else
kvm_release_pfn_clean(pfn);
- rmapp = gfn_to_rmap(kvm, sp->gfns[spte - sp->spt], is_large_pte(*spte));
+ rmapp = gfn_to_rmap(kvm, sp->gfns[spte - sp->spt], sp->role.level);
if (!*rmapp) {
printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte);
BUG();
@@ -652,10 +652,10 @@ static int rmap_write_protect(struct kvm *kvm, u64 gfn)
{
unsigned long *rmapp;
u64 *spte;
- int write_protected = 0;
+ int i, write_protected = 0;
gfn = unalias_gfn(kvm, gfn);
- rmapp = gfn_to_rmap(kvm, gfn, 0);
+ rmapp = gfn_to_rmap(kvm, gfn, PT_PAGE_TABLE_LEVEL);
spte = rmap_next(kvm, rmapp, NULL);
while (spte) {
@@ -677,21 +677,24 @@ static int rmap_write_protect(struct kvm *kvm, u64 gfn)
}
/* check for huge page mappings */
- rmapp = gfn_to_rmap(kvm, gfn, 1);
- spte = rmap_next(kvm, rmapp, NULL);
- while (spte) {
- BUG_ON(!spte);
- BUG_ON(!(*spte & PT_PRESENT_MASK));
- BUG_ON((*spte & (PT_PAGE_SIZE_MASK|PT_PRESENT_MASK)) != (PT_PAGE_SIZE_MASK|PT_PRESENT_MASK));
- pgprintk("rmap_write_protect(large): spte %p %llx %lld\n", spte, *spte, gfn);
- if (is_writeble_pte(*spte)) {
- rmap_remove(kvm, spte);
- --kvm->stat.lpages;
- __set_spte(spte, shadow_trap_nonpresent_pte);
- spte = NULL;
- write_protected = 1;
+ for (i = PT_DIRECTORY_LEVEL;
+ i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
+ rmapp = gfn_to_rmap(kvm, gfn, i);
+ spte = rmap_next(kvm, rmapp, NULL);
+ while (spte) {
+ BUG_ON(!spte);
+ BUG_ON(!(*spte & PT_PRESENT_MASK));
+ BUG_ON((*spte & (PT_PAGE_SIZE_MASK|PT_PRESENT_MASK)) != (PT_PAGE_SIZE_MASK|PT_PRESENT_MASK));
+ pgprintk("rmap_write_protect(large): spte %p %llx %lld\n", spte, *spte, gfn);
+ if (is_writeble_pte(*spte)) {
+ rmap_remove(kvm, spte);
+ --kvm->stat.lpages;
+ __set_spte(spte, shadow_trap_nonpresent_pte);
+ spte = NULL;
+ write_protected = 1;
+ }
+ spte = rmap_next(kvm, rmapp, spte);
}
- spte = rmap_next(kvm, rmapp, spte);
}
return write_protected;
@@ -1815,7 +1818,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
page_header_update_slot(vcpu->kvm, sptep, gfn);
if (!was_rmapped) {
- rmap_count = rmap_add(vcpu, sptep, gfn, largepage);
+ rmap_count = rmap_add(vcpu, sptep, gfn);
if (!is_rmap_spte(*sptep))
kvm_release_pfn_clean(pfn);
if (rmap_count > RMAP_RECYCLE_THRESHOLD)
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 41/46] KVM: MMU: rename is_largepage_backed to mapping_level
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (39 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 40/46] KVM: MMU: make rmap code aware of mapping levels Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 42/46] KVM: MMU: make direct mapping paths aware of mapping levels Avi Kivity
` (4 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
With the new name and the corresponding backend changes this function
can now support multiple hugepage sizes.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 100 +++++++++++++++++++++++++++++--------------
arch/x86/kvm/paging_tmpl.h | 4 +-
2 files changed, 69 insertions(+), 35 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b93ad2c..c707936 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -393,37 +393,52 @@ static void mmu_free_rmap_desc(struct kvm_rmap_desc *rd)
* Return the pointer to the largepage write count for a given
* gfn, handling slots that are not large page aligned.
*/
-static int *slot_largepage_idx(gfn_t gfn, struct kvm_memory_slot *slot)
+static int *slot_largepage_idx(gfn_t gfn,
+ struct kvm_memory_slot *slot,
+ int level)
{
unsigned long idx;
- idx = (gfn / KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL)) -
- (slot->base_gfn / KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL));
- return &slot->lpage_info[0][idx].write_count;
+ idx = (gfn / KVM_PAGES_PER_HPAGE(level)) -
+ (slot->base_gfn / KVM_PAGES_PER_HPAGE(level));
+ return &slot->lpage_info[level - 2][idx].write_count;
}
static void account_shadowed(struct kvm *kvm, gfn_t gfn)
{
+ struct kvm_memory_slot *slot;
int *write_count;
+ int i;
gfn = unalias_gfn(kvm, gfn);
- write_count = slot_largepage_idx(gfn,
- gfn_to_memslot_unaliased(kvm, gfn));
- *write_count += 1;
+
+ slot = gfn_to_memslot_unaliased(kvm, gfn);
+ for (i = PT_DIRECTORY_LEVEL;
+ i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
+ write_count = slot_largepage_idx(gfn, slot, i);
+ *write_count += 1;
+ }
}
static void unaccount_shadowed(struct kvm *kvm, gfn_t gfn)
{
+ struct kvm_memory_slot *slot;
int *write_count;
+ int i;
gfn = unalias_gfn(kvm, gfn);
- write_count = slot_largepage_idx(gfn,
- gfn_to_memslot_unaliased(kvm, gfn));
- *write_count -= 1;
- WARN_ON(*write_count < 0);
+ for (i = PT_DIRECTORY_LEVEL;
+ i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
+ slot = gfn_to_memslot_unaliased(kvm, gfn);
+ write_count = slot_largepage_idx(gfn, slot, i);
+ *write_count -= 1;
+ WARN_ON(*write_count < 0);
+ }
}
-static int has_wrprotected_page(struct kvm *kvm, gfn_t gfn)
+static int has_wrprotected_page(struct kvm *kvm,
+ gfn_t gfn,
+ int level)
{
struct kvm_memory_slot *slot;
int *largepage_idx;
@@ -431,47 +446,67 @@ static int has_wrprotected_page(struct kvm *kvm, gfn_t gfn)
gfn = unalias_gfn(kvm, gfn);
slot = gfn_to_memslot_unaliased(kvm, gfn);
if (slot) {
- largepage_idx = slot_largepage_idx(gfn, slot);
+ largepage_idx = slot_largepage_idx(gfn, slot, level);
return *largepage_idx;
}
return 1;
}
-static int host_largepage_backed(struct kvm *kvm, gfn_t gfn)
+static int host_mapping_level(struct kvm *kvm, gfn_t gfn)
{
+ unsigned long page_size = PAGE_SIZE;
struct vm_area_struct *vma;
unsigned long addr;
- int ret = 0;
+ int i, ret = 0;
addr = gfn_to_hva(kvm, gfn);
if (kvm_is_error_hva(addr))
- return ret;
+ return page_size;
down_read(¤t->mm->mmap_sem);
vma = find_vma(current->mm, addr);
- if (vma && is_vm_hugetlb_page(vma))
- ret = 1;
+ if (!vma)
+ goto out;
+
+ page_size = vma_kernel_pagesize(vma);
+
+out:
up_read(¤t->mm->mmap_sem);
+ for (i = PT_PAGE_TABLE_LEVEL;
+ i < (PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES); ++i) {
+ if (page_size >= KVM_HPAGE_SIZE(i))
+ ret = i;
+ else
+ break;
+ }
+
return ret;
}
-static int is_largepage_backed(struct kvm_vcpu *vcpu, gfn_t large_gfn)
+static int mapping_level(struct kvm_vcpu *vcpu, gfn_t large_gfn)
{
struct kvm_memory_slot *slot;
-
- if (has_wrprotected_page(vcpu->kvm, large_gfn))
- return 0;
-
- if (!host_largepage_backed(vcpu->kvm, large_gfn))
- return 0;
+ int host_level;
+ int level = PT_PAGE_TABLE_LEVEL;
slot = gfn_to_memslot(vcpu->kvm, large_gfn);
if (slot && slot->dirty_bitmap)
- return 0;
+ return PT_PAGE_TABLE_LEVEL;
- return 1;
+ host_level = host_mapping_level(vcpu->kvm, large_gfn);
+
+ if (host_level == PT_PAGE_TABLE_LEVEL)
+ return host_level;
+
+ for (level = PT_DIRECTORY_LEVEL; level <= host_level; ++level) {
+
+ if (has_wrprotected_page(vcpu->kvm, large_gfn, level))
+ break;
+ }
+
+ return level - 1;
}
/*
@@ -1733,7 +1768,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if ((pte_access & ACC_WRITE_MASK)
|| (write_fault && !is_write_protection(vcpu) && !user_fault)) {
- if (largepage && has_wrprotected_page(vcpu->kvm, gfn)) {
+ if (largepage && has_wrprotected_page(vcpu->kvm, gfn, 1)) {
ret = 1;
spte = shadow_trap_nonpresent_pte;
goto set_pte;
@@ -1884,8 +1919,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn)
pfn_t pfn;
unsigned long mmu_seq;
- if (is_largepage_backed(vcpu, gfn &
- ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1))) {
+ if (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL) {
gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
largepage = 1;
}
@@ -2091,8 +2125,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa,
if (r)
return r;
- if (is_largepage_backed(vcpu, gfn &
- ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1))) {
+ if (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL) {
gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
largepage = 1;
}
@@ -2494,7 +2527,8 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
return;
gfn = (gpte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
- if (is_large_pte(gpte) && is_largepage_backed(vcpu, gfn)) {
+ if (is_large_pte(gpte) &&
+ (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL)) {
gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
vcpu->arch.update_pte.largepage = 1;
}
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 36ac6d7..44f0346 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -407,8 +407,8 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
if (walker.level == PT_DIRECTORY_LEVEL) {
gfn_t large_gfn;
large_gfn = walker.gfn &
- ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
- if (is_largepage_backed(vcpu, large_gfn)) {
+ ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
+ if (mapping_level(vcpu, large_gfn) == PT_DIRECTORY_LEVEL) {
walker.gfn = large_gfn;
largepage = 1;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 42/46] KVM: MMU: make direct mapping paths aware of mapping levels
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (40 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 41/46] KVM: MMU: rename is_largepage_backed to mapping_level Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 43/46] KVM: MMU: make page walker " Avi Kivity
` (3 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/mmu.c | 83 +++++++++++++++++++++++----------------
arch/x86/kvm/paging_tmpl.h | 6 +-
3 files changed, 53 insertions(+), 38 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e210b21..e09dc26 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -315,7 +315,7 @@ struct kvm_vcpu_arch {
struct {
gfn_t gfn; /* presumed gfn during guest pte update */
pfn_t pfn; /* pfn corresponding to that gfn */
- int largepage;
+ int level;
unsigned long mmu_seq;
} update_pte;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c707936..110c224 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -257,7 +257,7 @@ static int is_last_spte(u64 pte, int level)
{
if (level == PT_PAGE_TABLE_LEVEL)
return 1;
- if (level == PT_DIRECTORY_LEVEL && is_large_pte(pte))
+ if (is_large_pte(pte))
return 1;
return 0;
}
@@ -753,7 +753,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp)
static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
int (*handler)(struct kvm *kvm, unsigned long *rmapp))
{
- int i;
+ int i, j;
int retval = 0;
/*
@@ -772,11 +772,15 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
end = start + (memslot->npages << PAGE_SHIFT);
if (hva >= start && hva < end) {
gfn_t gfn_offset = (hva - start) >> PAGE_SHIFT;
- int idx = gfn_offset /
- KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL);
+
retval |= handler(kvm, &memslot->rmap[gfn_offset]);
- retval |= handler(kvm,
- &memslot->lpage_info[0][idx].rmap_pde);
+
+ for (j = 0; j < KVM_NR_PAGE_SIZES - 1; ++j) {
+ int idx = gfn_offset;
+ idx /= KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL + j);
+ retval |= handler(kvm,
+ &memslot->lpage_info[j][idx].rmap_pde);
+ }
}
}
@@ -814,12 +818,15 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp)
#define RMAP_RECYCLE_THRESHOLD 1000
-static void rmap_recycle(struct kvm_vcpu *vcpu, gfn_t gfn, int lpage)
+static void rmap_recycle(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn)
{
unsigned long *rmapp;
+ struct kvm_mmu_page *sp;
+
+ sp = page_header(__pa(spte));
gfn = unalias_gfn(vcpu->kvm, gfn);
- rmapp = gfn_to_rmap(vcpu->kvm, gfn, lpage);
+ rmapp = gfn_to_rmap(vcpu->kvm, gfn, sp->role.level);
kvm_unmap_rmapp(vcpu->kvm, rmapp);
kvm_flush_remote_tlbs(vcpu->kvm);
@@ -1734,7 +1741,7 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
unsigned pte_access, int user_fault,
- int write_fault, int dirty, int largepage,
+ int write_fault, int dirty, int level,
gfn_t gfn, pfn_t pfn, bool speculative,
bool can_unsync)
{
@@ -1757,7 +1764,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
spte |= shadow_nx_mask;
if (pte_access & ACC_USER_MASK)
spte |= shadow_user_mask;
- if (largepage)
+ if (level > PT_PAGE_TABLE_LEVEL)
spte |= PT_PAGE_SIZE_MASK;
if (tdp_enabled)
spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn,
@@ -1768,7 +1775,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if ((pte_access & ACC_WRITE_MASK)
|| (write_fault && !is_write_protection(vcpu) && !user_fault)) {
- if (largepage && has_wrprotected_page(vcpu->kvm, gfn, 1)) {
+ if (level > PT_PAGE_TABLE_LEVEL &&
+ has_wrprotected_page(vcpu->kvm, gfn, level)) {
ret = 1;
spte = shadow_trap_nonpresent_pte;
goto set_pte;
@@ -1806,7 +1814,7 @@ set_pte:
static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
unsigned pt_access, unsigned pte_access,
int user_fault, int write_fault, int dirty,
- int *ptwrite, int largepage, gfn_t gfn,
+ int *ptwrite, int level, gfn_t gfn,
pfn_t pfn, bool speculative)
{
int was_rmapped = 0;
@@ -1823,7 +1831,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
* If we overwrite a PTE page pointer with a 2MB PMD, unlink
* the parent of the now unreachable PTE.
*/
- if (largepage && !is_large_pte(*sptep)) {
+ if (level > PT_PAGE_TABLE_LEVEL &&
+ !is_large_pte(*sptep)) {
struct kvm_mmu_page *child;
u64 pte = *sptep;
@@ -1836,8 +1845,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
} else
was_rmapped = 1;
}
+
if (set_spte(vcpu, sptep, pte_access, user_fault, write_fault,
- dirty, largepage, gfn, pfn, speculative, true)) {
+ dirty, level, gfn, pfn, speculative, true)) {
if (write_fault)
*ptwrite = 1;
kvm_x86_ops->tlb_flush(vcpu);
@@ -1857,7 +1867,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if (!is_rmap_spte(*sptep))
kvm_release_pfn_clean(pfn);
if (rmap_count > RMAP_RECYCLE_THRESHOLD)
- rmap_recycle(vcpu, gfn, largepage);
+ rmap_recycle(vcpu, sptep, gfn);
} else {
if (was_writeble)
kvm_release_pfn_dirty(pfn);
@@ -1875,7 +1885,7 @@ static void nonpaging_new_cr3(struct kvm_vcpu *vcpu)
}
static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
- int largepage, gfn_t gfn, pfn_t pfn)
+ int level, gfn_t gfn, pfn_t pfn)
{
struct kvm_shadow_walk_iterator iterator;
struct kvm_mmu_page *sp;
@@ -1883,11 +1893,10 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
gfn_t pseudo_gfn;
for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
- if (iterator.level == PT_PAGE_TABLE_LEVEL
- || (largepage && iterator.level == PT_DIRECTORY_LEVEL)) {
+ if (iterator.level == level) {
mmu_set_spte(vcpu, iterator.sptep, ACC_ALL, ACC_ALL,
0, write, 1, &pt_write,
- largepage, gfn, pfn, false);
+ level, gfn, pfn, false);
++vcpu->stat.pf_fixed;
break;
}
@@ -1915,14 +1924,20 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn)
{
int r;
- int largepage = 0;
+ int level;
pfn_t pfn;
unsigned long mmu_seq;
- if (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL) {
- gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
- largepage = 1;
- }
+ level = mapping_level(vcpu, gfn);
+
+ /*
+ * This path builds a PAE pagetable - so we can map 2mb pages at
+ * maximum. Therefore check if the level is larger than that.
+ */
+ if (level > PT_DIRECTORY_LEVEL)
+ level = PT_DIRECTORY_LEVEL;
+
+ gfn &= ~(KVM_PAGES_PER_HPAGE(level) - 1);
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
@@ -1938,7 +1953,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn)
if (mmu_notifier_retry(vcpu, mmu_seq))
goto out_unlock;
kvm_mmu_free_some_pages(vcpu);
- r = __direct_map(vcpu, v, write, largepage, gfn, pfn);
+ r = __direct_map(vcpu, v, write, level, gfn, pfn);
spin_unlock(&vcpu->kvm->mmu_lock);
@@ -2114,7 +2129,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa,
{
pfn_t pfn;
int r;
- int largepage = 0;
+ int level;
gfn_t gfn = gpa >> PAGE_SHIFT;
unsigned long mmu_seq;
@@ -2125,10 +2140,10 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa,
if (r)
return r;
- if (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL) {
- gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
- largepage = 1;
- }
+ level = mapping_level(vcpu, gfn);
+
+ gfn &= ~(KVM_PAGES_PER_HPAGE(level) - 1);
+
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
pfn = gfn_to_pfn(vcpu->kvm, gfn);
@@ -2141,7 +2156,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa,
goto out_unlock;
kvm_mmu_free_some_pages(vcpu);
r = __direct_map(vcpu, gpa, error_code & PFERR_WRITE_MASK,
- largepage, gfn, pfn);
+ level, gfn, pfn);
spin_unlock(&vcpu->kvm->mmu_lock);
return r;
@@ -2448,7 +2463,7 @@ static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
const void *new)
{
if (sp->role.level != PT_PAGE_TABLE_LEVEL) {
- if (!vcpu->arch.update_pte.largepage ||
+ if (vcpu->arch.update_pte.level == PT_PAGE_TABLE_LEVEL ||
sp->role.glevels == PT32_ROOT_LEVEL) {
++vcpu->kvm->stat.mmu_pde_zapped;
return;
@@ -2498,7 +2513,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
u64 gpte = 0;
pfn_t pfn;
- vcpu->arch.update_pte.largepage = 0;
+ vcpu->arch.update_pte.level = PT_PAGE_TABLE_LEVEL;
if (bytes != 4 && bytes != 8)
return;
@@ -2530,7 +2545,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
if (is_large_pte(gpte) &&
(mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL)) {
gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
- vcpu->arch.update_pte.largepage = 1;
+ vcpu->arch.update_pte.level = PT_DIRECTORY_LEVEL;
}
vcpu->arch.update_pte.mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 44f0346..b167f0d 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -253,7 +253,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
pt_element_t gpte;
unsigned pte_access;
pfn_t pfn;
- int largepage = vcpu->arch.update_pte.largepage;
+ int level = vcpu->arch.update_pte.level;
gpte = *(const pt_element_t *)pte;
if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK)) {
@@ -272,7 +272,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
return;
kvm_get_pfn(pfn);
mmu_set_spte(vcpu, spte, page->role.access, pte_access, 0, 0,
- gpte & PT_DIRTY_MASK, NULL, largepage,
+ gpte & PT_DIRTY_MASK, NULL, level,
gpte_to_gfn(gpte), pfn, true);
}
@@ -306,7 +306,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
gw->pte_access & access,
user_fault, write_fault,
gw->ptes[gw->level-1] & PT_DIRTY_MASK,
- ptwrite, largepage,
+ ptwrite, level,
gw->gfn, pfn, false);
break;
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 43/46] KVM: MMU: make page walker aware of mapping levels
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (41 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 42/46] KVM: MMU: make direct mapping paths aware of mapping levels Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 44/46] KVM: MMU: shadow support for 1gb pages Avi Kivity
` (2 subsequent siblings)
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
The page walker may be used with nested paging too when accessing mmio
areas. Make it support the additional page-level too.
[ Marcelo: fix reserved bit check for 1gb pte ]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 17 +++++++++++++-
arch/x86/kvm/paging_tmpl.h | 52 +++++++++++++++++++++++--------------------
2 files changed, 44 insertions(+), 25 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 110c224..09ab643 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -108,6 +108,9 @@ module_param(oos_shadow, bool, 0644);
#define PT32_LEVEL_MASK(level) \
(((1ULL << PT32_LEVEL_BITS) - 1) << PT32_LEVEL_SHIFT(level))
+#define PT32_LVL_OFFSET_MASK(level) \
+ (PT32_BASE_ADDR_MASK & ((1ULL << (PAGE_SHIFT + (((level) - 1) \
+ * PT32_LEVEL_BITS))) - 1))
#define PT32_INDEX(address, level)\
(((address) >> PT32_LEVEL_SHIFT(level)) & ((1 << PT32_LEVEL_BITS) - 1))
@@ -116,10 +119,19 @@ module_param(oos_shadow, bool, 0644);
#define PT64_BASE_ADDR_MASK (((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1))
#define PT64_DIR_BASE_ADDR_MASK \
(PT64_BASE_ADDR_MASK & ~((1ULL << (PAGE_SHIFT + PT64_LEVEL_BITS)) - 1))
+#define PT64_LVL_ADDR_MASK(level) \
+ (PT64_BASE_ADDR_MASK & ~((1ULL << (PAGE_SHIFT + (((level) - 1) \
+ * PT64_LEVEL_BITS))) - 1))
+#define PT64_LVL_OFFSET_MASK(level) \
+ (PT64_BASE_ADDR_MASK & ((1ULL << (PAGE_SHIFT + (((level) - 1) \
+ * PT64_LEVEL_BITS))) - 1))
#define PT32_BASE_ADDR_MASK PAGE_MASK
#define PT32_DIR_BASE_ADDR_MASK \
(PAGE_MASK & ~((1ULL << (PAGE_SHIFT + PT32_LEVEL_BITS)) - 1))
+#define PT32_LVL_ADDR_MASK(level) \
+ (PAGE_MASK & ~((1ULL << (PAGE_SHIFT + (((level) - 1) \
+ * PT32_LEVEL_BITS))) - 1))
#define PT64_PERM_MASK (PT_PRESENT_MASK | PT_WRITABLE_MASK | PT_USER_MASK \
| PT64_NX_MASK)
@@ -130,6 +142,7 @@ module_param(oos_shadow, bool, 0644);
#define PFERR_RSVD_MASK (1U << 3)
#define PFERR_FETCH_MASK (1U << 4)
+#define PT_PDPE_LEVEL 3
#define PT_DIRECTORY_LEVEL 2
#define PT_PAGE_TABLE_LEVEL 1
@@ -2273,7 +2286,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, int level)
context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
rsvd_bits(maxphyaddr, 51);
context->rsvd_bits_mask[1][3] = context->rsvd_bits_mask[0][3];
- context->rsvd_bits_mask[1][2] = context->rsvd_bits_mask[0][2];
+ context->rsvd_bits_mask[1][2] = exb_bit_rsvd |
+ rsvd_bits(maxphyaddr, 51) |
+ rsvd_bits(13, 29);
context->rsvd_bits_mask[1][1] = exb_bit_rsvd |
rsvd_bits(maxphyaddr, 51) |
rsvd_bits(13, 20); /* large page */
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index b167f0d..578276e 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -27,7 +27,8 @@
#define guest_walker guest_walker64
#define FNAME(name) paging##64_##name
#define PT_BASE_ADDR_MASK PT64_BASE_ADDR_MASK
- #define PT_DIR_BASE_ADDR_MASK PT64_DIR_BASE_ADDR_MASK
+ #define PT_LVL_ADDR_MASK(lvl) PT64_LVL_ADDR_MASK(lvl)
+ #define PT_LVL_OFFSET_MASK(lvl) PT64_LVL_OFFSET_MASK(lvl)
#define PT_INDEX(addr, level) PT64_INDEX(addr, level)
#define PT_LEVEL_MASK(level) PT64_LEVEL_MASK(level)
#define PT_LEVEL_BITS PT64_LEVEL_BITS
@@ -43,7 +44,8 @@
#define guest_walker guest_walker32
#define FNAME(name) paging##32_##name
#define PT_BASE_ADDR_MASK PT32_BASE_ADDR_MASK
- #define PT_DIR_BASE_ADDR_MASK PT32_DIR_BASE_ADDR_MASK
+ #define PT_LVL_ADDR_MASK(lvl) PT32_LVL_ADDR_MASK(lvl)
+ #define PT_LVL_OFFSET_MASK(lvl) PT32_LVL_OFFSET_MASK(lvl)
#define PT_INDEX(addr, level) PT32_INDEX(addr, level)
#define PT_LEVEL_MASK(level) PT32_LEVEL_MASK(level)
#define PT_LEVEL_BITS PT32_LEVEL_BITS
@@ -53,8 +55,8 @@
#error Invalid PTTYPE value
#endif
-#define gpte_to_gfn FNAME(gpte_to_gfn)
-#define gpte_to_gfn_pde FNAME(gpte_to_gfn_pde)
+#define gpte_to_gfn_lvl FNAME(gpte_to_gfn_lvl)
+#define gpte_to_gfn(pte) gpte_to_gfn_lvl((pte), PT_PAGE_TABLE_LEVEL)
/*
* The guest_walker structure emulates the behavior of the hardware page
@@ -71,14 +73,9 @@ struct guest_walker {
u32 error_code;
};
-static gfn_t gpte_to_gfn(pt_element_t gpte)
+static gfn_t gpte_to_gfn_lvl(pt_element_t gpte, int lvl)
{
- return (gpte & PT_BASE_ADDR_MASK) >> PAGE_SHIFT;
-}
-
-static gfn_t gpte_to_gfn_pde(pt_element_t gpte)
-{
- return (gpte & PT_DIR_BASE_ADDR_MASK) >> PAGE_SHIFT;
+ return (gpte & PT_LVL_ADDR_MASK(lvl)) >> PAGE_SHIFT;
}
static bool FNAME(cmpxchg_gpte)(struct kvm *kvm,
@@ -189,18 +186,24 @@ walk:
walker->ptes[walker->level - 1] = pte;
- if (walker->level == PT_PAGE_TABLE_LEVEL) {
- walker->gfn = gpte_to_gfn(pte);
- break;
- }
-
- if (walker->level == PT_DIRECTORY_LEVEL
- && (pte & PT_PAGE_SIZE_MASK)
- && (PTTYPE == 64 || is_pse(vcpu))) {
- walker->gfn = gpte_to_gfn_pde(pte);
- walker->gfn += PT_INDEX(addr, PT_PAGE_TABLE_LEVEL);
- if (PTTYPE == 32 && is_cpuid_PSE36())
+ if ((walker->level == PT_PAGE_TABLE_LEVEL) ||
+ ((walker->level == PT_DIRECTORY_LEVEL) &&
+ (pte & PT_PAGE_SIZE_MASK) &&
+ (PTTYPE == 64 || is_pse(vcpu))) ||
+ ((walker->level == PT_PDPE_LEVEL) &&
+ (pte & PT_PAGE_SIZE_MASK) &&
+ is_long_mode(vcpu))) {
+ int lvl = walker->level;
+
+ walker->gfn = gpte_to_gfn_lvl(pte, lvl);
+ walker->gfn += (addr & PT_LVL_OFFSET_MASK(lvl))
+ >> PAGE_SHIFT;
+
+ if (PTTYPE == 32 &&
+ walker->level == PT_DIRECTORY_LEVEL &&
+ is_cpuid_PSE36())
walker->gfn += pse36_gfn_delta(pte);
+
break;
}
@@ -609,9 +612,10 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
#undef PT_BASE_ADDR_MASK
#undef PT_INDEX
#undef PT_LEVEL_MASK
-#undef PT_DIR_BASE_ADDR_MASK
+#undef PT_LVL_ADDR_MASK
+#undef PT_LVL_OFFSET_MASK
#undef PT_LEVEL_BITS
#undef PT_MAX_FULL_LEVELS
#undef gpte_to_gfn
-#undef gpte_to_gfn_pde
+#undef gpte_to_gfn_lvl
#undef CMPXCHG
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 44/46] KVM: MMU: shadow support for 1gb pages
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (42 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 43/46] KVM: MMU: make page walker " Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 45/46] KVM: MMU: enable gbpages by increasing nr of pagesizes Avi Kivity
2009-08-23 11:56 ` [PATCH 46/46] KVM: report 1GB page support to userspace Avi Kivity
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
This patch adds support for shadow paging to the 1gb page table code in KVM.
With this code the guest can use 1gb pages even if the host does not support
them.
[ Marcelo: fix shadow page collision on pmd level if a guest 1gb page is mapped
with 4kb ptes on host level ]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/mmu.c | 14 +----------
arch/x86/kvm/paging_tmpl.h | 43 ++++++++++++++++++--------------------
3 files changed, 22 insertions(+), 36 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e09dc26..c9fb2bc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -315,7 +315,6 @@ struct kvm_vcpu_arch {
struct {
gfn_t gfn; /* presumed gfn during guest pte update */
pfn_t pfn; /* pfn corresponding to that gfn */
- int level;
unsigned long mmu_seq;
} update_pte;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 09ab643..1249c12 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2478,11 +2478,8 @@ static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
const void *new)
{
if (sp->role.level != PT_PAGE_TABLE_LEVEL) {
- if (vcpu->arch.update_pte.level == PT_PAGE_TABLE_LEVEL ||
- sp->role.glevels == PT32_ROOT_LEVEL) {
- ++vcpu->kvm->stat.mmu_pde_zapped;
- return;
- }
+ ++vcpu->kvm->stat.mmu_pde_zapped;
+ return;
}
++vcpu->kvm->stat.mmu_pte_updated;
@@ -2528,8 +2525,6 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
u64 gpte = 0;
pfn_t pfn;
- vcpu->arch.update_pte.level = PT_PAGE_TABLE_LEVEL;
-
if (bytes != 4 && bytes != 8)
return;
@@ -2557,11 +2552,6 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
return;
gfn = (gpte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT;
- if (is_large_pte(gpte) &&
- (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL)) {
- gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
- vcpu->arch.update_pte.level = PT_DIRECTORY_LEVEL;
- }
vcpu->arch.update_pte.mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
pfn = gfn_to_pfn(vcpu->kvm, gfn);
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 578276e..d2fec9c 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -256,7 +256,6 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
pt_element_t gpte;
unsigned pte_access;
pfn_t pfn;
- int level = vcpu->arch.update_pte.level;
gpte = *(const pt_element_t *)pte;
if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK)) {
@@ -275,7 +274,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
return;
kvm_get_pfn(pfn);
mmu_set_spte(vcpu, spte, page->role.access, pte_access, 0, 0,
- gpte & PT_DIRTY_MASK, NULL, level,
+ gpte & PT_DIRTY_MASK, NULL, PT_PAGE_TABLE_LEVEL,
gpte_to_gfn(gpte), pfn, true);
}
@@ -284,7 +283,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
*/
static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
struct guest_walker *gw,
- int user_fault, int write_fault, int largepage,
+ int user_fault, int write_fault, int hlevel,
int *ptwrite, pfn_t pfn)
{
unsigned access = gw->pt_access;
@@ -303,8 +302,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
for_each_shadow_entry(vcpu, addr, iterator) {
level = iterator.level;
sptep = iterator.sptep;
- if (level == PT_PAGE_TABLE_LEVEL
- || (largepage && level == PT_DIRECTORY_LEVEL)) {
+ if (iterator.level == hlevel) {
mmu_set_spte(vcpu, sptep, access,
gw->pte_access & access,
user_fault, write_fault,
@@ -323,12 +321,15 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
kvm_flush_remote_tlbs(vcpu->kvm);
}
- if (level == PT_DIRECTORY_LEVEL
- && gw->level == PT_DIRECTORY_LEVEL) {
+ if (level <= gw->level) {
+ int delta = level - gw->level + 1;
direct = 1;
- if (!is_dirty_gpte(gw->ptes[level - 1]))
+ if (!is_dirty_gpte(gw->ptes[level - delta]))
access &= ~ACC_WRITE_MASK;
- table_gfn = gpte_to_gfn(gw->ptes[level - 1]);
+ table_gfn = gpte_to_gfn(gw->ptes[level - delta]);
+ /* advance table_gfn when emulating 1gb pages with 4k */
+ if (delta == 0)
+ table_gfn += PT_INDEX(addr, level);
} else {
direct = 0;
table_gfn = gw->table_gfn[level - 2];
@@ -381,7 +382,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
int write_pt = 0;
int r;
pfn_t pfn;
- int largepage = 0;
+ int level = PT_PAGE_TABLE_LEVEL;
unsigned long mmu_seq;
pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);
@@ -407,15 +408,11 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
return 0;
}
- if (walker.level == PT_DIRECTORY_LEVEL) {
- gfn_t large_gfn;
- large_gfn = walker.gfn &
- ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1);
- if (mapping_level(vcpu, large_gfn) == PT_DIRECTORY_LEVEL) {
- walker.gfn = large_gfn;
- largepage = 1;
- }
+ if (walker.level >= PT_DIRECTORY_LEVEL) {
+ level = min(walker.level, mapping_level(vcpu, walker.gfn));
+ walker.gfn = walker.gfn & ~(KVM_PAGES_PER_HPAGE(level) - 1);
}
+
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
pfn = gfn_to_pfn(vcpu->kvm, walker.gfn);
@@ -432,8 +429,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
goto out_unlock;
kvm_mmu_free_some_pages(vcpu);
sptep = FNAME(fetch)(vcpu, addr, &walker, user_fault, write_fault,
- largepage, &write_pt, pfn);
-
+ level, &write_pt, pfn);
pgprintk("%s: shadow pte %p %llx ptwrite %d\n", __func__,
sptep, *sptep, write_pt);
@@ -468,8 +464,9 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
sptep = iterator.sptep;
/* FIXME: properly handle invlpg on large guest pages */
- if (level == PT_PAGE_TABLE_LEVEL ||
- ((level == PT_DIRECTORY_LEVEL) && is_large_pte(*sptep))) {
+ if (level == PT_PAGE_TABLE_LEVEL ||
+ ((level == PT_DIRECTORY_LEVEL && is_large_pte(*sptep))) ||
+ ((level == PT_PDPE_LEVEL && is_large_pte(*sptep)))) {
struct kvm_mmu_page *sp = page_header(__pa(sptep));
pte_gpa = (sp->gfn << PAGE_SHIFT);
@@ -599,7 +596,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
nr_present++;
pte_access = sp->role.access & FNAME(gpte_access)(vcpu, gpte);
set_spte(vcpu, &sp->spt[i], pte_access, 0, 0,
- is_dirty_gpte(gpte), 0, gfn,
+ is_dirty_gpte(gpte), PT_PAGE_TABLE_LEVEL, gfn,
spte_to_pfn(sp->spt[i]), true, false);
}
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 45/46] KVM: MMU: enable gbpages by increasing nr of pagesizes
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (43 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 44/46] KVM: MMU: shadow support for 1gb pages Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
2009-08-23 11:56 ` [PATCH 46/46] KVM: report 1GB page support to userspace Avi Kivity
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c9fb2bc..3315efa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -55,7 +55,7 @@
#define UNMAPPED_GVA (~(gpa_t)0)
/* KVM Hugepage definitions for x86 */
-#define KVM_NR_PAGE_SIZES 2
+#define KVM_NR_PAGE_SIZES 3
#define KVM_HPAGE_SHIFT(x) (PAGE_SHIFT + (((x) - 1) * 9))
#define KVM_HPAGE_SIZE(x) (1UL << KVM_HPAGE_SHIFT(x))
#define KVM_HPAGE_MASK(x) (~(KVM_HPAGE_SIZE(x) - 1))
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 46/46] KVM: report 1GB page support to userspace
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
` (44 preceding siblings ...)
2009-08-23 11:56 ` [PATCH 45/46] KVM: MMU: enable gbpages by increasing nr of pagesizes Avi Kivity
@ 2009-08-23 11:56 ` Avi Kivity
45 siblings, 0 replies; 47+ messages in thread
From: Avi Kivity @ 2009-08-23 11:56 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel
From: Joerg Roedel <joerg.roedel@amd.com>
If userspace knows that the kernel part supports 1GB pages it can enable
the corresponding cpuid bit so that guests actually use GB pages.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/svm.c | 6 ++++++
arch/x86/kvm/vmx.c | 6 ++++++
arch/x86/kvm/x86.c | 3 ++-
4 files changed, 16 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3315efa..b17d845 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -528,6 +528,8 @@ struct kvm_x86_ops {
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
int (*get_tdp_level)(void);
u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
+ bool (*gb_page_enable)(void);
+
const struct trace_print_flags *exit_reasons_str;
};
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 92fc0da..10e718d 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2754,6 +2754,11 @@ static const struct trace_print_flags svm_exit_reasons_str[] = {
{ -1, NULL }
};
+static bool svm_gb_page_enable(void)
+{
+ return true;
+}
+
static struct kvm_x86_ops svm_x86_ops = {
.cpu_has_kvm_support = has_svm,
.disabled_by_bios = is_disabled,
@@ -2817,6 +2822,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.get_mt_mask = svm_get_mt_mask,
.exit_reasons_str = svm_exit_reasons_str,
+ .gb_page_enable = svm_gb_page_enable,
};
static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c5aaa1b..32e6d20 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3908,6 +3908,11 @@ static const struct trace_print_flags vmx_exit_reasons_str[] = {
{ -1, NULL }
};
+static bool vmx_gb_page_enable(void)
+{
+ return false;
+}
+
static struct kvm_x86_ops vmx_x86_ops = {
.cpu_has_kvm_support = cpu_has_kvm_support,
.disabled_by_bios = vmx_disabled_by_bios,
@@ -3969,6 +3974,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
.get_mt_mask = vmx_get_mt_mask,
.exit_reasons_str = vmx_exit_reasons_str,
+ .gb_page_enable = vmx_gb_page_enable,
};
static int __init vmx_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d1bcc59..b407d3c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1444,6 +1444,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
u32 index, int *nent, int maxnent)
{
unsigned f_nx = is_efer_nx() ? F(NX) : 0;
+ unsigned f_gbpages = kvm_x86_ops->gb_page_enable() ? F(GBPAGES) : 0;
#ifdef CONFIG_X86_64
unsigned f_lm = F(LM);
#else
@@ -1468,7 +1469,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
F(PAT) | F(PSE36) | 0 /* Reserved */ |
f_nx | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
- F(FXSR) | F(FXSR_OPT) | 0 /* GBPAGES */ | 0 /* RDTSCP */ |
+ F(FXSR) | F(FXSR_OPT) | f_gbpages | 0 /* RDTSCP */ |
0 /* Reserved */ | f_lm | F(3DNOWEXT) | F(3DNOW);
/* cpuid 1.ecx */
const u32 kvm_supported_word4_x86_features =
--
1.6.4.1
^ permalink raw reply related [flat|nested] 47+ messages in thread
end of thread, other threads:[~2009-08-23 12:07 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-23 11:55 [PATCH 00/46] KVM updates for 2.6.32 merge window (3/4) Avi Kivity
2009-08-23 11:56 ` [PATCH 01/46] KVM: Trace irq level and source id Avi Kivity
2009-08-23 11:56 ` [PATCH 02/46] KVM: Ignore PCI ECS I/O enablement Avi Kivity
2009-08-23 11:56 ` [PATCH 03/46] KVM: Trace mmio Avi Kivity
2009-08-23 11:56 ` [PATCH 04/46] KVM: Trace apic registers using their symbolic names Avi Kivity
2009-08-23 11:56 ` [PATCH 05/46] KVM: Add Directed EOI support to APIC emulation Avi Kivity
2009-08-23 11:56 ` [PATCH 06/46] KVM: x2apic interface to lapic Avi Kivity
2009-08-23 11:56 ` [PATCH 07/46] KVM: Use temporary variable to shorten lines Avi Kivity
2009-08-23 11:56 ` [PATCH 08/46] KVM: Fix apic_mmio_write return for unaligned write Avi Kivity
2009-08-23 11:56 ` [PATCH 09/46] KVM: handle AMD microcode MSR Avi Kivity
2009-08-23 11:56 ` [PATCH 10/46] Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs" Avi Kivity
2009-08-23 11:56 ` [PATCH 11/46] KVM: MMU: Trace guest pagetable walker Avi Kivity
2009-08-23 11:56 ` [PATCH 12/46] KVM: Document basic API Avi Kivity
2009-08-23 11:56 ` [PATCH 13/46] KVM: Trace shadow page lifecycle Avi Kivity
2009-08-23 11:56 ` [PATCH 14/46] KVM: fix MMIO_CONF_BASE MSR access Avi Kivity
2009-08-23 11:56 ` [PATCH 15/46] KVM: ignore msi request if !level Avi Kivity
2009-08-23 11:56 ` [PATCH 16/46] KVM: Add trace points in irqchip code Avi Kivity
2009-08-23 11:56 ` [PATCH 17/46] KVM: No need to kick cpu if not in a guest mode Avi Kivity
2009-08-23 11:56 ` [PATCH 18/46] KVM: Always report x2apic as supported feature Avi Kivity
2009-08-23 11:56 ` [PATCH 19/46] KVM: PIT support for HPET legacy mode Avi Kivity
2009-08-23 11:56 ` [PATCH 20/46] KVM: add module parameters documentation Avi Kivity
2009-08-23 11:56 ` [PATCH 21/46] KVM: make io_bus interface more robust Avi Kivity
2009-08-23 11:56 ` [PATCH 22/46] KVM: add ioeventfd support Avi Kivity
2009-08-23 11:56 ` [PATCH 23/46] KVM: MMU: Fix MMU_DEBUG compile breakage Avi Kivity
2009-08-23 11:56 ` [PATCH 24/46] KVM: Move exception handling to the same place as other events Avi Kivity
2009-08-23 11:56 ` [PATCH 25/46] KVM: Move kvm_cpu_get_interrupt() declaration to x86 code Avi Kivity
2009-08-23 11:56 ` [PATCH 26/46] KVM: Reduce runnability interface with arch support code Avi Kivity
2009-08-23 11:56 ` [PATCH 27/46] KVM: silence lapic kernel messages that can be triggered by a guest Avi Kivity
2009-08-23 11:56 ` [PATCH 28/46] KVM: Discard unnecessary kvm_mmu_flush_tlb() in kvm_mmu_load() Avi Kivity
2009-08-23 11:56 ` [PATCH 29/46] KVM: MMU: fix missing locking in alloc_mmu_pages Avi Kivity
2009-08-23 11:56 ` [PATCH 30/46] KVM: s390: remove unused structs Avi Kivity
2009-08-23 11:56 ` [PATCH 31/46] KVM: x86: use get_desc_base() and get_desc_limit() Avi Kivity
2009-08-23 11:56 ` [PATCH 32/46] KVM: x86: use kvm_get_gdt() and kvm_read_ldt() Avi Kivity
2009-08-23 11:56 ` [PATCH 33/46] KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctl Avi Kivity
2009-08-23 11:56 ` [PATCH 34/46] KVM: PIT: Unregister ack notifier callback when freeing Avi Kivity
2009-08-23 11:56 ` [PATCH 35/46] KVM: Drop obsolete cpu_get/put in make_all_cpus_request Avi Kivity
2009-08-23 11:56 ` [PATCH 36/46] KVM: VMX: Avoid to return ENOTSUPP to userland Avi Kivity
2009-08-23 11:56 ` [PATCH 37/46] KVM: Align cr8 threshold when userspace changes cr8 Avi Kivity
2009-08-23 11:56 ` [PATCH 38/46] KVM: limit lapic periodic timer frequency Avi Kivity
2009-08-23 11:56 ` [PATCH 39/46] KVM: fix kvm_init() error handling Avi Kivity
2009-08-23 11:56 ` [PATCH 40/46] KVM: MMU: make rmap code aware of mapping levels Avi Kivity
2009-08-23 11:56 ` [PATCH 41/46] KVM: MMU: rename is_largepage_backed to mapping_level Avi Kivity
2009-08-23 11:56 ` [PATCH 42/46] KVM: MMU: make direct mapping paths aware of mapping levels Avi Kivity
2009-08-23 11:56 ` [PATCH 43/46] KVM: MMU: make page walker " Avi Kivity
2009-08-23 11:56 ` [PATCH 44/46] KVM: MMU: shadow support for 1gb pages Avi Kivity
2009-08-23 11:56 ` [PATCH 45/46] KVM: MMU: enable gbpages by increasing nr of pagesizes Avi Kivity
2009-08-23 11:56 ` [PATCH 46/46] KVM: report 1GB page support to userspace Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox