* [PATCH v4 00/13] KVM/ARM vGIC support
@ 2012-11-10 15:44 Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 01/13] KVM: ARM: Introduce KVM_SET_DEVICE_ADDRESS ioctl Christoffer Dall
` (12 more replies)
0 siblings, 13 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
The following series implements support for the virtual generic
interrupt controller architecture for KVM/ARM.
Changes since v3:
- Change KVM_SET_DEVICE_ADDRESS to 64-bit-field
Changes since v2:
- Get rid of hardcoded guest cpu and distributor physical addresses
and instead provide the address through the KVM_SET_DEVICE_ADDRESS
ioctl.
- Fix level/edge bugs
- Fix reboot bug: retire queued, disabled interrupts
This patch series can also be pulled from:
git://github.com/virtualopensystems/linux-kvm-arm.git
branch: kvm-arm-v13-vgic
---
Christoffer Dall (2):
KVM: ARM: Introduce KVM_SET_DEVICE_ADDRESS ioctl
ARM: KVM: VGIC accept vcpu and dist base addresses from user space
Marc Zyngier (11):
ARM: KVM: Keep track of currently running vcpus
ARM: KVM: Initial VGIC infrastructure support
ARM: KVM: Initial VGIC MMIO support code
ARM: KVM: VGIC distributor handling
ARM: KVM: VGIC virtual CPU interface management
ARM: KVM: vgic: retire queued, disabled interrupts
ARM: KVM: VGIC interrupt injection
ARM: KVM: VGIC control interface world switch
ARM: KVM: VGIC initialisation code
ARM: KVM: vgic: reduce the number of vcpu kick
ARM: KVM: Add VGIC configuration option
Documentation/virtual/kvm/api.txt | 37 +
arch/arm/include/asm/kvm_arm.h | 12
arch/arm/include/asm/kvm_host.h | 17 +
arch/arm/include/asm/kvm_mmu.h | 2
arch/arm/include/asm/kvm_vgic.h | 320 +++++++++
arch/arm/include/uapi/asm/kvm.h | 13
arch/arm/kernel/asm-offsets.c | 12
arch/arm/kvm/Kconfig | 7
arch/arm/kvm/Makefile | 1
arch/arm/kvm/arm.c | 138 ++++
arch/arm/kvm/interrupts.S | 4
arch/arm/kvm/interrupts_head.S | 68 ++
arch/arm/kvm/mmio.c | 3
arch/arm/kvm/vgic.c | 1251 +++++++++++++++++++++++++++++++++++++
include/uapi/linux/kvm.h | 8
virt/kvm/kvm_main.c | 5
16 files changed, 1893 insertions(+), 5 deletions(-)
create mode 100644 arch/arm/include/asm/kvm_vgic.h
create mode 100644 arch/arm/kvm/vgic.c
--
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 01/13] KVM: ARM: Introduce KVM_SET_DEVICE_ADDRESS ioctl
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
@ 2012-11-10 15:44 ` Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus Christoffer Dall
` (11 subsequent siblings)
12 siblings, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
On ARM (and possibly other architectures) some bits are specific to the
model being emulated for the guest and user space needs a way to tell
the kernel about those bits. An example is mmio device base addresses,
where KVM must know the base address for a given device to properly
emulate mmio accesses within a certain address range or directly map a
device with virtualiation extensions into the guest address space.
We try to make this API slightly more generic than for our specific use,
but so far only the VGIC uses this feature.
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
Documentation/virtual/kvm/api.txt | 37 +++++++++++++++++++++++++++++++++++++
arch/arm/include/uapi/asm/kvm.h | 13 +++++++++++++
arch/arm/kvm/arm.c | 24 +++++++++++++++++++++++-
include/uapi/linux/kvm.h | 8 ++++++++
4 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 9ffd702..35f6275 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2191,6 +2191,43 @@ This ioctl returns the guest registers that are supported for the
KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
+4.80 KVM_SET_DEVICE_ADDRESS
+
+Capability: KVM_CAP_SET_DEVICE_ADDRESS
+Architectures: arm
+Type: vm ioctl
+Parameters: struct kvm_device_address (in)
+Returns: 0 on success, -1 on error
+Errors:
+ ENODEV: The device id is unknown
+ ENXIO: Device not supported on current system
+ EEXIST: Address already set
+ E2BIG: Address outside guest physical address space
+
+struct kvm_device_address {
+ __u64 id;
+ __u64 addr;
+};
+
+Specify a device address in the guest's physical address space where guests
+can access emulated or directly exposed devices, which the host kernel needs
+to know about. The id field is an architecture specific identifier for a
+specific device.
+
+ARM divides the id field into two parts, a device id and an address type id
+specific to the individual device.
+
+ ?bits: | 31 ... 16 | 15 ... 0 |
+ field: | device id | addr type id |
+
+ARM currently only require this when using the in-kernel GIC support for the
+hardware vGIC features, using KVM_ARM_DEVICE_VGIC_V2 as the device id. When
+setting the base address for the guest's mapping of the vGIC virtual CPU
+and distributor interface, the ioctl must be called after calling
+KVM_CREATE_IRQCHIP, but before calling KVM_RUN on any of the VCPUs. Calling
+this ioctl twice for any of the base addresses will return -EEXIST.
+
+
5. The kvm_run structure
------------------------
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 5142cab..b1c7871 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -41,6 +41,19 @@ struct kvm_regs {
#define KVM_ARM_TARGET_CORTEX_A15 0
#define KVM_ARM_NUM_TARGETS 1
+/* KVM_SET_DEVICE_ADDRESS ioctl id encoding */
+#define KVM_DEVICE_TYPE_SHIFT 0
+#define KVM_DEVICE_TYPE_MASK (0xffff << KVM_DEVICE_TYPE_SHIFT)
+#define KVM_DEVICE_ID_SHIFT 16
+#define KVM_DEVICE_ID_MASK (0xffff << KVM_DEVICE_ID_SHIFT)
+
+/* Supported device IDs */
+#define KVM_ARM_DEVICE_VGIC_V2 0
+
+/* Supported VGIC address types */
+#define KVM_VGIC_V2_ADDR_TYPE_DIST 0
+#define KVM_VGIC_V2_ADDR_TYPE_CPU 1
+
struct kvm_vcpu_init {
__u32 target;
__u32 features[7];
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 435c5cc..2cdc07b 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -164,6 +164,9 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_COALESCED_MMIO:
r = KVM_COALESCED_MMIO_PAGE_OFFSET;
break;
+ case KVM_CAP_SET_DEVICE_ADDR:
+ r = 1;
+ break;
default:
r = 0;
break;
@@ -777,10 +780,29 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
return -EINVAL;
}
+static int kvm_vm_ioctl_set_device_address(struct kvm *kvm,
+ struct kvm_device_address *dev_addr)
+{
+ return -ENODEV;
+}
+
long kvm_arch_vm_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
- return -EINVAL;
+ struct kvm *kvm = filp->private_data;
+ void __user *argp = (void __user *)arg;
+
+ switch (ioctl) {
+ case KVM_SET_DEVICE_ADDRESS: {
+ struct kvm_device_address dev_addr;
+
+ if (copy_from_user(&dev_addr, argp, sizeof(dev_addr)))
+ return -EFAULT;
+ return kvm_vm_ioctl_set_device_address(kvm, &dev_addr);
+ }
+ default:
+ return -EINVAL;
+ }
}
static void cpu_init_hyp_mode(void *vector)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 1db0460..e936f8f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -635,6 +635,7 @@ struct kvm_ppc_smmu_info {
#endif
#define KVM_CAP_IRQFD_RESAMPLE 82
#define KVM_CAP_PPC_BOOKE_WATCHDOG 83
+#define KVM_CAP_SET_DEVICE_ADDR 84
#ifdef KVM_CAP_IRQ_ROUTING
@@ -782,6 +783,11 @@ struct kvm_msi {
__u8 pad[16];
};
+struct kvm_device_address {
+ __u64 id;
+ __u64 addr;
+};
+
/*
* ioctls for VM fds
*/
@@ -865,6 +871,8 @@ struct kvm_s390_ucas_mapping {
#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce)
/* Available with KVM_CAP_RMA */
#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma)
+/* Available with KVM_CAP_SET_DEVICE_ADDR */
+#define KVM_SET_DEVICE_ADDRESS _IOW(KVMIO, 0xaa, struct kvm_device_address)
/*
* ioctls for vcpu fds
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 01/13] KVM: ARM: Introduce KVM_SET_DEVICE_ADDRESS ioctl Christoffer Dall
@ 2012-11-10 15:44 ` Christoffer Dall
2012-11-28 12:47 ` Will Deacon
2012-11-10 15:44 ` [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support Christoffer Dall
` (10 subsequent siblings)
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
When an interrupt occurs for the guest, it is sometimes necessary
to find out which vcpu was running at that point.
Keep track of which vcpu is being tun in kvm_arch_vcpu_ioctl_run(),
and allow the data to be retrived using either:
- kvm_arm_get_running_vcpu(): returns the vcpu running at this point
on the current CPU. Can only be used in a non-preemptable context.
- kvm_arm_get_running_vcpus(): returns the per-CPU variable holding
the the running vcpus, useable for per-CPU interrupts.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_host.h | 10 ++++++++++
arch/arm/kvm/arm.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index e7fc249..e66cd56 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -154,4 +154,14 @@ static inline int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
{
return 0;
}
+
+struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
+struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
+
+int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
+unsigned long kvm_arm_num_coproc_regs(struct kvm_vcpu *vcpu);
+struct kvm_one_reg;
+int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
+int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
+
#endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 2cdc07b..60b119a 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -53,11 +53,38 @@ static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
static struct vfp_hard_struct __percpu *kvm_host_vfp_state;
static unsigned long hyp_default_vectors;
+/* Per-CPU variable containing the currently running vcpu. */
+static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
+
/* The VMID used in the VTTBR */
static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
static u8 kvm_next_vmid;
static DEFINE_SPINLOCK(kvm_vmid_lock);
+static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
+{
+ BUG_ON(preemptible());
+ __get_cpu_var(kvm_arm_running_vcpu) = vcpu;
+}
+
+/**
+ * kvm_arm_get_running_vcpu - get the vcpu running on the current CPU.
+ * Must be called from non-preemptible context
+ */
+struct kvm_vcpu *kvm_arm_get_running_vcpu(void)
+{
+ BUG_ON(preemptible());
+ return __get_cpu_var(kvm_arm_running_vcpu);
+}
+
+/**
+ * kvm_arm_get_running_vcpus - get the per-CPU array on currently running vcpus.
+ */
+struct kvm_vcpu __percpu **kvm_get_running_vcpus(void)
+{
+ return &kvm_arm_running_vcpu;
+}
+
int kvm_arch_hardware_enable(void *garbage)
{
return 0;
@@ -299,10 +326,13 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
cpumask_clear_cpu(cpu, &vcpu->arch.require_dcache_flush);
flush_cache_all(); /* We'd really want v7_flush_dcache_all() */
}
+
+ kvm_arm_set_running_vcpu(vcpu);
}
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
{
+ kvm_arm_set_running_vcpu(NULL);
}
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 01/13] KVM: ARM: Introduce KVM_SET_DEVICE_ADDRESS ioctl Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus Christoffer Dall
@ 2012-11-10 15:44 ` Christoffer Dall
2012-11-28 12:49 ` Will Deacon
2012-11-10 15:44 ` [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code Christoffer Dall
` (9 subsequent siblings)
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Wire the basic framework code for VGIC support. Nothing to enable
yet.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_host.h | 7 ++++
arch/arm/include/asm/kvm_vgic.h | 70 +++++++++++++++++++++++++++++++++++++++
arch/arm/kvm/arm.c | 21 +++++++++++-
arch/arm/kvm/interrupts.S | 4 ++
arch/arm/kvm/mmio.c | 3 ++
virt/kvm/kvm_main.c | 5 ++-
6 files changed, 107 insertions(+), 3 deletions(-)
create mode 100644 arch/arm/include/asm/kvm_vgic.h
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index e66cd56..49ba25a 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -23,6 +23,7 @@
#include <asm/kvm_asm.h>
#include <asm/fpstate.h>
#include <asm/kvm_decode.h>
+#include <asm/kvm_vgic.h>
#define KVM_MAX_VCPUS CONFIG_KVM_ARM_MAX_VCPUS
#define KVM_MEMORY_SLOTS 32
@@ -58,6 +59,9 @@ struct kvm_arch {
/* Stage-2 page table */
pgd_t *pgd;
+
+ /* Interrupt controller */
+ struct vgic_dist vgic;
};
#define KVM_NR_MEM_OBJS 40
@@ -92,6 +96,9 @@ struct kvm_vcpu_arch {
struct vfp_hard_struct vfp_guest;
struct vfp_hard_struct *vfp_host;
+ /* VGIC state */
+ struct vgic_cpu vgic_cpu;
+
/*
* Anything that is not used directly from assembly code goes
* here.
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
new file mode 100644
index 0000000..d75540a
--- /dev/null
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -0,0 +1,70 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#ifndef __ASM_ARM_KVM_VGIC_H
+#define __ASM_ARM_KVM_VGIC_H
+
+struct vgic_dist {
+};
+
+struct vgic_cpu {
+};
+
+struct kvm;
+struct kvm_vcpu;
+struct kvm_run;
+struct kvm_exit_mmio;
+
+#ifndef CONFIG_KVM_ARM_VGIC
+static inline int kvm_vgic_hyp_init(void)
+{
+ return 0;
+}
+
+static inline int kvm_vgic_init(struct kvm *kvm)
+{
+ return 0;
+}
+
+static inline int kvm_vgic_create(struct kvm *kvm)
+{
+ return 0;
+}
+
+static inline void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) {}
+static inline void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu) {}
+static inline void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu) {}
+
+static inline int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
+static inline bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
+ struct kvm_exit_mmio *mmio)
+{
+ return false;
+}
+
+static inline int irqchip_in_kernel(struct kvm *kvm)
+{
+ return 0;
+}
+#endif
+
+#endif
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 60b119a..426828a 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -183,6 +183,9 @@ int kvm_dev_ioctl_check_extension(long ext)
{
int r;
switch (ext) {
+#ifdef CONFIG_KVM_ARM_VGIC
+ case KVM_CAP_IRQCHIP:
+#endif
case KVM_CAP_USER_MEMORY:
case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
case KVM_CAP_ONE_REG:
@@ -304,6 +307,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
{
/* Force users to call KVM_ARM_VCPU_INIT */
vcpu->arch.target = -1;
+
+ /* Set up VGIC */
+ kvm_vgic_vcpu_init(vcpu);
+
return 0;
}
@@ -363,7 +370,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
*/
int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
{
- return !!v->arch.irq_lines;
+ return !!v->arch.irq_lines || kvm_vgic_vcpu_pending_irq(v);
}
int kvm_arch_vcpu_in_guest_mode(struct kvm_vcpu *v)
@@ -633,6 +640,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
update_vttbr(vcpu->kvm);
+ kvm_vgic_sync_to_cpu(vcpu);
+
local_irq_disable();
/*
@@ -645,6 +654,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
local_irq_enable();
+ kvm_vgic_sync_from_cpu(vcpu);
continue;
}
@@ -683,6 +693,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
* Back from guest
*************************************************************/
+ kvm_vgic_sync_from_cpu(vcpu);
+
ret = handle_exit(vcpu, run, ret);
}
@@ -965,6 +977,13 @@ static int init_hyp_mode(void)
}
}
+ /*
+ * Init HYP view of VGIC
+ */
+ err = kvm_vgic_hyp_init();
+ if (err)
+ goto out_free_mappings;
+
return 0;
out_free_vfp:
free_percpu(kvm_host_vfp_state);
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 7c89708..e418c9b 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -91,6 +91,8 @@ ENTRY(__kvm_vcpu_run)
save_host_regs
+ restore_vgic_state r0
+
@ Store hardware CP15 state and load guest state
read_cp15_state
write_cp15_state 1, r0
@@ -184,6 +186,8 @@ after_vfp_restore:
read_cp15_state 1, r1
write_cp15_state
+ save_vgic_state r1
+
restore_host_regs
clrex @ Clear exclusive monitor
bx lr @ return to IOCTL
diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c
index d6a4ca0..eadec78a 100644
--- a/arch/arm/kvm/mmio.c
+++ b/arch/arm/kvm/mmio.c
@@ -149,6 +149,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
if (mmio.is_write)
memcpy(mmio.data, vcpu_reg(vcpu, rt), mmio.len);
+ if (vgic_handle_mmio(vcpu, run, &mmio))
+ return 1;
+
kvm_prepare_mmio(run, &mmio);
return 0;
}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2fb7319..665af96 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1880,12 +1880,13 @@ static long kvm_vcpu_ioctl(struct file *filp,
if (vcpu->kvm->mm != current->mm)
return -EIO;
-#if defined(CONFIG_S390) || defined(CONFIG_PPC)
+#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_ARM)
/*
* Special cases: vcpu ioctls that are asynchronous to vcpu execution,
* so vcpu_load() would break it.
*/
- if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT)
+ if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT ||
+ ioctl == KVM_IRQ_LINE)
return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
#endif
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (2 preceding siblings ...)
2012-11-10 15:44 ` [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support Christoffer Dall
@ 2012-11-10 15:44 ` Christoffer Dall
2012-11-12 8:54 ` Dong Aisheng
2012-11-28 13:09 ` Will Deacon
2012-11-10 15:44 ` [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space Christoffer Dall
` (8 subsequent siblings)
12 siblings, 2 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Wire the initial in-kernel MMIO support code for the VGIC, used
for the distributor emulation.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_vgic.h | 6 +-
arch/arm/kvm/Makefile | 1
arch/arm/kvm/vgic.c | 138 +++++++++++++++++++++++++++++++++++++++
3 files changed, 144 insertions(+), 1 deletion(-)
create mode 100644 arch/arm/kvm/vgic.c
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index d75540a..b444ecf 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -30,7 +30,11 @@ struct kvm_vcpu;
struct kvm_run;
struct kvm_exit_mmio;
-#ifndef CONFIG_KVM_ARM_VGIC
+#ifdef CONFIG_KVM_ARM_VGIC
+bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
+ struct kvm_exit_mmio *mmio);
+
+#else
static inline int kvm_vgic_hyp_init(void)
{
return 0;
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 8a4f396..c019f02 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../virt/kvm/, kvm_main.o coalesc
obj-$(CONFIG_KVM_ARM_HOST) += arm.o guest.o mmu.o emulate.o reset.o
obj-$(CONFIG_KVM_ARM_HOST) += coproc.o coproc_a15.o mmio.o decode.o
+obj-$(CONFIG_KVM_ARM_VGIC) += vgic.o
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
new file mode 100644
index 0000000..26ada3b
--- /dev/null
+++ b/arch/arm/kvm/vgic.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <asm/kvm_emulate.h>
+
+#define ACCESS_READ_VALUE (1 << 0)
+#define ACCESS_READ_RAZ (0 << 0)
+#define ACCESS_READ_MASK(x) ((x) & (1 << 0))
+#define ACCESS_WRITE_IGNORED (0 << 1)
+#define ACCESS_WRITE_SETBIT (1 << 1)
+#define ACCESS_WRITE_CLEARBIT (2 << 1)
+#define ACCESS_WRITE_VALUE (3 << 1)
+#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
+
+/**
+ * vgic_reg_access - access vgic register
+ * @mmio: pointer to the data describing the mmio access
+ * @reg: pointer to the virtual backing of the vgic distributor struct
+ * @offset: least significant 2 bits used for word offset
+ * @mode: ACCESS_ mode (see defines above)
+ *
+ * Helper to make vgic register access easier using one of the access
+ * modes defined for vgic register access
+ * (read,raz,write-ignored,setbit,clearbit,write)
+ */
+static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
+ u32 offset, int mode)
+{
+ int word_offset = offset & 3;
+ int shift = word_offset * 8;
+ u32 mask;
+ u32 regval;
+
+ /*
+ * Any alignment fault should have been delivered to the guest
+ * directly (ARM ARM B3.12.7 "Prioritization of aborts").
+ */
+
+ mask = (~0U) >> (word_offset * 8);
+ if (reg)
+ regval = *reg;
+ else {
+ BUG_ON(mode != (ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED));
+ regval = 0;
+ }
+
+ if (mmio->is_write) {
+ u32 data = (*((u32 *)mmio->data) & mask) << shift;
+ switch (ACCESS_WRITE_MASK(mode)) {
+ case ACCESS_WRITE_IGNORED:
+ return;
+
+ case ACCESS_WRITE_SETBIT:
+ regval |= data;
+ break;
+
+ case ACCESS_WRITE_CLEARBIT:
+ regval &= ~data;
+ break;
+
+ case ACCESS_WRITE_VALUE:
+ regval = (regval & ~(mask << shift)) | data;
+ break;
+ }
+ *reg = regval;
+ } else {
+ switch (ACCESS_READ_MASK(mode)) {
+ case ACCESS_READ_RAZ:
+ regval = 0;
+ /* fall through */
+
+ case ACCESS_READ_VALUE:
+ *((u32 *)mmio->data) = (regval >> shift) & mask;
+ }
+ }
+}
+
+/* All this should be handled by kvm_bus_io_*()... FIXME!!! */
+struct mmio_range {
+ unsigned long base;
+ unsigned long len;
+ bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
+ u32 offset);
+};
+
+static const struct mmio_range vgic_ranges[] = {
+ {}
+};
+
+static const
+struct mmio_range *find_matching_range(const struct mmio_range *ranges,
+ struct kvm_exit_mmio *mmio,
+ unsigned long base)
+{
+ const struct mmio_range *r = ranges;
+ unsigned long addr = mmio->phys_addr - base;
+
+ while (r->len) {
+ if (addr >= r->base &&
+ (addr + mmio->len) <= (r->base + r->len))
+ return r;
+ r++;
+ }
+
+ return NULL;
+}
+
+/**
+ * vgic_handle_mmio - handle an in-kernel MMIO access
+ * @vcpu: pointer to the vcpu performing the access
+ * @mmio: pointer to the data describing the access
+ *
+ * returns true if the MMIO access has been performed in kernel space,
+ * and false if it needs to be emulated in user space.
+ */
+bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio)
+{
+ return KVM_EXIT_MMIO;
+}
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (3 preceding siblings ...)
2012-11-10 15:44 ` [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code Christoffer Dall
@ 2012-11-10 15:44 ` Christoffer Dall
2012-11-12 8:56 ` Dong Aisheng
2012-11-28 13:11 ` Will Deacon
2012-11-10 15:44 ` [PATCH v4 06/13] ARM: KVM: VGIC distributor handling Christoffer Dall
` (7 subsequent siblings)
12 siblings, 2 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
User space defines the model to emulate to a guest and should therefore
decide which addresses are used for both the virtual CPU interface
directly mapped in the guest physical address space and for the emulated
distributor interface, which is mapped in software by the in-kernel VGIC
support.
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_mmu.h | 2 +
arch/arm/include/asm/kvm_vgic.h | 9 ++++++
arch/arm/kvm/arm.c | 16 ++++++++++
arch/arm/kvm/vgic.c | 61 +++++++++++++++++++++++++++++++++++++++
4 files changed, 87 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 9bd0508..0800531 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -26,6 +26,8 @@
* To save a bit of memory and to avoid alignment issues we assume 39-bit IPA
* for now, but remember that the level-1 table must be aligned to its size.
*/
+#define KVM_PHYS_SHIFT (38)
+#define KVM_PHYS_MASK ((1ULL << KVM_PHYS_SHIFT) - 1)
#define PTRS_PER_PGD2 512
#define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index b444ecf..9ca8d21 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -20,6 +20,9 @@
#define __ASM_ARM_KVM_VGIC_H
struct vgic_dist {
+ /* Distributor and vcpu interface mapping in the guest */
+ phys_addr_t vgic_dist_base;
+ phys_addr_t vgic_cpu_base;
};
struct vgic_cpu {
@@ -31,6 +34,7 @@ struct kvm_run;
struct kvm_exit_mmio;
#ifdef CONFIG_KVM_ARM_VGIC
+int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
struct kvm_exit_mmio *mmio);
@@ -40,6 +44,11 @@ static inline int kvm_vgic_hyp_init(void)
return 0;
}
+static inline int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
+{
+ return 0;
+}
+
static inline int kvm_vgic_init(struct kvm *kvm)
{
return 0;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 426828a..3ac1aab 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
static u8 kvm_next_vmid;
static DEFINE_SPINLOCK(kvm_vmid_lock);
+static bool vgic_present;
+
static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
{
BUG_ON(preemptible());
@@ -825,7 +827,19 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
static int kvm_vm_ioctl_set_device_address(struct kvm *kvm,
struct kvm_device_address *dev_addr)
{
- return -ENODEV;
+ unsigned long dev_id, type;
+
+ dev_id = (dev_addr->id & KVM_DEVICE_ID_MASK) >> KVM_DEVICE_ID_SHIFT;
+ type = (dev_addr->id & KVM_DEVICE_TYPE_MASK) >> KVM_DEVICE_TYPE_SHIFT;
+
+ switch (dev_id) {
+ case KVM_ARM_DEVICE_VGIC_V2:
+ if (!vgic_present)
+ return -ENXIO;
+ return kvm_vgic_set_addr(kvm, type, dev_addr->addr);
+ default:
+ return -ENODEV;
+ }
}
long kvm_arch_vm_ioctl(struct file *filp,
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index 26ada3b..f85b275 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -22,6 +22,13 @@
#include <linux/io.h>
#include <asm/kvm_emulate.h>
+#define VGIC_ADDR_UNDEF (-1)
+#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
+
+#define VGIC_DIST_SIZE 0x1000
+#define VGIC_CPU_SIZE 0x2000
+
+
#define ACCESS_READ_VALUE (1 << 0)
#define ACCESS_READ_RAZ (0 << 0)
#define ACCESS_READ_MASK(x) ((x) & (1 << 0))
@@ -136,3 +143,57 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
{
return KVM_EXIT_MMIO;
}
+
+static bool vgic_ioaddr_overlap(struct kvm *kvm)
+{
+ phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
+ phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
+
+ if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
+ return false;
+ if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
+ (cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
+ return true;
+ return false;
+}
+
+int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
+{
+ int r = 0;
+ struct vgic_dist *vgic = &kvm->arch.vgic;
+
+ if (addr & ~KVM_PHYS_MASK)
+ return -E2BIG;
+
+ if (addr & ~PAGE_MASK)
+ return -EINVAL;
+
+ mutex_lock(&kvm->lock);
+ switch (type) {
+ case KVM_VGIC_V2_ADDR_TYPE_DIST:
+ if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
+ return -EEXIST;
+ if (addr + VGIC_DIST_SIZE < addr)
+ return -EINVAL;
+ kvm->arch.vgic.vgic_dist_base = addr;
+ break;
+ case KVM_VGIC_V2_ADDR_TYPE_CPU:
+ if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
+ return -EEXIST;
+ if (addr + VGIC_CPU_SIZE < addr)
+ return -EINVAL;
+ kvm->arch.vgic.vgic_cpu_base = addr;
+ break;
+ default:
+ r = -ENODEV;
+ }
+
+ if (vgic_ioaddr_overlap(kvm)) {
+ kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
+ kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
+ return -EINVAL;
+ }
+
+ mutex_unlock(&kvm->lock);
+ return r;
+}
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 06/13] ARM: KVM: VGIC distributor handling
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (4 preceding siblings ...)
2012-11-10 15:44 ` [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space Christoffer Dall
@ 2012-11-10 15:44 ` Christoffer Dall
2012-11-12 9:29 ` Dong Aisheng
2012-11-28 13:21 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management Christoffer Dall
` (6 subsequent siblings)
12 siblings, 2 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:44 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Add the GIC distributor emulation code. A number of the GIC features
are simply ignored as they are not required to boot a Linux guest.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_vgic.h | 167 ++++++++++++++
arch/arm/kvm/vgic.c | 471 +++++++++++++++++++++++++++++++++++++++
2 files changed, 637 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index 9ca8d21..9e60b1d 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -19,10 +19,177 @@
#ifndef __ASM_ARM_KVM_VGIC_H
#define __ASM_ARM_KVM_VGIC_H
+#include <linux/kernel.h>
+#include <linux/kvm.h>
+#include <linux/kvm_host.h>
+#include <linux/irqreturn.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+
+#define VGIC_NR_IRQS 128
+#define VGIC_NR_SHARED_IRQS (VGIC_NR_IRQS - 32)
+#define VGIC_MAX_CPUS NR_CPUS
+
+/* Sanity checks... */
+#if (VGIC_MAX_CPUS > 8)
+#error Invalid number of CPU interfaces
+#endif
+
+#if (VGIC_NR_IRQS & 31)
+#error "VGIC_NR_IRQS must be a multiple of 32"
+#endif
+
+#if (VGIC_NR_IRQS > 1024)
+#error "VGIC_NR_IRQS must be <= 1024"
+#endif
+
+/*
+ * The GIC distributor registers describing interrupts have two parts:
+ * - 32 per-CPU interrupts (SGI + PPI)
+ * - a bunch of shared interrups (SPI)
+ */
+struct vgic_bitmap {
+ union {
+ u32 reg[1];
+ unsigned long reg_ul[0];
+ } percpu[VGIC_MAX_CPUS];
+ union {
+ u32 reg[VGIC_NR_SHARED_IRQS / 32];
+ unsigned long reg_ul[0];
+ } shared;
+};
+
+static inline u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x,
+ int cpuid, u32 offset)
+{
+ offset >>= 2;
+ BUG_ON(offset > (VGIC_NR_IRQS / 32));
+ if (!offset)
+ return x->percpu[cpuid].reg;
+ else
+ return x->shared.reg + offset - 1;
+}
+
+static inline int vgic_bitmap_get_irq_val(struct vgic_bitmap *x,
+ int cpuid, int irq)
+{
+ if (irq < 32)
+ return test_bit(irq, x->percpu[cpuid].reg_ul);
+
+ return test_bit(irq - 32, x->shared.reg_ul);
+}
+
+static inline void vgic_bitmap_set_irq_val(struct vgic_bitmap *x,
+ int cpuid, int irq, int val)
+{
+ unsigned long *reg;
+
+ if (irq < 32)
+ reg = x->percpu[cpuid].reg_ul;
+ else {
+ reg = x->shared.reg_ul;
+ irq -= 32;
+ }
+
+ if (val)
+ set_bit(irq, reg);
+ else
+ clear_bit(irq, reg);
+}
+
+static inline unsigned long *vgic_bitmap_get_cpu_map(struct vgic_bitmap *x,
+ int cpuid)
+{
+ if (unlikely(cpuid >= VGIC_MAX_CPUS))
+ return NULL;
+ return x->percpu[cpuid].reg_ul;
+}
+
+static inline unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x)
+{
+ return x->shared.reg_ul;
+}
+
+struct vgic_bytemap {
+ union {
+ u32 reg[8];
+ unsigned long reg_ul[0];
+ } percpu[VGIC_MAX_CPUS];
+ union {
+ u32 reg[VGIC_NR_SHARED_IRQS / 4];
+ unsigned long reg_ul[0];
+ } shared;
+};
+
+static inline u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x,
+ int cpuid, u32 offset)
+{
+ offset >>= 2;
+ BUG_ON(offset > (VGIC_NR_IRQS / 4));
+ if (offset < 4)
+ return x->percpu[cpuid].reg + offset;
+ else
+ return x->shared.reg + offset - 8;
+}
+
+static inline int vgic_bytemap_get_irq_val(struct vgic_bytemap *x,
+ int cpuid, int irq)
+{
+ u32 *reg, shift;
+ shift = (irq & 3) * 8;
+ reg = vgic_bytemap_get_reg(x, cpuid, irq);
+ return (*reg >> shift) & 0xff;
+}
+
+static inline void vgic_bytemap_set_irq_val(struct vgic_bytemap *x,
+ int cpuid, int irq, int val)
+{
+ u32 *reg, shift;
+ shift = (irq & 3) * 8;
+ reg = vgic_bytemap_get_reg(x, cpuid, irq);
+ *reg &= ~(0xff << shift);
+ *reg |= (val & 0xff) << shift;
+}
+
struct vgic_dist {
+#ifdef CONFIG_KVM_ARM_VGIC
+ spinlock_t lock;
+
+ /* Virtual control interface mapping */
+ void __iomem *vctrl_base;
+
/* Distributor and vcpu interface mapping in the guest */
phys_addr_t vgic_dist_base;
phys_addr_t vgic_cpu_base;
+
+ /* Distributor enabled */
+ u32 enabled;
+
+ /* Interrupt enabled (one bit per IRQ) */
+ struct vgic_bitmap irq_enabled;
+
+ /* Interrupt 'pin' level */
+ struct vgic_bitmap irq_state;
+
+ /* Level-triggered interrupt in progress */
+ struct vgic_bitmap irq_active;
+
+ /* Interrupt priority. Not used yet. */
+ struct vgic_bytemap irq_priority;
+
+ /* Level/edge triggered */
+ struct vgic_bitmap irq_cfg;
+
+ /* Source CPU per SGI and target CPU */
+ u8 irq_sgi_sources[VGIC_MAX_CPUS][16];
+
+ /* Target CPU for each IRQ */
+ u8 irq_spi_cpu[VGIC_NR_SHARED_IRQS];
+ struct vgic_bitmap irq_spi_target[VGIC_MAX_CPUS];
+
+ /* Bitmap indicating which CPU has something pending */
+ unsigned long irq_pending_on_cpu;
+#endif
};
struct vgic_cpu {
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index f85b275..82feee8 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -22,6 +22,42 @@
#include <linux/io.h>
#include <asm/kvm_emulate.h>
+/*
+ * How the whole thing works (courtesy of Christoffer Dall):
+ *
+ * - At any time, the dist->irq_pending_on_cpu is the oracle that knows if
+ * something is pending
+ * - VGIC pending interrupts are stored on the vgic.irq_state vgic
+ * bitmap (this bitmap is updated by both user land ioctls and guest
+ * mmio ops) and indicate the 'wire' state.
+ * - Every time the bitmap changes, the irq_pending_on_cpu oracle is
+ * recalculated
+ * - To calculate the oracle, we need info for each cpu from
+ * compute_pending_for_cpu, which considers:
+ * - PPI: dist->irq_state & dist->irq_enable
+ * - SPI: dist->irq_state & dist->irq_enable & dist->irq_spi_target
+ * - irq_spi_target is a 'formatted' version of the GICD_ICFGR
+ * registers, stored on each vcpu. We only keep one bit of
+ * information per interrupt, making sure that only one vcpu can
+ * accept the interrupt.
+ * - The same is true when injecting an interrupt, except that we only
+ * consider a single interrupt at a time. The irq_spi_cpu array
+ * contains the target CPU for each SPI.
+ *
+ * The handling of level interrupts adds some extra complexity. We
+ * need to track when the interrupt has been EOIed, so we can sample
+ * the 'line' again. This is achieved as such:
+ *
+ * - When a level interrupt is moved onto a vcpu, the corresponding
+ * bit in irq_active is set. As long as this bit is set, the line
+ * will be ignored for further interrupts. The interrupt is injected
+ * into the vcpu with the VGIC_LR_EOI bit set (generate a
+ * maintenance interrupt on EOI).
+ * - When the interrupt is EOIed, the maintenance interrupt fires,
+ * and clears the corresponding bit in irq_active. This allow the
+ * interrupt line to be sampled again.
+ */
+
#define VGIC_ADDR_UNDEF (-1)
#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
@@ -38,6 +74,14 @@
#define ACCESS_WRITE_VALUE (3 << 1)
#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
+static void vgic_update_state(struct kvm *kvm);
+static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
+
+static inline int vgic_irq_is_edge(struct vgic_dist *dist, int irq)
+{
+ return vgic_bitmap_get_irq_val(&dist->irq_cfg, 0, irq);
+}
+
/**
* vgic_reg_access - access vgic register
* @mmio: pointer to the data describing the mmio access
@@ -101,6 +145,280 @@ static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
}
}
+static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 reg;
+ u32 u32off = offset & 3;
+
+ switch (offset & ~3) {
+ case 0: /* CTLR */
+ reg = vcpu->kvm->arch.vgic.enabled;
+ vgic_reg_access(mmio, ®, u32off,
+ ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+ if (mmio->is_write) {
+ vcpu->kvm->arch.vgic.enabled = reg & 1;
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+ break;
+
+ case 4: /* TYPER */
+ reg = (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
+ reg |= (VGIC_NR_IRQS >> 5) - 1;
+ vgic_reg_access(mmio, ®, u32off,
+ ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+ break;
+
+ case 8: /* IIDR */
+ reg = 0x4B00043B;
+ vgic_reg_access(mmio, ®, u32off,
+ ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+ break;
+ }
+
+ return false;
+}
+
+static bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ vgic_reg_access(mmio, NULL, offset,
+ ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
+ return false;
+}
+
+static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
+ vcpu->vcpu_id, offset);
+ vgic_reg_access(mmio, reg, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+ if (mmio->is_write) {
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+
+ return false;
+}
+
+static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
+ vcpu->vcpu_id, offset);
+ vgic_reg_access(mmio, reg, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+ if (mmio->is_write) {
+ if (offset < 4) /* Force SGI enabled */
+ *reg |= 0xffff;
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+
+ return false;
+}
+
+static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_state,
+ vcpu->vcpu_id, offset);
+ vgic_reg_access(mmio, reg, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
+ if (mmio->is_write) {
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+
+ return false;
+}
+
+static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_state,
+ vcpu->vcpu_id, offset);
+ vgic_reg_access(mmio, reg, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
+ if (mmio->is_write) {
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+
+ return false;
+}
+
+static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 *reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
+ vcpu->vcpu_id, offset);
+ vgic_reg_access(mmio, reg, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+ return false;
+}
+
+static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
+{
+ struct vgic_dist *dist = &kvm->arch.vgic;
+ struct kvm_vcpu *vcpu;
+ int i, c;
+ unsigned long *bmap;
+ u32 val = 0;
+
+ BUG_ON(irq & 3);
+ BUG_ON(irq < 32);
+
+ irq -= 32;
+
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
+ for (i = 0; i < 4; i++)
+ if (test_bit(irq + i, bmap))
+ val |= 1 << (c + i * 8);
+ }
+
+ return val;
+}
+
+static void vgic_set_target_reg(struct kvm *kvm, u32 val, int irq)
+{
+ struct vgic_dist *dist = &kvm->arch.vgic;
+ struct kvm_vcpu *vcpu;
+ int i, c;
+ unsigned long *bmap;
+ u32 target;
+
+ BUG_ON(irq & 3);
+ BUG_ON(irq < 32);
+
+ irq -= 32;
+
+ /*
+ * Pick the LSB in each byte. This ensures we target exactly
+ * one vcpu per IRQ. If the byte is null, assume we target
+ * CPU0.
+ */
+ for (i = 0; i < 4; i++) {
+ int shift = i * 8;
+ target = ffs((val >> shift) & 0xffU);
+ target = target ? (target - 1) : 0;
+ dist->irq_spi_cpu[irq + i] = target;
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
+ if (c == target)
+ set_bit(irq + i, bmap);
+ else
+ clear_bit(irq + i, bmap);
+ }
+ }
+}
+
+static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 reg;
+
+ /* We treat the banked interrupts targets as read-only */
+ if (offset < 32) {
+ u32 roreg = 1 << vcpu->vcpu_id;
+ roreg |= roreg << 8;
+ roreg |= roreg << 16;
+
+ vgic_reg_access(mmio, &roreg, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
+ return false;
+ }
+
+ reg = vgic_get_target_reg(vcpu->kvm, offset & ~3U);
+ vgic_reg_access(mmio, ®, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+ if (mmio->is_write) {
+ vgic_set_target_reg(vcpu->kvm, reg, offset & ~3U);
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+
+ return false;
+}
+
+static u32 vgic_cfg_expand(u16 val)
+{
+ u32 res = 0;
+ int i;
+
+ for (i = 0; i < 16; i++)
+ res |= (val >> i) << (2 * i + 1);
+
+ return res;
+}
+
+static u16 vgic_cfg_compress(u32 val)
+{
+ u16 res = 0;
+ int i;
+
+ for (i = 0; i < 16; i++)
+ res |= (val >> (i * 2 + 1)) << i;
+
+ return res;
+}
+
+/*
+ * The distributor uses 2 bits per IRQ for the CFG register, but the
+ * LSB is always 0. As such, we only keep the upper bit, and use the
+ * two above functions to compress/expand the bits
+ */
+static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 val;
+ u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
+ vcpu->vcpu_id, offset >> 1);
+ if (offset & 2)
+ val = *reg >> 16;
+ else
+ val = *reg & 0xffff;
+
+ val = vgic_cfg_expand(val);
+ vgic_reg_access(mmio, &val, offset,
+ ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
+ if (mmio->is_write) {
+ if (offset < 4) {
+ *reg = ~0U; /* Force PPIs/SGIs to 1 */
+ return false;
+ }
+
+ val = vgic_cfg_compress(val);
+ if (offset & 2) {
+ *reg &= 0xffff;
+ *reg |= val << 16;
+ } else {
+ *reg &= 0xffff << 16;
+ *reg |= val;
+ }
+ }
+
+ return false;
+}
+
+static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
+ struct kvm_exit_mmio *mmio, u32 offset)
+{
+ u32 reg;
+ vgic_reg_access(mmio, ®, offset,
+ ACCESS_READ_RAZ | ACCESS_WRITE_VALUE);
+ if (mmio->is_write) {
+ vgic_dispatch_sgi(vcpu, reg);
+ vgic_update_state(vcpu->kvm);
+ return true;
+ }
+
+ return false;
+}
+
/* All this should be handled by kvm_bus_io_*()... FIXME!!! */
struct mmio_range {
unsigned long base;
@@ -110,6 +428,66 @@ struct mmio_range {
};
static const struct mmio_range vgic_ranges[] = {
+ { /* CTRL, TYPER, IIDR */
+ .base = 0,
+ .len = 12,
+ .handle_mmio = handle_mmio_misc,
+ },
+ { /* IGROUPRn */
+ .base = 0x80,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_raz_wi,
+ },
+ { /* ISENABLERn */
+ .base = 0x100,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_set_enable_reg,
+ },
+ { /* ICENABLERn */
+ .base = 0x180,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_clear_enable_reg,
+ },
+ { /* ISPENDRn */
+ .base = 0x200,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_set_pending_reg,
+ },
+ { /* ICPENDRn */
+ .base = 0x280,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_clear_pending_reg,
+ },
+ { /* ISACTIVERn */
+ .base = 0x300,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_raz_wi,
+ },
+ { /* ICACTIVERn */
+ .base = 0x380,
+ .len = VGIC_NR_IRQS / 8,
+ .handle_mmio = handle_mmio_raz_wi,
+ },
+ { /* IPRIORITYRn */
+ .base = 0x400,
+ .len = VGIC_NR_IRQS,
+ .handle_mmio = handle_mmio_priority_reg,
+ },
+ { /* ITARGETSRn */
+ .base = 0x800,
+ .len = VGIC_NR_IRQS,
+ .handle_mmio = handle_mmio_target_reg,
+ },
+ { /* ICFGRn */
+ .base = 0xC00,
+ .len = VGIC_NR_IRQS / 4,
+ .handle_mmio = handle_mmio_cfg_reg,
+ },
+ { /* SGIRn */
+ .base = 0xF00,
+ .len = 4,
+ .handle_mmio = handle_mmio_sgi_reg,
+ },
{}
};
@@ -141,7 +519,98 @@ struct mmio_range *find_matching_range(const struct mmio_range *ranges,
*/
bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio)
{
- return KVM_EXIT_MMIO;
+ const struct mmio_range *range;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ unsigned long base = dist->vgic_dist_base;
+ bool updated_state;
+
+ if (!irqchip_in_kernel(vcpu->kvm) ||
+ mmio->phys_addr < base ||
+ (mmio->phys_addr + mmio->len) > (base + dist->vgic_dist_size))
+ return false;
+
+ range = find_matching_range(vgic_ranges, mmio, base);
+ if (unlikely(!range || !range->handle_mmio)) {
+ pr_warn("Unhandled access %d %08llx %d\n",
+ mmio->is_write, mmio->phys_addr, mmio->len);
+ return false;
+ }
+
+ spin_lock(&vcpu->kvm->arch.vgic.lock);
+ updated_state = range->handle_mmio(vcpu, mmio,mmio->phys_addr - range->base - base);
+ spin_unlock(&vcpu->kvm->arch.vgic.lock);
+ kvm_prepare_mmio(run, mmio);
+ kvm_handle_mmio_return(vcpu, run);
+
+ return true;
+}
+
+static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct vgic_dist *dist = &kvm->arch.vgic;
+ int nrcpus = atomic_read(&kvm->online_vcpus);
+ u8 target_cpus;
+ int sgi, mode, c, vcpu_id;
+
+ vcpu_id = vcpu->vcpu_id;
+
+ sgi = reg & 0xf;
+ target_cpus = (reg >> 16) & 0xff;
+ mode = (reg >> 24) & 3;
+
+ switch (mode) {
+ case 0:
+ if (!target_cpus)
+ return;
+
+ case 1:
+ target_cpus = ((1 << nrcpus) - 1) & ~(1 << vcpu_id) & 0xff;
+ break;
+
+ case 2:
+ target_cpus = 1 << vcpu_id;
+ break;
+ }
+
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ if (target_cpus & 1) {
+ /* Flag the SGI as pending */
+ vgic_bitmap_set_irq_val(&dist->irq_state, c, sgi, 1);
+ dist->irq_sgi_sources[c][sgi] |= 1 << vcpu_id;
+ kvm_debug("SGI%d from CPU%d to CPU%d\n", sgi, vcpu_id, c);
+ }
+
+ target_cpus >>= 1;
+ }
+}
+
+static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
+/*
+ * Update the interrupt state and determine which CPUs have pending
+ * interrupts. Must be called with distributor lock held.
+ */
+static void vgic_update_state(struct kvm *kvm)
+{
+ struct vgic_dist *dist = &kvm->arch.vgic;
+ struct kvm_vcpu *vcpu;
+ int c;
+
+ if (!dist->enabled) {
+ set_bit(0, &dist->irq_pending_on_cpu);
+ return;
+ }
+
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ if (compute_pending_for_cpu(vcpu)) {
+ pr_debug("CPU%d has pending interrupts\n", c);
+ set_bit(c, &dist->irq_pending_on_cpu);
+ }
+ }
}
static bool vgic_ioaddr_overlap(struct kvm *kvm)
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (5 preceding siblings ...)
2012-11-10 15:44 ` [PATCH v4 06/13] ARM: KVM: VGIC distributor handling Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-12-03 13:23 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 08/13] ARM: KVM: vgic: retire queued, disabled interrupts Christoffer Dall
` (5 subsequent siblings)
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Add VGIC virtual CPU interface code, picking pending interrupts
from the distributor and stashing them in the VGIC control interface
list registers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_vgic.h | 41 +++++++
arch/arm/kvm/vgic.c | 226 +++++++++++++++++++++++++++++++++++++++
2 files changed, 266 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index 9e60b1d..7229324 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -193,8 +193,45 @@ struct vgic_dist {
};
struct vgic_cpu {
+#ifdef CONFIG_KVM_ARM_VGIC
+ /* per IRQ to LR mapping */
+ u8 vgic_irq_lr_map[VGIC_NR_IRQS];
+
+ /* Pending interrupts on this VCPU */
+ DECLARE_BITMAP( pending, VGIC_NR_IRQS);
+
+ /* Bitmap of used/free list registers */
+ DECLARE_BITMAP( lr_used, 64);
+
+ /* Number of list registers on this CPU */
+ int nr_lr;
+
+ /* CPU vif control registers for world switch */
+ u32 vgic_hcr;
+ u32 vgic_vmcr;
+ u32 vgic_misr; /* Saved only */
+ u32 vgic_eisr[2]; /* Saved only */
+ u32 vgic_elrsr[2]; /* Saved only */
+ u32 vgic_apr;
+ u32 vgic_lr[64]; /* A15 has only 4... */
+#endif
};
+#define VGIC_HCR_EN (1 << 0)
+#define VGIC_HCR_UIE (1 << 1)
+
+#define VGIC_LR_VIRTUALID (0x3ff << 0)
+#define VGIC_LR_PHYSID_CPUID (7 << 10)
+#define VGIC_LR_STATE (3 << 28)
+#define VGIC_LR_PENDING_BIT (1 << 28)
+#define VGIC_LR_ACTIVE_BIT (1 << 29)
+#define VGIC_LR_EOI (1 << 19)
+
+#define VGIC_MISR_EOI (1 << 0)
+#define VGIC_MISR_U (1 << 1)
+
+#define LR_EMPTY 0xff
+
struct kvm;
struct kvm_vcpu;
struct kvm_run;
@@ -202,9 +239,13 @@ struct kvm_exit_mmio;
#ifdef CONFIG_KVM_ARM_VGIC
int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
+void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
+void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
+int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
struct kvm_exit_mmio *mmio);
+#define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
#else
static inline int kvm_vgic_hyp_init(void)
{
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index 82feee8..d7cdec5 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -587,7 +587,25 @@ static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
{
- return 0;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ unsigned long *pending, *enabled, *pend;
+ int vcpu_id;
+
+ vcpu_id = vcpu->vcpu_id;
+ pend = vcpu->arch.vgic_cpu.pending;
+
+ pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
+ enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
+ bitmap_and(pend, pending, enabled, 32);
+
+ pending = vgic_bitmap_get_shared_map(&dist->irq_state);
+ enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
+ bitmap_and(pend + 1, pending, enabled, VGIC_NR_SHARED_IRQS);
+ bitmap_and(pend + 1, pend + 1,
+ vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
+ VGIC_NR_SHARED_IRQS);
+
+ return (find_first_bit(pend, VGIC_NR_IRQS) < VGIC_NR_IRQS);
}
/*
@@ -613,6 +631,212 @@ static void vgic_update_state(struct kvm *kvm)
}
}
+#define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
+#define MK_LR_PEND(src, irq) (VGIC_LR_PENDING_BIT | ((src) << 10) | (irq))
+/*
+ * Queue an interrupt to a CPU virtual interface. Return true on success,
+ * or false if it wasn't possible to queue it.
+ */
+static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
+{
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ int lr, is_level;
+
+ /* Sanitize the input... */
+ BUG_ON(sgi_source_id & ~7);
+ BUG_ON(sgi_source_id && irq > 15);
+ BUG_ON(irq >= VGIC_NR_IRQS);
+
+ kvm_debug("Queue IRQ%d\n", irq);
+
+ lr = vgic_cpu->vgic_irq_lr_map[irq];
+ is_level = !vgic_irq_is_edge(dist, irq);
+
+ /* Do we have an active interrupt for the same CPUID? */
+ if (lr != LR_EMPTY &&
+ (LR_PHYSID(vgic_cpu->vgic_lr[lr]) == sgi_source_id)) {
+ kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
+ BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
+ vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
+ if (is_level)
+ vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
+ return true;
+ }
+
+ /* Try to use another LR for this interrupt */
+ lr = find_first_bit((unsigned long *)vgic_cpu->vgic_elrsr,
+ vgic_cpu->nr_lr);
+ if (lr >= vgic_cpu->nr_lr)
+ return false;
+
+ kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
+ vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
+ if (is_level)
+ vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
+
+ vgic_cpu->vgic_irq_lr_map[irq] = lr;
+ clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
+ set_bit(lr, vgic_cpu->lr_used);
+
+ return true;
+}
+
+/*
+ * Fill the list registers with pending interrupts before running the
+ * guest.
+ */
+static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
+{
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ unsigned long *pending;
+ int i, c, vcpu_id;
+ int overflow = 0;
+
+ vcpu_id = vcpu->vcpu_id;
+
+ /*
+ * We may not have any pending interrupt, or the interrupts
+ * may have been serviced from another vcpu. In all cases,
+ * move along.
+ */
+ if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
+ pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
+ goto epilog;
+ }
+
+ /* SGIs */
+ pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
+ for_each_set_bit(i, vgic_cpu->pending, 16) {
+ unsigned long sources;
+
+ sources = dist->irq_sgi_sources[vcpu_id][i];
+ for_each_set_bit(c, &sources, 8) {
+ if (!vgic_queue_irq(vcpu, c, i)) {
+ overflow = 1;
+ continue;
+ }
+
+ clear_bit(c, &sources);
+ }
+
+ if (!sources)
+ clear_bit(i, pending);
+
+ dist->irq_sgi_sources[vcpu_id][i] = sources;
+ }
+
+ /* PPIs */
+ for_each_set_bit_from(i, vgic_cpu->pending, 32) {
+ if (!vgic_queue_irq(vcpu, 0, i)) {
+ overflow = 1;
+ continue;
+ }
+
+ clear_bit(i, pending);
+ }
+
+
+ /* SPIs */
+ pending = vgic_bitmap_get_shared_map(&dist->irq_state);
+ for_each_set_bit_from(i, vgic_cpu->pending, VGIC_NR_IRQS) {
+ if (vgic_bitmap_get_irq_val(&dist->irq_active, 0, i))
+ continue; /* level interrupt, already queued */
+
+ if (!vgic_queue_irq(vcpu, 0, i)) {
+ overflow = 1;
+ continue;
+ }
+
+ /* Immediate clear on edge, set active on level */
+ if (vgic_irq_is_edge(dist, i)) {
+ clear_bit(i - 32, pending);
+ clear_bit(i, vgic_cpu->pending);
+ } else {
+ vgic_bitmap_set_irq_val(&dist->irq_active, 0, i, 1);
+ }
+ }
+
+epilog:
+ if (overflow)
+ vgic_cpu->vgic_hcr |= VGIC_HCR_UIE;
+ else {
+ vgic_cpu->vgic_hcr &= ~VGIC_HCR_UIE;
+ /*
+ * We're about to run this VCPU, and we've consumed
+ * everything the distributor had in store for
+ * us. Claim we don't have anything pending. We'll
+ * adjust that if needed while exiting.
+ */
+ clear_bit(vcpu_id, &dist->irq_pending_on_cpu);
+ }
+}
+
+/*
+ * Sync back the VGIC state after a guest run. We do not really touch
+ * the distributor here (the irq_pending_on_cpu bit is safe to set),
+ * so there is no need for taking its lock.
+ */
+static void __kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu)
+{
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ int lr, pending;
+
+ /* Clear mappings for empty LRs */
+ for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
+ vgic_cpu->nr_lr) {
+ int irq;
+
+ if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
+ continue;
+
+ irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
+
+ BUG_ON(irq >= VGIC_NR_IRQS);
+ vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
+ }
+
+ /* Check if we still have something up our sleeve... */
+ pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
+ vgic_cpu->nr_lr);
+ if (pending < vgic_cpu->nr_lr) {
+ set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
+ smp_mb();
+ }
+}
+
+void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
+{
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+ if (!irqchip_in_kernel(vcpu->kvm))
+ return;
+
+ spin_lock(&dist->lock);
+ __kvm_vgic_sync_to_cpu(vcpu);
+ spin_unlock(&dist->lock);
+}
+
+void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu)
+{
+ if (!irqchip_in_kernel(vcpu->kvm))
+ return;
+
+ __kvm_vgic_sync_from_cpu(vcpu);
+}
+
+int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
+{
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+
+ if (!irqchip_in_kernel(vcpu->kvm))
+ return 0;
+
+ return test_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
+}
+
static bool vgic_ioaddr_overlap(struct kvm *kvm)
{
phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 08/13] ARM: KVM: vgic: retire queued, disabled interrupts
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (6 preceding siblings ...)
2012-11-10 15:45 ` [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-12-03 13:24 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection Christoffer Dall
` (4 subsequent siblings)
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
An interrupt may have been disabled after being made pending on the
CPU interface (the classic case is a timer running while we're
rebooting the guest - the interrupt would kick as soon as the CPU
interface gets enabled, with deadly consequences).
The solution is to examine already active LRs, and check the
interrupt is still enabled. If not, just retire it.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/kvm/vgic.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index d7cdec5..dda5623 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -633,6 +633,34 @@ static void vgic_update_state(struct kvm *kvm)
#define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
#define MK_LR_PEND(src, irq) (VGIC_LR_PENDING_BIT | ((src) << 10) | (irq))
+
+/*
+ * An interrupt may have been disabled after being made pending on the
+ * CPU interface (the classic case is a timer running while we're
+ * rebooting the guest - the interrupt would kick as soon as the CPU
+ * interface gets enabled, with deadly consequences).
+ *
+ * The solution is to examine already active LRs, and check the
+ * interrupt is still enabled. If not, just retire it.
+ */
+static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
+{
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ int lr;
+
+ for_each_set_bit(lr, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+ int irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
+
+ if (!vgic_bitmap_get_irq_val(&dist->irq_enabled,
+ vcpu->vcpu_id, irq)) {
+ vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
+ clear_bit(lr, vgic_cpu->lr_used);
+ vgic_cpu->vgic_lr[lr] &= ~VGIC_LR_STATE;
+ }
+ }
+}
+
/*
* Queue an interrupt to a CPU virtual interface. Return true on success,
* or false if it wasn't possible to queue it.
@@ -696,6 +724,8 @@ static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
vcpu_id = vcpu->vcpu_id;
+ vgic_retire_disabled_irqs(vcpu);
+
/*
* We may not have any pending interrupt, or the interrupts
* may have been serviced from another vcpu. In all cases,
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (7 preceding siblings ...)
2012-11-10 15:45 ` [PATCH v4 08/13] ARM: KVM: vgic: retire queued, disabled interrupts Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-12-03 13:25 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch Christoffer Dall
` (3 subsequent siblings)
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Plug the interrupt injection code. Interrupts can now be generated
from user space.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_vgic.h | 8 +++
arch/arm/kvm/arm.c | 29 +++++++++++++
arch/arm/kvm/vgic.c | 90 +++++++++++++++++++++++++++++++++++++++
3 files changed, 127 insertions(+)
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index 7229324..6e3d303 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -241,6 +241,8 @@ struct kvm_exit_mmio;
int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
+int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
+ bool level);
int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
struct kvm_exit_mmio *mmio);
@@ -271,6 +273,12 @@ static inline void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) {}
static inline void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu) {}
static inline void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu) {}
+static inline int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid,
+ const struct kvm_irq_level *irq)
+{
+ return 0;
+}
+
static inline int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
{
return 0;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 3ac1aab..f43da01 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -764,10 +764,31 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
switch (irq_type) {
case KVM_ARM_IRQ_TYPE_CPU:
+ if (irqchip_in_kernel(kvm))
+ return -ENXIO;
+
if (irq_num > KVM_ARM_IRQ_CPU_FIQ)
return -EINVAL;
return vcpu_interrupt_line(vcpu, irq_num, level);
+#ifdef CONFIG_KVM_ARM_VGIC
+ case KVM_ARM_IRQ_TYPE_PPI:
+ if (!irqchip_in_kernel(kvm))
+ return -ENXIO;
+
+ if (irq_num < 16 || irq_num > 31)
+ return -EINVAL;
+
+ return kvm_vgic_inject_irq(kvm, vcpu->vcpu_id, irq_num, level);
+ case KVM_ARM_IRQ_TYPE_SPI:
+ if (!irqchip_in_kernel(kvm))
+ return -ENXIO;
+
+ if (irq_num < 32 || irq_num > KVM_ARM_IRQ_GIC_MAX)
+ return -EINVAL;
+
+ return kvm_vgic_inject_irq(kvm, 0, irq_num, level);
+#endif
}
return -EINVAL;
@@ -849,6 +870,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
void __user *argp = (void __user *)arg;
switch (ioctl) {
+#ifdef CONFIG_KVM_ARM_VGIC
+ case KVM_CREATE_IRQCHIP: {
+ if (vgic_present)
+ return kvm_vgic_create(kvm);
+ else
+ return -EINVAL;
+ }
+#endif
case KVM_SET_DEVICE_ADDRESS: {
struct kvm_device_address dev_addr;
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index dda5623..70040bb 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -75,6 +75,7 @@
#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
static void vgic_update_state(struct kvm *kvm);
+static void vgic_kick_vcpus(struct kvm *kvm);
static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
static inline int vgic_irq_is_edge(struct vgic_dist *dist, int irq)
@@ -542,6 +543,9 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
kvm_prepare_mmio(run, mmio);
kvm_handle_mmio_return(vcpu, run);
+ if (updated_state)
+ vgic_kick_vcpus(vcpu->kvm);
+
return true;
}
@@ -867,6 +871,92 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
return test_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
}
+static void vgic_kick_vcpus(struct kvm *kvm)
+{
+ struct kvm_vcpu *vcpu;
+ int c;
+
+ /*
+ * We've injected an interrupt, time to find out who deserves
+ * a good kick...
+ */
+ kvm_for_each_vcpu(c, vcpu, kvm) {
+ if (kvm_vgic_vcpu_pending_irq(vcpu))
+ kvm_vcpu_kick(vcpu);
+ }
+}
+
+static bool vgic_update_irq_state(struct kvm *kvm, int cpuid,
+ unsigned int irq_num, bool level)
+{
+ struct vgic_dist *dist = &kvm->arch.vgic;
+ struct kvm_vcpu *vcpu;
+ int is_edge, is_level, state;
+ int enabled;
+ bool ret = true;
+
+ spin_lock(&dist->lock);
+
+ is_edge = vgic_irq_is_edge(dist, irq_num);
+ is_level = !is_edge;
+ state = vgic_bitmap_get_irq_val(&dist->irq_state, cpuid, irq_num);
+
+ /*
+ * Only inject an interrupt if:
+ * - level triggered and we change level
+ * - edge triggered and we have a rising edge
+ */
+ if ((is_level && !(state ^ level)) || (is_edge && (state || !level))) {
+ ret = false;
+ goto out;
+ }
+
+ vgic_bitmap_set_irq_val(&dist->irq_state, cpuid, irq_num, level);
+
+ enabled = vgic_bitmap_get_irq_val(&dist->irq_enabled, cpuid, irq_num);
+
+ if (!enabled) {
+ ret = false;
+ goto out;
+ }
+
+ if (is_level && vgic_bitmap_get_irq_val(&dist->irq_active,
+ cpuid, irq_num)) {
+ /*
+ * Level interrupt in progress, will be picked up
+ * when EOId.
+ */
+ ret = false;
+ goto out;
+ }
+
+ if (irq_num >= 32)
+ cpuid = dist->irq_spi_cpu[irq_num - 32];
+
+ kvm_debug("Inject IRQ%d level %d CPU%d\n", irq_num, level, cpuid);
+
+ vcpu = kvm_get_vcpu(kvm, cpuid);
+
+ if (level) {
+ set_bit(irq_num, vcpu->arch.vgic_cpu.pending);
+ set_bit(cpuid, &dist->irq_pending_on_cpu);
+ }
+
+out:
+ spin_unlock(&dist->lock);
+
+ return ret;
+}
+
+int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
+ bool level)
+{
+ if (vgic_update_irq_state(kvm, cpuid, irq_num, level))
+ vgic_kick_vcpus(kvm);
+
+ return 0;
+}
+
static bool vgic_ioaddr_overlap(struct kvm *kvm)
{
phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (8 preceding siblings ...)
2012-11-10 15:45 ` [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-12-03 13:31 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 11/13] ARM: KVM: VGIC initialisation code Christoffer Dall
` (2 subsequent siblings)
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Enable the VGIC control interface to be save-restored on world switch.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_arm.h | 12 +++++++
arch/arm/kernel/asm-offsets.c | 12 +++++++
arch/arm/kvm/interrupts_head.S | 68 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 92 insertions(+)
diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
index 246afd7..8f5dd22 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -192,4 +192,16 @@
#define HSR_EC_DABT (0x24)
#define HSR_EC_DABT_HYP (0x25)
+/* GICH offsets */
+#define GICH_HCR 0x0
+#define GICH_VTR 0x4
+#define GICH_VMCR 0x8
+#define GICH_MISR 0x10
+#define GICH_EISR0 0x20
+#define GICH_EISR1 0x24
+#define GICH_ELRSR0 0x30
+#define GICH_ELRSR1 0x34
+#define GICH_APR 0xf0
+#define GICH_LR0 0x100
+
#endif /* __ARM_KVM_ARM_H__ */
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 95cab37..39b6221 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -167,6 +167,18 @@ int main(void)
DEFINE(VCPU_HxFAR, offsetof(struct kvm_vcpu, arch.hxfar));
DEFINE(VCPU_HPFAR, offsetof(struct kvm_vcpu, arch.hpfar));
DEFINE(VCPU_HYP_PC, offsetof(struct kvm_vcpu, arch.hyp_pc));
+#ifdef CONFIG_KVM_ARM_VGIC
+ DEFINE(VCPU_VGIC_CPU, offsetof(struct kvm_vcpu, arch.vgic_cpu));
+ DEFINE(VGIC_CPU_HCR, offsetof(struct vgic_cpu, vgic_hcr));
+ DEFINE(VGIC_CPU_VMCR, offsetof(struct vgic_cpu, vgic_vmcr));
+ DEFINE(VGIC_CPU_MISR, offsetof(struct vgic_cpu, vgic_misr));
+ DEFINE(VGIC_CPU_EISR, offsetof(struct vgic_cpu, vgic_eisr));
+ DEFINE(VGIC_CPU_ELRSR, offsetof(struct vgic_cpu, vgic_elrsr));
+ DEFINE(VGIC_CPU_APR, offsetof(struct vgic_cpu, vgic_apr));
+ DEFINE(VGIC_CPU_LR, offsetof(struct vgic_cpu, vgic_lr));
+ DEFINE(VGIC_CPU_NR_LR, offsetof(struct vgic_cpu, nr_lr));
+ DEFINE(KVM_VGIC_VCTRL, offsetof(struct kvm, arch.vgic.vctrl_base));
+#endif
DEFINE(KVM_VTTBR, offsetof(struct kvm, arch.vttbr));
#endif
return 0;
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 2ac8b4a..c2423d8 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -341,6 +341,45 @@
* @vcpup: Register pointing to VCPU struct
*/
.macro save_vgic_state vcpup
+#ifdef CONFIG_KVM_ARM_VGIC
+ /* Get VGIC VCTRL base into r2 */
+ ldr r2, [\vcpup, #VCPU_KVM]
+ ldr r2, [r2, #KVM_VGIC_VCTRL]
+ cmp r2, #0
+ beq 2f
+
+ /* Compute the address of struct vgic_cpu */
+ add r11, \vcpup, #VCPU_VGIC_CPU
+
+ /* Save all interesting registers */
+ ldr r3, [r2, #GICH_HCR]
+ ldr r4, [r2, #GICH_VMCR]
+ ldr r5, [r2, #GICH_MISR]
+ ldr r6, [r2, #GICH_EISR0]
+ ldr r7, [r2, #GICH_EISR1]
+ ldr r8, [r2, #GICH_ELRSR0]
+ ldr r9, [r2, #GICH_ELRSR1]
+ ldr r10, [r2, #GICH_APR]
+
+ str r3, [r11, #VGIC_CPU_HCR]
+ str r4, [r11, #VGIC_CPU_VMCR]
+ str r5, [r11, #VGIC_CPU_MISR]
+ str r6, [r11, #VGIC_CPU_EISR]
+ str r7, [r11, #(VGIC_CPU_EISR + 4)]
+ str r8, [r11, #VGIC_CPU_ELRSR]
+ str r9, [r11, #(VGIC_CPU_ELRSR + 4)]
+ str r10, [r11, #VGIC_CPU_APR]
+
+ /* Save list registers */
+ add r2, r2, #GICH_LR0
+ add r3, r11, #VGIC_CPU_LR
+ ldr r4, [r11, #VGIC_CPU_NR_LR]
+1: ldr r6, [r2], #4
+ str r6, [r3], #4
+ subs r4, r4, #1
+ bne 1b
+2:
+#endif
.endm
/*
@@ -348,6 +387,35 @@
* @vcpup: Register pointing to VCPU struct
*/
.macro restore_vgic_state vcpup
+#ifdef CONFIG_KVM_ARM_VGIC
+ /* Get VGIC VCTRL base into r2 */
+ ldr r2, [\vcpup, #VCPU_KVM]
+ ldr r2, [r2, #KVM_VGIC_VCTRL]
+ cmp r2, #0
+ beq 2f
+
+ /* Compute the address of struct vgic_cpu */
+ add r11, \vcpup, #VCPU_VGIC_CPU
+
+ /* We only restore a minimal set of registers */
+ ldr r3, [r11, #VGIC_CPU_HCR]
+ ldr r4, [r11, #VGIC_CPU_VMCR]
+ ldr r8, [r11, #VGIC_CPU_APR]
+
+ str r3, [r2, #GICH_HCR]
+ str r4, [r2, #GICH_VMCR]
+ str r8, [r2, #GICH_APR]
+
+ /* Restore list registers */
+ add r2, r2, #GICH_LR0
+ add r3, r11, #VGIC_CPU_LR
+ ldr r4, [r11, #VGIC_CPU_NR_LR]
+1: ldr r6, [r3], #4
+ str r6, [r2], #4
+ subs r4, r4, #1
+ bne 1b
+2:
+#endif
.endm
/* Configures the HSTR (Hyp System Trap Register) on entry/return
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 11/13] ARM: KVM: VGIC initialisation code
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (9 preceding siblings ...)
2012-11-10 15:45 ` [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-12-05 10:43 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick Christoffer Dall
2012-11-10 15:45 ` [PATCH v4 13/13] ARM: KVM: Add VGIC configuration option Christoffer Dall
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
Add the init code for the hypervisor, the virtual machine, and
the virtual CPUs.
An interrupt handler is also wired to allow the VGIC maintenance
interrupts, used to deal with level triggered interrupts and LR
underflows.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_vgic.h | 11 ++
arch/arm/kvm/arm.c | 14 ++
arch/arm/kvm/vgic.c | 237 +++++++++++++++++++++++++++++++++++++++
3 files changed, 258 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index 6e3d303..a8e7a93 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -154,6 +154,7 @@ static inline void vgic_bytemap_set_irq_val(struct vgic_bytemap *x,
struct vgic_dist {
#ifdef CONFIG_KVM_ARM_VGIC
spinlock_t lock;
+ bool ready;
/* Virtual control interface mapping */
void __iomem *vctrl_base;
@@ -239,6 +240,10 @@ struct kvm_exit_mmio;
#ifdef CONFIG_KVM_ARM_VGIC
int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
+int kvm_vgic_hyp_init(void);
+int kvm_vgic_init(struct kvm *kvm);
+int kvm_vgic_create(struct kvm *kvm);
+void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu);
void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
@@ -248,6 +253,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
struct kvm_exit_mmio *mmio);
#define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
+#define vgic_initialized(k) ((k)->arch.vgic.ready)
#else
static inline int kvm_vgic_hyp_init(void)
{
@@ -294,6 +300,11 @@ static inline int irqchip_in_kernel(struct kvm *kvm)
{
return 0;
}
+
+static inline bool vgic_initialized(struct kvm *kvm)
+{
+ return true;
+}
#endif
#endif
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index f43da01..a633d9d 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -187,6 +187,8 @@ int kvm_dev_ioctl_check_extension(long ext)
switch (ext) {
#ifdef CONFIG_KVM_ARM_VGIC
case KVM_CAP_IRQCHIP:
+ r = vgic_present;
+ break;
#endif
case KVM_CAP_USER_MEMORY:
case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
@@ -623,6 +625,14 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (unlikely(vcpu->arch.target < 0))
return -ENOEXEC;
+ /* Initalize the VGIC before running a vcpu the first time on this VM */
+ if (unlikely(irqchip_in_kernel(vcpu->kvm) &&
+ !vgic_initialized(vcpu->kvm))) {
+ ret = kvm_vgic_init(vcpu->kvm);
+ if (ret)
+ return ret;
+ }
+
if (run->exit_reason == KVM_EXIT_MMIO) {
ret = kvm_handle_mmio_return(vcpu, vcpu->run);
if (ret)
@@ -1024,8 +1034,8 @@ static int init_hyp_mode(void)
* Init HYP view of VGIC
*/
err = kvm_vgic_hyp_init();
- if (err)
- goto out_free_mappings;
+ if (!err)
+ vgic_present = true;
return 0;
out_free_vfp:
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index 70040bb..415ddb8 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -20,7 +20,14 @@
#include <linux/kvm_host.h>
#include <linux/interrupt.h>
#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+
#include <asm/kvm_emulate.h>
+#include <asm/hardware/gic.h>
+#include <asm/kvm_arm.h>
+#include <asm/kvm_mmu.h>
/*
* How the whole thing works (courtesy of Christoffer Dall):
@@ -59,11 +66,18 @@
*/
#define VGIC_ADDR_UNDEF (-1)
-#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
+#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == VGIC_ADDR_UNDEF)
#define VGIC_DIST_SIZE 0x1000
#define VGIC_CPU_SIZE 0x2000
+/* Physical address of vgic virtual cpu interface */
+static phys_addr_t vgic_vcpu_base;
+
+/* Virtual control interface base address */
+static void __iomem *vgic_vctrl_base;
+
+static struct device_node *vgic_node;
#define ACCESS_READ_VALUE (1 << 0)
#define ACCESS_READ_RAZ (0 << 0)
@@ -527,7 +541,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
if (!irqchip_in_kernel(vcpu->kvm) ||
mmio->phys_addr < base ||
- (mmio->phys_addr + mmio->len) > (base + dist->vgic_dist_size))
+ (mmio->phys_addr + mmio->len) > (base + VGIC_DIST_SIZE))
return false;
range = find_matching_range(vgic_ranges, mmio, base);
@@ -957,6 +971,225 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
return 0;
}
+static irqreturn_t vgic_maintenance_handler(int irq, void *data)
+{
+ struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)data;
+ struct vgic_dist *dist;
+ struct vgic_cpu *vgic_cpu;
+
+ if (WARN(!vcpu,
+ "VGIC interrupt on CPU %d with no vcpu\n", smp_processor_id()))
+ return IRQ_HANDLED;
+
+ vgic_cpu = &vcpu->arch.vgic_cpu;
+ dist = &vcpu->kvm->arch.vgic;
+ kvm_debug("MISR = %08x\n", vgic_cpu->vgic_misr);
+
+ /*
+ * We do not need to take the distributor lock here, since the only
+ * action we perform is clearing the irq_active_bit for an EOIed
+ * level interrupt. There is a potential race with
+ * the queuing of an interrupt in __kvm_sync_to_cpu(), where we check
+ * if the interrupt is already active. Two possibilities:
+ *
+ * - The queuing is occuring on the same vcpu: cannot happen, as we're
+ * already in the context of this vcpu, and executing the handler
+ * - The interrupt has been migrated to another vcpu, and we ignore
+ * this interrupt for this run. Big deal. It is still pending though,
+ * and will get considered when this vcpu exits.
+ */
+ if (vgic_cpu->vgic_misr & VGIC_MISR_EOI) {
+ /*
+ * Some level interrupts have been EOIed. Clear their
+ * active bit.
+ */
+ int lr, irq;
+
+ for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_eisr,
+ vgic_cpu->nr_lr) {
+ irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
+
+ vgic_bitmap_set_irq_val(&dist->irq_active,
+ vcpu->vcpu_id, irq, 0);
+ vgic_cpu->vgic_lr[lr] &= ~VGIC_LR_EOI;
+ writel_relaxed(vgic_cpu->vgic_lr[lr],
+ dist->vctrl_base + GICH_LR0 + (lr << 2));
+
+ /* Any additionnal pending interrupt? */
+ if (vgic_bitmap_get_irq_val(&dist->irq_state,
+ vcpu->vcpu_id, irq)) {
+ set_bit(irq, vcpu->arch.vgic_cpu.pending);
+ set_bit(vcpu->vcpu_id,
+ &dist->irq_pending_on_cpu);
+ } else {
+ clear_bit(irq, vgic_cpu->pending);
+ }
+ }
+ }
+
+ if (vgic_cpu->vgic_misr & VGIC_MISR_U) {
+ vgic_cpu->vgic_hcr &= ~VGIC_HCR_UIE;
+ writel_relaxed(vgic_cpu->vgic_hcr, dist->vctrl_base + GICH_HCR);
+ }
+
+ return IRQ_HANDLED;
+}
+
+void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
+{
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+ u32 reg;
+ int i;
+
+ if (!irqchip_in_kernel(vcpu->kvm))
+ return;
+
+ for (i = 0; i < VGIC_NR_IRQS; i++) {
+ if (i < 16)
+ vgic_bitmap_set_irq_val(&dist->irq_enabled,
+ vcpu->vcpu_id, i, 1);
+ if (i < 32)
+ vgic_bitmap_set_irq_val(&dist->irq_cfg,
+ vcpu->vcpu_id, i, 1);
+
+ vgic_cpu->vgic_irq_lr_map[i] = LR_EMPTY;
+ }
+
+ BUG_ON(!vcpu->kvm->arch.vgic.vctrl_base);
+ reg = readl_relaxed(vcpu->kvm->arch.vgic.vctrl_base + GICH_VTR);
+ vgic_cpu->nr_lr = (reg & 0x1f) + 1;
+
+ reg = readl_relaxed(vcpu->kvm->arch.vgic.vctrl_base + GICH_VMCR);
+ vgic_cpu->vgic_vmcr = reg | (0x1f << 27); /* Priority */
+
+ vgic_cpu->vgic_hcr |= VGIC_HCR_EN; /* Get the show on the road... */
+}
+
+static void vgic_init_maintenance_interrupt(void *info)
+{
+ unsigned int *irqp = info;
+
+ enable_percpu_irq(*irqp, 0);
+}
+
+int kvm_vgic_hyp_init(void)
+{
+ int ret;
+ unsigned int irq;
+ struct resource vctrl_res;
+ struct resource vcpu_res;
+
+ vgic_node = of_find_compatible_node(NULL, NULL, "arm,cortex-a15-gic");
+ if (!vgic_node)
+ return -ENODEV;
+
+ irq = irq_of_parse_and_map(vgic_node, 0);
+ if (!irq)
+ return -ENXIO;
+
+ ret = request_percpu_irq(irq, vgic_maintenance_handler,
+ "vgic", kvm_get_running_vcpus());
+ if (ret) {
+ kvm_err("Cannot register interrupt %d\n", irq);
+ return ret;
+ }
+
+ ret = of_address_to_resource(vgic_node, 2, &vctrl_res);
+ if (ret) {
+ kvm_err("Cannot obtain VCTRL resource\n");
+ goto out_free_irq;
+ }
+
+ vgic_vctrl_base = of_iomap(vgic_node, 2);
+ if (!vgic_vctrl_base) {
+ kvm_err("Cannot ioremap VCTRL\n");
+ ret = -ENOMEM;
+ goto out_free_irq;
+ }
+
+ ret = create_hyp_io_mappings(vgic_vctrl_base,
+ vgic_vctrl_base + resource_size(&vctrl_res),
+ vctrl_res.start);
+ if (ret) {
+ kvm_err("Cannot map VCTRL into hyp\n");
+ goto out_unmap;
+ }
+
+ kvm_info("%s@%llx IRQ%d\n", vgic_node->name, vctrl_res.start, irq);
+ on_each_cpu(vgic_init_maintenance_interrupt, &irq, 1);
+
+ if (of_address_to_resource(vgic_node, 3, &vcpu_res)) {
+ kvm_err("Cannot obtain VCPU resource\n");
+ ret = -ENXIO;
+ goto out_unmap;
+ }
+ vgic_vcpu_base = vcpu_res.start;
+
+ return 0;
+
+out_unmap:
+ iounmap(vgic_vctrl_base);
+out_free_irq:
+ free_percpu_irq(irq, kvm_get_running_vcpus());
+
+ return ret;
+}
+
+int kvm_vgic_init(struct kvm *kvm)
+{
+ int ret = 0, i;
+
+ mutex_lock(&kvm->lock);
+
+ if (vgic_initialized(kvm))
+ goto out;
+
+ if (IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_dist_base) ||
+ IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_cpu_base)) {
+ kvm_err("Need to set vgic cpu and dist addresses first\n");
+ ret = -ENXIO;
+ goto out;
+ }
+
+ ret = kvm_phys_addr_ioremap(kvm, kvm->arch.vgic.vgic_cpu_base,
+ vgic_vcpu_base, VGIC_CPU_SIZE);
+ if (ret) {
+ kvm_err("Unable to remap VGIC CPU to VCPU\n");
+ goto out;
+ }
+
+ for (i = 32; i < VGIC_NR_IRQS; i += 4)
+ vgic_set_target_reg(kvm, 0, i);
+
+ kvm->arch.vgic.ready = true;
+out:
+ mutex_unlock(&kvm->lock);
+ return ret;
+}
+
+int kvm_vgic_create(struct kvm *kvm)
+{
+ int ret;
+
+ mutex_lock(&kvm->lock);
+
+ if (atomic_read(&kvm->online_vcpus) || kvm->arch.vgic.vctrl_base) {
+ ret = -EEXIST;
+ goto out;
+ }
+
+ spin_lock_init(&kvm->arch.vgic.lock);
+ kvm->arch.vgic.vctrl_base = vgic_vctrl_base;
+ kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
+ kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
+
+ ret = 0;
+out:
+ mutex_unlock(&kvm->lock);
+ return ret;
+}
+
static bool vgic_ioaddr_overlap(struct kvm *kvm)
{
phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (10 preceding siblings ...)
2012-11-10 15:45 ` [PATCH v4 11/13] ARM: KVM: VGIC initialisation code Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-12-05 10:43 ` Will Deacon
2012-12-05 11:16 ` Russell King - ARM Linux
2012-11-10 15:45 ` [PATCH v4 13/13] ARM: KVM: Add VGIC configuration option Christoffer Dall
12 siblings, 2 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
If we have level interrupts already programmed to fire on a vcpu,
there is no reason to kick it after injecting a new interrupt,
as we're guaranteed that we'll exit when the level interrupt will
be EOId (VGIC_LR_EOI is set).
The exit will force a reload of the VGIC, injecting the new interrupts.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/include/asm/kvm_vgic.h | 10 ++++++++++
arch/arm/kvm/arm.c | 10 +++++++++-
arch/arm/kvm/vgic.c | 10 ++++++++--
3 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
index a8e7a93..7d2662c 100644
--- a/arch/arm/include/asm/kvm_vgic.h
+++ b/arch/arm/include/asm/kvm_vgic.h
@@ -215,6 +215,9 @@ struct vgic_cpu {
u32 vgic_elrsr[2]; /* Saved only */
u32 vgic_apr;
u32 vgic_lr[64]; /* A15 has only 4... */
+
+ /* Number of level-triggered interrupt in progress */
+ atomic_t irq_active_count;
#endif
};
@@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
#define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
#define vgic_initialized(k) ((k)->arch.vgic.ready)
+#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
+
#else
static inline int kvm_vgic_hyp_init(void)
{
@@ -305,6 +310,11 @@ static inline bool vgic_initialized(struct kvm *kvm)
{
return true;
}
+
+static inline int vgic_active_irq(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
#endif
#endif
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index a633d9d..1716f12 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -94,7 +94,15 @@ int kvm_arch_hardware_enable(void *garbage)
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
- return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
+ if (kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE) {
+ if (vgic_active_irq(vcpu) &&
+ cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
+ return 0;
+
+ return 1;
+ }
+
+ return 0;
}
void kvm_arch_hardware_disable(void *garbage)
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index 415ddb8..146de1d 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -705,8 +705,10 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
- if (is_level)
+ if (is_level) {
vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
+ atomic_inc(&vgic_cpu->irq_active_count);
+ }
return true;
}
@@ -718,8 +720,10 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
- if (is_level)
+ if (is_level) {
vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
+ atomic_inc(&vgic_cpu->irq_active_count);
+ }
vgic_cpu->vgic_irq_lr_map[irq] = lr;
clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
@@ -1011,6 +1015,8 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
vgic_bitmap_set_irq_val(&dist->irq_active,
vcpu->vcpu_id, irq, 0);
+ atomic_dec(&vgic_cpu->irq_active_count);
+ smp_mb();
vgic_cpu->vgic_lr[lr] &= ~VGIC_LR_EOI;
writel_relaxed(vgic_cpu->vgic_lr[lr],
dist->vctrl_base + GICH_LR0 + (lr << 2));
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 13/13] ARM: KVM: Add VGIC configuration option
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
` (11 preceding siblings ...)
2012-11-10 15:45 ` [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick Christoffer Dall
@ 2012-11-10 15:45 ` Christoffer Dall
2012-11-10 19:52 ` Sergei Shtylyov
12 siblings, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-11-10 15:45 UTC (permalink / raw)
To: linux-arm-kernel
From: Marc Zyngier <marc.zyngier@arm.com>
It is now possible to select the VGIC configuration option.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
arch/arm/kvm/Kconfig | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index bfa3174..3c979ce 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -51,6 +51,13 @@ config KVM_ARM_MAX_VCPUS
large, so only choose a reasonable number that you expect to
actually use.
+config KVM_ARM_VGIC
+ bool "KVM support for Virtual GIC"
+ depends on KVM_ARM_HOST && OF
+ select HAVE_KVM_IRQCHIP
+ ---help---
+ Adds support for a hardware assisted, in-kernel GIC emulation.
+
source drivers/virtio/Kconfig
endif # VIRTUALIZATION
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 13/13] ARM: KVM: Add VGIC configuration option
2012-11-10 15:45 ` [PATCH v4 13/13] ARM: KVM: Add VGIC configuration option Christoffer Dall
@ 2012-11-10 19:52 ` Sergei Shtylyov
0 siblings, 0 replies; 58+ messages in thread
From: Sergei Shtylyov @ 2012-11-10 19:52 UTC (permalink / raw)
To: linux-arm-kernel
Hello.
On 10-11-2012 19:45, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> It is now possible to select the VGIC configuration option.
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/kvm/Kconfig | 7 +++++++
> 1 file changed, 7 insertions(+)
> diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
> index bfa3174..3c979ce 100644
> --- a/arch/arm/kvm/Kconfig
> +++ b/arch/arm/kvm/Kconfig
> @@ -51,6 +51,13 @@ config KVM_ARM_MAX_VCPUS
> large, so only choose a reasonable number that you expect to
> actually use.
>
> +config KVM_ARM_VGIC
> + bool "KVM support for Virtual GIC"
Indent with tabs uniformaly, please.
> + depends on KVM_ARM_HOST && OF
> + select HAVE_KVM_IRQCHIP
> + ---help---
> + Adds support for a hardware assisted, in-kernel GIC emulation.
> +
WBR, Sergei
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code
2012-11-10 15:44 ` [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code Christoffer Dall
@ 2012-11-12 8:54 ` Dong Aisheng
2012-11-13 13:32 ` Christoffer Dall
2012-11-28 13:09 ` Will Deacon
1 sibling, 1 reply; 58+ messages in thread
From: Dong Aisheng @ 2012-11-12 8:54 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 04:44:44PM +0100, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Wire the initial in-kernel MMIO support code for the VGIC, used
> for the distributor emulation.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_vgic.h | 6 +-
> arch/arm/kvm/Makefile | 1
> arch/arm/kvm/vgic.c | 138 +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 144 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm/kvm/vgic.c
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index d75540a..b444ecf 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -30,7 +30,11 @@ struct kvm_vcpu;
> struct kvm_run;
> struct kvm_exit_mmio;
>
> -#ifndef CONFIG_KVM_ARM_VGIC
> +#ifdef CONFIG_KVM_ARM_VGIC
> +bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> + struct kvm_exit_mmio *mmio);
> +
> +#else
> static inline int kvm_vgic_hyp_init(void)
> {
> return 0;
> diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
> index 8a4f396..c019f02 100644
> --- a/arch/arm/kvm/Makefile
> +++ b/arch/arm/kvm/Makefile
> @@ -20,3 +20,4 @@ obj-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../virt/kvm/, kvm_main.o coalesc
>
> obj-$(CONFIG_KVM_ARM_HOST) += arm.o guest.o mmu.o emulate.o reset.o
> obj-$(CONFIG_KVM_ARM_HOST) += coproc.o coproc_a15.o mmio.o decode.o
> +obj-$(CONFIG_KVM_ARM_VGIC) += vgic.o
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> new file mode 100644
> index 0000000..26ada3b
> --- /dev/null
> +++ b/arch/arm/kvm/vgic.c
> @@ -0,0 +1,138 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +
> +#include <linux/kvm.h>
> +#include <linux/kvm_host.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <asm/kvm_emulate.h>
> +
> +#define ACCESS_READ_VALUE (1 << 0)
> +#define ACCESS_READ_RAZ (0 << 0)
> +#define ACCESS_READ_MASK(x) ((x) & (1 << 0))
> +#define ACCESS_WRITE_IGNORED (0 << 1)
> +#define ACCESS_WRITE_SETBIT (1 << 1)
> +#define ACCESS_WRITE_CLEARBIT (2 << 1)
> +#define ACCESS_WRITE_VALUE (3 << 1)
> +#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
> +
> +/**
> + * vgic_reg_access - access vgic register
> + * @mmio: pointer to the data describing the mmio access
> + * @reg: pointer to the virtual backing of the vgic distributor struct
Is this correct?
> + * @offset: least significant 2 bits used for word offset
> + * @mode: ACCESS_ mode (see defines above)
> + *
> + * Helper to make vgic register access easier using one of the access
> + * modes defined for vgic register access
> + * (read,raz,write-ignored,setbit,clearbit,write)
> + */
> +static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
> + u32 offset, int mode)
> +{
> + int word_offset = offset & 3;
> + int shift = word_offset * 8;
> + u32 mask;
> + u32 regval;
> +
> + /*
> + * Any alignment fault should have been delivered to the guest
> + * directly (ARM ARM B3.12.7 "Prioritization of aborts").
> + */
> +
> + mask = (~0U) >> (word_offset * 8);
> + if (reg)
> + regval = *reg;
> + else {
> + BUG_ON(mode != (ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED));
> + regval = 0;
> + }
> +
> + if (mmio->is_write) {
> + u32 data = (*((u32 *)mmio->data) & mask) << shift;
> + switch (ACCESS_WRITE_MASK(mode)) {
> + case ACCESS_WRITE_IGNORED:
> + return;
> +
> + case ACCESS_WRITE_SETBIT:
> + regval |= data;
> + break;
> +
> + case ACCESS_WRITE_CLEARBIT:
> + regval &= ~data;
> + break;
> +
> + case ACCESS_WRITE_VALUE:
> + regval = (regval & ~(mask << shift)) | data;
> + break;
> + }
> + *reg = regval;
> + } else {
> + switch (ACCESS_READ_MASK(mode)) {
> + case ACCESS_READ_RAZ:
> + regval = 0;
> + /* fall through */
> +
> + case ACCESS_READ_VALUE:
> + *((u32 *)mmio->data) = (regval >> shift) & mask;
> + }
> + }
> +}
> +
> +/* All this should be handled by kvm_bus_io_*()... FIXME!!! */
> +struct mmio_range {
> + unsigned long base;
> + unsigned long len;
> + bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
> + u32 offset);
> +};
> +
> +static const struct mmio_range vgic_ranges[] = {
> + {}
> +};
> +
> +static const
> +struct mmio_range *find_matching_range(const struct mmio_range *ranges,
> + struct kvm_exit_mmio *mmio,
> + unsigned long base)
> +{
> + const struct mmio_range *r = ranges;
> + unsigned long addr = mmio->phys_addr - base;
> +
> + while (r->len) {
> + if (addr >= r->base &&
> + (addr + mmio->len) <= (r->base + r->len))
> + return r;
> + r++;
> + }
> +
> + return NULL;
> +}
> +
> +/**
> + * vgic_handle_mmio - handle an in-kernel MMIO access
> + * @vcpu: pointer to the vcpu performing the access
> + * @mmio: pointer to the data describing the access
Can we also have @run here?
> + *
> + * returns true if the MMIO access has been performed in kernel space,
> + * and false if it needs to be emulated in user space.
> + */
> +bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio)
> +{
> + return KVM_EXIT_MMIO;
> +}
>
Regards
Dong Aisheng
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-11-10 15:44 ` [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space Christoffer Dall
@ 2012-11-12 8:56 ` Dong Aisheng
2012-11-13 13:35 ` Christoffer Dall
2012-11-28 13:11 ` Will Deacon
1 sibling, 1 reply; 58+ messages in thread
From: Dong Aisheng @ 2012-11-12 8:56 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 04:44:51PM +0100, Christoffer Dall wrote:
[...]
> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
> +{
> + int r = 0;
> + struct vgic_dist *vgic = &kvm->arch.vgic;
> +
> + if (addr & ~KVM_PHYS_MASK)
> + return -E2BIG;
> +
> + if (addr & ~PAGE_MASK)
> + return -EINVAL;
> +
> + mutex_lock(&kvm->lock);
> + switch (type) {
> + case KVM_VGIC_V2_ADDR_TYPE_DIST:
> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
> + return -EEXIST;
> + if (addr + VGIC_DIST_SIZE < addr)
> + return -EINVAL;
> + kvm->arch.vgic.vgic_dist_base = addr;
> + break;
> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
> + return -EEXIST;
> + if (addr + VGIC_CPU_SIZE < addr)
> + return -EINVAL;
> + kvm->arch.vgic.vgic_cpu_base = addr;
> + break;
> + default:
> + r = -ENODEV;
> + }
> +
> + if (vgic_ioaddr_overlap(kvm)) {
> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
Missing mutex_unlock?
> + return -EINVAL;
> + }
> +
> + mutex_unlock(&kvm->lock);
> + return r;
> +}
>
Regards
Dong Aisheng
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 06/13] ARM: KVM: VGIC distributor handling
2012-11-10 15:44 ` [PATCH v4 06/13] ARM: KVM: VGIC distributor handling Christoffer Dall
@ 2012-11-12 9:29 ` Dong Aisheng
2012-11-13 13:38 ` Christoffer Dall
2012-11-28 13:21 ` Will Deacon
1 sibling, 1 reply; 58+ messages in thread
From: Dong Aisheng @ 2012-11-12 9:29 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 04:44:58PM +0100, Christoffer Dall wrote:
[...]
> @@ -141,7 +519,98 @@ struct mmio_range *find_matching_range(const struct mmio_range *ranges,
> */
> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio)
> {
> - return KVM_EXIT_MMIO;
> + const struct mmio_range *range;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + unsigned long base = dist->vgic_dist_base;
> + bool updated_state;
> +
> + if (!irqchip_in_kernel(vcpu->kvm) ||
> + mmio->phys_addr < base ||
> + (mmio->phys_addr + mmio->len) > (base + dist->vgic_dist_size))
> + return false;
> +
> + range = find_matching_range(vgic_ranges, mmio, base);
> + if (unlikely(!range || !range->handle_mmio)) {
> + pr_warn("Unhandled access %d %08llx %d\n",
> + mmio->is_write, mmio->phys_addr, mmio->len);
> + return false;
> + }
> +
> + spin_lock(&vcpu->kvm->arch.vgic.lock);
> + updated_state = range->handle_mmio(vcpu, mmio,mmio->phys_addr - range->base - base);
Missing space after ','.
Checkpatch may fail here.
Regards
Dong Aisheng
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code
2012-11-12 8:54 ` Dong Aisheng
@ 2012-11-13 13:32 ` Christoffer Dall
0 siblings, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-13 13:32 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 12, 2012 at 3:54 AM, Dong Aisheng <b29396@freescale.com> wrote:
> On Sat, Nov 10, 2012 at 04:44:44PM +0100, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Wire the initial in-kernel MMIO support code for the VGIC, used
>> for the distributor emulation.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_vgic.h | 6 +-
>> arch/arm/kvm/Makefile | 1
>> arch/arm/kvm/vgic.c | 138 +++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 144 insertions(+), 1 deletion(-)
>> create mode 100644 arch/arm/kvm/vgic.c
>>
>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>> index d75540a..b444ecf 100644
>> --- a/arch/arm/include/asm/kvm_vgic.h
>> +++ b/arch/arm/include/asm/kvm_vgic.h
>> @@ -30,7 +30,11 @@ struct kvm_vcpu;
>> struct kvm_run;
>> struct kvm_exit_mmio;
>>
>> -#ifndef CONFIG_KVM_ARM_VGIC
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> +bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>> + struct kvm_exit_mmio *mmio);
>> +
>> +#else
>> static inline int kvm_vgic_hyp_init(void)
>> {
>> return 0;
>> diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
>> index 8a4f396..c019f02 100644
>> --- a/arch/arm/kvm/Makefile
>> +++ b/arch/arm/kvm/Makefile
>> @@ -20,3 +20,4 @@ obj-$(CONFIG_KVM_ARM_HOST) += $(addprefix ../../../virt/kvm/, kvm_main.o coalesc
>>
>> obj-$(CONFIG_KVM_ARM_HOST) += arm.o guest.o mmu.o emulate.o reset.o
>> obj-$(CONFIG_KVM_ARM_HOST) += coproc.o coproc_a15.o mmio.o decode.o
>> +obj-$(CONFIG_KVM_ARM_VGIC) += vgic.o
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> new file mode 100644
>> index 0000000..26ada3b
>> --- /dev/null
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -0,0 +1,138 @@
>> +/*
>> + * Copyright (C) 2012 ARM Ltd.
>> + * Author: Marc Zyngier <marc.zyngier@arm.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
>> + */
>> +
>> +#include <linux/kvm.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <asm/kvm_emulate.h>
>> +
>> +#define ACCESS_READ_VALUE (1 << 0)
>> +#define ACCESS_READ_RAZ (0 << 0)
>> +#define ACCESS_READ_MASK(x) ((x) & (1 << 0))
>> +#define ACCESS_WRITE_IGNORED (0 << 1)
>> +#define ACCESS_WRITE_SETBIT (1 << 1)
>> +#define ACCESS_WRITE_CLEARBIT (2 << 1)
>> +#define ACCESS_WRITE_VALUE (3 << 1)
>> +#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>> +
>> +/**
>> + * vgic_reg_access - access vgic register
>> + * @mmio: pointer to the data describing the mmio access
>> + * @reg: pointer to the virtual backing of the vgic distributor struct
>
> Is this correct?
>
>> + * @offset: least significant 2 bits used for word offset
>> + * @mode: ACCESS_ mode (see defines above)
>> + *
>> + * Helper to make vgic register access easier using one of the access
>> + * modes defined for vgic register access
>> + * (read,raz,write-ignored,setbit,clearbit,write)
>> + */
>> +static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>> + u32 offset, int mode)
>> +{
>> + int word_offset = offset & 3;
>> + int shift = word_offset * 8;
>> + u32 mask;
>> + u32 regval;
>> +
>> + /*
>> + * Any alignment fault should have been delivered to the guest
>> + * directly (ARM ARM B3.12.7 "Prioritization of aborts").
>> + */
>> +
>> + mask = (~0U) >> (word_offset * 8);
>> + if (reg)
>> + regval = *reg;
>> + else {
>> + BUG_ON(mode != (ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED));
>> + regval = 0;
>> + }
>> +
>> + if (mmio->is_write) {
>> + u32 data = (*((u32 *)mmio->data) & mask) << shift;
>> + switch (ACCESS_WRITE_MASK(mode)) {
>> + case ACCESS_WRITE_IGNORED:
>> + return;
>> +
>> + case ACCESS_WRITE_SETBIT:
>> + regval |= data;
>> + break;
>> +
>> + case ACCESS_WRITE_CLEARBIT:
>> + regval &= ~data;
>> + break;
>> +
>> + case ACCESS_WRITE_VALUE:
>> + regval = (regval & ~(mask << shift)) | data;
>> + break;
>> + }
>> + *reg = regval;
>> + } else {
>> + switch (ACCESS_READ_MASK(mode)) {
>> + case ACCESS_READ_RAZ:
>> + regval = 0;
>> + /* fall through */
>> +
>> + case ACCESS_READ_VALUE:
>> + *((u32 *)mmio->data) = (regval >> shift) & mask;
>> + }
>> + }
>> +}
>> +
>> +/* All this should be handled by kvm_bus_io_*()... FIXME!!! */
>> +struct mmio_range {
>> + unsigned long base;
>> + unsigned long len;
>> + bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
>> + u32 offset);
>> +};
>> +
>> +static const struct mmio_range vgic_ranges[] = {
>> + {}
>> +};
>> +
>> +static const
>> +struct mmio_range *find_matching_range(const struct mmio_range *ranges,
>> + struct kvm_exit_mmio *mmio,
>> + unsigned long base)
>> +{
>> + const struct mmio_range *r = ranges;
>> + unsigned long addr = mmio->phys_addr - base;
>> +
>> + while (r->len) {
>> + if (addr >= r->base &&
>> + (addr + mmio->len) <= (r->base + r->len))
>> + return r;
>> + r++;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +/**
>> + * vgic_handle_mmio - handle an in-kernel MMIO access
>> + * @vcpu: pointer to the vcpu performing the access
>> + * @mmio: pointer to the data describing the access
>
> Can we also have @run here?
>
>> + *
>> + * returns true if the MMIO access has been performed in kernel space,
>> + * and false if it needs to be emulated in user space.
>> + */
>> +bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio)
>> +{
>> + return KVM_EXIT_MMIO;
>> +}
>>
thanks,
fixed
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-11-12 8:56 ` Dong Aisheng
@ 2012-11-13 13:35 ` Christoffer Dall
0 siblings, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-13 13:35 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 12, 2012 at 3:56 AM, Dong Aisheng <b29396@freescale.com> wrote:
> On Sat, Nov 10, 2012 at 04:44:51PM +0100, Christoffer Dall wrote:
> [...]
>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>> +{
>> + int r = 0;
>> + struct vgic_dist *vgic = &kvm->arch.vgic;
>> +
>> + if (addr & ~KVM_PHYS_MASK)
>> + return -E2BIG;
>> +
>> + if (addr & ~PAGE_MASK)
>> + return -EINVAL;
>> +
>> + mutex_lock(&kvm->lock);
>> + switch (type) {
>> + case KVM_VGIC_V2_ADDR_TYPE_DIST:
>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
>> + return -EEXIST;
>> + if (addr + VGIC_DIST_SIZE < addr)
>> + return -EINVAL;
>> + kvm->arch.vgic.vgic_dist_base = addr;
>> + break;
>> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
>> + return -EEXIST;
>> + if (addr + VGIC_CPU_SIZE < addr)
>> + return -EINVAL;
>> + kvm->arch.vgic.vgic_cpu_base = addr;
>> + break;
>> + default:
>> + r = -ENODEV;
>> + }
>> +
>> + if (vgic_ioaddr_overlap(kvm)) {
>> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>
> Missing mutex_unlock?
indeed, should be r = -EINVAL.
nice catch!
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 06/13] ARM: KVM: VGIC distributor handling
2012-11-12 9:29 ` Dong Aisheng
@ 2012-11-13 13:38 ` Christoffer Dall
0 siblings, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-13 13:38 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 12, 2012 at 4:29 AM, Dong Aisheng <b29396@freescale.com> wrote:
> On Sat, Nov 10, 2012 at 04:44:58PM +0100, Christoffer Dall wrote:
> [...]
>> @@ -141,7 +519,98 @@ struct mmio_range *find_matching_range(const struct mmio_range *ranges,
>> */
>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exit_mmio *mmio)
>> {
>> - return KVM_EXIT_MMIO;
>> + const struct mmio_range *range;
>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> + unsigned long base = dist->vgic_dist_base;
>> + bool updated_state;
>> +
>> + if (!irqchip_in_kernel(vcpu->kvm) ||
>> + mmio->phys_addr < base ||
>> + (mmio->phys_addr + mmio->len) > (base + dist->vgic_dist_size))
>> + return false;
>> +
>> + range = find_matching_range(vgic_ranges, mmio, base);
>> + if (unlikely(!range || !range->handle_mmio)) {
>> + pr_warn("Unhandled access %d %08llx %d\n",
>> + mmio->is_write, mmio->phys_addr, mmio->len);
>> + return false;
>> + }
>> +
>> + spin_lock(&vcpu->kvm->arch.vgic.lock);
>> + updated_state = range->handle_mmio(vcpu, mmio,mmio->phys_addr - range->base - base);
> Missing space after ','.
> Checkpatch may fail here.
>
thanks,
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus
2012-11-10 15:44 ` [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus Christoffer Dall
@ 2012-11-28 12:47 ` Will Deacon
2012-11-28 13:15 ` Marc Zyngier
2012-11-30 22:39 ` Christoffer Dall
0 siblings, 2 replies; 58+ messages in thread
From: Will Deacon @ 2012-11-28 12:47 UTC (permalink / raw)
To: linux-arm-kernel
Just a bunch of typos in this one :)
On Sat, Nov 10, 2012 at 03:44:30PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> When an interrupt occurs for the guest, it is sometimes necessary
> to find out which vcpu was running at that point.
>
> Keep track of which vcpu is being tun in kvm_arch_vcpu_ioctl_run(),
run
> and allow the data to be retrived using either:
retrieved
> - kvm_arm_get_running_vcpu(): returns the vcpu running at this point
> on the current CPU. Can only be used in a non-preemptable context.
preemptible
> - kvm_arm_get_running_vcpus(): returns the per-CPU variable holding
> the the running vcpus, useable for per-CPU interrupts.
-the
usable
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_host.h | 10 ++++++++++
> arch/arm/kvm/arm.c | 30 ++++++++++++++++++++++++++++++
> 2 files changed, 40 insertions(+)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index e7fc249..e66cd56 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -154,4 +154,14 @@ static inline int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
> {
> return 0;
> }
> +
> +struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
> +struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
DECLARE_PER_CPU?
> +int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> +unsigned long kvm_arm_num_coproc_regs(struct kvm_vcpu *vcpu);
> +struct kvm_one_reg;
> +int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
> +int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
> +
> #endif /* __ARM_KVM_HOST_H__ */
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 2cdc07b..60b119a 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -53,11 +53,38 @@ static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
> static struct vfp_hard_struct __percpu *kvm_host_vfp_state;
> static unsigned long hyp_default_vectors;
>
> +/* Per-CPU variable containing the currently running vcpu. */
> +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
> +
> /* The VMID used in the VTTBR */
> static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
> static u8 kvm_next_vmid;
> static DEFINE_SPINLOCK(kvm_vmid_lock);
>
> +static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
> +{
> + BUG_ON(preemptible());
> + __get_cpu_var(kvm_arm_running_vcpu) = vcpu;
> +}
> +
> +/**
> + * kvm_arm_get_running_vcpu - get the vcpu running on the current CPU.
> + * Must be called from non-preemptible context
> + */
> +struct kvm_vcpu *kvm_arm_get_running_vcpu(void)
> +{
> + BUG_ON(preemptible());
> + return __get_cpu_var(kvm_arm_running_vcpu);
> +}
> +
> +/**
> + * kvm_arm_get_running_vcpus - get the per-CPU array on currently running vcpus.
> + */
s/on/of/ ?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support
2012-11-10 15:44 ` [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support Christoffer Dall
@ 2012-11-28 12:49 ` Will Deacon
2012-11-28 13:09 ` Marc Zyngier
2012-12-01 2:19 ` Christoffer Dall
0 siblings, 2 replies; 58+ messages in thread
From: Will Deacon @ 2012-11-28 12:49 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:44:37PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Wire the basic framework code for VGIC support. Nothing to enable
> yet.
Again, not sure how useful this patch is. Might as well merge it with code
that actually does something. Couple of comments inline anyway...
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_host.h | 7 ++++
> arch/arm/include/asm/kvm_vgic.h | 70 +++++++++++++++++++++++++++++++++++++++
> arch/arm/kvm/arm.c | 21 +++++++++++-
> arch/arm/kvm/interrupts.S | 4 ++
> arch/arm/kvm/mmio.c | 3 ++
> virt/kvm/kvm_main.c | 5 ++-
> 6 files changed, 107 insertions(+), 3 deletions(-)
> create mode 100644 arch/arm/include/asm/kvm_vgic.h
[...]
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 60b119a..426828a 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -183,6 +183,9 @@ int kvm_dev_ioctl_check_extension(long ext)
> {
> int r;
> switch (ext) {
> +#ifdef CONFIG_KVM_ARM_VGIC
> + case KVM_CAP_IRQCHIP:
> +#endif
> case KVM_CAP_USER_MEMORY:
> case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
> case KVM_CAP_ONE_REG:
> @@ -304,6 +307,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
> {
> /* Force users to call KVM_ARM_VCPU_INIT */
> vcpu->arch.target = -1;
> +
> + /* Set up VGIC */
> + kvm_vgic_vcpu_init(vcpu);
> +
> return 0;
> }
>
> @@ -363,7 +370,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
> */
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> {
> - return !!v->arch.irq_lines;
> + return !!v->arch.irq_lines || kvm_vgic_vcpu_pending_irq(v);
> }
So interrupt injection without the in-kernel GIC updates irq_lines, but the
in-kernel GIC has its own separate data structures? Why can't the in-kernel GIC
just use irq_lines instead of irq_pending_on_cpu?
>
> int kvm_arch_vcpu_in_guest_mode(struct kvm_vcpu *v)
> @@ -633,6 +640,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>
> update_vttbr(vcpu->kvm);
>
> + kvm_vgic_sync_to_cpu(vcpu);
> +
> local_irq_disable();
>
> /*
> @@ -645,6 +654,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>
> if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
> local_irq_enable();
> + kvm_vgic_sync_from_cpu(vcpu);
> continue;
> }
For VFP, we use different terminology (sync and flush). I don't think they're
any clearer than what you have, but the consistency would be nice.
Given that both these functions are run with interrupts enabled, why doesn't
the second require a lock for updating dist->irq_pending_on_cpu? I notice
there's a random smp_mb() over there...
>
> @@ -683,6 +693,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> * Back from guest
> *************************************************************/
>
> + kvm_vgic_sync_from_cpu(vcpu);
Likewise.
> ret = handle_exit(vcpu, run, ret);
> }
>
> @@ -965,6 +977,13 @@ static int init_hyp_mode(void)
> }
> }
>
> + /*
> + * Init HYP view of VGIC
> + */
> + err = kvm_vgic_hyp_init();
> + if (err)
> + goto out_free_mappings;
> +
> return 0;
> out_free_vfp:
> free_percpu(kvm_host_vfp_state);
[...]
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2fb7319..665af96 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1880,12 +1880,13 @@ static long kvm_vcpu_ioctl(struct file *filp,
> if (vcpu->kvm->mm != current->mm)
> return -EIO;
>
> -#if defined(CONFIG_S390) || defined(CONFIG_PPC)
> +#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_ARM)
> /*
> * Special cases: vcpu ioctls that are asynchronous to vcpu execution,
> * so vcpu_load() would break it.
> */
> - if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT)
> + if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT ||
> + ioctl == KVM_IRQ_LINE)
> return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
> #endif
Separate patch?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code
2012-11-10 15:44 ` [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code Christoffer Dall
2012-11-12 8:54 ` Dong Aisheng
@ 2012-11-28 13:09 ` Will Deacon
2012-11-28 13:44 ` Marc Zyngier
1 sibling, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-11-28 13:09 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:44:44PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Wire the initial in-kernel MMIO support code for the VGIC, used
> for the distributor emulation.
[...]
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> new file mode 100644
> index 0000000..26ada3b
> --- /dev/null
> +++ b/arch/arm/kvm/vgic.c
> @@ -0,0 +1,138 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +
> +#include <linux/kvm.h>
> +#include <linux/kvm_host.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <asm/kvm_emulate.h>
> +
> +#define ACCESS_READ_VALUE (1 << 0)
> +#define ACCESS_READ_RAZ (0 << 0)
> +#define ACCESS_READ_MASK(x) ((x) & (1 << 0))
> +#define ACCESS_WRITE_IGNORED (0 << 1)
> +#define ACCESS_WRITE_SETBIT (1 << 1)
> +#define ACCESS_WRITE_CLEARBIT (2 << 1)
> +#define ACCESS_WRITE_VALUE (3 << 1)
> +#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
> +
> +/**
> + * vgic_reg_access - access vgic register
> + * @mmio: pointer to the data describing the mmio access
> + * @reg: pointer to the virtual backing of the vgic distributor struct
> + * @offset: least significant 2 bits used for word offset
> + * @mode: ACCESS_ mode (see defines above)
> + *
> + * Helper to make vgic register access easier using one of the access
> + * modes defined for vgic register access
> + * (read,raz,write-ignored,setbit,clearbit,write)
> + */
> +static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
> + u32 offset, int mode)
> +{
> + int word_offset = offset & 3;
You can get rid of this variable.
> + int shift = word_offset * 8;
shift = (offset & 3) << 3;
> + u32 mask;
> + u32 regval;
> +
> + /*
> + * Any alignment fault should have been delivered to the guest
> + * directly (ARM ARM B3.12.7 "Prioritization of aborts").
> + */
> +
> + mask = (~0U) >> (word_offset * 8);
then use shift here.
> + if (reg)
> + regval = *reg;
> + else {
Use braces for the if clause.
> + BUG_ON(mode != (ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED));
> + regval = 0;
> + }
> +
> + if (mmio->is_write) {
> + u32 data = (*((u32 *)mmio->data) & mask) << shift;
> + switch (ACCESS_WRITE_MASK(mode)) {
> + case ACCESS_WRITE_IGNORED:
> + return;
> +
> + case ACCESS_WRITE_SETBIT:
> + regval |= data;
> + break;
> +
> + case ACCESS_WRITE_CLEARBIT:
> + regval &= ~data;
> + break;
> +
> + case ACCESS_WRITE_VALUE:
> + regval = (regval & ~(mask << shift)) | data;
> + break;
> + }
> + *reg = regval;
> + } else {
> + switch (ACCESS_READ_MASK(mode)) {
> + case ACCESS_READ_RAZ:
> + regval = 0;
> + /* fall through */
> +
> + case ACCESS_READ_VALUE:
> + *((u32 *)mmio->data) = (regval >> shift) & mask;
> + }
> + }
> +}
It might be a good idea to have some port accessors for mmio->data otherwise
you'll likely get endianness issues creeping in.
> +/* All this should be handled by kvm_bus_io_*()... FIXME!!! */
I don't follow this comment :) Can you either make it clearer (and less
alarming!) or just drop it please?
> +struct mmio_range {
> + unsigned long base;
> + unsigned long len;
> + bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
> + u32 offset);
> +};
Why not make offset a phys_addr_t?
> +static const struct mmio_range vgic_ranges[] = {
> + {}
> +};
> +
> +static const
> +struct mmio_range *find_matching_range(const struct mmio_range *ranges,
> + struct kvm_exit_mmio *mmio,
> + unsigned long base)
> +{
> + const struct mmio_range *r = ranges;
> + unsigned long addr = mmio->phys_addr - base;
Same here, I don't think we want to truncate everything to unsigned long.
> + while (r->len) {
> + if (addr >= r->base &&
> + (addr + mmio->len) <= (r->base + r->len))
> + return r;
> + r++;
> + }
Hmm, does this work correctly for adjacent mmio devices where addr sits
on the boundary (i.e. the first address of the second device)?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support
2012-11-28 12:49 ` Will Deacon
@ 2012-11-28 13:09 ` Marc Zyngier
2012-11-28 14:13 ` Will Deacon
2012-12-01 2:19 ` Christoffer Dall
1 sibling, 1 reply; 58+ messages in thread
From: Marc Zyngier @ 2012-11-28 13:09 UTC (permalink / raw)
To: linux-arm-kernel
On 28/11/12 12:49, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:44:37PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Wire the basic framework code for VGIC support. Nothing to enable
>> yet.
>
> Again, not sure how useful this patch is. Might as well merge it with code
> that actually does something. Couple of comments inline anyway...
>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_host.h | 7 ++++
>> arch/arm/include/asm/kvm_vgic.h | 70 +++++++++++++++++++++++++++++++++++++++
>> arch/arm/kvm/arm.c | 21 +++++++++++-
>> arch/arm/kvm/interrupts.S | 4 ++
>> arch/arm/kvm/mmio.c | 3 ++
>> virt/kvm/kvm_main.c | 5 ++-
>> 6 files changed, 107 insertions(+), 3 deletions(-)
>> create mode 100644 arch/arm/include/asm/kvm_vgic.h
>
> [...]
>
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 60b119a..426828a 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -183,6 +183,9 @@ int kvm_dev_ioctl_check_extension(long ext)
>> {
>> int r;
>> switch (ext) {
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + case KVM_CAP_IRQCHIP:
>> +#endif
>> case KVM_CAP_USER_MEMORY:
>> case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
>> case KVM_CAP_ONE_REG:
>> @@ -304,6 +307,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>> {
>> /* Force users to call KVM_ARM_VCPU_INIT */
>> vcpu->arch.target = -1;
>> +
>> + /* Set up VGIC */
>> + kvm_vgic_vcpu_init(vcpu);
>> +
>> return 0;
>> }
>>
>> @@ -363,7 +370,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>> */
>> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>> {
>> - return !!v->arch.irq_lines;
>> + return !!v->arch.irq_lines || kvm_vgic_vcpu_pending_irq(v);
>> }
>
> So interrupt injection without the in-kernel GIC updates irq_lines, but the
> in-kernel GIC has its own separate data structures? Why can't the in-kernel GIC
> just use irq_lines instead of irq_pending_on_cpu?
They serve very different purposes:
- irq_lines directly controls the IRQ and FIQ lines (it is or-ed into
the HCR register before entering the guest)
- irq_pending_on_cpu deals with the CPU interface, and only that. Plus,
it is a kernel only thing. What triggers the interrupt on the guest is
the presence of list registers with a pending state.
You signal interrupts one way or the other.
>
>>
>> int kvm_arch_vcpu_in_guest_mode(struct kvm_vcpu *v)
>> @@ -633,6 +640,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>
>> update_vttbr(vcpu->kvm);
>>
>> + kvm_vgic_sync_to_cpu(vcpu);
>> +
>> local_irq_disable();
>>
>> /*
>> @@ -645,6 +654,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>
>> if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
>> local_irq_enable();
>> + kvm_vgic_sync_from_cpu(vcpu);
>> continue;
>> }
>
> For VFP, we use different terminology (sync and flush). I don't think they're
> any clearer than what you have, but the consistency would be nice.
Which one maps to which?
> Given that both these functions are run with interrupts enabled, why doesn't
> the second require a lock for updating dist->irq_pending_on_cpu? I notice
> there's a random smp_mb() over there...
Updating *only* irq_pending_on_cpu doesn't require the lock (set_bit()
should be safe, and I think the smp_mb() is a leftover of some debugging
hack). kvm_vgic_to_cpu() does a lot more (it picks interrupt from the
distributor, hence requires the lock to be taken).
>>
>> @@ -683,6 +693,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> * Back from guest
>> *************************************************************/
>>
>> + kvm_vgic_sync_from_cpu(vcpu);
>
> Likewise.
>
>> ret = handle_exit(vcpu, run, ret);
>> }
>>
>> @@ -965,6 +977,13 @@ static int init_hyp_mode(void)
>> }
>> }
>>
>> + /*
>> + * Init HYP view of VGIC
>> + */
>> + err = kvm_vgic_hyp_init();
>> + if (err)
>> + goto out_free_mappings;
>> +
>> return 0;
>> out_free_vfp:
>> free_percpu(kvm_host_vfp_state);
>
> [...]
>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 2fb7319..665af96 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -1880,12 +1880,13 @@ static long kvm_vcpu_ioctl(struct file *filp,
>> if (vcpu->kvm->mm != current->mm)
>> return -EIO;
>>
>> -#if defined(CONFIG_S390) || defined(CONFIG_PPC)
>> +#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_ARM)
>> /*
>> * Special cases: vcpu ioctls that are asynchronous to vcpu execution,
>> * so vcpu_load() would break it.
>> */
>> - if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT)
>> + if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT ||
>> + ioctl == KVM_IRQ_LINE)
>> return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
>> #endif
>
> Separate patch?
Probably, yes.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-11-10 15:44 ` [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space Christoffer Dall
2012-11-12 8:56 ` Dong Aisheng
@ 2012-11-28 13:11 ` Will Deacon
2012-11-28 13:22 ` [kvmarm] " Marc Zyngier
2012-12-01 2:52 ` Christoffer Dall
1 sibling, 2 replies; 58+ messages in thread
From: Will Deacon @ 2012-11-28 13:11 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:44:51PM +0000, Christoffer Dall wrote:
> User space defines the model to emulate to a guest and should therefore
> decide which addresses are used for both the virtual CPU interface
> directly mapped in the guest physical address space and for the emulated
> distributor interface, which is mapped in software by the in-kernel VGIC
> support.
>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_mmu.h | 2 +
> arch/arm/include/asm/kvm_vgic.h | 9 ++++++
> arch/arm/kvm/arm.c | 16 ++++++++++
> arch/arm/kvm/vgic.c | 61 +++++++++++++++++++++++++++++++++++++++
> 4 files changed, 87 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 9bd0508..0800531 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -26,6 +26,8 @@
> * To save a bit of memory and to avoid alignment issues we assume 39-bit IPA
> * for now, but remember that the level-1 table must be aligned to its size.
> */
> +#define KVM_PHYS_SHIFT (38)
Seems a bit small...
> +#define KVM_PHYS_MASK ((1ULL << KVM_PHYS_SHIFT) - 1)
> #define PTRS_PER_PGD2 512
> #define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index b444ecf..9ca8d21 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -20,6 +20,9 @@
> #define __ASM_ARM_KVM_VGIC_H
>
> struct vgic_dist {
> + /* Distributor and vcpu interface mapping in the guest */
> + phys_addr_t vgic_dist_base;
> + phys_addr_t vgic_cpu_base;
> };
>
> struct vgic_cpu {
> @@ -31,6 +34,7 @@ struct kvm_run;
> struct kvm_exit_mmio;
>
> #ifdef CONFIG_KVM_ARM_VGIC
> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> struct kvm_exit_mmio *mmio);
>
> @@ -40,6 +44,11 @@ static inline int kvm_vgic_hyp_init(void)
> return 0;
> }
>
> +static inline int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
> +{
> + return 0;
> +}
> +
> static inline int kvm_vgic_init(struct kvm *kvm)
> {
> return 0;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 426828a..3ac1aab 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
> static u8 kvm_next_vmid;
> static DEFINE_SPINLOCK(kvm_vmid_lock);
>
> +static bool vgic_present;
> +
> static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
> {
> BUG_ON(preemptible());
> @@ -825,7 +827,19 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> static int kvm_vm_ioctl_set_device_address(struct kvm *kvm,
> struct kvm_device_address *dev_addr)
> {
> - return -ENODEV;
> + unsigned long dev_id, type;
> +
> + dev_id = (dev_addr->id & KVM_DEVICE_ID_MASK) >> KVM_DEVICE_ID_SHIFT;
> + type = (dev_addr->id & KVM_DEVICE_TYPE_MASK) >> KVM_DEVICE_TYPE_SHIFT;
> +
> + switch (dev_id) {
> + case KVM_ARM_DEVICE_VGIC_V2:
> + if (!vgic_present)
> + return -ENXIO;
> + return kvm_vgic_set_addr(kvm, type, dev_addr->addr);
> + default:
> + return -ENODEV;
> + }
> }
>
> long kvm_arch_vm_ioctl(struct file *filp,
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index 26ada3b..f85b275 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -22,6 +22,13 @@
> #include <linux/io.h>
> #include <asm/kvm_emulate.h>
>
> +#define VGIC_ADDR_UNDEF (-1)
> +#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
> +
> +#define VGIC_DIST_SIZE 0x1000
> +#define VGIC_CPU_SIZE 0x2000
These defines might be useful to userspace so that they don't request the
distributor and the cpu interface to be place too close together (been there,
done that :).
> +
> +
> #define ACCESS_READ_VALUE (1 << 0)
> #define ACCESS_READ_RAZ (0 << 0)
> #define ACCESS_READ_MASK(x) ((x) & (1 << 0))
> @@ -136,3 +143,57 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
> {
> return KVM_EXIT_MMIO;
> }
> +
> +static bool vgic_ioaddr_overlap(struct kvm *kvm)
> +{
> + phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
> + phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
> +
> + if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
> + return false;
> + if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
> + (cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
> + return true;
> + return false;
Just return the predicate that you're testing.
> +}
> +
> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
> +{
> + int r = 0;
> + struct vgic_dist *vgic = &kvm->arch.vgic;
> +
> + if (addr & ~KVM_PHYS_MASK)
> + return -E2BIG;
> +
> + if (addr & ~PAGE_MASK)
> + return -EINVAL;
> +
> + mutex_lock(&kvm->lock);
> + switch (type) {
> + case KVM_VGIC_V2_ADDR_TYPE_DIST:
> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
> + return -EEXIST;
> + if (addr + VGIC_DIST_SIZE < addr)
> + return -EINVAL;
I think somebody else pointed out the missing mutex_unlocks on the failure
paths.
> + kvm->arch.vgic.vgic_dist_base = addr;
> + break;
> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
> + return -EEXIST;
> + if (addr + VGIC_CPU_SIZE < addr)
> + return -EINVAL;
> + kvm->arch.vgic.vgic_cpu_base = addr;
> + break;
> + default:
> + r = -ENODEV;
> + }
> +
> + if (vgic_ioaddr_overlap(kvm)) {
> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
> + return -EINVAL;
Perhaps we could put all the address checking in one place, so that the wrapping
round zero checks and the overlap checks can be in the same function?
> + }
> +
> + mutex_unlock(&kvm->lock);
> + return r;
> +}
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus
2012-11-28 12:47 ` Will Deacon
@ 2012-11-28 13:15 ` Marc Zyngier
2012-11-30 22:39 ` Christoffer Dall
1 sibling, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-11-28 13:15 UTC (permalink / raw)
To: linux-arm-kernel
On 28/11/12 12:47, Will Deacon wrote:
> Just a bunch of typos in this one :)
Typos? me? ;-)
>
> On Sat, Nov 10, 2012 at 03:44:30PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> When an interrupt occurs for the guest, it is sometimes necessary
>> to find out which vcpu was running at that point.
>>
>> Keep track of which vcpu is being tun in kvm_arch_vcpu_ioctl_run(),
>
> run
>
>> and allow the data to be retrived using either:
>
> retrieved
>
>> - kvm_arm_get_running_vcpu(): returns the vcpu running at this point
>> on the current CPU. Can only be used in a non-preemptable context.
>
> preemptible
>
>> - kvm_arm_get_running_vcpus(): returns the per-CPU variable holding
>> the the running vcpus, useable for per-CPU interrupts.
>
> -the
> usable
>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_host.h | 10 ++++++++++
>> arch/arm/kvm/arm.c | 30 ++++++++++++++++++++++++++++++
>> 2 files changed, 40 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index e7fc249..e66cd56 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -154,4 +154,14 @@ static inline int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
>> {
>> return 0;
>> }
>> +
>> +struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
>> +struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
>
> DECLARE_PER_CPU?
Ah, nice one. I didn't even now it existed!
>> +int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
>> +unsigned long kvm_arm_num_coproc_regs(struct kvm_vcpu *vcpu);
>> +struct kvm_one_reg;
>> +int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
>> +int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
>> +
>> #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 2cdc07b..60b119a 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -53,11 +53,38 @@ static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
>> static struct vfp_hard_struct __percpu *kvm_host_vfp_state;
>> static unsigned long hyp_default_vectors;
>>
>> +/* Per-CPU variable containing the currently running vcpu. */
>> +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
>> +
>> /* The VMID used in the VTTBR */
>> static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>> static u8 kvm_next_vmid;
>> static DEFINE_SPINLOCK(kvm_vmid_lock);
>>
>> +static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>> +{
>> + BUG_ON(preemptible());
>> + __get_cpu_var(kvm_arm_running_vcpu) = vcpu;
>> +}
>> +
>> +/**
>> + * kvm_arm_get_running_vcpu - get the vcpu running on the current CPU.
>> + * Must be called from non-preemptible context
>> + */
>> +struct kvm_vcpu *kvm_arm_get_running_vcpu(void)
>> +{
>> + BUG_ON(preemptible());
>> + return __get_cpu_var(kvm_arm_running_vcpu);
>> +}
>> +
>> +/**
>> + * kvm_arm_get_running_vcpus - get the per-CPU array on currently running vcpus.
>> + */
>
> s/on/of/ ?
Indeed.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 06/13] ARM: KVM: VGIC distributor handling
2012-11-10 15:44 ` [PATCH v4 06/13] ARM: KVM: VGIC distributor handling Christoffer Dall
2012-11-12 9:29 ` Dong Aisheng
@ 2012-11-28 13:21 ` Will Deacon
2012-11-28 14:35 ` Marc Zyngier
1 sibling, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-11-28 13:21 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:44:58PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Add the GIC distributor emulation code. A number of the GIC features
> are simply ignored as they are not required to boot a Linux guest.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_vgic.h | 167 ++++++++++++++
> arch/arm/kvm/vgic.c | 471 +++++++++++++++++++++++++++++++++++++++
> 2 files changed, 637 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index 9ca8d21..9e60b1d 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -19,10 +19,177 @@
> #ifndef __ASM_ARM_KVM_VGIC_H
> #define __ASM_ARM_KVM_VGIC_H
>
> +#include <linux/kernel.h>
> +#include <linux/kvm.h>
> +#include <linux/kvm_host.h>
> +#include <linux/irqreturn.h>
> +#include <linux/spinlock.h>
> +#include <linux/types.h>
> +
> +#define VGIC_NR_IRQS 128
#define VGIC_NR_PRIVATE_IRQS 32?
> +#define VGIC_NR_SHARED_IRQS (VGIC_NR_IRQS - 32)
then subtract it here
> +#define VGIC_MAX_CPUS NR_CPUS
We already have KVM_MAX_VCPUS, why do we need another?
> +
> +/* Sanity checks... */
> +#if (VGIC_MAX_CPUS > 8)
> +#error Invalid number of CPU interfaces
> +#endif
> +
> +#if (VGIC_NR_IRQS & 31)
> +#error "VGIC_NR_IRQS must be a multiple of 32"
> +#endif
> +
> +#if (VGIC_NR_IRQS > 1024)
> +#error "VGIC_NR_IRQS must be <= 1024"
> +#endif
Maybe put each check directly below the #define being tested, to make it
super-obvious to people thinking of changing the constants?
> +/*
> + * The GIC distributor registers describing interrupts have two parts:
> + * - 32 per-CPU interrupts (SGI + PPI)
> + * - a bunch of shared interrups (SPI)
interrupts
> + */
> +struct vgic_bitmap {
> + union {
> + u32 reg[1];
> + unsigned long reg_ul[0];
> + } percpu[VGIC_MAX_CPUS];
> + union {
> + u32 reg[VGIC_NR_SHARED_IRQS / 32];
> + unsigned long reg_ul[0];
> + } shared;
> +};
Whoa, this is nasty!
Firstly, let's replace the `32' with sizeof(u32) for fun. Secondly, can
we make the reg_ul arrays sized using the BITS_TO_LONGS macro?
> +
> +static inline u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x,
> + int cpuid, u32 offset)
> +{
> + offset >>= 2;
> + BUG_ON(offset > (VGIC_NR_IRQS / 32));
Hmm, where is offset sanity-checked before here? Do you just rely on all
trapped accesses being valid?
> + if (!offset)
> + return x->percpu[cpuid].reg;
> + else
> + return x->shared.reg + offset - 1;
An alternative to this would be to have a single array, with the per-cpu
interrupts all laid out at the start and a macro to convert an offset to
an index. Might make the code more readable and the struct definition more
concise.
> +}
> +
> +static inline int vgic_bitmap_get_irq_val(struct vgic_bitmap *x,
> + int cpuid, int irq)
> +{
> + if (irq < 32)
VGIC_NR_PRIVATE_IRQS (inless you go with the suggestion above)
> + return test_bit(irq, x->percpu[cpuid].reg_ul);
> +
> + return test_bit(irq - 32, x->shared.reg_ul);
> +}
> +
> +static inline void vgic_bitmap_set_irq_val(struct vgic_bitmap *x,
> + int cpuid, int irq, int val)
> +{
> + unsigned long *reg;
> +
> + if (irq < 32)
> + reg = x->percpu[cpuid].reg_ul;
> + else {
> + reg = x->shared.reg_ul;
> + irq -= 32;
> + }
Likewise.
> +
> + if (val)
> + set_bit(irq, reg);
> + else
> + clear_bit(irq, reg);
> +}
> +
> +static inline unsigned long *vgic_bitmap_get_cpu_map(struct vgic_bitmap *x,
> + int cpuid)
> +{
> + if (unlikely(cpuid >= VGIC_MAX_CPUS))
> + return NULL;
> + return x->percpu[cpuid].reg_ul;
> +}
> +
> +static inline unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x)
> +{
> + return x->shared.reg_ul;
> +}
> +
> +struct vgic_bytemap {
> + union {
> + u32 reg[8];
> + unsigned long reg_ul[0];
> + } percpu[VGIC_MAX_CPUS];
> + union {
> + u32 reg[VGIC_NR_SHARED_IRQS / 4];
> + unsigned long reg_ul[0];
> + } shared;
> +};
Argh, it's another one! :)
> +
> +static inline u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x,
> + int cpuid, u32 offset)
> +{
> + offset >>= 2;
> + BUG_ON(offset > (VGIC_NR_IRQS / 4));
> + if (offset < 4)
> + return x->percpu[cpuid].reg + offset;
> + else
> + return x->shared.reg + offset - 8;
> +}
> +
> +static inline int vgic_bytemap_get_irq_val(struct vgic_bytemap *x,
> + int cpuid, int irq)
> +{
> + u32 *reg, shift;
> + shift = (irq & 3) * 8;
> + reg = vgic_bytemap_get_reg(x, cpuid, irq);
> + return (*reg >> shift) & 0xff;
> +}
> +
> +static inline void vgic_bytemap_set_irq_val(struct vgic_bytemap *x,
> + int cpuid, int irq, int val)
> +{
> + u32 *reg, shift;
> + shift = (irq & 3) * 8;
> + reg = vgic_bytemap_get_reg(x, cpuid, irq);
> + *reg &= ~(0xff << shift);
> + *reg |= (val & 0xff) << shift;
> +}
> +
> struct vgic_dist {
> +#ifdef CONFIG_KVM_ARM_VGIC
> + spinlock_t lock;
> +
> + /* Virtual control interface mapping */
> + void __iomem *vctrl_base;
> +
> /* Distributor and vcpu interface mapping in the guest */
> phys_addr_t vgic_dist_base;
> phys_addr_t vgic_cpu_base;
> +
> + /* Distributor enabled */
> + u32 enabled;
> +
> + /* Interrupt enabled (one bit per IRQ) */
> + struct vgic_bitmap irq_enabled;
> +
> + /* Interrupt 'pin' level */
> + struct vgic_bitmap irq_state;
> +
> + /* Level-triggered interrupt in progress */
> + struct vgic_bitmap irq_active;
> +
> + /* Interrupt priority. Not used yet. */
> + struct vgic_bytemap irq_priority;
What would the bitmap component of the bytemap represent for priorities?
> +
> + /* Level/edge triggered */
> + struct vgic_bitmap irq_cfg;
> +
> + /* Source CPU per SGI and target CPU */
> + u8 irq_sgi_sources[VGIC_MAX_CPUS][16];
Ah, I guess my VGIC_NR_PRIVATE_IRQS interrupt should be further divided...
> + /* Target CPU for each IRQ */
> + u8 irq_spi_cpu[VGIC_NR_SHARED_IRQS];
> + struct vgic_bitmap irq_spi_target[VGIC_MAX_CPUS];
> +
> + /* Bitmap indicating which CPU has something pending */
> + unsigned long irq_pending_on_cpu;
> +#endif
> };
>
> struct vgic_cpu {
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index f85b275..82feee8 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -22,6 +22,42 @@
> #include <linux/io.h>
> #include <asm/kvm_emulate.h>
>
> +/*
> + * How the whole thing works (courtesy of Christoffer Dall):
> + *
> + * - At any time, the dist->irq_pending_on_cpu is the oracle that knows if
> + * something is pending
> + * - VGIC pending interrupts are stored on the vgic.irq_state vgic
> + * bitmap (this bitmap is updated by both user land ioctls and guest
> + * mmio ops) and indicate the 'wire' state.
> + * - Every time the bitmap changes, the irq_pending_on_cpu oracle is
> + * recalculated
> + * - To calculate the oracle, we need info for each cpu from
> + * compute_pending_for_cpu, which considers:
> + * - PPI: dist->irq_state & dist->irq_enable
> + * - SPI: dist->irq_state & dist->irq_enable & dist->irq_spi_target
> + * - irq_spi_target is a 'formatted' version of the GICD_ICFGR
> + * registers, stored on each vcpu. We only keep one bit of
> + * information per interrupt, making sure that only one vcpu can
> + * accept the interrupt.
> + * - The same is true when injecting an interrupt, except that we only
> + * consider a single interrupt at a time. The irq_spi_cpu array
> + * contains the target CPU for each SPI.
> + *
> + * The handling of level interrupts adds some extra complexity. We
> + * need to track when the interrupt has been EOIed, so we can sample
> + * the 'line' again. This is achieved as such:
> + *
> + * - When a level interrupt is moved onto a vcpu, the corresponding
> + * bit in irq_active is set. As long as this bit is set, the line
> + * will be ignored for further interrupts. The interrupt is injected
> + * into the vcpu with the VGIC_LR_EOI bit set (generate a
> + * maintenance interrupt on EOI).
> + * - When the interrupt is EOIed, the maintenance interrupt fires,
> + * and clears the corresponding bit in irq_active. This allow the
> + * interrupt line to be sampled again.
> + */
> +
> #define VGIC_ADDR_UNDEF (-1)
> #define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
>
> @@ -38,6 +74,14 @@
> #define ACCESS_WRITE_VALUE (3 << 1)
> #define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>
> +static void vgic_update_state(struct kvm *kvm);
> +static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
> +
> +static inline int vgic_irq_is_edge(struct vgic_dist *dist, int irq)
> +{
> + return vgic_bitmap_get_irq_val(&dist->irq_cfg, 0, irq);
> +}
so vgic_bitmap_get_irq_val returns 0 for level and anything else for edge?
Maybe an enum or something could make this clearer? Also, why not take a vcpu
or cpuid parameter to pass through, rather than assuming 0?
> +
> /**
> * vgic_reg_access - access vgic register
> * @mmio: pointer to the data describing the mmio access
> @@ -101,6 +145,280 @@ static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
> }
> }
>
> +static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 reg;
> + u32 u32off = offset & 3;
u32off? Bitten by a regex perhaps?
> +
> + switch (offset & ~3) {
> + case 0: /* CTLR */
> + reg = vcpu->kvm->arch.vgic.enabled;
> + vgic_reg_access(mmio, ®, u32off,
> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> + if (mmio->is_write) {
> + vcpu->kvm->arch.vgic.enabled = reg & 1;
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> + break;
> +
> + case 4: /* TYPER */
> + reg = (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
> + reg |= (VGIC_NR_IRQS >> 5) - 1;
> + vgic_reg_access(mmio, ®, u32off,
> + ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> + break;
> +
> + case 8: /* IIDR */
> + reg = 0x4B00043B;
> + vgic_reg_access(mmio, ®, u32off,
> + ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> + break;
> + }
> +
> + return false;
> +}
> +
> +static bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + vgic_reg_access(mmio, NULL, offset,
> + ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
> + return false;
> +}
> +
> +static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
> + vcpu->vcpu_id, offset);
> + vgic_reg_access(mmio, reg, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
> + if (mmio->is_write) {
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
> + vcpu->vcpu_id, offset);
> + vgic_reg_access(mmio, reg, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
> + if (mmio->is_write) {
> + if (offset < 4) /* Force SGI enabled */
> + *reg |= 0xffff;
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_state,
> + vcpu->vcpu_id, offset);
> + vgic_reg_access(mmio, reg, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
> + if (mmio->is_write) {
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_state,
> + vcpu->vcpu_id, offset);
> + vgic_reg_access(mmio, reg, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
> + if (mmio->is_write) {
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 *reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
> + vcpu->vcpu_id, offset);
> + vgic_reg_access(mmio, reg, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> + return false;
> +}
What do you gain from returning a bool from the MMIO handlers? Why not assume
that state has always been updated and kick the vcpus if something is pending?
> +
> +static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
> +{
> + struct vgic_dist *dist = &kvm->arch.vgic;
> + struct kvm_vcpu *vcpu;
> + int i, c;
> + unsigned long *bmap;
> + u32 val = 0;
> +
> + BUG_ON(irq & 3);
> + BUG_ON(irq < 32);
Again, these look scary because I can't see the offset sanity checking for
the MMIO traps...
> +
> + irq -= 32;
> +
> + kvm_for_each_vcpu(c, vcpu, kvm) {
> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
> + for (i = 0; i < 4; i++)
Is that 4 from sizeof(unsigned long)?
> + if (test_bit(irq + i, bmap))
> + val |= 1 << (c + i * 8);
> + }
> +
> + return val;
> +}
> +
> +static void vgic_set_target_reg(struct kvm *kvm, u32 val, int irq)
> +{
> + struct vgic_dist *dist = &kvm->arch.vgic;
> + struct kvm_vcpu *vcpu;
> + int i, c;
> + unsigned long *bmap;
> + u32 target;
> +
> + BUG_ON(irq & 3);
> + BUG_ON(irq < 32);
> +
> + irq -= 32;
> +
> + /*
> + * Pick the LSB in each byte. This ensures we target exactly
> + * one vcpu per IRQ. If the byte is null, assume we target
> + * CPU0.
> + */
> + for (i = 0; i < 4; i++) {
> + int shift = i * 8;
Is this from BITS_PER_BYTE?
> + target = ffs((val >> shift) & 0xffU);
> + target = target ? (target - 1) : 0;
__ffs?
> + dist->irq_spi_cpu[irq + i] = target;
> + kvm_for_each_vcpu(c, vcpu, kvm) {
> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
> + if (c == target)
> + set_bit(irq + i, bmap);
> + else
> + clear_bit(irq + i, bmap);
> + }
> + }
> +}
> +
> +static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 reg;
> +
> + /* We treat the banked interrupts targets as read-only */
> + if (offset < 32) {
> + u32 roreg = 1 << vcpu->vcpu_id;
> + roreg |= roreg << 8;
> + roreg |= roreg << 16;
> +
> + vgic_reg_access(mmio, &roreg, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
> + return false;
> + }
> +
> + reg = vgic_get_target_reg(vcpu->kvm, offset & ~3U);
> + vgic_reg_access(mmio, ®, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> + if (mmio->is_write) {
> + vgic_set_target_reg(vcpu->kvm, reg, offset & ~3U);
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static u32 vgic_cfg_expand(u16 val)
> +{
> + u32 res = 0;
> + int i;
> +
> + for (i = 0; i < 16; i++)
> + res |= (val >> i) << (2 * i + 1);
Ok, you've lost me on this one but replacing some of the magic numbers with
the constants they represent would be much appreciated, please!
> +
> + return res;
> +}
> +
> +static u16 vgic_cfg_compress(u32 val)
> +{
> + u16 res = 0;
> + int i;
> +
> + for (i = 0; i < 16; i++)
> + res |= (val >> (i * 2 + 1)) << i;
> +
> + return res;
> +}
> +
> +/*
> + * The distributor uses 2 bits per IRQ for the CFG register, but the
> + * LSB is always 0. As such, we only keep the upper bit, and use the
> + * two above functions to compress/expand the bits
> + */
> +static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 val;
> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
> + vcpu->vcpu_id, offset >> 1);
> + if (offset & 2)
> + val = *reg >> 16;
> + else
> + val = *reg & 0xffff;
> +
> + val = vgic_cfg_expand(val);
> + vgic_reg_access(mmio, &val, offset,
> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
> + if (mmio->is_write) {
> + if (offset < 4) {
> + *reg = ~0U; /* Force PPIs/SGIs to 1 */
> + return false;
> + }
> +
> + val = vgic_cfg_compress(val);
> + if (offset & 2) {
> + *reg &= 0xffff;
> + *reg |= val << 16;
> + } else {
> + *reg &= 0xffff << 16;
> + *reg |= val;
> + }
> + }
> +
> + return false;
> +}
> +
> +static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
> + struct kvm_exit_mmio *mmio, u32 offset)
> +{
> + u32 reg;
> + vgic_reg_access(mmio, ®, offset,
> + ACCESS_READ_RAZ | ACCESS_WRITE_VALUE);
> + if (mmio->is_write) {
> + vgic_dispatch_sgi(vcpu, reg);
> + vgic_update_state(vcpu->kvm);
> + return true;
> + }
> +
> + return false;
> +}
> +
> /* All this should be handled by kvm_bus_io_*()... FIXME!!! */
> struct mmio_range {
> unsigned long base;
> @@ -110,6 +428,66 @@ struct mmio_range {
> };
>
> static const struct mmio_range vgic_ranges[] = {
> + { /* CTRL, TYPER, IIDR */
> + .base = 0,
> + .len = 12,
> + .handle_mmio = handle_mmio_misc,
> + },
> + { /* IGROUPRn */
> + .base = 0x80,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_raz_wi,
> + },
> + { /* ISENABLERn */
> + .base = 0x100,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_set_enable_reg,
> + },
> + { /* ICENABLERn */
> + .base = 0x180,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_clear_enable_reg,
> + },
> + { /* ISPENDRn */
> + .base = 0x200,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_set_pending_reg,
> + },
> + { /* ICPENDRn */
> + .base = 0x280,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_clear_pending_reg,
> + },
> + { /* ISACTIVERn */
> + .base = 0x300,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_raz_wi,
> + },
> + { /* ICACTIVERn */
> + .base = 0x380,
> + .len = VGIC_NR_IRQS / 8,
> + .handle_mmio = handle_mmio_raz_wi,
> + },
> + { /* IPRIORITYRn */
> + .base = 0x400,
> + .len = VGIC_NR_IRQS,
> + .handle_mmio = handle_mmio_priority_reg,
> + },
> + { /* ITARGETSRn */
> + .base = 0x800,
> + .len = VGIC_NR_IRQS,
> + .handle_mmio = handle_mmio_target_reg,
> + },
> + { /* ICFGRn */
> + .base = 0xC00,
> + .len = VGIC_NR_IRQS / 4,
> + .handle_mmio = handle_mmio_cfg_reg,
> + },
> + { /* SGIRn */
> + .base = 0xF00,
> + .len = 4,
> + .handle_mmio = handle_mmio_sgi_reg,
> + },
Why not #define the offset values for the base fields instead of commenting
the literals?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [kvmarm] [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-11-28 13:11 ` Will Deacon
@ 2012-11-28 13:22 ` Marc Zyngier
2012-12-01 2:52 ` Christoffer Dall
1 sibling, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-11-28 13:22 UTC (permalink / raw)
To: linux-arm-kernel
On 28/11/12 13:11, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:44:51PM +0000, Christoffer Dall wrote:
>> User space defines the model to emulate to a guest and should therefore
>> decide which addresses are used for both the virtual CPU interface
>> directly mapped in the guest physical address space and for the emulated
>> distributor interface, which is mapped in software by the in-kernel VGIC
>> support.
>>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_mmu.h | 2 +
>> arch/arm/include/asm/kvm_vgic.h | 9 ++++++
>> arch/arm/kvm/arm.c | 16 ++++++++++
>> arch/arm/kvm/vgic.c | 61 +++++++++++++++++++++++++++++++++++++++
>> 4 files changed, 87 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 9bd0508..0800531 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -26,6 +26,8 @@
>> * To save a bit of memory and to avoid alignment issues we assume 39-bit IPA
>> * for now, but remember that the level-1 table must be aligned to its size.
>> */
>> +#define KVM_PHYS_SHIFT (38)
>
> Seems a bit small...
It's now been fixed to be 40 bits.
> +#define KVM_PHYS_MASK ((1ULL << KVM_PHYS_SHIFT) - 1)
>> #define PTRS_PER_PGD2 512
>> #define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
>>
>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>> index b444ecf..9ca8d21 100644
>> --- a/arch/arm/include/asm/kvm_vgic.h
>> +++ b/arch/arm/include/asm/kvm_vgic.h
>> @@ -20,6 +20,9 @@
>> #define __ASM_ARM_KVM_VGIC_H
>>
>> struct vgic_dist {
>> + /* Distributor and vcpu interface mapping in the guest */
>> + phys_addr_t vgic_dist_base;
>> + phys_addr_t vgic_cpu_base;
>> };
>>
>> struct vgic_cpu {
>> @@ -31,6 +34,7 @@ struct kvm_run;
>> struct kvm_exit_mmio;
>>
>> #ifdef CONFIG_KVM_ARM_VGIC
>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>> struct kvm_exit_mmio *mmio);
>>
>> @@ -40,6 +44,11 @@ static inline int kvm_vgic_hyp_init(void)
>> return 0;
>> }
>>
>> +static inline int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>> +{
>> + return 0;
>> +}
>> +
>> static inline int kvm_vgic_init(struct kvm *kvm)
>> {
>> return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 426828a..3ac1aab 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>> static u8 kvm_next_vmid;
>> static DEFINE_SPINLOCK(kvm_vmid_lock);
>>
>> +static bool vgic_present;
>> +
>> static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>> {
>> BUG_ON(preemptible());
>> @@ -825,7 +827,19 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>> static int kvm_vm_ioctl_set_device_address(struct kvm *kvm,
>> struct kvm_device_address *dev_addr)
>> {
>> - return -ENODEV;
>> + unsigned long dev_id, type;
>> +
>> + dev_id = (dev_addr->id & KVM_DEVICE_ID_MASK) >> KVM_DEVICE_ID_SHIFT;
>> + type = (dev_addr->id & KVM_DEVICE_TYPE_MASK) >> KVM_DEVICE_TYPE_SHIFT;
>> +
>> + switch (dev_id) {
>> + case KVM_ARM_DEVICE_VGIC_V2:
>> + if (!vgic_present)
>> + return -ENXIO;
>> + return kvm_vgic_set_addr(kvm, type, dev_addr->addr);
>> + default:
>> + return -ENODEV;
>> + }
>> }
>>
>> long kvm_arch_vm_ioctl(struct file *filp,
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> index 26ada3b..f85b275 100644
>> --- a/arch/arm/kvm/vgic.c
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -22,6 +22,13 @@
>> #include <linux/io.h>
>> #include <asm/kvm_emulate.h>
>>
>> +#define VGIC_ADDR_UNDEF (-1)
>> +#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
>> +
>> +#define VGIC_DIST_SIZE 0x1000
>> +#define VGIC_CPU_SIZE 0x2000
>
> These defines might be useful to userspace so that they don't request the
> distributor and the cpu interface to be place too close together (been there,
> done that :).
Fair enough.
>> +
>> +
>> #define ACCESS_READ_VALUE (1 << 0)
>> #define ACCESS_READ_RAZ (0 << 0)
>> #define ACCESS_READ_MASK(x) ((x) & (1 << 0))
>> @@ -136,3 +143,57 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
>> {
>> return KVM_EXIT_MMIO;
>> }
>> +
>> +static bool vgic_ioaddr_overlap(struct kvm *kvm)
>> +{
>> + phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
>> + phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
>> +
>> + if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
>> + return false;
>> + if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
>> + (cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
>> + return true;
>> + return false;
>
> Just return the predicate that you're testing.
>
>> +}
>> +
>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>> +{
>> + int r = 0;
>> + struct vgic_dist *vgic = &kvm->arch.vgic;
>> +
>> + if (addr & ~KVM_PHYS_MASK)
>> + return -E2BIG;
>> +
>> + if (addr & ~PAGE_MASK)
>> + return -EINVAL;
>> +
>> + mutex_lock(&kvm->lock);
>> + switch (type) {
>> + case KVM_VGIC_V2_ADDR_TYPE_DIST:
>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
>> + return -EEXIST;
>> + if (addr + VGIC_DIST_SIZE < addr)
>> + return -EINVAL;
>
> I think somebody else pointed out the missing mutex_unlocks on the failure
> paths.
Yes, it's been fixed in the tree already.
>> + kvm->arch.vgic.vgic_dist_base = addr;
>> + break;
>> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
>> + return -EEXIST;
>> + if (addr + VGIC_CPU_SIZE < addr)
>> + return -EINVAL;
>> + kvm->arch.vgic.vgic_cpu_base = addr;
>> + break;
>> + default:
>> + r = -ENODEV;
>> + }
>> +
>> + if (vgic_ioaddr_overlap(kvm)) {
>> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>> + return -EINVAL;
>
> Perhaps we could put all the address checking in one place, so that the wrapping
> round zero checks and the overlap checks can be in the same function?
>
>> + }
>> +
>> + mutex_unlock(&kvm->lock);
>> + return r;
>> +}
>
> Will
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
>
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code
2012-11-28 13:09 ` Will Deacon
@ 2012-11-28 13:44 ` Marc Zyngier
0 siblings, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-11-28 13:44 UTC (permalink / raw)
To: linux-arm-kernel
On 28/11/12 13:09, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:44:44PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Wire the initial in-kernel MMIO support code for the VGIC, used
>> for the distributor emulation.
>
> [...]
>
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> new file mode 100644
>> index 0000000..26ada3b
>> --- /dev/null
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -0,0 +1,138 @@
>> +/*
>> + * Copyright (C) 2012 ARM Ltd.
>> + * Author: Marc Zyngier <marc.zyngier@arm.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
>> + */
>> +
>> +#include <linux/kvm.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <asm/kvm_emulate.h>
>> +
>> +#define ACCESS_READ_VALUE (1 << 0)
>> +#define ACCESS_READ_RAZ (0 << 0)
>> +#define ACCESS_READ_MASK(x) ((x) & (1 << 0))
>> +#define ACCESS_WRITE_IGNORED (0 << 1)
>> +#define ACCESS_WRITE_SETBIT (1 << 1)
>> +#define ACCESS_WRITE_CLEARBIT (2 << 1)
>> +#define ACCESS_WRITE_VALUE (3 << 1)
>> +#define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>> +
>> +/**
>> + * vgic_reg_access - access vgic register
>> + * @mmio: pointer to the data describing the mmio access
>> + * @reg: pointer to the virtual backing of the vgic distributor struct
>> + * @offset: least significant 2 bits used for word offset
>> + * @mode: ACCESS_ mode (see defines above)
>> + *
>> + * Helper to make vgic register access easier using one of the access
>> + * modes defined for vgic register access
>> + * (read,raz,write-ignored,setbit,clearbit,write)
>> + */
>> +static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>> + u32 offset, int mode)
>> +{
>> + int word_offset = offset & 3;
>
> You can get rid of this variable.
>
>> + int shift = word_offset * 8;
>
> shift = (offset & 3) << 3;
>
>> + u32 mask;
>> + u32 regval;
>> +
>> + /*
>> + * Any alignment fault should have been delivered to the guest
>> + * directly (ARM ARM B3.12.7 "Prioritization of aborts").
>> + */
>> +
>> + mask = (~0U) >> (word_offset * 8);
>
> then use shift here.
Sure.
>> + if (reg)
>> + regval = *reg;
>> + else {
>
> Use braces for the if clause.
Indeed.
>
>> + BUG_ON(mode != (ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED));
>> + regval = 0;
>> + }
>> +
>> + if (mmio->is_write) {
>> + u32 data = (*((u32 *)mmio->data) & mask) << shift;
>> + switch (ACCESS_WRITE_MASK(mode)) {
>> + case ACCESS_WRITE_IGNORED:
>> + return;
>> +
>> + case ACCESS_WRITE_SETBIT:
>> + regval |= data;
>> + break;
>> +
>> + case ACCESS_WRITE_CLEARBIT:
>> + regval &= ~data;
>> + break;
>> +
>> + case ACCESS_WRITE_VALUE:
>> + regval = (regval & ~(mask << shift)) | data;
>> + break;
>> + }
>> + *reg = regval;
>> + } else {
>> + switch (ACCESS_READ_MASK(mode)) {
>> + case ACCESS_READ_RAZ:
>> + regval = 0;
>> + /* fall through */
>> +
>> + case ACCESS_READ_VALUE:
>> + *((u32 *)mmio->data) = (regval >> shift) & mask;
>> + }
>> + }
>> +}
>
> It might be a good idea to have some port accessors for mmio->data otherwise
> you'll likely get endianness issues creeping in.
Aarrgh! This depends on the relative endianess of host and guest. 'm
feeling slightly sick...
>> +/* All this should be handled by kvm_bus_io_*()... FIXME!!! */
>
> I don't follow this comment :) Can you either make it clearer (and less
> alarming!) or just drop it please?
The non-alarming version should read:
/*
* I would have liked to use the kvm_bus_io_*() API instead, but
* it cannot cope with banked registers (only the VM pointer is
* passed around, and we need the vcpu). One of these days, someone
* please fix it!
*/
>
>> +struct mmio_range {
>> + unsigned long base;
>> + unsigned long len;
>> + bool (*handle_mmio)(struct kvm_vcpu *vcpu, struct kvm_exit_mmio *mmio,
>> + u32 offset);
>> +};
>
> Why not make offset a phys_addr_t?
Very good point.
>> +static const struct mmio_range vgic_ranges[] = {
>> + {}
>> +};
>> +
>> +static const
>> +struct mmio_range *find_matching_range(const struct mmio_range *ranges,
>> + struct kvm_exit_mmio *mmio,
>> + unsigned long base)
>> +{
>> + const struct mmio_range *r = ranges;
>> + unsigned long addr = mmio->phys_addr - base;
>
> Same here, I don't think we want to truncate everything to unsigned long.
Indeed.
>> + while (r->len) {
>> + if (addr >= r->base &&
>> + (addr + mmio->len) <= (r->base + r->len))
>> + return r;
>> + r++;
>> + }
>
> Hmm, does this work correctly for adjacent mmio devices where addr sits
> on the boundary (i.e. the first address of the second device)?
I think it is OK. We basically check that both ends of the access are
within a single range (we do not bother with cross range accesses).
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support
2012-11-28 13:09 ` Marc Zyngier
@ 2012-11-28 14:13 ` Will Deacon
0 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2012-11-28 14:13 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 28, 2012 at 01:09:37PM +0000, Marc Zyngier wrote:
> On 28/11/12 12:49, Will Deacon wrote:
> > On Sat, Nov 10, 2012 at 03:44:37PM +0000, Christoffer Dall wrote:
> >> @@ -363,7 +370,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
> >> */
> >> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> >> {
> >> - return !!v->arch.irq_lines;
> >> + return !!v->arch.irq_lines || kvm_vgic_vcpu_pending_irq(v);
> >> }
> >
> > So interrupt injection without the in-kernel GIC updates irq_lines, but the
> > in-kernel GIC has its own separate data structures? Why can't the in-kernel GIC
> > just use irq_lines instead of irq_pending_on_cpu?
>
> They serve very different purposes:
> - irq_lines directly controls the IRQ and FIQ lines (it is or-ed into
> the HCR register before entering the guest)
> - irq_pending_on_cpu deals with the CPU interface, and only that. Plus,
> it is a kernel only thing. What triggers the interrupt on the guest is
> the presence of list registers with a pending state.
>
> You signal interrupts one way or the other.
Ok, thanks for the explanation. I suspect that we could use (another)
cosmetic change then. How about cpui_irq_pending and hcr_irq_pending?
> >> int kvm_arch_vcpu_in_guest_mode(struct kvm_vcpu *v)
> >> @@ -633,6 +640,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>
> >> update_vttbr(vcpu->kvm);
> >>
> >> + kvm_vgic_sync_to_cpu(vcpu);
> >> +
> >> local_irq_disable();
> >>
> >> /*
> >> @@ -645,6 +654,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> >>
> >> if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
> >> local_irq_enable();
> >> + kvm_vgic_sync_from_cpu(vcpu);
> >> continue;
> >> }
> >
> > For VFP, we use different terminology (sync and flush). I don't think they're
> > any clearer than what you have, but the consistency would be nice.
>
> Which one maps to which?
sync: hardware -> data structure
flush: data structure -> hardware
> > Given that both these functions are run with interrupts enabled, why doesn't
> > the second require a lock for updating dist->irq_pending_on_cpu? I notice
> > there's a random smp_mb() over there...
>
> Updating *only* irq_pending_on_cpu doesn't require the lock (set_bit()
> should be safe, and I think the smp_mb() is a leftover of some debugging
> hack). kvm_vgic_to_cpu() does a lot more (it picks interrupt from the
> distributor, hence requires the lock to be taken).
Ok, if the barrier is just a hangover from something else and you don't have
any races with test/clear operations then you should be alright.
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 06/13] ARM: KVM: VGIC distributor handling
2012-11-28 13:21 ` Will Deacon
@ 2012-11-28 14:35 ` Marc Zyngier
0 siblings, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-11-28 14:35 UTC (permalink / raw)
To: linux-arm-kernel
On 28/11/12 13:21, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:44:58PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Add the GIC distributor emulation code. A number of the GIC features
>> are simply ignored as they are not required to boot a Linux guest.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_vgic.h | 167 ++++++++++++++
>> arch/arm/kvm/vgic.c | 471 +++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 637 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>> index 9ca8d21..9e60b1d 100644
>> --- a/arch/arm/include/asm/kvm_vgic.h
>> +++ b/arch/arm/include/asm/kvm_vgic.h
>> @@ -19,10 +19,177 @@
>> #ifndef __ASM_ARM_KVM_VGIC_H
>> #define __ASM_ARM_KVM_VGIC_H
>>
>> +#include <linux/kernel.h>
>> +#include <linux/kvm.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/irqreturn.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/types.h>
>> +
>> +#define VGIC_NR_IRQS 128
>
> #define VGIC_NR_PRIVATE_IRQS 32?
>
>> +#define VGIC_NR_SHARED_IRQS (VGIC_NR_IRQS - 32)
>
> then subtract it here
Sure.
>> +#define VGIC_MAX_CPUS NR_CPUS
>
> We already have KVM_MAX_VCPUS, why do we need another?
They really should be the same, and this NR_CPUS is a bug that has
already been fixed.
>> +
>> +/* Sanity checks... */
>> +#if (VGIC_MAX_CPUS > 8)
>> +#error Invalid number of CPU interfaces
>> +#endif
>> +
>> +#if (VGIC_NR_IRQS & 31)
>> +#error "VGIC_NR_IRQS must be a multiple of 32"
>> +#endif
>> +
>> +#if (VGIC_NR_IRQS > 1024)
>> +#error "VGIC_NR_IRQS must be <= 1024"
>> +#endif
>
> Maybe put each check directly below the #define being tested, to make it
> super-obvious to people thinking of changing the constants?
OK.
>> +/*
>> + * The GIC distributor registers describing interrupts have two parts:
>> + * - 32 per-CPU interrupts (SGI + PPI)
>> + * - a bunch of shared interrups (SPI)
>
> interrupts
>
>> + */
>> +struct vgic_bitmap {
>> + union {
>> + u32 reg[1];
>> + unsigned long reg_ul[0];
>> + } percpu[VGIC_MAX_CPUS];
>> + union {
>> + u32 reg[VGIC_NR_SHARED_IRQS / 32];
>> + unsigned long reg_ul[0];
>> + } shared;
>> +};
>
> Whoa, this is nasty!
>
> Firstly, let's replace the `32' with sizeof(u32) for fun. Secondly, can
> we make the reg_ul arrays sized using the BITS_TO_LONGS macro?
This has already been replaced with:
struct vgic_bitmap {
union {
u32 reg[1];
DECLARE_BITMAP(reg_ul, 32);
} percpu[VGIC_MAX_CPUS];
union {
u32 reg[VGIC_NR_SHARED_IRQS / 32];
DECLARE_BITMAP(reg_ul, VGIC_NR_SHARED_IRQS);
} shared;
};
which should address most of your concerns. As for the sizeof(u32), I
think assuming that u32 has a grand total of 32bits is safe enough ;-)
>> +
>> +static inline u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x,
>> + int cpuid, u32 offset)
>> +{
>> + offset >>= 2;
>> + BUG_ON(offset > (VGIC_NR_IRQS / 32));
>
> Hmm, where is offset sanity-checked before here? Do you just rely on all
> trapped accesses being valid?
You've already validated the access being in a valid range. Offset is
just derived the access address. the BUG_ON() is a leftover from early
debugging and should go away now.
>> + if (!offset)
>> + return x->percpu[cpuid].reg;
>> + else
>> + return x->shared.reg + offset - 1;
>
> An alternative to this would be to have a single array, with the per-cpu
> interrupts all laid out at the start and a macro to convert an offset to
> an index. Might make the code more readable and the struct definition more
> concise.
I'll try and see if this makes the code more palatable.
>> +}
>> +
>> +static inline int vgic_bitmap_get_irq_val(struct vgic_bitmap *x,
>> + int cpuid, int irq)
>> +{
>> + if (irq < 32)
>
> VGIC_NR_PRIVATE_IRQS (inless you go with the suggestion above)
yep.
>> + return test_bit(irq, x->percpu[cpuid].reg_ul);
>> +
>> + return test_bit(irq - 32, x->shared.reg_ul);
>> +}
>> +
>> +static inline void vgic_bitmap_set_irq_val(struct vgic_bitmap *x,
>> + int cpuid, int irq, int val)
>> +{
>> + unsigned long *reg;
>> +
>> + if (irq < 32)
>> + reg = x->percpu[cpuid].reg_ul;
>> + else {
>> + reg = x->shared.reg_ul;
>> + irq -= 32;
>> + }
>
> Likewise.
>
>> +
>> + if (val)
>> + set_bit(irq, reg);
>> + else
>> + clear_bit(irq, reg);
>> +}
>> +
>> +static inline unsigned long *vgic_bitmap_get_cpu_map(struct vgic_bitmap *x,
>> + int cpuid)
>> +{
>> + if (unlikely(cpuid >= VGIC_MAX_CPUS))
>> + return NULL;
>> + return x->percpu[cpuid].reg_ul;
>> +}
>> +
>> +static inline unsigned long *vgic_bitmap_get_shared_map(struct vgic_bitmap *x)
>> +{
>> + return x->shared.reg_ul;
>> +}
>> +
>> +struct vgic_bytemap {
>> + union {
>> + u32 reg[8];
>> + unsigned long reg_ul[0];
>> + } percpu[VGIC_MAX_CPUS];
>> + union {
>> + u32 reg[VGIC_NR_SHARED_IRQS / 4];
>> + unsigned long reg_ul[0];
>> + } shared;
>> +};
>
> Argh, it's another one! :)
Rest assured it is the last one, and it doesn't get much use either. ;-)
>> +
>> +static inline u32 *vgic_bytemap_get_reg(struct vgic_bytemap *x,
>> + int cpuid, u32 offset)
>> +{
>> + offset >>= 2;
>> + BUG_ON(offset > (VGIC_NR_IRQS / 4));
>> + if (offset < 4)
>> + return x->percpu[cpuid].reg + offset;
>> + else
>> + return x->shared.reg + offset - 8;
>> +}
>> +
>> +static inline int vgic_bytemap_get_irq_val(struct vgic_bytemap *x,
>> + int cpuid, int irq)
>> +{
>> + u32 *reg, shift;
>> + shift = (irq & 3) * 8;
>> + reg = vgic_bytemap_get_reg(x, cpuid, irq);
>> + return (*reg >> shift) & 0xff;
>> +}
>> +
>> +static inline void vgic_bytemap_set_irq_val(struct vgic_bytemap *x,
>> + int cpuid, int irq, int val)
>> +{
>> + u32 *reg, shift;
>> + shift = (irq & 3) * 8;
>> + reg = vgic_bytemap_get_reg(x, cpuid, irq);
>> + *reg &= ~(0xff << shift);
>> + *reg |= (val & 0xff) << shift;
>> +}
>> +
>> struct vgic_dist {
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + spinlock_t lock;
>> +
>> + /* Virtual control interface mapping */
>> + void __iomem *vctrl_base;
>> +
>> /* Distributor and vcpu interface mapping in the guest */
>> phys_addr_t vgic_dist_base;
>> phys_addr_t vgic_cpu_base;
>> +
>> + /* Distributor enabled */
>> + u32 enabled;
>> +
>> + /* Interrupt enabled (one bit per IRQ) */
>> + struct vgic_bitmap irq_enabled;
>> +
>> + /* Interrupt 'pin' level */
>> + struct vgic_bitmap irq_state;
>> +
>> + /* Level-triggered interrupt in progress */
>> + struct vgic_bitmap irq_active;
>> +
>> + /* Interrupt priority. Not used yet. */
>> + struct vgic_bytemap irq_priority;
>
> What would the bitmap component of the bytemap represent for priorities?
Exactly nothing. These fields will die a horrible death in a few seconds ;-)
>> +
>> + /* Level/edge triggered */
>> + struct vgic_bitmap irq_cfg;
>> +
>> + /* Source CPU per SGI and target CPU */
>> + u8 irq_sgi_sources[VGIC_MAX_CPUS][16];
>
> Ah, I guess my VGIC_NR_PRIVATE_IRQS interrupt should be further divided...
Could do indeed.
>> + /* Target CPU for each IRQ */
>> + u8 irq_spi_cpu[VGIC_NR_SHARED_IRQS];
>> + struct vgic_bitmap irq_spi_target[VGIC_MAX_CPUS];
>> +
>> + /* Bitmap indicating which CPU has something pending */
>> + unsigned long irq_pending_on_cpu;
>> +#endif
>> };
>>
>> struct vgic_cpu {
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> index f85b275..82feee8 100644
>> --- a/arch/arm/kvm/vgic.c
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -22,6 +22,42 @@
>> #include <linux/io.h>
>> #include <asm/kvm_emulate.h>
>>
>> +/*
>> + * How the whole thing works (courtesy of Christoffer Dall):
>> + *
>> + * - At any time, the dist->irq_pending_on_cpu is the oracle that knows if
>> + * something is pending
>> + * - VGIC pending interrupts are stored on the vgic.irq_state vgic
>> + * bitmap (this bitmap is updated by both user land ioctls and guest
>> + * mmio ops) and indicate the 'wire' state.
>> + * - Every time the bitmap changes, the irq_pending_on_cpu oracle is
>> + * recalculated
>> + * - To calculate the oracle, we need info for each cpu from
>> + * compute_pending_for_cpu, which considers:
>> + * - PPI: dist->irq_state & dist->irq_enable
>> + * - SPI: dist->irq_state & dist->irq_enable & dist->irq_spi_target
>> + * - irq_spi_target is a 'formatted' version of the GICD_ICFGR
>> + * registers, stored on each vcpu. We only keep one bit of
>> + * information per interrupt, making sure that only one vcpu can
>> + * accept the interrupt.
>> + * - The same is true when injecting an interrupt, except that we only
>> + * consider a single interrupt at a time. The irq_spi_cpu array
>> + * contains the target CPU for each SPI.
>> + *
>> + * The handling of level interrupts adds some extra complexity. We
>> + * need to track when the interrupt has been EOIed, so we can sample
>> + * the 'line' again. This is achieved as such:
>> + *
>> + * - When a level interrupt is moved onto a vcpu, the corresponding
>> + * bit in irq_active is set. As long as this bit is set, the line
>> + * will be ignored for further interrupts. The interrupt is injected
>> + * into the vcpu with the VGIC_LR_EOI bit set (generate a
>> + * maintenance interrupt on EOI).
>> + * - When the interrupt is EOIed, the maintenance interrupt fires,
>> + * and clears the corresponding bit in irq_active. This allow the
>> + * interrupt line to be sampled again.
>> + */
>> +
>> #define VGIC_ADDR_UNDEF (-1)
>> #define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
>>
>> @@ -38,6 +74,14 @@
>> #define ACCESS_WRITE_VALUE (3 << 1)
>> #define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>>
>> +static void vgic_update_state(struct kvm *kvm);
>> +static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
>> +
>> +static inline int vgic_irq_is_edge(struct vgic_dist *dist, int irq)
>> +{
>> + return vgic_bitmap_get_irq_val(&dist->irq_cfg, 0, irq);
>> +}
>
> so vgic_bitmap_get_irq_val returns 0 for level and anything else for edge?
> Maybe an enum or something could make this clearer? Also, why not take a vcpu
> or cpuid parameter to pass through, rather than assuming 0?
Fair enough.
>> +
>> /**
>> * vgic_reg_access - access vgic register
>> * @mmio: pointer to the data describing the mmio access
>> @@ -101,6 +145,280 @@ static void vgic_reg_access(struct kvm_exit_mmio *mmio, u32 *reg,
>> }
>> }
>>
>> +static bool handle_mmio_misc(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 reg;
>> + u32 u32off = offset & 3;
>
> u32off? Bitten by a regex perhaps?
No. It used to be "off", which described the offset in a u32. People
complained about the lack of clarity, hence the u32off. word_offset, maybe?
>> +
>> + switch (offset & ~3) {
>> + case 0: /* CTLR */
>> + reg = vcpu->kvm->arch.vgic.enabled;
>> + vgic_reg_access(mmio, ®, u32off,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> + if (mmio->is_write) {
>> + vcpu->kvm->arch.vgic.enabled = reg & 1;
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> + break;
>> +
>> + case 4: /* TYPER */
>> + reg = (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
>> + reg |= (VGIC_NR_IRQS >> 5) - 1;
>> + vgic_reg_access(mmio, ®, u32off,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> + break;
>> +
>> + case 8: /* IIDR */
>> + reg = 0x4B00043B;
>> + vgic_reg_access(mmio, ®, u32off,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> + break;
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_raz_wi(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + vgic_reg_access(mmio, NULL, offset,
>> + ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_set_enable_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
>> + vcpu->vcpu_id, offset);
>> + vgic_reg_access(mmio, reg, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
>> + if (mmio->is_write) {
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_clear_enable_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_enabled,
>> + vcpu->vcpu_id, offset);
>> + vgic_reg_access(mmio, reg, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
>> + if (mmio->is_write) {
>> + if (offset < 4) /* Force SGI enabled */
>> + *reg |= 0xffff;
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_set_pending_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_state,
>> + vcpu->vcpu_id, offset);
>> + vgic_reg_access(mmio, reg, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_SETBIT);
>> + if (mmio->is_write) {
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_clear_pending_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_state,
>> + vcpu->vcpu_id, offset);
>> + vgic_reg_access(mmio, reg, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_CLEARBIT);
>> + if (mmio->is_write) {
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 *reg = vgic_bytemap_get_reg(&vcpu->kvm->arch.vgic.irq_priority,
>> + vcpu->vcpu_id, offset);
>> + vgic_reg_access(mmio, reg, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> + return false;
>> +}
>
> What do you gain from returning a bool from the MMIO handlers? Why not assume
> that state has always been updated and kick the vcpus if something is pending?
Computing the state for the whole distributor is quite expensive (we
used to compute it all at each interrupt injection, and switching to
single-nit operations have a measurable difference).
>> +
>> +static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
>> +{
>> + struct vgic_dist *dist = &kvm->arch.vgic;
>> + struct kvm_vcpu *vcpu;
>> + int i, c;
>> + unsigned long *bmap;
>> + u32 val = 0;
>> +
>> + BUG_ON(irq & 3);
>> + BUG_ON(irq < 32);
>
> Again, these look scary because I can't see the offset sanity checking for
> the MMIO traps...
Range checking again.
>> +
>> + irq -= 32;
>> +
>> + kvm_for_each_vcpu(c, vcpu, kvm) {
>> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
>> + for (i = 0; i < 4; i++)
>
> Is that 4 from sizeof(unsigned long)?
>From the definition of the target registers, actually (4 bytes per
register, one byte per interrupt, one CPU per bit).
>
>> + if (test_bit(irq + i, bmap))
>> + val |= 1 << (c + i * 8);
>> + }
>> +
>> + return val;
>> +}
>> +
>> +static void vgic_set_target_reg(struct kvm *kvm, u32 val, int irq)
>> +{
>> + struct vgic_dist *dist = &kvm->arch.vgic;
>> + struct kvm_vcpu *vcpu;
>> + int i, c;
>> + unsigned long *bmap;
>> + u32 target;
>> +
>> + BUG_ON(irq & 3);
>> + BUG_ON(irq < 32);
>> +
>> + irq -= 32;
>> +
>> + /*
>> + * Pick the LSB in each byte. This ensures we target exactly
>> + * one vcpu per IRQ. If the byte is null, assume we target
>> + * CPU0.
>> + */
>> + for (i = 0; i < 4; i++) {
>> + int shift = i * 8;
>
> Is this from BITS_PER_BYTE?
Good point.
>> + target = ffs((val >> shift) & 0xffU);
>> + target = target ? (target - 1) : 0;
>
> __ffs?
Does this look better?
target = (val >> shift) & 0xffU;
if (target)
target = __ffs(target);
>
>> + dist->irq_spi_cpu[irq + i] = target;
>> + kvm_for_each_vcpu(c, vcpu, kvm) {
>> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
>> + if (c == target)
>> + set_bit(irq + i, bmap);
>> + else
>> + clear_bit(irq + i, bmap);
>> + }
>> + }
>> +}
>> +
>> +static bool handle_mmio_target_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 reg;
>> +
>> + /* We treat the banked interrupts targets as read-only */
>> + if (offset < 32) {
>> + u32 roreg = 1 << vcpu->vcpu_id;
>> + roreg |= roreg << 8;
>> + roreg |= roreg << 16;
>> +
>> + vgic_reg_access(mmio, &roreg, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_IGNORED);
>> + return false;
>> + }
>> +
>> + reg = vgic_get_target_reg(vcpu->kvm, offset & ~3U);
>> + vgic_reg_access(mmio, ®, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> + if (mmio->is_write) {
>> + vgic_set_target_reg(vcpu->kvm, reg, offset & ~3U);
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static u32 vgic_cfg_expand(u16 val)
>> +{
>> + u32 res = 0;
>> + int i;
>> +
>> + for (i = 0; i < 16; i++)
>> + res |= (val >> i) << (2 * i + 1);
>
> Ok, you've lost me on this one but replacing some of the magic numbers with
> the constants they represent would be much appreciated, please!
The comment below contains some explanation (but the code is buggy and
has been fixed already). I'll try to come up with some detailed
explanation about what this code is trying to do.
>> +
>> + return res;
>> +}
>> +
>> +static u16 vgic_cfg_compress(u32 val)
>> +{
>> + u16 res = 0;
>> + int i;
>> +
>> + for (i = 0; i < 16; i++)
>> + res |= (val >> (i * 2 + 1)) << i;
>> +
>> + return res;
>> +}
>> +
>> +/*
>> + * The distributor uses 2 bits per IRQ for the CFG register, but the
>> + * LSB is always 0. As such, we only keep the upper bit, and use the
>> + * two above functions to compress/expand the bits
>> + */
>> +static bool handle_mmio_cfg_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 val;
>> + u32 *reg = vgic_bitmap_get_reg(&vcpu->kvm->arch.vgic.irq_cfg,
>> + vcpu->vcpu_id, offset >> 1);
>> + if (offset & 2)
>> + val = *reg >> 16;
>> + else
>> + val = *reg & 0xffff;
>> +
>> + val = vgic_cfg_expand(val);
>> + vgic_reg_access(mmio, &val, offset,
>> + ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
>> + if (mmio->is_write) {
>> + if (offset < 4) {
>> + *reg = ~0U; /* Force PPIs/SGIs to 1 */
>> + return false;
>> + }
>> +
>> + val = vgic_cfg_compress(val);
>> + if (offset & 2) {
>> + *reg &= 0xffff;
>> + *reg |= val << 16;
>> + } else {
>> + *reg &= 0xffff << 16;
>> + *reg |= val;
>> + }
>> + }
>> +
>> + return false;
>> +}
>> +
>> +static bool handle_mmio_sgi_reg(struct kvm_vcpu *vcpu,
>> + struct kvm_exit_mmio *mmio, u32 offset)
>> +{
>> + u32 reg;
>> + vgic_reg_access(mmio, ®, offset,
>> + ACCESS_READ_RAZ | ACCESS_WRITE_VALUE);
>> + if (mmio->is_write) {
>> + vgic_dispatch_sgi(vcpu, reg);
>> + vgic_update_state(vcpu->kvm);
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> /* All this should be handled by kvm_bus_io_*()... FIXME!!! */
>> struct mmio_range {
>> unsigned long base;
>> @@ -110,6 +428,66 @@ struct mmio_range {
>> };
>>
>> static const struct mmio_range vgic_ranges[] = {
>> + { /* CTRL, TYPER, IIDR */
>> + .base = 0,
>> + .len = 12,
>> + .handle_mmio = handle_mmio_misc,
>> + },
>> + { /* IGROUPRn */
>> + .base = 0x80,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_raz_wi,
>> + },
>> + { /* ISENABLERn */
>> + .base = 0x100,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_set_enable_reg,
>> + },
>> + { /* ICENABLERn */
>> + .base = 0x180,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_clear_enable_reg,
>> + },
>> + { /* ISPENDRn */
>> + .base = 0x200,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_set_pending_reg,
>> + },
>> + { /* ICPENDRn */
>> + .base = 0x280,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_clear_pending_reg,
>> + },
>> + { /* ISACTIVERn */
>> + .base = 0x300,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_raz_wi,
>> + },
>> + { /* ICACTIVERn */
>> + .base = 0x380,
>> + .len = VGIC_NR_IRQS / 8,
>> + .handle_mmio = handle_mmio_raz_wi,
>> + },
>> + { /* IPRIORITYRn */
>> + .base = 0x400,
>> + .len = VGIC_NR_IRQS,
>> + .handle_mmio = handle_mmio_priority_reg,
>> + },
>> + { /* ITARGETSRn */
>> + .base = 0x800,
>> + .len = VGIC_NR_IRQS,
>> + .handle_mmio = handle_mmio_target_reg,
>> + },
>> + { /* ICFGRn */
>> + .base = 0xC00,
>> + .len = VGIC_NR_IRQS / 4,
>> + .handle_mmio = handle_mmio_cfg_reg,
>> + },
>> + { /* SGIRn */
>> + .base = 0xF00,
>> + .len = 4,
>> + .handle_mmio = handle_mmio_sgi_reg,
>> + },
>
> Why not #define the offset values for the base fields instead of commenting
> the literals?
I've had a separate discussion with RobH about that, and this is the
pipe for the GIC move into drivers.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus
2012-11-28 12:47 ` Will Deacon
2012-11-28 13:15 ` Marc Zyngier
@ 2012-11-30 22:39 ` Christoffer Dall
1 sibling, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-11-30 22:39 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 28, 2012 at 7:47 AM, Will Deacon <will.deacon@arm.com> wrote:
> Just a bunch of typos in this one :)
>
> On Sat, Nov 10, 2012 at 03:44:30PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> When an interrupt occurs for the guest, it is sometimes necessary
>> to find out which vcpu was running at that point.
>>
>> Keep track of which vcpu is being tun in kvm_arch_vcpu_ioctl_run(),
>
> run
>
>> and allow the data to be retrived using either:
>
> retrieved
>
>> - kvm_arm_get_running_vcpu(): returns the vcpu running at this point
>> on the current CPU. Can only be used in a non-preemptable context.
>
> preemptible
>
>> - kvm_arm_get_running_vcpus(): returns the per-CPU variable holding
>> the the running vcpus, useable for per-CPU interrupts.
>
> -the
> usable
>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_host.h | 10 ++++++++++
>> arch/arm/kvm/arm.c | 30 ++++++++++++++++++++++++++++++
>> 2 files changed, 40 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index e7fc249..e66cd56 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -154,4 +154,14 @@ static inline int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
>> {
>> return 0;
>> }
>> +
>> +struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
>> +struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
>
> DECLARE_PER_CPU?
>
for a function prototype, how does that work?
>> +int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
>> +unsigned long kvm_arm_num_coproc_regs(struct kvm_vcpu *vcpu);
>> +struct kvm_one_reg;
>> +int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
>> +int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
>> +
>> #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 2cdc07b..60b119a 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -53,11 +53,38 @@ static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
>> static struct vfp_hard_struct __percpu *kvm_host_vfp_state;
>> static unsigned long hyp_default_vectors;
>>
>> +/* Per-CPU variable containing the currently running vcpu. */
>> +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
>> +
>> /* The VMID used in the VTTBR */
>> static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>> static u8 kvm_next_vmid;
>> static DEFINE_SPINLOCK(kvm_vmid_lock);
>>
>> +static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>> +{
>> + BUG_ON(preemptible());
>> + __get_cpu_var(kvm_arm_running_vcpu) = vcpu;
>> +}
>> +
>> +/**
>> + * kvm_arm_get_running_vcpu - get the vcpu running on the current CPU.
>> + * Must be called from non-preemptible context
>> + */
>> +struct kvm_vcpu *kvm_arm_get_running_vcpu(void)
>> +{
>> + BUG_ON(preemptible());
>> + return __get_cpu_var(kvm_arm_running_vcpu);
>> +}
>> +
>> +/**
>> + * kvm_arm_get_running_vcpus - get the per-CPU array on currently running vcpus.
>> + */
>
> s/on/of/ ?
>
typos fixed.
Thanks,
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support
2012-11-28 12:49 ` Will Deacon
2012-11-28 13:09 ` Marc Zyngier
@ 2012-12-01 2:19 ` Christoffer Dall
1 sibling, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-12-01 2:19 UTC (permalink / raw)
To: linux-arm-kernel
[...]
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 2fb7319..665af96 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -1880,12 +1880,13 @@ static long kvm_vcpu_ioctl(struct file *filp,
>> if (vcpu->kvm->mm != current->mm)
>> return -EIO;
>>
>> -#if defined(CONFIG_S390) || defined(CONFIG_PPC)
>> +#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_ARM)
>> /*
>> * Special cases: vcpu ioctls that are asynchronous to vcpu execution,
>> * so vcpu_load() would break it.
>> */
>> - if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT)
>> + if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT ||
>> + ioctl == KVM_IRQ_LINE)
>> return kvm_arch_vcpu_ioctl(filp, ioctl, arg);
>> #endif
>
> Separate patch?
>
KVM_IRQ_LINE is a VM ioctl, not a vcpu ioctl, so this is leftover from
the duplicate VCPU/VM same ioctl mess.
It's gone!
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-11-28 13:11 ` Will Deacon
2012-11-28 13:22 ` [kvmarm] " Marc Zyngier
@ 2012-12-01 2:52 ` Christoffer Dall
2012-12-01 15:57 ` Christoffer Dall
2012-12-03 10:40 ` Will Deacon
1 sibling, 2 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-12-01 2:52 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Nov 28, 2012 at 8:11 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Sat, Nov 10, 2012 at 03:44:51PM +0000, Christoffer Dall wrote:
>> User space defines the model to emulate to a guest and should therefore
>> decide which addresses are used for both the virtual CPU interface
>> directly mapped in the guest physical address space and for the emulated
>> distributor interface, which is mapped in software by the in-kernel VGIC
>> support.
>>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_mmu.h | 2 +
>> arch/arm/include/asm/kvm_vgic.h | 9 ++++++
>> arch/arm/kvm/arm.c | 16 ++++++++++
>> arch/arm/kvm/vgic.c | 61 +++++++++++++++++++++++++++++++++++++++
>> 4 files changed, 87 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 9bd0508..0800531 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -26,6 +26,8 @@
>> * To save a bit of memory and to avoid alignment issues we assume 39-bit IPA
>> * for now, but remember that the level-1 table must be aligned to its size.
>> */
>> +#define KVM_PHYS_SHIFT (38)
>
> Seems a bit small...
>
>> +#define KVM_PHYS_MASK ((1ULL << KVM_PHYS_SHIFT) - 1)
>> #define PTRS_PER_PGD2 512
>> #define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
>>
>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>> index b444ecf..9ca8d21 100644
>> --- a/arch/arm/include/asm/kvm_vgic.h
>> +++ b/arch/arm/include/asm/kvm_vgic.h
>> @@ -20,6 +20,9 @@
>> #define __ASM_ARM_KVM_VGIC_H
>>
>> struct vgic_dist {
>> + /* Distributor and vcpu interface mapping in the guest */
>> + phys_addr_t vgic_dist_base;
>> + phys_addr_t vgic_cpu_base;
>> };
>>
>> struct vgic_cpu {
>> @@ -31,6 +34,7 @@ struct kvm_run;
>> struct kvm_exit_mmio;
>>
>> #ifdef CONFIG_KVM_ARM_VGIC
>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>> struct kvm_exit_mmio *mmio);
>>
>> @@ -40,6 +44,11 @@ static inline int kvm_vgic_hyp_init(void)
>> return 0;
>> }
>>
>> +static inline int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>> +{
>> + return 0;
>> +}
>> +
>> static inline int kvm_vgic_init(struct kvm *kvm)
>> {
>> return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 426828a..3ac1aab 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>> static u8 kvm_next_vmid;
>> static DEFINE_SPINLOCK(kvm_vmid_lock);
>>
>> +static bool vgic_present;
>> +
>> static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>> {
>> BUG_ON(preemptible());
>> @@ -825,7 +827,19 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>> static int kvm_vm_ioctl_set_device_address(struct kvm *kvm,
>> struct kvm_device_address *dev_addr)
>> {
>> - return -ENODEV;
>> + unsigned long dev_id, type;
>> +
>> + dev_id = (dev_addr->id & KVM_DEVICE_ID_MASK) >> KVM_DEVICE_ID_SHIFT;
>> + type = (dev_addr->id & KVM_DEVICE_TYPE_MASK) >> KVM_DEVICE_TYPE_SHIFT;
>> +
>> + switch (dev_id) {
>> + case KVM_ARM_DEVICE_VGIC_V2:
>> + if (!vgic_present)
>> + return -ENXIO;
>> + return kvm_vgic_set_addr(kvm, type, dev_addr->addr);
>> + default:
>> + return -ENODEV;
>> + }
>> }
>>
>> long kvm_arch_vm_ioctl(struct file *filp,
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> index 26ada3b..f85b275 100644
>> --- a/arch/arm/kvm/vgic.c
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -22,6 +22,13 @@
>> #include <linux/io.h>
>> #include <asm/kvm_emulate.h>
>>
>> +#define VGIC_ADDR_UNDEF (-1)
>> +#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
>> +
>> +#define VGIC_DIST_SIZE 0x1000
>> +#define VGIC_CPU_SIZE 0x2000
>
> These defines might be useful to userspace so that they don't request the
> distributor and the cpu interface to be place too close together (been there,
> done that :).
>
>> +
>> +
>> #define ACCESS_READ_VALUE (1 << 0)
>> #define ACCESS_READ_RAZ (0 << 0)
>> #define ACCESS_READ_MASK(x) ((x) & (1 << 0))
>> @@ -136,3 +143,57 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
>> {
>> return KVM_EXIT_MMIO;
>> }
>> +
>> +static bool vgic_ioaddr_overlap(struct kvm *kvm)
>> +{
>> + phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
>> + phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
>> +
>> + if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
>> + return false;
>> + if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
>> + (cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
>> + return true;
>> + return false;
>
> Just return the predicate that you're testing.
>
>> +}
>> +
>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>> +{
>> + int r = 0;
>> + struct vgic_dist *vgic = &kvm->arch.vgic;
>> +
>> + if (addr & ~KVM_PHYS_MASK)
>> + return -E2BIG;
>> +
>> + if (addr & ~PAGE_MASK)
>> + return -EINVAL;
>> +
>> + mutex_lock(&kvm->lock);
>> + switch (type) {
>> + case KVM_VGIC_V2_ADDR_TYPE_DIST:
>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
>> + return -EEXIST;
>> + if (addr + VGIC_DIST_SIZE < addr)
>> + return -EINVAL;
>
> I think somebody else pointed out the missing mutex_unlocks on the failure
> paths.
>
>> + kvm->arch.vgic.vgic_dist_base = addr;
>> + break;
>> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
>> + return -EEXIST;
>> + if (addr + VGIC_CPU_SIZE < addr)
>> + return -EINVAL;
>> + kvm->arch.vgic.vgic_cpu_base = addr;
>> + break;
>> + default:
>> + r = -ENODEV;
>> + }
>> +
>> + if (vgic_ioaddr_overlap(kvm)) {
>> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>> + return -EINVAL;
>
> Perhaps we could put all the address checking in one place, so that the wrapping
> round zero checks and the overlap checks can be in the same function?
>
Like this (?):
diff --git a/Documentation/virtual/kvm/api.txt
b/Documentation/virtual/kvm/api.txt
index 7f057a2..0b6b95e 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2163,6 +2163,7 @@ Errors:
ENXIO: Device not supported on current system
EEXIST: Address already set
E2BIG: Address outside guest physical address space
+ EBUSY: Address overlaps with other device range
struct kvm_device_address {
__u64 id;
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index f697c14..660fe24 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -1230,11 +1230,28 @@ static bool vgic_ioaddr_overlap(struct kvm *kvm)
phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
- return false;
+ return 0;
if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
(cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
- return true;
- return false;
+ return -EBUSY;
+ return 0;
+}
+
+static int vgic_ioaddr_assign(struct kvm *kvm, phys_addr_t *ioaddr,
+ phys_addr_t addr, phys_addr_t size)
+{
+ int ret;
+
+ if (!IS_VGIC_ADDR_UNDEF(*ioaddr))
+ return -EEXIST;
+ if (addr + size < addr)
+ return -EINVAL;
+
+ ret = vgic_ioaddr_overlap(kvm);
+ if (ret)
+ return ret;
+ *ioaddr = addr;
+ return ret;
}
int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
@@ -1251,29 +1268,21 @@ int kvm_vgic_set_addr(struct kvm *kvm,
unsigned long type, u64 addr)
mutex_lock(&kvm->lock);
switch (type) {
case KVM_VGIC_V2_ADDR_TYPE_DIST:
- if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
- return -EEXIST;
- if (addr + VGIC_DIST_SIZE < addr)
- return -EINVAL;
- kvm->arch.vgic.vgic_dist_base = addr;
+ r = vgic_ioaddr_assign(kvm, &vgic->vgic_dist_base,
+ addr, VGIC_DIST_SIZE);
break;
case KVM_VGIC_V2_ADDR_TYPE_CPU:
if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
return -EEXIST;
if (addr + VGIC_CPU_SIZE < addr)
return -EINVAL;
- kvm->arch.vgic.vgic_cpu_base = addr;
+ r = vgic_ioaddr_assign(kvm, &vgic->vgic_cpu_base,
+ addr, VGIC_CPU_SIZE);
break;
default:
r = -ENODEV;
}
- if (vgic_ioaddr_overlap(kvm)) {
- kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
- kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
- r = -EINVAL;
- }
-
mutex_unlock(&kvm->lock);
return r;
}
--
Thanks,
-Christoffer
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-12-01 2:52 ` Christoffer Dall
@ 2012-12-01 15:57 ` Christoffer Dall
2012-12-03 10:40 ` Will Deacon
1 sibling, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-12-01 15:57 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Nov 30, 2012 at 9:52 PM, Christoffer Dall
<c.dall@virtualopensystems.com> wrote:
> On Wed, Nov 28, 2012 at 8:11 AM, Will Deacon <will.deacon@arm.com> wrote:
>> On Sat, Nov 10, 2012 at 03:44:51PM +0000, Christoffer Dall wrote:
>>> User space defines the model to emulate to a guest and should therefore
>>> decide which addresses are used for both the virtual CPU interface
>>> directly mapped in the guest physical address space and for the emulated
>>> distributor interface, which is mapped in software by the in-kernel VGIC
>>> support.
>>>
>>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>>> ---
>>> arch/arm/include/asm/kvm_mmu.h | 2 +
>>> arch/arm/include/asm/kvm_vgic.h | 9 ++++++
>>> arch/arm/kvm/arm.c | 16 ++++++++++
>>> arch/arm/kvm/vgic.c | 61 +++++++++++++++++++++++++++++++++++++++
>>> 4 files changed, 87 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>> index 9bd0508..0800531 100644
>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>> @@ -26,6 +26,8 @@
>>> * To save a bit of memory and to avoid alignment issues we assume 39-bit IPA
>>> * for now, but remember that the level-1 table must be aligned to its size.
>>> */
>>> +#define KVM_PHYS_SHIFT (38)
>>
>> Seems a bit small...
>>
>>> +#define KVM_PHYS_MASK ((1ULL << KVM_PHYS_SHIFT) - 1)
>>> #define PTRS_PER_PGD2 512
>>> #define PGD2_ORDER get_order(PTRS_PER_PGD2 * sizeof(pgd_t))
>>>
>>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>>> index b444ecf..9ca8d21 100644
>>> --- a/arch/arm/include/asm/kvm_vgic.h
>>> +++ b/arch/arm/include/asm/kvm_vgic.h
>>> @@ -20,6 +20,9 @@
>>> #define __ASM_ARM_KVM_VGIC_H
>>>
>>> struct vgic_dist {
>>> + /* Distributor and vcpu interface mapping in the guest */
>>> + phys_addr_t vgic_dist_base;
>>> + phys_addr_t vgic_cpu_base;
>>> };
>>>
>>> struct vgic_cpu {
>>> @@ -31,6 +34,7 @@ struct kvm_run;
>>> struct kvm_exit_mmio;
>>>
>>> #ifdef CONFIG_KVM_ARM_VGIC
>>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>> struct kvm_exit_mmio *mmio);
>>>
>>> @@ -40,6 +44,11 @@ static inline int kvm_vgic_hyp_init(void)
>>> return 0;
>>> }
>>>
>>> +static inline int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>>> +{
>>> + return 0;
>>> +}
>>> +
>>> static inline int kvm_vgic_init(struct kvm *kvm)
>>> {
>>> return 0;
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index 426828a..3ac1aab 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>>> static u8 kvm_next_vmid;
>>> static DEFINE_SPINLOCK(kvm_vmid_lock);
>>>
>>> +static bool vgic_present;
>>> +
>>> static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>>> {
>>> BUG_ON(preemptible());
>>> @@ -825,7 +827,19 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>> static int kvm_vm_ioctl_set_device_address(struct kvm *kvm,
>>> struct kvm_device_address *dev_addr)
>>> {
>>> - return -ENODEV;
>>> + unsigned long dev_id, type;
>>> +
>>> + dev_id = (dev_addr->id & KVM_DEVICE_ID_MASK) >> KVM_DEVICE_ID_SHIFT;
>>> + type = (dev_addr->id & KVM_DEVICE_TYPE_MASK) >> KVM_DEVICE_TYPE_SHIFT;
>>> +
>>> + switch (dev_id) {
>>> + case KVM_ARM_DEVICE_VGIC_V2:
>>> + if (!vgic_present)
>>> + return -ENXIO;
>>> + return kvm_vgic_set_addr(kvm, type, dev_addr->addr);
>>> + default:
>>> + return -ENODEV;
>>> + }
>>> }
>>>
>>> long kvm_arch_vm_ioctl(struct file *filp,
>>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>>> index 26ada3b..f85b275 100644
>>> --- a/arch/arm/kvm/vgic.c
>>> +++ b/arch/arm/kvm/vgic.c
>>> @@ -22,6 +22,13 @@
>>> #include <linux/io.h>
>>> #include <asm/kvm_emulate.h>
>>>
>>> +#define VGIC_ADDR_UNDEF (-1)
>>> +#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
>>> +
>>> +#define VGIC_DIST_SIZE 0x1000
>>> +#define VGIC_CPU_SIZE 0x2000
>>
>> These defines might be useful to userspace so that they don't request the
>> distributor and the cpu interface to be place too close together (been there,
>> done that :).
>>
>>> +
>>> +
>>> #define ACCESS_READ_VALUE (1 << 0)
>>> #define ACCESS_READ_RAZ (0 << 0)
>>> #define ACCESS_READ_MASK(x) ((x) & (1 << 0))
>>> @@ -136,3 +143,57 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
>>> {
>>> return KVM_EXIT_MMIO;
>>> }
>>> +
>>> +static bool vgic_ioaddr_overlap(struct kvm *kvm)
>>> +{
>>> + phys_addr_t dist = kvm->arch.vgic.vgic_dist_base;
>>> + phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
>>> +
>>> + if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
>>> + return false;
>>> + if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
>>> + (cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
>>> + return true;
>>> + return false;
>>
>> Just return the predicate that you're testing.
>>
>>> +}
>>> +
>>> +int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
>>> +{
>>> + int r = 0;
>>> + struct vgic_dist *vgic = &kvm->arch.vgic;
>>> +
>>> + if (addr & ~KVM_PHYS_MASK)
>>> + return -E2BIG;
>>> +
>>> + if (addr & ~PAGE_MASK)
>>> + return -EINVAL;
>>> +
>>> + mutex_lock(&kvm->lock);
>>> + switch (type) {
>>> + case KVM_VGIC_V2_ADDR_TYPE_DIST:
>>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
>>> + return -EEXIST;
>>> + if (addr + VGIC_DIST_SIZE < addr)
>>> + return -EINVAL;
>>
>> I think somebody else pointed out the missing mutex_unlocks on the failure
>> paths.
>>
>>> + kvm->arch.vgic.vgic_dist_base = addr;
>>> + break;
>>> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
>>> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
>>> + return -EEXIST;
>>> + if (addr + VGIC_CPU_SIZE < addr)
>>> + return -EINVAL;
>>> + kvm->arch.vgic.vgic_cpu_base = addr;
>>> + break;
>>> + default:
>>> + r = -ENODEV;
>>> + }
>>> +
>>> + if (vgic_ioaddr_overlap(kvm)) {
>>> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>>> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>>> + return -EINVAL;
>>
>> Perhaps we could put all the address checking in one place, so that the wrapping
>> round zero checks and the overlap checks can be in the same function?
>>
>
> Like this (?):
>
and by this, I mean this:
diff --git a/Documentation/virtual/kvm/api.txt
b/Documentation/virtual/kvm/api.txt
index 7f057a2..0b6b95e 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2163,6 +2163,7 @@ Errors:
ENXIO: Device not supported on current system
EEXIST: Address already set
E2BIG: Address outside guest physical address space
+ EBUSY: Address overlaps with other device range
struct kvm_device_address {
__u64 id;
diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
index f697c14..c666b95 100644
--- a/arch/arm/kvm/vgic.c
+++ b/arch/arm/kvm/vgic.c
@@ -1230,11 +1230,28 @@ static bool vgic_ioaddr_overlap(struct kvm *kvm)
phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
- return false;
+ return 0;
if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
(cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
- return true;
- return false;
+ return -EBUSY;
+ return 0;
+}
+
+static int vgic_ioaddr_assign(struct kvm *kvm, phys_addr_t *ioaddr,
+ phys_addr_t addr, phys_addr_t size)
+{
+ int ret;
+
+ if (!IS_VGIC_ADDR_UNDEF(*ioaddr))
+ return -EEXIST;
+ if (addr + size < addr)
+ return -EINVAL;
+
+ ret = vgic_ioaddr_overlap(kvm);
+ if (ret)
+ return ret;
+ *ioaddr = addr;
+ return ret;
}
int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
@@ -1251,29 +1268,17 @@ int kvm_vgic_set_addr(struct kvm *kvm,
unsigned long type, u64 addr)
mutex_lock(&kvm->lock);
switch (type) {
case KVM_VGIC_V2_ADDR_TYPE_DIST:
- if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
- return -EEXIST;
- if (addr + VGIC_DIST_SIZE < addr)
- return -EINVAL;
- kvm->arch.vgic.vgic_dist_base = addr;
+ r = vgic_ioaddr_assign(kvm, &vgic->vgic_dist_base,
+ addr, VGIC_DIST_SIZE);
break;
case KVM_VGIC_V2_ADDR_TYPE_CPU:
- if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
- return -EEXIST;
- if (addr + VGIC_CPU_SIZE < addr)
- return -EINVAL;
- kvm->arch.vgic.vgic_cpu_base = addr;
+ r = vgic_ioaddr_assign(kvm, &vgic->vgic_cpu_base,
+ addr, VGIC_CPU_SIZE);
break;
default:
r = -ENODEV;
}
- if (vgic_ioaddr_overlap(kvm)) {
- kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
- kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
- r = -EINVAL;
- }
-
mutex_unlock(&kvm->lock);
return r;
}
--
Thanks,
-Christoffer
^ permalink raw reply related [flat|nested] 58+ messages in thread
* [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space
2012-12-01 2:52 ` Christoffer Dall
2012-12-01 15:57 ` Christoffer Dall
@ 2012-12-03 10:40 ` Will Deacon
1 sibling, 0 replies; 58+ messages in thread
From: Will Deacon @ 2012-12-03 10:40 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Dec 01, 2012 at 02:52:13AM +0000, Christoffer Dall wrote:
> On Wed, Nov 28, 2012 at 8:11 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Sat, Nov 10, 2012 at 03:44:51PM +0000, Christoffer Dall wrote:
> >> + kvm->arch.vgic.vgic_dist_base = addr;
> >> + break;
> >> + case KVM_VGIC_V2_ADDR_TYPE_CPU:
> >> + if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
> >> + return -EEXIST;
> >> + if (addr + VGIC_CPU_SIZE < addr)
> >> + return -EINVAL;
> >> + kvm->arch.vgic.vgic_cpu_base = addr;
> >> + break;
> >> + default:
> >> + r = -ENODEV;
> >> + }
> >> +
> >> + if (vgic_ioaddr_overlap(kvm)) {
> >> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
> >> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
> >> + return -EINVAL;
> >
> > Perhaps we could put all the address checking in one place, so that the wrapping
> > round zero checks and the overlap checks can be in the same function?
> >
>
> Like this (?):
Almost:
> diff --git a/Documentation/virtual/kvm/api.txt
> b/Documentation/virtual/kvm/api.txt
> index 7f057a2..0b6b95e 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2163,6 +2163,7 @@ Errors:
> ENXIO: Device not supported on current system
> EEXIST: Address already set
> E2BIG: Address outside guest physical address space
> + EBUSY: Address overlaps with other device range
>
> struct kvm_device_address {
> __u64 id;
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index f697c14..660fe24 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -1230,11 +1230,28 @@ static bool vgic_ioaddr_overlap(struct kvm *kvm)
> phys_addr_t cpu = kvm->arch.vgic.vgic_cpu_base;
>
> if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
> - return false;
> + return 0;
> if ((dist <= cpu && dist + VGIC_DIST_SIZE > cpu) ||
> (cpu <= dist && cpu + VGIC_CPU_SIZE > dist))
> - return true;
> - return false;
> + return -EBUSY;
> + return 0;
> +}
> +
> +static int vgic_ioaddr_assign(struct kvm *kvm, phys_addr_t *ioaddr,
> + phys_addr_t addr, phys_addr_t size)
> +{
> + int ret;
> +
> + if (!IS_VGIC_ADDR_UNDEF(*ioaddr))
> + return -EEXIST;
> + if (addr + size < addr)
> + return -EINVAL;
> +
> + ret = vgic_ioaddr_overlap(kvm);
> + if (ret)
> + return ret;
> + *ioaddr = addr;
> + return ret;
> }
>
> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr)
> @@ -1251,29 +1268,21 @@ int kvm_vgic_set_addr(struct kvm *kvm,
> unsigned long type, u64 addr)
> mutex_lock(&kvm->lock);
> switch (type) {
> case KVM_VGIC_V2_ADDR_TYPE_DIST:
> - if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_dist_base))
> - return -EEXIST;
> - if (addr + VGIC_DIST_SIZE < addr)
> - return -EINVAL;
> - kvm->arch.vgic.vgic_dist_base = addr;
> + r = vgic_ioaddr_assign(kvm, &vgic->vgic_dist_base,
> + addr, VGIC_DIST_SIZE);
> break;
> case KVM_VGIC_V2_ADDR_TYPE_CPU:
> if (!IS_VGIC_ADDR_UNDEF(vgic->vgic_cpu_base))
> return -EEXIST;
> if (addr + VGIC_CPU_SIZE < addr)
> return -EINVAL;
> - kvm->arch.vgic.vgic_cpu_base = addr;
> + r = vgic_ioaddr_assign(kvm, &vgic->vgic_cpu_base,
> + addr, VGIC_CPU_SIZE);
You've left the checking in here, so now everything is checked twice.
Also, you should probably touch bases with Marc as I'm under the impression
that both of you are looking at addressing my comments so it would be good
to avoid tripping over each other on this.
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management
2012-11-10 15:45 ` [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management Christoffer Dall
@ 2012-12-03 13:23 ` Will Deacon
2012-12-03 14:11 ` Marc Zyngier
0 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-12-03 13:23 UTC (permalink / raw)
To: linux-arm-kernel
Hi Marc,
I've managed to look at some more of the vgic code, so here is some more
feedback. I've still not got to the end of the series, but there's light at
the end of the tunnel...
On Sat, Nov 10, 2012 at 03:45:05PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Add VGIC virtual CPU interface code, picking pending interrupts
> from the distributor and stashing them in the VGIC control interface
> list registers.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_vgic.h | 41 +++++++
> arch/arm/kvm/vgic.c | 226 +++++++++++++++++++++++++++++++++++++++
> 2 files changed, 266 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index 9e60b1d..7229324 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -193,8 +193,45 @@ struct vgic_dist {
> };
>
> struct vgic_cpu {
> +#ifdef CONFIG_KVM_ARM_VGIC
> + /* per IRQ to LR mapping */
> + u8 vgic_irq_lr_map[VGIC_NR_IRQS];
per IRQ?
> +
> + /* Pending interrupts on this VCPU */
> + DECLARE_BITMAP( pending, VGIC_NR_IRQS);
> +
> + /* Bitmap of used/free list registers */
> + DECLARE_BITMAP( lr_used, 64);
> +
> + /* Number of list registers on this CPU */
> + int nr_lr;
> +
> + /* CPU vif control registers for world switch */
> + u32 vgic_hcr;
> + u32 vgic_vmcr;
> + u32 vgic_misr; /* Saved only */
> + u32 vgic_eisr[2]; /* Saved only */
> + u32 vgic_elrsr[2]; /* Saved only */
> + u32 vgic_apr;
> + u32 vgic_lr[64]; /* A15 has only 4... */
> +#endif
> };
Looks like we should have a #define for the maximum number of list registers,
so we keep vgic_lr and lr_user in sync.
>
> +#define VGIC_HCR_EN (1 << 0)
> +#define VGIC_HCR_UIE (1 << 1)
> +
> +#define VGIC_LR_VIRTUALID (0x3ff << 0)
> +#define VGIC_LR_PHYSID_CPUID (7 << 10)
> +#define VGIC_LR_STATE (3 << 28)
> +#define VGIC_LR_PENDING_BIT (1 << 28)
> +#define VGIC_LR_ACTIVE_BIT (1 << 29)
> +#define VGIC_LR_EOI (1 << 19)
> +
> +#define VGIC_MISR_EOI (1 << 0)
> +#define VGIC_MISR_U (1 << 1)
> +
> +#define LR_EMPTY 0xff
> +
Could stick these in asm/hardware/gic.h. I know they're not used by the gic
driver, but they're the same piece of architecture so it's probably worth
keeping in one place.
You'd probably also want a s/VGIC/GICH/
> struct kvm;
> struct kvm_vcpu;
> struct kvm_run;
> @@ -202,9 +239,13 @@ struct kvm_exit_mmio;
>
> #ifdef CONFIG_KVM_ARM_VGIC
> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
> +void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
> +void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
> +int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> struct kvm_exit_mmio *mmio);
>
> +#define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
> #else
> static inline int kvm_vgic_hyp_init(void)
> {
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index 82feee8..d7cdec5 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -587,7 +587,25 @@ static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
>
> static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
> {
> - return 0;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + unsigned long *pending, *enabled, *pend;
> + int vcpu_id;
> +
> + vcpu_id = vcpu->vcpu_id;
> + pend = vcpu->arch.vgic_cpu.pending;
> +
> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
> + enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
> + bitmap_and(pend, pending, enabled, 32);
pend and pending! vcpu_pending and dist_pending?
> +
> + pending = vgic_bitmap_get_shared_map(&dist->irq_state);
> + enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
> + bitmap_and(pend + 1, pending, enabled, VGIC_NR_SHARED_IRQS);
> + bitmap_and(pend + 1, pend + 1,
> + vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
> + VGIC_NR_SHARED_IRQS);
> +
> + return (find_first_bit(pend, VGIC_NR_IRQS) < VGIC_NR_IRQS);
> }
>
> /*
> @@ -613,6 +631,212 @@ static void vgic_update_state(struct kvm *kvm)
> }
> }
>
> +#define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
Is VGIC_LR_PHYSID_CPUID wide enough for this? The CPUID is only 3 bits, but
the interrupt ID could be larger. Or do you not supported hardware interrupt
forwarding? (in which case, LR_PHYSID is a misleading name).
> +#define MK_LR_PEND(src, irq) (VGIC_LR_PENDING_BIT | ((src) << 10) | (irq))
> +/*
> + * Queue an interrupt to a CPU virtual interface. Return true on success,
> + * or false if it wasn't possible to queue it.
> + */
> +static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
> +{
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + int lr, is_level;
> +
> + /* Sanitize the input... */
> + BUG_ON(sgi_source_id & ~7);
sgi_source_id > MAX_SGI_SOURCES (or whatever we end up having for the SGI
and PPI limits).
> + BUG_ON(sgi_source_id && irq > 15);
irq > MAX_PPI_SOURCES
> + BUG_ON(irq >= VGIC_NR_IRQS);
> +
> + kvm_debug("Queue IRQ%d\n", irq);
> +
> + lr = vgic_cpu->vgic_irq_lr_map[irq];
> + is_level = !vgic_irq_is_edge(dist, irq);
> +
> + /* Do we have an active interrupt for the same CPUID? */
> + if (lr != LR_EMPTY &&
> + (LR_PHYSID(vgic_cpu->vgic_lr[lr]) == sgi_source_id)) {
Ok, so this does return the source.
> + kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
> + BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
> + if (is_level)
> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> + return true;
> + }
> +
> + /* Try to use another LR for this interrupt */
> + lr = find_first_bit((unsigned long *)vgic_cpu->vgic_elrsr,
> + vgic_cpu->nr_lr);
> + if (lr >= vgic_cpu->nr_lr)
> + return false;
> +
> + kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
> + vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
> + if (is_level)
> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> +
> + vgic_cpu->vgic_irq_lr_map[irq] = lr;
> + clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
> + set_bit(lr, vgic_cpu->lr_used);
> +
> + return true;
> +}
I can't help but feel that this could be made cleaner by moving the
level-specific EOI handling out into a separate function.
> +
> +/*
> + * Fill the list registers with pending interrupts before running the
> + * guest.
> + */
> +static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + unsigned long *pending;
> + int i, c, vcpu_id;
> + int overflow = 0;
> +
> + vcpu_id = vcpu->vcpu_id;
> +
> + /*
> + * We may not have any pending interrupt, or the interrupts
> + * may have been serviced from another vcpu. In all cases,
> + * move along.
> + */
> + if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
> + pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
> + goto epilog;
> + }
> +
> + /* SGIs */
> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
> + for_each_set_bit(i, vgic_cpu->pending, 16) {
> + unsigned long sources;
> +
> + sources = dist->irq_sgi_sources[vcpu_id][i];
> + for_each_set_bit(c, &sources, 8) {
> + if (!vgic_queue_irq(vcpu, c, i)) {
> + overflow = 1;
> + continue;
> + }
If there are multiple sources, why do you need to queue the interrupt
multiple times? I would have thought it could be collapsed into one.
> +
> + clear_bit(c, &sources);
> + }
> +
> + if (!sources)
> + clear_bit(i, pending);
What does this signify and how does it happen? An SGI without a source
sounds pretty weird...
> +
> + dist->irq_sgi_sources[vcpu_id][i] = sources;
> + }
> +
> + /* PPIs */
> + for_each_set_bit_from(i, vgic_cpu->pending, 32) {
> + if (!vgic_queue_irq(vcpu, 0, i)) {
> + overflow = 1;
> + continue;
> + }
> +
> + clear_bit(i, pending);
You could lose the `continue' and stick the clear_bit in an else clause
(same for SGIs and SPIs).
> + }
> +
> +
> + /* SPIs */
> + pending = vgic_bitmap_get_shared_map(&dist->irq_state);
> + for_each_set_bit_from(i, vgic_cpu->pending, VGIC_NR_IRQS) {
> + if (vgic_bitmap_get_irq_val(&dist->irq_active, 0, i))
> + continue; /* level interrupt, already queued */
> +
> + if (!vgic_queue_irq(vcpu, 0, i)) {
> + overflow = 1;
> + continue;
> + }
> +
> + /* Immediate clear on edge, set active on level */
> + if (vgic_irq_is_edge(dist, i)) {
> + clear_bit(i - 32, pending);
> + clear_bit(i, vgic_cpu->pending);
> + } else {
> + vgic_bitmap_set_irq_val(&dist->irq_active, 0, i, 1);
> + }
> + }
Hmm, more of this edge/level handling trying to use the same code and it
not really working.
> +
> +epilog:
> + if (overflow)
> + vgic_cpu->vgic_hcr |= VGIC_HCR_UIE;
> + else {
> + vgic_cpu->vgic_hcr &= ~VGIC_HCR_UIE;
> + /*
> + * We're about to run this VCPU, and we've consumed
> + * everything the distributor had in store for
> + * us. Claim we don't have anything pending. We'll
> + * adjust that if needed while exiting.
> + */
> + clear_bit(vcpu_id, &dist->irq_pending_on_cpu);
> + }
> +}
> +
> +/*
> + * Sync back the VGIC state after a guest run. We do not really touch
> + * the distributor here (the irq_pending_on_cpu bit is safe to set),
> + * so there is no need for taking its lock.
> + */
> +static void __kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + int lr, pending;
> +
> + /* Clear mappings for empty LRs */
> + for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
> + vgic_cpu->nr_lr) {
> + int irq;
> +
> + if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
> + continue;
> +
> + irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
> +
> + BUG_ON(irq >= VGIC_NR_IRQS);
> + vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
> + }
> +
> + /* Check if we still have something up our sleeve... */
> + pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
> + vgic_cpu->nr_lr);
Does this rely on timeliness of maintenance interrupts with respect to
EOIs in the guest? i.e. if a maintenance interrupt is delayed (I can't
see anything in the spec stating that they're synchronous) and you end up
taking one here, will you accidentally re-pend the interrupt?
> + if (pending < vgic_cpu->nr_lr) {
> + set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
> + smp_mb();
What's this barrier for?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 08/13] ARM: KVM: vgic: retire queued, disabled interrupts
2012-11-10 15:45 ` [PATCH v4 08/13] ARM: KVM: vgic: retire queued, disabled interrupts Christoffer Dall
@ 2012-12-03 13:24 ` Will Deacon
0 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2012-12-03 13:24 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:45:11PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> An interrupt may have been disabled after being made pending on the
> CPU interface (the classic case is a timer running while we're
> rebooting the guest - the interrupt would kick as soon as the CPU
> interface gets enabled, with deadly consequences).
>
> The solution is to examine already active LRs, and check the
> interrupt is still enabled. If not, just retire it.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/kvm/vgic.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index d7cdec5..dda5623 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -633,6 +633,34 @@ static void vgic_update_state(struct kvm *kvm)
>
> #define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
> #define MK_LR_PEND(src, irq) (VGIC_LR_PENDING_BIT | ((src) << 10) | (irq))
> +
> +/*
> + * An interrupt may have been disabled after being made pending on the
> + * CPU interface (the classic case is a timer running while we're
> + * rebooting the guest - the interrupt would kick as soon as the CPU
> + * interface gets enabled, with deadly consequences).
> + *
> + * The solution is to examine already active LRs, and check the
> + * interrupt is still enabled. If not, just retire it.
> + */
> +static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + int lr;
> +
> + for_each_set_bit(lr, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
> + int irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
> +
> + if (!vgic_bitmap_get_irq_val(&dist->irq_enabled,
> + vcpu->vcpu_id, irq)) {
> + vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
> + clear_bit(lr, vgic_cpu->lr_used);
> + vgic_cpu->vgic_lr[lr] &= ~VGIC_LR_STATE;
> + }
> + }
> +}
> +
> /*
> * Queue an interrupt to a CPU virtual interface. Return true on success,
> * or false if it wasn't possible to queue it.
> @@ -696,6 +724,8 @@ static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
>
> vcpu_id = vcpu->vcpu_id;
>
> + vgic_retire_disabled_irqs(vcpu);
Wouldn't it be better to do this when the interrupt is disabled, rather
than do the checking in the sync_to_cpu path?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection
2012-11-10 15:45 ` [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection Christoffer Dall
@ 2012-12-03 13:25 ` Will Deacon
2012-12-03 14:21 ` Marc Zyngier
0 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-12-03 13:25 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:45:18PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Plug the interrupt injection code. Interrupts can now be generated
> from user space.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_vgic.h | 8 +++
> arch/arm/kvm/arm.c | 29 +++++++++++++
> arch/arm/kvm/vgic.c | 90 +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 127 insertions(+)
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index 7229324..6e3d303 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -241,6 +241,8 @@ struct kvm_exit_mmio;
> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
> void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
> void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
> +int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> + bool level);
> int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> struct kvm_exit_mmio *mmio);
> @@ -271,6 +273,12 @@ static inline void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) {}
> static inline void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu) {}
> static inline void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu) {}
>
> +static inline int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid,
> + const struct kvm_irq_level *irq)
> +{
> + return 0;
> +}
> +
> static inline int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
> {
> return 0;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 3ac1aab..f43da01 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -764,10 +764,31 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
>
> switch (irq_type) {
> case KVM_ARM_IRQ_TYPE_CPU:
> + if (irqchip_in_kernel(kvm))
> + return -ENXIO;
> +
> if (irq_num > KVM_ARM_IRQ_CPU_FIQ)
> return -EINVAL;
>
> return vcpu_interrupt_line(vcpu, irq_num, level);
> +#ifdef CONFIG_KVM_ARM_VGIC
> + case KVM_ARM_IRQ_TYPE_PPI:
> + if (!irqchip_in_kernel(kvm))
> + return -ENXIO;
> +
> + if (irq_num < 16 || irq_num > 31)
> + return -EINVAL;
It's our favourite two numbers again! :)
> +
> + return kvm_vgic_inject_irq(kvm, vcpu->vcpu_id, irq_num, level);
> + case KVM_ARM_IRQ_TYPE_SPI:
> + if (!irqchip_in_kernel(kvm))
> + return -ENXIO;
> +
> + if (irq_num < 32 || irq_num > KVM_ARM_IRQ_GIC_MAX)
> + return -EINVAL;
> +
> + return kvm_vgic_inject_irq(kvm, 0, irq_num, level);
> +#endif
> }
>
> return -EINVAL;
> @@ -849,6 +870,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
> void __user *argp = (void __user *)arg;
>
> switch (ioctl) {
> +#ifdef CONFIG_KVM_ARM_VGIC
> + case KVM_CREATE_IRQCHIP: {
> + if (vgic_present)
> + return kvm_vgic_create(kvm);
> + else
> + return -EINVAL;
ENXIO? At least, that's what you use when setting the GIC addresses.
> + }
> +#endif
> case KVM_SET_DEVICE_ADDRESS: {
> struct kvm_device_address dev_addr;
>
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index dda5623..70040bb 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -75,6 +75,7 @@
> #define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>
> static void vgic_update_state(struct kvm *kvm);
> +static void vgic_kick_vcpus(struct kvm *kvm);
> static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
>
> static inline int vgic_irq_is_edge(struct vgic_dist *dist, int irq)
> @@ -542,6 +543,9 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
> kvm_prepare_mmio(run, mmio);
> kvm_handle_mmio_return(vcpu, run);
>
> + if (updated_state)
> + vgic_kick_vcpus(vcpu->kvm);
> +
> return true;
> }
>
> @@ -867,6 +871,92 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
> return test_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
> }
>
> +static void vgic_kick_vcpus(struct kvm *kvm)
> +{
> + struct kvm_vcpu *vcpu;
> + int c;
> +
> + /*
> + * We've injected an interrupt, time to find out who deserves
> + * a good kick...
> + */
> + kvm_for_each_vcpu(c, vcpu, kvm) {
> + if (kvm_vgic_vcpu_pending_irq(vcpu))
> + kvm_vcpu_kick(vcpu);
> + }
> +}
> +
> +static bool vgic_update_irq_state(struct kvm *kvm, int cpuid,
> + unsigned int irq_num, bool level)
> +{
> + struct vgic_dist *dist = &kvm->arch.vgic;
> + struct kvm_vcpu *vcpu;
> + int is_edge, is_level, state;
> + int enabled;
> + bool ret = true;
> +
> + spin_lock(&dist->lock);
> +
> + is_edge = vgic_irq_is_edge(dist, irq_num);
> + is_level = !is_edge;
> + state = vgic_bitmap_get_irq_val(&dist->irq_state, cpuid, irq_num);
> +
> + /*
> + * Only inject an interrupt if:
> + * - level triggered and we change level
> + * - edge triggered and we have a rising edge
> + */
> + if ((is_level && !(state ^ level)) || (is_edge && (state || !level))) {
> + ret = false;
> + goto out;
> + }
Eek, more of the edge/level combo. Can this be be restructured so that we
have vgic_update_{edge,level}_irq_state, which are called from here
appropriately?
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch
2012-11-10 15:45 ` [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch Christoffer Dall
@ 2012-12-03 13:31 ` Will Deacon
2012-12-03 14:26 ` Marc Zyngier
0 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-12-03 13:31 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:45:25PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Enable the VGIC control interface to be save-restored on world switch.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_arm.h | 12 +++++++
> arch/arm/kernel/asm-offsets.c | 12 +++++++
> arch/arm/kvm/interrupts_head.S | 68 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 92 insertions(+)
>
> diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
> index 246afd7..8f5dd22 100644
> --- a/arch/arm/include/asm/kvm_arm.h
> +++ b/arch/arm/include/asm/kvm_arm.h
> @@ -192,4 +192,16 @@
> #define HSR_EC_DABT (0x24)
> #define HSR_EC_DABT_HYP (0x25)
>
> +/* GICH offsets */
> +#define GICH_HCR 0x0
> +#define GICH_VTR 0x4
> +#define GICH_VMCR 0x8
> +#define GICH_MISR 0x10
> +#define GICH_EISR0 0x20
> +#define GICH_EISR1 0x24
> +#define GICH_ELRSR0 0x30
> +#define GICH_ELRSR1 0x34
> +#define GICH_APR 0xf0
> +#define GICH_LR0 0x100
Similar thing to the other new gic defines -- they're probably better off
in gic.h
> +
> #endif /* __ARM_KVM_ARM_H__ */
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 95cab37..39b6221 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -167,6 +167,18 @@ int main(void)
> DEFINE(VCPU_HxFAR, offsetof(struct kvm_vcpu, arch.hxfar));
> DEFINE(VCPU_HPFAR, offsetof(struct kvm_vcpu, arch.hpfar));
> DEFINE(VCPU_HYP_PC, offsetof(struct kvm_vcpu, arch.hyp_pc));
> +#ifdef CONFIG_KVM_ARM_VGIC
> + DEFINE(VCPU_VGIC_CPU, offsetof(struct kvm_vcpu, arch.vgic_cpu));
> + DEFINE(VGIC_CPU_HCR, offsetof(struct vgic_cpu, vgic_hcr));
> + DEFINE(VGIC_CPU_VMCR, offsetof(struct vgic_cpu, vgic_vmcr));
> + DEFINE(VGIC_CPU_MISR, offsetof(struct vgic_cpu, vgic_misr));
> + DEFINE(VGIC_CPU_EISR, offsetof(struct vgic_cpu, vgic_eisr));
> + DEFINE(VGIC_CPU_ELRSR, offsetof(struct vgic_cpu, vgic_elrsr));
> + DEFINE(VGIC_CPU_APR, offsetof(struct vgic_cpu, vgic_apr));
> + DEFINE(VGIC_CPU_LR, offsetof(struct vgic_cpu, vgic_lr));
> + DEFINE(VGIC_CPU_NR_LR, offsetof(struct vgic_cpu, nr_lr));
> + DEFINE(KVM_VGIC_VCTRL, offsetof(struct kvm, arch.vgic.vctrl_base));
> +#endif
> DEFINE(KVM_VTTBR, offsetof(struct kvm, arch.vttbr));
> #endif
> return 0;
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 2ac8b4a..c2423d8 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -341,6 +341,45 @@
> * @vcpup: Register pointing to VCPU struct
> */
> .macro save_vgic_state vcpup
> +#ifdef CONFIG_KVM_ARM_VGIC
> + /* Get VGIC VCTRL base into r2 */
> + ldr r2, [\vcpup, #VCPU_KVM]
> + ldr r2, [r2, #KVM_VGIC_VCTRL]
> + cmp r2, #0
> + beq 2f
> +
> + /* Compute the address of struct vgic_cpu */
> + add r11, \vcpup, #VCPU_VGIC_CPU
Given that we're dealing with constants, it would be more efficient to
express this addition as part of the immediate offset and let gas spit
out the final computed address for the stores below.
> +
> + /* Save all interesting registers */
> + ldr r3, [r2, #GICH_HCR]
> + ldr r4, [r2, #GICH_VMCR]
> + ldr r5, [r2, #GICH_MISR]
> + ldr r6, [r2, #GICH_EISR0]
> + ldr r7, [r2, #GICH_EISR1]
> + ldr r8, [r2, #GICH_ELRSR0]
> + ldr r9, [r2, #GICH_ELRSR1]
> + ldr r10, [r2, #GICH_APR]
> +
> + str r3, [r11, #VGIC_CPU_HCR]
> + str r4, [r11, #VGIC_CPU_VMCR]
> + str r5, [r11, #VGIC_CPU_MISR]
> + str r6, [r11, #VGIC_CPU_EISR]
> + str r7, [r11, #(VGIC_CPU_EISR + 4)]
> + str r8, [r11, #VGIC_CPU_ELRSR]
> + str r9, [r11, #(VGIC_CPU_ELRSR + 4)]
> + str r10, [r11, #VGIC_CPU_APR]
> +
> + /* Save list registers */
> + add r2, r2, #GICH_LR0
> + add r3, r11, #VGIC_CPU_LR
> + ldr r4, [r11, #VGIC_CPU_NR_LR]
> +1: ldr r6, [r2], #4
> + str r6, [r3], #4
> + subs r4, r4, #1
> + bne 1b
> +2:
> +#endif
> .endm
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management
2012-12-03 13:23 ` Will Deacon
@ 2012-12-03 14:11 ` Marc Zyngier
2012-12-03 14:34 ` Will Deacon
2012-12-03 14:54 ` Christoffer Dall
0 siblings, 2 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-12-03 14:11 UTC (permalink / raw)
To: linux-arm-kernel
On 03/12/12 13:23, Will Deacon wrote:
> Hi Marc,
>
> I've managed to look at some more of the vgic code, so here is some more
> feedback. I've still not got to the end of the series, but there's light at
> the end of the tunnel...
>
> On Sat, Nov 10, 2012 at 03:45:05PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Add VGIC virtual CPU interface code, picking pending interrupts
>> from the distributor and stashing them in the VGIC control interface
>> list registers.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_vgic.h | 41 +++++++
>> arch/arm/kvm/vgic.c | 226 +++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 266 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>> index 9e60b1d..7229324 100644
>> --- a/arch/arm/include/asm/kvm_vgic.h
>> +++ b/arch/arm/include/asm/kvm_vgic.h
>> @@ -193,8 +193,45 @@ struct vgic_dist {
>> };
>>
>> struct vgic_cpu {
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + /* per IRQ to LR mapping */
>> + u8 vgic_irq_lr_map[VGIC_NR_IRQS];
>
> per IRQ?
Yes. We need to track which IRQ maps to which LR (so we can piggyback a
pending interrupt on an active one).
>> +
>> + /* Pending interrupts on this VCPU */
>> + DECLARE_BITMAP( pending, VGIC_NR_IRQS);
>> +
>> + /* Bitmap of used/free list registers */
>> + DECLARE_BITMAP( lr_used, 64);
>> +
>> + /* Number of list registers on this CPU */
>> + int nr_lr;
>> +
>> + /* CPU vif control registers for world switch */
>> + u32 vgic_hcr;
>> + u32 vgic_vmcr;
>> + u32 vgic_misr; /* Saved only */
>> + u32 vgic_eisr[2]; /* Saved only */
>> + u32 vgic_elrsr[2]; /* Saved only */
>> + u32 vgic_apr;
>> + u32 vgic_lr[64]; /* A15 has only 4... */
>> +#endif
>> };
>
> Looks like we should have a #define for the maximum number of list registers,
> so we keep vgic_lr and lr_user in sync.
Indeed.
>>
>> +#define VGIC_HCR_EN (1 << 0)
>> +#define VGIC_HCR_UIE (1 << 1)
>> +
>> +#define VGIC_LR_VIRTUALID (0x3ff << 0)
>> +#define VGIC_LR_PHYSID_CPUID (7 << 10)
>> +#define VGIC_LR_STATE (3 << 28)
>> +#define VGIC_LR_PENDING_BIT (1 << 28)
>> +#define VGIC_LR_ACTIVE_BIT (1 << 29)
>> +#define VGIC_LR_EOI (1 << 19)
>> +
>> +#define VGIC_MISR_EOI (1 << 0)
>> +#define VGIC_MISR_U (1 << 1)
>> +
>> +#define LR_EMPTY 0xff
>> +
>
> Could stick these in asm/hardware/gic.h. I know they're not used by the gic
> driver, but they're the same piece of architecture so it's probably worth
> keeping in one place.
This is on my list of things to do once the GIC code is shared between
arm and arm64. Could do it earlier if that makes more sense.
> You'd probably also want a s/VGIC/GICH/
Sure.
>> struct kvm;
>> struct kvm_vcpu;
>> struct kvm_run;
>> @@ -202,9 +239,13 @@ struct kvm_exit_mmio;
>>
>> #ifdef CONFIG_KVM_ARM_VGIC
>> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>> +void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
>> +void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
>> +int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>> struct kvm_exit_mmio *mmio);
>>
>> +#define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
>> #else
>> static inline int kvm_vgic_hyp_init(void)
>> {
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> index 82feee8..d7cdec5 100644
>> --- a/arch/arm/kvm/vgic.c
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -587,7 +587,25 @@ static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg)
>>
>> static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
>> {
>> - return 0;
>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> + unsigned long *pending, *enabled, *pend;
>> + int vcpu_id;
>> +
>> + vcpu_id = vcpu->vcpu_id;
>> + pend = vcpu->arch.vgic_cpu.pending;
>> +
>> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
>> + enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
>> + bitmap_and(pend, pending, enabled, 32);
>
> pend and pending! vcpu_pending and dist_pending?
A lot of that code has already been reworked. See:
https://lists.cs.columbia.edu/pipermail/kvmarm/2012-November/004138.html
>> +
>> + pending = vgic_bitmap_get_shared_map(&dist->irq_state);
>> + enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
>> + bitmap_and(pend + 1, pending, enabled, VGIC_NR_SHARED_IRQS);
>> + bitmap_and(pend + 1, pend + 1,
>> + vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
>> + VGIC_NR_SHARED_IRQS);
>> +
>> + return (find_first_bit(pend, VGIC_NR_IRQS) < VGIC_NR_IRQS);
>> }
>>
>> /*
>> @@ -613,6 +631,212 @@ static void vgic_update_state(struct kvm *kvm)
>> }
>> }
>>
>> +#define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
>
> Is VGIC_LR_PHYSID_CPUID wide enough for this? The CPUID is only 3 bits, but
> the interrupt ID could be larger. Or do you not supported hardware interrupt
> forwarding? (in which case, LR_PHYSID is a misleading name).
Hardware interrupt forwarding is not supported. PHYSID is the name of
the actual field in the spec, hence the name of the macro. LR_CPUID?
>> +#define MK_LR_PEND(src, irq) (VGIC_LR_PENDING_BIT | ((src) << 10) | (irq))
>> +/*
>> + * Queue an interrupt to a CPU virtual interface. Return true on success,
>> + * or false if it wasn't possible to queue it.
>> + */
>> +static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>> +{
>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> + int lr, is_level;
>> +
>> + /* Sanitize the input... */
>> + BUG_ON(sgi_source_id & ~7);
>
> sgi_source_id > MAX_SGI_SOURCES (or whatever we end up having for the SGI
> and PPI limits).
OK.
>> + BUG_ON(sgi_source_id && irq > 15);
>
> irq > MAX_PPI_SOURCES
OK.
>> + BUG_ON(irq >= VGIC_NR_IRQS);
>> +
>> + kvm_debug("Queue IRQ%d\n", irq);
>> +
>> + lr = vgic_cpu->vgic_irq_lr_map[irq];
>> + is_level = !vgic_irq_is_edge(dist, irq);
>> +
>> + /* Do we have an active interrupt for the same CPUID? */
>> + if (lr != LR_EMPTY &&
>> + (LR_PHYSID(vgic_cpu->vgic_lr[lr]) == sgi_source_id)) {
>
> Ok, so this does return the source.
>
>> + kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
>> + BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
>> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
>> + if (is_level)
>> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
>> + return true;
>> + }
>> +
>> + /* Try to use another LR for this interrupt */
>> + lr = find_first_bit((unsigned long *)vgic_cpu->vgic_elrsr,
>> + vgic_cpu->nr_lr);
>> + if (lr >= vgic_cpu->nr_lr)
>> + return false;
>> +
>> + kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
>> + vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
>> + if (is_level)
>> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
>> +
>> + vgic_cpu->vgic_irq_lr_map[irq] = lr;
>> + clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
>> + set_bit(lr, vgic_cpu->lr_used);
>> +
>> + return true;
>> +}
>
> I can't help but feel that this could be made cleaner by moving the
> level-specific EOI handling out into a separate function.
Do you mean having two functions, one for edge and the other for level?
Seems overkill to me. I could move the "if (is_level) ..." to a common
spot though.
>> +
>> +/*
>> + * Fill the list registers with pending interrupts before running the
>> + * guest.
>> + */
>> +static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
>> +{
>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> + unsigned long *pending;
>> + int i, c, vcpu_id;
>> + int overflow = 0;
>> +
>> + vcpu_id = vcpu->vcpu_id;
>> +
>> + /*
>> + * We may not have any pending interrupt, or the interrupts
>> + * may have been serviced from another vcpu. In all cases,
>> + * move along.
>> + */
>> + if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
>> + pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
>> + goto epilog;
>> + }
>> +
>> + /* SGIs */
>> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
>> + for_each_set_bit(i, vgic_cpu->pending, 16) {
>> + unsigned long sources;
>> +
>> + sources = dist->irq_sgi_sources[vcpu_id][i];
>> + for_each_set_bit(c, &sources, 8) {
>> + if (!vgic_queue_irq(vcpu, c, i)) {
>> + overflow = 1;
>> + continue;
>> + }
>
> If there are multiple sources, why do you need to queue the interrupt
> multiple times? I would have thought it could be collapsed into one.
Because SGIs from different sources *are* different interrupts. In an
n-CPU system (with n > 2), you could have some message passing system
based on interrupts, and you'd need to know which CPU is pinging you.
>> +
>> + clear_bit(c, &sources);
>> + }
>> +
>> + if (!sources)
>> + clear_bit(i, pending);
>
> What does this signify and how does it happen? An SGI without a source
> sounds pretty weird...
See the clear_bit() just above. Once all the sources for this SGI are
cleared, we can make the interrupt not pending anymore.
>> +
>> + dist->irq_sgi_sources[vcpu_id][i] = sources;
>> + }
>> +
>> + /* PPIs */
>> + for_each_set_bit_from(i, vgic_cpu->pending, 32) {
>> + if (!vgic_queue_irq(vcpu, 0, i)) {
>> + overflow = 1;
>> + continue;
>> + }
>> +
>> + clear_bit(i, pending);
>
> You could lose the `continue' and stick the clear_bit in an else clause
> (same for SGIs and SPIs).
Sure.
>> + }
>> +
>> +
>> + /* SPIs */
>> + pending = vgic_bitmap_get_shared_map(&dist->irq_state);
>> + for_each_set_bit_from(i, vgic_cpu->pending, VGIC_NR_IRQS) {
>> + if (vgic_bitmap_get_irq_val(&dist->irq_active, 0, i))
>> + continue; /* level interrupt, already queued */
>> +
>> + if (!vgic_queue_irq(vcpu, 0, i)) {
>> + overflow = 1;
>> + continue;
>> + }
>> +
>> + /* Immediate clear on edge, set active on level */
>> + if (vgic_irq_is_edge(dist, i)) {
>> + clear_bit(i - 32, pending);
>> + clear_bit(i, vgic_cpu->pending);
>> + } else {
>> + vgic_bitmap_set_irq_val(&dist->irq_active, 0, i, 1);
>> + }
>> + }
>
> Hmm, more of this edge/level handling trying to use the same code and it
> not really working.
Hmmm. Let me think of a better way to do this without ending up
duplicating too much code (it is complicated enough that I don't want to
maintain two copies of it).
>> +
>> +epilog:
>> + if (overflow)
>> + vgic_cpu->vgic_hcr |= VGIC_HCR_UIE;
>> + else {
>> + vgic_cpu->vgic_hcr &= ~VGIC_HCR_UIE;
>> + /*
>> + * We're about to run this VCPU, and we've consumed
>> + * everything the distributor had in store for
>> + * us. Claim we don't have anything pending. We'll
>> + * adjust that if needed while exiting.
>> + */
>> + clear_bit(vcpu_id, &dist->irq_pending_on_cpu);
>> + }
>> +}
>> +
>> +/*
>> + * Sync back the VGIC state after a guest run. We do not really touch
>> + * the distributor here (the irq_pending_on_cpu bit is safe to set),
>> + * so there is no need for taking its lock.
>> + */
>> +static void __kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu)
>> +{
>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> + int lr, pending;
>> +
>> + /* Clear mappings for empty LRs */
>> + for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
>> + vgic_cpu->nr_lr) {
>> + int irq;
>> +
>> + if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
>> + continue;
>> +
>> + irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
>> +
>> + BUG_ON(irq >= VGIC_NR_IRQS);
>> + vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
>> + }
>> +
>> + /* Check if we still have something up our sleeve... */
>> + pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
>> + vgic_cpu->nr_lr);
>
> Does this rely on timeliness of maintenance interrupts with respect to
> EOIs in the guest? i.e. if a maintenance interrupt is delayed (I can't
> see anything in the spec stating that they're synchronous) and you end up
> taking one here, will you accidentally re-pend the interrupt?
I don't think so. ELRSR only indicates that the list register is empty.
If we find a zero bit there, we flag that this vcpu has at least one
pending interrupt (in its list registers). A delayed maintenance
interrupt may race with this by also setting this bit if an interrupt is
still in the active state after being EOIed, but that's not a problem
(we just set_bit twice). A race between clear and set would be
problematic though.
>> + if (pending < vgic_cpu->nr_lr) {
>> + set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
>> + smp_mb();
>
> What's this barrier for?
It is strategically placed to entertain the reviewer. And it does its
job! I'll nuke it, now that you found it. ;-)
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection
2012-12-03 13:25 ` Will Deacon
@ 2012-12-03 14:21 ` Marc Zyngier
2012-12-03 14:58 ` Christoffer Dall
2012-12-03 19:13 ` Christoffer Dall
0 siblings, 2 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-12-03 14:21 UTC (permalink / raw)
To: linux-arm-kernel
On 03/12/12 13:25, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:45:18PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Plug the interrupt injection code. Interrupts can now be generated
>> from user space.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_vgic.h | 8 +++
>> arch/arm/kvm/arm.c | 29 +++++++++++++
>> arch/arm/kvm/vgic.c | 90 +++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 127 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>> index 7229324..6e3d303 100644
>> --- a/arch/arm/include/asm/kvm_vgic.h
>> +++ b/arch/arm/include/asm/kvm_vgic.h
>> @@ -241,6 +241,8 @@ struct kvm_exit_mmio;
>> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>> void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
>> void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
>> +int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>> + bool level);
>> int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>> struct kvm_exit_mmio *mmio);
>> @@ -271,6 +273,12 @@ static inline void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) {}
>> static inline void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu) {}
>> static inline void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu) {}
>>
>> +static inline int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid,
>> + const struct kvm_irq_level *irq)
>> +{
>> + return 0;
>> +}
>> +
>> static inline int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>> {
>> return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 3ac1aab..f43da01 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -764,10 +764,31 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
>>
>> switch (irq_type) {
>> case KVM_ARM_IRQ_TYPE_CPU:
>> + if (irqchip_in_kernel(kvm))
>> + return -ENXIO;
>> +
>> if (irq_num > KVM_ARM_IRQ_CPU_FIQ)
>> return -EINVAL;
>>
>> return vcpu_interrupt_line(vcpu, irq_num, level);
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + case KVM_ARM_IRQ_TYPE_PPI:
>> + if (!irqchip_in_kernel(kvm))
>> + return -ENXIO;
>> +
>> + if (irq_num < 16 || irq_num > 31)
>> + return -EINVAL;
>
> It's our favourite two numbers again! :)
I already fixed a number of them. Probably missed this one though.
>> +
>> + return kvm_vgic_inject_irq(kvm, vcpu->vcpu_id, irq_num, level);
>> + case KVM_ARM_IRQ_TYPE_SPI:
>> + if (!irqchip_in_kernel(kvm))
>> + return -ENXIO;
>> +
>> + if (irq_num < 32 || irq_num > KVM_ARM_IRQ_GIC_MAX)
>> + return -EINVAL;
>> +
>> + return kvm_vgic_inject_irq(kvm, 0, irq_num, level);
>> +#endif
>> }
>>
>> return -EINVAL;
>> @@ -849,6 +870,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
>> void __user *argp = (void __user *)arg;
>>
>> switch (ioctl) {
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + case KVM_CREATE_IRQCHIP: {
>> + if (vgic_present)
>> + return kvm_vgic_create(kvm);
>> + else
>> + return -EINVAL;
>
> ENXIO? At least, that's what you use when setting the GIC addresses.
-EINVAL seems to be one of the values other archs are using. -ENXIO is
not one of them for KVM_CREATE_IRQCHIP. Doesn't mean they are right, but
for the sake of keeping userspace happy, I'm not really inclined to
change this.
Christoffer?
>> + }
>> +#endif
>> case KVM_SET_DEVICE_ADDRESS: {
>> struct kvm_device_address dev_addr;
>>
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> index dda5623..70040bb 100644
>> --- a/arch/arm/kvm/vgic.c
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -75,6 +75,7 @@
>> #define ACCESS_WRITE_MASK(x) ((x) & (3 << 1))
>>
>> static void vgic_update_state(struct kvm *kvm);
>> +static void vgic_kick_vcpus(struct kvm *kvm);
>> static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
>>
>> static inline int vgic_irq_is_edge(struct vgic_dist *dist, int irq)
>> @@ -542,6 +543,9 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
>> kvm_prepare_mmio(run, mmio);
>> kvm_handle_mmio_return(vcpu, run);
>>
>> + if (updated_state)
>> + vgic_kick_vcpus(vcpu->kvm);
>> +
>> return true;
>> }
>>
>> @@ -867,6 +871,92 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>> return test_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
>> }
>>
>> +static void vgic_kick_vcpus(struct kvm *kvm)
>> +{
>> + struct kvm_vcpu *vcpu;
>> + int c;
>> +
>> + /*
>> + * We've injected an interrupt, time to find out who deserves
>> + * a good kick...
>> + */
>> + kvm_for_each_vcpu(c, vcpu, kvm) {
>> + if (kvm_vgic_vcpu_pending_irq(vcpu))
>> + kvm_vcpu_kick(vcpu);
>> + }
>> +}
>> +
>> +static bool vgic_update_irq_state(struct kvm *kvm, int cpuid,
>> + unsigned int irq_num, bool level)
>> +{
>> + struct vgic_dist *dist = &kvm->arch.vgic;
>> + struct kvm_vcpu *vcpu;
>> + int is_edge, is_level, state;
>> + int enabled;
>> + bool ret = true;
>> +
>> + spin_lock(&dist->lock);
>> +
>> + is_edge = vgic_irq_is_edge(dist, irq_num);
>> + is_level = !is_edge;
>> + state = vgic_bitmap_get_irq_val(&dist->irq_state, cpuid, irq_num);
>> +
>> + /*
>> + * Only inject an interrupt if:
>> + * - level triggered and we change level
>> + * - edge triggered and we have a rising edge
>> + */
>> + if ((is_level && !(state ^ level)) || (is_edge && (state || !level))) {
>> + ret = false;
>> + goto out;
>> + }
>
> Eek, more of the edge/level combo. Can this be be restructured so that we
> have vgic_update_{edge,level}_irq_state, which are called from here
> appropriately?
I'll have a look.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch
2012-12-03 13:31 ` Will Deacon
@ 2012-12-03 14:26 ` Marc Zyngier
0 siblings, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-12-03 14:26 UTC (permalink / raw)
To: linux-arm-kernel
On 03/12/12 13:31, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:45:25PM +0000, Christoffer Dall wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Enable the VGIC control interface to be save-restored on world switch.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>> ---
>> arch/arm/include/asm/kvm_arm.h | 12 +++++++
>> arch/arm/kernel/asm-offsets.c | 12 +++++++
>> arch/arm/kvm/interrupts_head.S | 68 ++++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 92 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
>> index 246afd7..8f5dd22 100644
>> --- a/arch/arm/include/asm/kvm_arm.h
>> +++ b/arch/arm/include/asm/kvm_arm.h
>> @@ -192,4 +192,16 @@
>> #define HSR_EC_DABT (0x24)
>> #define HSR_EC_DABT_HYP (0x25)
>>
>> +/* GICH offsets */
>> +#define GICH_HCR 0x0
>> +#define GICH_VTR 0x4
>> +#define GICH_VMCR 0x8
>> +#define GICH_MISR 0x10
>> +#define GICH_EISR0 0x20
>> +#define GICH_EISR1 0x24
>> +#define GICH_ELRSR0 0x30
>> +#define GICH_ELRSR1 0x34
>> +#define GICH_APR 0xf0
>> +#define GICH_LR0 0x100
>
> Similar thing to the other new gic defines -- they're probably better off
> in gic.h
Agreed.
>> +
>> #endif /* __ARM_KVM_ARM_H__ */
>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> index 95cab37..39b6221 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -167,6 +167,18 @@ int main(void)
>> DEFINE(VCPU_HxFAR, offsetof(struct kvm_vcpu, arch.hxfar));
>> DEFINE(VCPU_HPFAR, offsetof(struct kvm_vcpu, arch.hpfar));
>> DEFINE(VCPU_HYP_PC, offsetof(struct kvm_vcpu, arch.hyp_pc));
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + DEFINE(VCPU_VGIC_CPU, offsetof(struct kvm_vcpu, arch.vgic_cpu));
>> + DEFINE(VGIC_CPU_HCR, offsetof(struct vgic_cpu, vgic_hcr));
>> + DEFINE(VGIC_CPU_VMCR, offsetof(struct vgic_cpu, vgic_vmcr));
>> + DEFINE(VGIC_CPU_MISR, offsetof(struct vgic_cpu, vgic_misr));
>> + DEFINE(VGIC_CPU_EISR, offsetof(struct vgic_cpu, vgic_eisr));
>> + DEFINE(VGIC_CPU_ELRSR, offsetof(struct vgic_cpu, vgic_elrsr));
>> + DEFINE(VGIC_CPU_APR, offsetof(struct vgic_cpu, vgic_apr));
>> + DEFINE(VGIC_CPU_LR, offsetof(struct vgic_cpu, vgic_lr));
>> + DEFINE(VGIC_CPU_NR_LR, offsetof(struct vgic_cpu, nr_lr));
>> + DEFINE(KVM_VGIC_VCTRL, offsetof(struct kvm, arch.vgic.vctrl_base));
>> +#endif
>> DEFINE(KVM_VTTBR, offsetof(struct kvm, arch.vttbr));
>> #endif
>> return 0;
>> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
>> index 2ac8b4a..c2423d8 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -341,6 +341,45 @@
>> * @vcpup: Register pointing to VCPU struct
>> */
>> .macro save_vgic_state vcpup
>> +#ifdef CONFIG_KVM_ARM_VGIC
>> + /* Get VGIC VCTRL base into r2 */
>> + ldr r2, [\vcpup, #VCPU_KVM]
>> + ldr r2, [r2, #KVM_VGIC_VCTRL]
>> + cmp r2, #0
>> + beq 2f
>> +
>> + /* Compute the address of struct vgic_cpu */
>> + add r11, \vcpup, #VCPU_VGIC_CPU
>
> Given that we're dealing with constants, it would be more efficient to
> express this addition as part of the immediate offset and let gas spit
> out the final computed address for the stores below.
We had that for a while, and with the kvm_vcpu structure growing, we
ended up having fields out of the reach of an immediate offset. Is
kvm_vcpu too big? Yes.
>> +
>> + /* Save all interesting registers */
>> + ldr r3, [r2, #GICH_HCR]
>> + ldr r4, [r2, #GICH_VMCR]
>> + ldr r5, [r2, #GICH_MISR]
>> + ldr r6, [r2, #GICH_EISR0]
>> + ldr r7, [r2, #GICH_EISR1]
>> + ldr r8, [r2, #GICH_ELRSR0]
>> + ldr r9, [r2, #GICH_ELRSR1]
>> + ldr r10, [r2, #GICH_APR]
>> +
>> + str r3, [r11, #VGIC_CPU_HCR]
>> + str r4, [r11, #VGIC_CPU_VMCR]
>> + str r5, [r11, #VGIC_CPU_MISR]
>> + str r6, [r11, #VGIC_CPU_EISR]
>> + str r7, [r11, #(VGIC_CPU_EISR + 4)]
>> + str r8, [r11, #VGIC_CPU_ELRSR]
>> + str r9, [r11, #(VGIC_CPU_ELRSR + 4)]
>> + str r10, [r11, #VGIC_CPU_APR]
>> +
>> + /* Save list registers */
>> + add r2, r2, #GICH_LR0
>> + add r3, r11, #VGIC_CPU_LR
>> + ldr r4, [r11, #VGIC_CPU_NR_LR]
>> +1: ldr r6, [r2], #4
>> + str r6, [r3], #4
>> + subs r4, r4, #1
>> + bne 1b
>> +2:
>> +#endif
>> .endm
>
> Will
>
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management
2012-12-03 14:11 ` Marc Zyngier
@ 2012-12-03 14:34 ` Will Deacon
2012-12-03 15:24 ` Marc Zyngier
2012-12-03 14:54 ` Christoffer Dall
1 sibling, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-12-03 14:34 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 03, 2012 at 02:11:03PM +0000, Marc Zyngier wrote:
> On 03/12/12 13:23, Will Deacon wrote:
> >>
> >> +#define VGIC_HCR_EN (1 << 0)
> >> +#define VGIC_HCR_UIE (1 << 1)
> >> +
> >> +#define VGIC_LR_VIRTUALID (0x3ff << 0)
> >> +#define VGIC_LR_PHYSID_CPUID (7 << 10)
> >> +#define VGIC_LR_STATE (3 << 28)
> >> +#define VGIC_LR_PENDING_BIT (1 << 28)
> >> +#define VGIC_LR_ACTIVE_BIT (1 << 29)
> >> +#define VGIC_LR_EOI (1 << 19)
> >> +
> >> +#define VGIC_MISR_EOI (1 << 0)
> >> +#define VGIC_MISR_U (1 << 1)
> >> +
> >> +#define LR_EMPTY 0xff
> >> +
> >
> > Could stick these in asm/hardware/gic.h. I know they're not used by the gic
> > driver, but they're the same piece of architecture so it's probably worth
> > keeping in one place.
>
> This is on my list of things to do once the GIC code is shared between
> arm and arm64. Could do it earlier if that makes more sense.
Might as well as I found some others in a later patch too.
> >> static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
> >> {
> >> - return 0;
> >> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> + unsigned long *pending, *enabled, *pend;
> >> + int vcpu_id;
> >> +
> >> + vcpu_id = vcpu->vcpu_id;
> >> + pend = vcpu->arch.vgic_cpu.pending;
> >> +
> >> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
> >> + enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
> >> + bitmap_and(pend, pending, enabled, 32);
> >
> > pend and pending! vcpu_pending and dist_pending?
>
> A lot of that code has already been reworked. See:
> https://lists.cs.columbia.edu/pipermail/kvmarm/2012-November/004138.html
Argh, too much code! Ok, as long as it's being looked at.
> >> +
> >> + pending = vgic_bitmap_get_shared_map(&dist->irq_state);
> >> + enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
> >> + bitmap_and(pend + 1, pending, enabled, VGIC_NR_SHARED_IRQS);
> >> + bitmap_and(pend + 1, pend + 1,
> >> + vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
> >> + VGIC_NR_SHARED_IRQS);
> >> +
> >> + return (find_first_bit(pend, VGIC_NR_IRQS) < VGIC_NR_IRQS);
> >> }
> >>
> >> /*
> >> @@ -613,6 +631,212 @@ static void vgic_update_state(struct kvm *kvm)
> >> }
> >> }
> >>
> >> +#define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
> >
> > Is VGIC_LR_PHYSID_CPUID wide enough for this? The CPUID is only 3 bits, but
> > the interrupt ID could be larger. Or do you not supported hardware interrupt
> > forwarding? (in which case, LR_PHYSID is a misleading name).
>
> Hardware interrupt forwarding is not supported. PHYSID is the name of
> the actual field in the spec, hence the name of the macro. LR_CPUID?
Sure.
> >> + kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
> >> + BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
> >> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
> >> + if (is_level)
> >> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> >> + return true;
> >> + }
> >> +
> >> + /* Try to use another LR for this interrupt */
> >> + lr = find_first_bit((unsigned long *)vgic_cpu->vgic_elrsr,
> >> + vgic_cpu->nr_lr);
> >> + if (lr >= vgic_cpu->nr_lr)
> >> + return false;
> >> +
> >> + kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
> >> + vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
> >> + if (is_level)
> >> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> >> +
> >> + vgic_cpu->vgic_irq_lr_map[irq] = lr;
> >> + clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
> >> + set_bit(lr, vgic_cpu->lr_used);
> >> +
> >> + return true;
> >> +}
> >
> > I can't help but feel that this could be made cleaner by moving the
> > level-specific EOI handling out into a separate function.
>
> Do you mean having two functions, one for edge and the other for level?
> Seems overkill to me. I could move the "if (is_level) ..." to a common
> spot though.
Indeed, you could just have something like vgic_eoi_irq and call that
in one place, letting that function do the level check.
> >> +
> >> +/*
> >> + * Fill the list registers with pending interrupts before running the
> >> + * guest.
> >> + */
> >> +static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
> >> +{
> >> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> + unsigned long *pending;
> >> + int i, c, vcpu_id;
> >> + int overflow = 0;
> >> +
> >> + vcpu_id = vcpu->vcpu_id;
> >> +
> >> + /*
> >> + * We may not have any pending interrupt, or the interrupts
> >> + * may have been serviced from another vcpu. In all cases,
> >> + * move along.
> >> + */
> >> + if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
> >> + pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
> >> + goto epilog;
> >> + }
> >> +
> >> + /* SGIs */
> >> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
> >> + for_each_set_bit(i, vgic_cpu->pending, 16) {
> >> + unsigned long sources;
> >> +
> >> + sources = dist->irq_sgi_sources[vcpu_id][i];
> >> + for_each_set_bit(c, &sources, 8) {
> >> + if (!vgic_queue_irq(vcpu, c, i)) {
> >> + overflow = 1;
> >> + continue;
> >> + }
> >
> > If there are multiple sources, why do you need to queue the interrupt
> > multiple times? I would have thought it could be collapsed into one.
>
> Because SGIs from different sources *are* different interrupts. In an
> n-CPU system (with n > 2), you could have some message passing system
> based on interrupts, and you'd need to know which CPU is pinging you.
Ok, fair point.
> >> +
> >> + clear_bit(c, &sources);
> >> + }
> >> +
> >> + if (!sources)
> >> + clear_bit(i, pending);
> >
> > What does this signify and how does it happen? An SGI without a source
> > sounds pretty weird...
>
> See the clear_bit() just above. Once all the sources for this SGI are
> cleared, we can make the interrupt not pending anymore.
Yup, missed that.
> >> +/*
> >> + * Sync back the VGIC state after a guest run. We do not really touch
> >> + * the distributor here (the irq_pending_on_cpu bit is safe to set),
> >> + * so there is no need for taking its lock.
> >> + */
> >> +static void __kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu)
> >> +{
> >> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >> + int lr, pending;
> >> +
> >> + /* Clear mappings for empty LRs */
> >> + for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
> >> + vgic_cpu->nr_lr) {
> >> + int irq;
> >> +
> >> + if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
> >> + continue;
> >> +
> >> + irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
> >> +
> >> + BUG_ON(irq >= VGIC_NR_IRQS);
> >> + vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
> >> + }
> >> +
> >> + /* Check if we still have something up our sleeve... */
> >> + pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
> >> + vgic_cpu->nr_lr);
> >
> > Does this rely on timeliness of maintenance interrupts with respect to
> > EOIs in the guest? i.e. if a maintenance interrupt is delayed (I can't
> > see anything in the spec stating that they're synchronous) and you end up
> > taking one here, will you accidentally re-pend the interrupt?
>
> I don't think so. ELRSR only indicates that the list register is empty.
> If we find a zero bit there, we flag that this vcpu has at least one
> pending interrupt (in its list registers). A delayed maintenance
> interrupt may race with this by also setting this bit if an interrupt is
> still in the active state after being EOIed, but that's not a problem
> (we just set_bit twice). A race between clear and set would be
> problematic though.
Hmm, yes, the EOI maintenance handler only sets pending IRQs. So, to turn it
around, how about __kvm_vgic_sync_to_cpu? There is a comment in the
maintenance handler about it:
* level interrupt. There is a potential race with
* the queuing of an interrupt in __kvm_sync_to_cpu(), where we check
* if the interrupt is already active. Two possibilities:
*
* - The queuing is occuring on the same vcpu: cannot happen, as we're
* already in the context of this vcpu, and executing the handler
Does this still apply if the maintenance interrupt comes in late? It will
then look like the stopped vcpu just EOId an interrupt...
> >> + if (pending < vgic_cpu->nr_lr) {
> >> + set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
> >> + smp_mb();
> >
> > What's this barrier for?
>
> It is strategically placed to entertain the reviewer. And it does its
> job! I'll nuke it, now that you found it. ;-)
Excellent! I think there may be another one on the horizon when I get into
the maintenance interrupt handler proper too. Looking forward to it.
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management
2012-12-03 14:11 ` Marc Zyngier
2012-12-03 14:34 ` Will Deacon
@ 2012-12-03 14:54 ` Christoffer Dall
1 sibling, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-12-03 14:54 UTC (permalink / raw)
To: linux-arm-kernel
[...]
>
>>> +
>>> + clear_bit(c, &sources);
>>> + }
>>> +
>>> + if (!sources)
>>> + clear_bit(i, pending);
>>
>> What does this signify and how does it happen? An SGI without a source
>> sounds pretty weird...
>
> See the clear_bit() just above. Once all the sources for this SGI are
> cleared, we can make the interrupt not pending anymore.
>
every time I read the code, I get completely bogged up on trying to
understand this case and I tell myself we should put a comment here,
then I understand why it happens and I think, oh it's obvious, no
comment needed, but now I (almost) forgot again. Could we add a
comment?
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection
2012-12-03 14:21 ` Marc Zyngier
@ 2012-12-03 14:58 ` Christoffer Dall
2012-12-03 19:13 ` Christoffer Dall
1 sibling, 0 replies; 58+ messages in thread
From: Christoffer Dall @ 2012-12-03 14:58 UTC (permalink / raw)
To: linux-arm-kernel
[...]
>>> +
>>> +static bool vgic_update_irq_state(struct kvm *kvm, int cpuid,
>>> + unsigned int irq_num, bool level)
>>> +{
>>> + struct vgic_dist *dist = &kvm->arch.vgic;
>>> + struct kvm_vcpu *vcpu;
>>> + int is_edge, is_level, state;
>>> + int enabled;
>>> + bool ret = true;
>>> +
>>> + spin_lock(&dist->lock);
>>> +
>>> + is_edge = vgic_irq_is_edge(dist, irq_num);
>>> + is_level = !is_edge;
>>> + state = vgic_bitmap_get_irq_val(&dist->irq_state, cpuid, irq_num);
>>> +
>>> + /*
>>> + * Only inject an interrupt if:
>>> + * - level triggered and we change level
>>> + * - edge triggered and we have a rising edge
>>> + */
>>> + if ((is_level && !(state ^ level)) || (is_edge && (state || !level))) {
>>> + ret = false;
>>> + goto out;
>>> + }
>>
>> Eek, more of the edge/level combo. Can this be be restructured so that we
>> have vgic_update_{edge,level}_irq_state, which are called from here
>> appropriately?
>
> I'll have a look.
>
oh, you're no fun anymore. That if statement is one of the funniest
pieces of this code.
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management
2012-12-03 14:34 ` Will Deacon
@ 2012-12-03 15:24 ` Marc Zyngier
0 siblings, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-12-03 15:24 UTC (permalink / raw)
To: linux-arm-kernel
On 03/12/12 14:34, Will Deacon wrote:
> On Mon, Dec 03, 2012 at 02:11:03PM +0000, Marc Zyngier wrote:
>> On 03/12/12 13:23, Will Deacon wrote:
>>>>
>>>> +#define VGIC_HCR_EN (1 << 0)
>>>> +#define VGIC_HCR_UIE (1 << 1)
>>>> +
>>>> +#define VGIC_LR_VIRTUALID (0x3ff << 0)
>>>> +#define VGIC_LR_PHYSID_CPUID (7 << 10)
>>>> +#define VGIC_LR_STATE (3 << 28)
>>>> +#define VGIC_LR_PENDING_BIT (1 << 28)
>>>> +#define VGIC_LR_ACTIVE_BIT (1 << 29)
>>>> +#define VGIC_LR_EOI (1 << 19)
>>>> +
>>>> +#define VGIC_MISR_EOI (1 << 0)
>>>> +#define VGIC_MISR_U (1 << 1)
>>>> +
>>>> +#define LR_EMPTY 0xff
>>>> +
>>>
>>> Could stick these in asm/hardware/gic.h. I know they're not used by the gic
>>> driver, but they're the same piece of architecture so it's probably worth
>>> keeping in one place.
>>
>> This is on my list of things to do once the GIC code is shared between
>> arm and arm64. Could do it earlier if that makes more sense.
>
> Might as well as I found some others in a later patch too.
>
>>>> static int compute_pending_for_cpu(struct kvm_vcpu *vcpu)
>>>> {
>>>> - return 0;
>>>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>> + unsigned long *pending, *enabled, *pend;
>>>> + int vcpu_id;
>>>> +
>>>> + vcpu_id = vcpu->vcpu_id;
>>>> + pend = vcpu->arch.vgic_cpu.pending;
>>>> +
>>>> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
>>>> + enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id);
>>>> + bitmap_and(pend, pending, enabled, 32);
>>>
>>> pend and pending! vcpu_pending and dist_pending?
>>
>> A lot of that code has already been reworked. See:
>> https://lists.cs.columbia.edu/pipermail/kvmarm/2012-November/004138.html
>
> Argh, too much code! Ok, as long as it's being looked at.
>
>>>> +
>>>> + pending = vgic_bitmap_get_shared_map(&dist->irq_state);
>>>> + enabled = vgic_bitmap_get_shared_map(&dist->irq_enabled);
>>>> + bitmap_and(pend + 1, pending, enabled, VGIC_NR_SHARED_IRQS);
>>>> + bitmap_and(pend + 1, pend + 1,
>>>> + vgic_bitmap_get_shared_map(&dist->irq_spi_target[vcpu_id]),
>>>> + VGIC_NR_SHARED_IRQS);
>>>> +
>>>> + return (find_first_bit(pend, VGIC_NR_IRQS) < VGIC_NR_IRQS);
>>>> }
>>>>
>>>> /*
>>>> @@ -613,6 +631,212 @@ static void vgic_update_state(struct kvm *kvm)
>>>> }
>>>> }
>>>>
>>>> +#define LR_PHYSID(lr) (((lr) & VGIC_LR_PHYSID_CPUID) >> 10)
>>>
>>> Is VGIC_LR_PHYSID_CPUID wide enough for this? The CPUID is only 3 bits, but
>>> the interrupt ID could be larger. Or do you not supported hardware interrupt
>>> forwarding? (in which case, LR_PHYSID is a misleading name).
>>
>> Hardware interrupt forwarding is not supported. PHYSID is the name of
>> the actual field in the spec, hence the name of the macro. LR_CPUID?
>
> Sure.
>
>>>> + kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
>>>> + BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
>>>> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
>>>> + if (is_level)
>>>> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
>>>> + return true;
>>>> + }
>>>> +
>>>> + /* Try to use another LR for this interrupt */
>>>> + lr = find_first_bit((unsigned long *)vgic_cpu->vgic_elrsr,
>>>> + vgic_cpu->nr_lr);
>>>> + if (lr >= vgic_cpu->nr_lr)
>>>> + return false;
>>>> +
>>>> + kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
>>>> + vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
>>>> + if (is_level)
>>>> + vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
>>>> +
>>>> + vgic_cpu->vgic_irq_lr_map[irq] = lr;
>>>> + clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
>>>> + set_bit(lr, vgic_cpu->lr_used);
>>>> +
>>>> + return true;
>>>> +}
>>>
>>> I can't help but feel that this could be made cleaner by moving the
>>> level-specific EOI handling out into a separate function.
>>
>> Do you mean having two functions, one for edge and the other for level?
>> Seems overkill to me. I could move the "if (is_level) ..." to a common
>> spot though.
>
> Indeed, you could just have something like vgic_eoi_irq and call that
> in one place, letting that function do the level check.
>
>>>> +
>>>> +/*
>>>> + * Fill the list registers with pending interrupts before running the
>>>> + * guest.
>>>> + */
>>>> +static void __kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu)
>>>> +{
>>>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>> + unsigned long *pending;
>>>> + int i, c, vcpu_id;
>>>> + int overflow = 0;
>>>> +
>>>> + vcpu_id = vcpu->vcpu_id;
>>>> +
>>>> + /*
>>>> + * We may not have any pending interrupt, or the interrupts
>>>> + * may have been serviced from another vcpu. In all cases,
>>>> + * move along.
>>>> + */
>>>> + if (!kvm_vgic_vcpu_pending_irq(vcpu)) {
>>>> + pr_debug("CPU%d has no pending interrupt\n", vcpu_id);
>>>> + goto epilog;
>>>> + }
>>>> +
>>>> + /* SGIs */
>>>> + pending = vgic_bitmap_get_cpu_map(&dist->irq_state, vcpu_id);
>>>> + for_each_set_bit(i, vgic_cpu->pending, 16) {
>>>> + unsigned long sources;
>>>> +
>>>> + sources = dist->irq_sgi_sources[vcpu_id][i];
>>>> + for_each_set_bit(c, &sources, 8) {
>>>> + if (!vgic_queue_irq(vcpu, c, i)) {
>>>> + overflow = 1;
>>>> + continue;
>>>> + }
>>>
>>> If there are multiple sources, why do you need to queue the interrupt
>>> multiple times? I would have thought it could be collapsed into one.
>>
>> Because SGIs from different sources *are* different interrupts. In an
>> n-CPU system (with n > 2), you could have some message passing system
>> based on interrupts, and you'd need to know which CPU is pinging you.
>
> Ok, fair point.
>
>>>> +
>>>> + clear_bit(c, &sources);
>>>> + }
>>>> +
>>>> + if (!sources)
>>>> + clear_bit(i, pending);
>>>
>>> What does this signify and how does it happen? An SGI without a source
>>> sounds pretty weird...
>>
>> See the clear_bit() just above. Once all the sources for this SGI are
>> cleared, we can make the interrupt not pending anymore.
>
> Yup, missed that.
>
>>>> +/*
>>>> + * Sync back the VGIC state after a guest run. We do not really touch
>>>> + * the distributor here (the irq_pending_on_cpu bit is safe to set),
>>>> + * so there is no need for taking its lock.
>>>> + */
>>>> +static void __kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu)
>>>> +{
>>>> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>> + int lr, pending;
>>>> +
>>>> + /* Clear mappings for empty LRs */
>>>> + for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
>>>> + vgic_cpu->nr_lr) {
>>>> + int irq;
>>>> +
>>>> + if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
>>>> + continue;
>>>> +
>>>> + irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
>>>> +
>>>> + BUG_ON(irq >= VGIC_NR_IRQS);
>>>> + vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
>>>> + }
>>>> +
>>>> + /* Check if we still have something up our sleeve... */
>>>> + pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
>>>> + vgic_cpu->nr_lr);
>>>
>>> Does this rely on timeliness of maintenance interrupts with respect to
>>> EOIs in the guest? i.e. if a maintenance interrupt is delayed (I can't
>>> see anything in the spec stating that they're synchronous) and you end up
>>> taking one here, will you accidentally re-pend the interrupt?
>>
>> I don't think so. ELRSR only indicates that the list register is empty.
>> If we find a zero bit there, we flag that this vcpu has at least one
>> pending interrupt (in its list registers). A delayed maintenance
>> interrupt may race with this by also setting this bit if an interrupt is
>> still in the active state after being EOIed, but that's not a problem
>> (we just set_bit twice). A race between clear and set would be
>> problematic though.
>
> Hmm, yes, the EOI maintenance handler only sets pending IRQs. So, to turn it
> around, how about __kvm_vgic_sync_to_cpu? There is a comment in the
> maintenance handler about it:
>
>
> * level interrupt. There is a potential race with
> * the queuing of an interrupt in __kvm_sync_to_cpu(), where we check
> * if the interrupt is already active. Two possibilities:
> *
> * - The queuing is occuring on the same vcpu: cannot happen, as we're
> * already in the context of this vcpu, and executing the handler
>
>
> Does this still apply if the maintenance interrupt comes in late? It will
> then look like the stopped vcpu just EOId an interrupt...
Gniiii... Yup, there is a race at the end of __kvm_vgic_sync_to_cpu(),
when we decide we've injected all non-active pending interrupts. The
maintenance interrupt could fire just before the clear_bit, and we'd
loose the now pending interrupt for a round. Probably not fatal, but still.
I think I'll use spin_lock_irqsave() in kvm_vgic_sync_to_cpu(), it will
save me a lot of headache.
But the ugliest thing with the maintenance interrupt is that if it is
delayed for long enough, you could end up messing with the wrong vcpu,
or no vcpu at all. But I don't think there is much you can do about
this. If your hardware is broken enough to deliver late VGIC interrupt,
we're screwed.
>>>> + if (pending < vgic_cpu->nr_lr) {
>>>> + set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
>>>> + smp_mb();
>>>
>>> What's this barrier for?
>>
>> It is strategically placed to entertain the reviewer. And it does its
>> job! I'll nuke it, now that you found it. ;-)
>
> Excellent! I think there may be another one on the horizon when I get into
> the maintenance interrupt handler proper too. Looking forward to it.
Enjoy!
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection
2012-12-03 14:21 ` Marc Zyngier
2012-12-03 14:58 ` Christoffer Dall
@ 2012-12-03 19:13 ` Christoffer Dall
2012-12-03 19:22 ` Marc Zyngier
1 sibling, 1 reply; 58+ messages in thread
From: Christoffer Dall @ 2012-12-03 19:13 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Dec 3, 2012 at 9:21 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 03/12/12 13:25, Will Deacon wrote:
>> On Sat, Nov 10, 2012 at 03:45:18PM +0000, Christoffer Dall wrote:
>>> From: Marc Zyngier <marc.zyngier@arm.com>
>>>
>>> Plug the interrupt injection code. Interrupts can now be generated
>>> from user space.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>>> ---
>>> arch/arm/include/asm/kvm_vgic.h | 8 +++
>>> arch/arm/kvm/arm.c | 29 +++++++++++++
>>> arch/arm/kvm/vgic.c | 90 +++++++++++++++++++++++++++++++++++++++
>>> 3 files changed, 127 insertions(+)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>>> index 7229324..6e3d303 100644
>>> --- a/arch/arm/include/asm/kvm_vgic.h
>>> +++ b/arch/arm/include/asm/kvm_vgic.h
>>> @@ -241,6 +241,8 @@ struct kvm_exit_mmio;
>>> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>>> void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
>>> void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
>>> +int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>> + bool level);
>>> int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>> struct kvm_exit_mmio *mmio);
>>> @@ -271,6 +273,12 @@ static inline void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) {}
>>> static inline void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu) {}
>>> static inline void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu) {}
>>>
>>> +static inline int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid,
>>> + const struct kvm_irq_level *irq)
>>> +{
>>> + return 0;
>>> +}
>>> +
>>> static inline int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>>> {
>>> return 0;
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index 3ac1aab..f43da01 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -764,10 +764,31 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
>>>
>>> switch (irq_type) {
>>> case KVM_ARM_IRQ_TYPE_CPU:
>>> + if (irqchip_in_kernel(kvm))
>>> + return -ENXIO;
>>> +
>>> if (irq_num > KVM_ARM_IRQ_CPU_FIQ)
>>> return -EINVAL;
>>>
>>> return vcpu_interrupt_line(vcpu, irq_num, level);
>>> +#ifdef CONFIG_KVM_ARM_VGIC
>>> + case KVM_ARM_IRQ_TYPE_PPI:
>>> + if (!irqchip_in_kernel(kvm))
>>> + return -ENXIO;
>>> +
>>> + if (irq_num < 16 || irq_num > 31)
>>> + return -EINVAL;
>>
>> It's our favourite two numbers again! :)
>
> I already fixed a number of them. Probably missed this one though.
>
>>> +
>>> + return kvm_vgic_inject_irq(kvm, vcpu->vcpu_id, irq_num, level);
>>> + case KVM_ARM_IRQ_TYPE_SPI:
>>> + if (!irqchip_in_kernel(kvm))
>>> + return -ENXIO;
>>> +
>>> + if (irq_num < 32 || irq_num > KVM_ARM_IRQ_GIC_MAX)
>>> + return -EINVAL;
>>> +
>>> + return kvm_vgic_inject_irq(kvm, 0, irq_num, level);
>>> +#endif
>>> }
>>>
>>> return -EINVAL;
>>> @@ -849,6 +870,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>> void __user *argp = (void __user *)arg;
>>>
>>> switch (ioctl) {
>>> +#ifdef CONFIG_KVM_ARM_VGIC
>>> + case KVM_CREATE_IRQCHIP: {
>>> + if (vgic_present)
>>> + return kvm_vgic_create(kvm);
>>> + else
>>> + return -EINVAL;
>>
>> ENXIO? At least, that's what you use when setting the GIC addresses.
>
> -EINVAL seems to be one of the values other archs are using. -ENXIO is
> not one of them for KVM_CREATE_IRQCHIP. Doesn't mean they are right, but
> for the sake of keeping userspace happy, I'm not really inclined to
> change this.
>
We don't have user space code relying on this, and EINVAL is
misleading, so let's use ENXIO to be consistent with
SET_DEVICE_ADDRESS. No error values are specified in the API docs, so
we should use the most appropriate one.
You fix?
-Christoffer
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection
2012-12-03 19:13 ` Christoffer Dall
@ 2012-12-03 19:22 ` Marc Zyngier
0 siblings, 0 replies; 58+ messages in thread
From: Marc Zyngier @ 2012-12-03 19:22 UTC (permalink / raw)
To: linux-arm-kernel
On 03/12/12 19:13, Christoffer Dall wrote:
> On Mon, Dec 3, 2012 at 9:21 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 03/12/12 13:25, Will Deacon wrote:
>>> On Sat, Nov 10, 2012 at 03:45:18PM +0000, Christoffer Dall wrote:
>>>> From: Marc Zyngier <marc.zyngier@arm.com>
>>>>
>>>> Plug the interrupt injection code. Interrupts can now be generated
>>>> from user space.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>>>> ---
>>>> arch/arm/include/asm/kvm_vgic.h | 8 +++
>>>> arch/arm/kvm/arm.c | 29 +++++++++++++
>>>> arch/arm/kvm/vgic.c | 90 +++++++++++++++++++++++++++++++++++++++
>>>> 3 files changed, 127 insertions(+)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>>>> index 7229324..6e3d303 100644
>>>> --- a/arch/arm/include/asm/kvm_vgic.h
>>>> +++ b/arch/arm/include/asm/kvm_vgic.h
>>>> @@ -241,6 +241,8 @@ struct kvm_exit_mmio;
>>>> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
>>>> void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
>>>> void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
>>>> +int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
>>>> + bool level);
>>>> int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>>>> bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>>> struct kvm_exit_mmio *mmio);
>>>> @@ -271,6 +273,12 @@ static inline void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) {}
>>>> static inline void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu) {}
>>>> static inline void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu) {}
>>>>
>>>> +static inline int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid,
>>>> + const struct kvm_irq_level *irq)
>>>> +{
>>>> + return 0;
>>>> +}
>>>> +
>>>> static inline int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
>>>> {
>>>> return 0;
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 3ac1aab..f43da01 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -764,10 +764,31 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
>>>>
>>>> switch (irq_type) {
>>>> case KVM_ARM_IRQ_TYPE_CPU:
>>>> + if (irqchip_in_kernel(kvm))
>>>> + return -ENXIO;
>>>> +
>>>> if (irq_num > KVM_ARM_IRQ_CPU_FIQ)
>>>> return -EINVAL;
>>>>
>>>> return vcpu_interrupt_line(vcpu, irq_num, level);
>>>> +#ifdef CONFIG_KVM_ARM_VGIC
>>>> + case KVM_ARM_IRQ_TYPE_PPI:
>>>> + if (!irqchip_in_kernel(kvm))
>>>> + return -ENXIO;
>>>> +
>>>> + if (irq_num < 16 || irq_num > 31)
>>>> + return -EINVAL;
>>>
>>> It's our favourite two numbers again! :)
>>
>> I already fixed a number of them. Probably missed this one though.
>>
>>>> +
>>>> + return kvm_vgic_inject_irq(kvm, vcpu->vcpu_id, irq_num, level);
>>>> + case KVM_ARM_IRQ_TYPE_SPI:
>>>> + if (!irqchip_in_kernel(kvm))
>>>> + return -ENXIO;
>>>> +
>>>> + if (irq_num < 32 || irq_num > KVM_ARM_IRQ_GIC_MAX)
>>>> + return -EINVAL;
>>>> +
>>>> + return kvm_vgic_inject_irq(kvm, 0, irq_num, level);
>>>> +#endif
>>>> }
>>>>
>>>> return -EINVAL;
>>>> @@ -849,6 +870,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>>> void __user *argp = (void __user *)arg;
>>>>
>>>> switch (ioctl) {
>>>> +#ifdef CONFIG_KVM_ARM_VGIC
>>>> + case KVM_CREATE_IRQCHIP: {
>>>> + if (vgic_present)
>>>> + return kvm_vgic_create(kvm);
>>>> + else
>>>> + return -EINVAL;
>>>
>>> ENXIO? At least, that's what you use when setting the GIC addresses.
>>
>> -EINVAL seems to be one of the values other archs are using. -ENXIO is
>> not one of them for KVM_CREATE_IRQCHIP. Doesn't mean they are right, but
>> for the sake of keeping userspace happy, I'm not really inclined to
>> change this.
>>
>
> We don't have user space code relying on this, and EINVAL is
> misleading, so let's use ENXIO to be consistent with
> SET_DEVICE_ADDRESS. No error values are specified in the API docs, so
> we should use the most appropriate one.
>
> You fix?
Yes, I'll fix it as part the whole vgic series.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 11/13] ARM: KVM: VGIC initialisation code
2012-11-10 15:45 ` [PATCH v4 11/13] ARM: KVM: VGIC initialisation code Christoffer Dall
@ 2012-12-05 10:43 ` Will Deacon
0 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2012-12-05 10:43 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:45:32PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> Add the init code for the hypervisor, the virtual machine, and
> the virtual CPUs.
>
> An interrupt handler is also wired to allow the VGIC maintenance
> interrupts, used to deal with level triggered interrupts and LR
> underflows.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_vgic.h | 11 ++
> arch/arm/kvm/arm.c | 14 ++
> arch/arm/kvm/vgic.c | 237 +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 258 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index 6e3d303..a8e7a93 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -154,6 +154,7 @@ static inline void vgic_bytemap_set_irq_val(struct vgic_bytemap *x,
> struct vgic_dist {
> #ifdef CONFIG_KVM_ARM_VGIC
> spinlock_t lock;
> + bool ready;
>
> /* Virtual control interface mapping */
> void __iomem *vctrl_base;
> @@ -239,6 +240,10 @@ struct kvm_exit_mmio;
>
> #ifdef CONFIG_KVM_ARM_VGIC
> int kvm_vgic_set_addr(struct kvm *kvm, unsigned long type, u64 addr);
> +int kvm_vgic_hyp_init(void);
> +int kvm_vgic_init(struct kvm *kvm);
> +int kvm_vgic_create(struct kvm *kvm);
> +void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu);
> void kvm_vgic_sync_to_cpu(struct kvm_vcpu *vcpu);
> void kvm_vgic_sync_from_cpu(struct kvm_vcpu *vcpu);
> int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> @@ -248,6 +253,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> struct kvm_exit_mmio *mmio);
>
> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
> +#define vgic_initialized(k) ((k)->arch.vgic.ready)
> #else
> static inline int kvm_vgic_hyp_init(void)
> {
> @@ -294,6 +300,11 @@ static inline int irqchip_in_kernel(struct kvm *kvm)
> {
> return 0;
> }
> +
> +static inline bool vgic_initialized(struct kvm *kvm)
> +{
> + return true;
> +}
> #endif
Shouldn't this return false?
>
> #endif
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index f43da01..a633d9d 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -187,6 +187,8 @@ int kvm_dev_ioctl_check_extension(long ext)
> switch (ext) {
> #ifdef CONFIG_KVM_ARM_VGIC
> case KVM_CAP_IRQCHIP:
> + r = vgic_present;
> + break;
> #endif
> case KVM_CAP_USER_MEMORY:
> case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
> @@ -623,6 +625,14 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> if (unlikely(vcpu->arch.target < 0))
> return -ENOEXEC;
>
> + /* Initalize the VGIC before running a vcpu the first time on this VM */
Initialize
> + if (unlikely(irqchip_in_kernel(vcpu->kvm) &&
Why unlikely? I'd just drop the hint.
> + !vgic_initialized(vcpu->kvm))) {
> + ret = kvm_vgic_init(vcpu->kvm);
> + if (ret)
> + return ret;
> + }
> +
> if (run->exit_reason == KVM_EXIT_MMIO) {
> ret = kvm_handle_mmio_return(vcpu, vcpu->run);
> if (ret)
> @@ -1024,8 +1034,8 @@ static int init_hyp_mode(void)
> * Init HYP view of VGIC
> */
> err = kvm_vgic_hyp_init();
> - if (err)
> - goto out_free_mappings;
> + if (!err)
> + vgic_present = true;
>
> return 0;
Is that an intended change in behaviour? If kvm_vgic_hyp_init fails, you'll
return 0...
> out_free_vfp:
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index 70040bb..415ddb8 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -20,7 +20,14 @@
> #include <linux/kvm_host.h>
> #include <linux/interrupt.h>
> #include <linux/io.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_irq.h>
> +
> #include <asm/kvm_emulate.h>
> +#include <asm/hardware/gic.h>
> +#include <asm/kvm_arm.h>
> +#include <asm/kvm_mmu.h>
>
> /*
> * How the whole thing works (courtesy of Christoffer Dall):
> @@ -59,11 +66,18 @@
> */
>
> #define VGIC_ADDR_UNDEF (-1)
> -#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == (typeof(_x))VGIC_ADDR_UNDEF)
> +#define IS_VGIC_ADDR_UNDEF(_x) ((_x) == VGIC_ADDR_UNDEF)
Huh?
>
> #define VGIC_DIST_SIZE 0x1000
> #define VGIC_CPU_SIZE 0x2000
>
> +/* Physical address of vgic virtual cpu interface */
> +static phys_addr_t vgic_vcpu_base;
> +
> +/* Virtual control interface base address */
> +static void __iomem *vgic_vctrl_base;
> +
> +static struct device_node *vgic_node;
>
> #define ACCESS_READ_VALUE (1 << 0)
> #define ACCESS_READ_RAZ (0 << 0)
> @@ -527,7 +541,7 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_exi
>
> if (!irqchip_in_kernel(vcpu->kvm) ||
> mmio->phys_addr < base ||
> - (mmio->phys_addr + mmio->len) > (base + dist->vgic_dist_size))
> + (mmio->phys_addr + mmio->len) > (base + VGIC_DIST_SIZE))
> return false;
Again, this is an odd hunk. It looks like you've accidentally got some older
context lying around in this patch.
>
> range = find_matching_range(vgic_ranges, mmio, base);
> @@ -957,6 +971,225 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
> return 0;
> }
>
> +static irqreturn_t vgic_maintenance_handler(int irq, void *data)
> +{
> + struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)data;
> + struct vgic_dist *dist;
> + struct vgic_cpu *vgic_cpu;
> +
> + if (WARN(!vcpu,
> + "VGIC interrupt on CPU %d with no vcpu\n", smp_processor_id()))
> + return IRQ_HANDLED;
We need to get an answer about the potential delaying of this interrupt,
otherwise this might kick more often that we'd like.
> +
> + vgic_cpu = &vcpu->arch.vgic_cpu;
> + dist = &vcpu->kvm->arch.vgic;
> + kvm_debug("MISR = %08x\n", vgic_cpu->vgic_misr);
> +
> + /*
> + * We do not need to take the distributor lock here, since the only
> + * action we perform is clearing the irq_active_bit for an EOIed
> + * level interrupt. There is a potential race with
> + * the queuing of an interrupt in __kvm_sync_to_cpu(), where we check
> + * if the interrupt is already active. Two possibilities:
> + *
> + * - The queuing is occuring on the same vcpu: cannot happen, as we're
> + * already in the context of this vcpu, and executing the handler
> + * - The interrupt has been migrated to another vcpu, and we ignore
> + * this interrupt for this run. Big deal. It is still pending though,
> + * and will get considered when this vcpu exits.
> + */
We've already discussed this in an earlier thread.
> + if (vgic_cpu->vgic_misr & VGIC_MISR_EOI) {
> + /*
> + * Some level interrupts have been EOIed. Clear their
> + * active bit.
> + */
> + int lr, irq;
> +
> + for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_eisr,
> + vgic_cpu->nr_lr) {
> + irq = vgic_cpu->vgic_lr[lr] & VGIC_LR_VIRTUALID;
> +
> + vgic_bitmap_set_irq_val(&dist->irq_active,
> + vcpu->vcpu_id, irq, 0);
> + vgic_cpu->vgic_lr[lr] &= ~VGIC_LR_EOI;
> + writel_relaxed(vgic_cpu->vgic_lr[lr],
> + dist->vctrl_base + GICH_LR0 + (lr << 2));
> +
> + /* Any additionnal pending interrupt? */
additional
> + if (vgic_bitmap_get_irq_val(&dist->irq_state,
> + vcpu->vcpu_id, irq)) {
> + set_bit(irq, vcpu->arch.vgic_cpu.pending);
> + set_bit(vcpu->vcpu_id,
> + &dist->irq_pending_on_cpu);
> + } else {
> + clear_bit(irq, vgic_cpu->pending);
> + }
> + }
> + }
> +
> + if (vgic_cpu->vgic_misr & VGIC_MISR_U) {
> + vgic_cpu->vgic_hcr &= ~VGIC_HCR_UIE;
> + writel_relaxed(vgic_cpu->vgic_hcr, dist->vctrl_base + GICH_HCR);
> + }
> +
> + return IRQ_HANDLED;
> +}
I wonder whether we could instead simplify the irq handler so it just sets
a flag indicating that the list registers need updating, then we pick that
up on the resume path and keep all the work in one place. Hard to tell.
> +
> +void kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> + u32 reg;
> + int i;
> +
> + if (!irqchip_in_kernel(vcpu->kvm))
> + return;
> +
> + for (i = 0; i < VGIC_NR_IRQS; i++) {
> + if (i < 16)
> + vgic_bitmap_set_irq_val(&dist->irq_enabled,
> + vcpu->vcpu_id, i, 1);
> + if (i < 32)
> + vgic_bitmap_set_irq_val(&dist->irq_cfg,
> + vcpu->vcpu_id, i, 1);
> +
Usual comments here :)
> + vgic_cpu->vgic_irq_lr_map[i] = LR_EMPTY;
> + }
> +
> + BUG_ON(!vcpu->kvm->arch.vgic.vctrl_base);
> + reg = readl_relaxed(vcpu->kvm->arch.vgic.vctrl_base + GICH_VTR);
> + vgic_cpu->nr_lr = (reg & 0x1f) + 1;
> +
> + reg = readl_relaxed(vcpu->kvm->arch.vgic.vctrl_base + GICH_VMCR);
> + vgic_cpu->vgic_vmcr = reg | (0x1f << 27); /* Priority */
Why isn't this defined with the other registers?
> +
> + vgic_cpu->vgic_hcr |= VGIC_HCR_EN; /* Get the show on the road... */
> +}
> +
> +static void vgic_init_maintenance_interrupt(void *info)
> +{
> + unsigned int *irqp = info;
> +
> + enable_percpu_irq(*irqp, 0);
> +}
> +
> +int kvm_vgic_hyp_init(void)
> +{
> + int ret;
> + unsigned int irq;
> + struct resource vctrl_res;
> + struct resource vcpu_res;
> +
> + vgic_node = of_find_compatible_node(NULL, NULL, "arm,cortex-a15-gic");
> + if (!vgic_node)
> + return -ENODEV;
> +
> + irq = irq_of_parse_and_map(vgic_node, 0);
> + if (!irq)
> + return -ENXIO;
Don't you need to put of_node_put to decrement the refcount before returning?
> +
> + ret = request_percpu_irq(irq, vgic_maintenance_handler,
> + "vgic", kvm_get_running_vcpus());
> + if (ret) {
> + kvm_err("Cannot register interrupt %d\n", irq);
> + return ret;
Likewise (and similarly throughout this function).
> + }
> +
> + ret = of_address_to_resource(vgic_node, 2, &vctrl_res);
> + if (ret) {
> + kvm_err("Cannot obtain VCTRL resource\n");
> + goto out_free_irq;
> + }
> +
> + vgic_vctrl_base = of_iomap(vgic_node, 2);
> + if (!vgic_vctrl_base) {
> + kvm_err("Cannot ioremap VCTRL\n");
> + ret = -ENOMEM;
> + goto out_free_irq;
> + }
> +
> + ret = create_hyp_io_mappings(vgic_vctrl_base,
> + vgic_vctrl_base + resource_size(&vctrl_res),
> + vctrl_res.start);
> + if (ret) {
> + kvm_err("Cannot map VCTRL into hyp\n");
> + goto out_unmap;
> + }
> +
> + kvm_info("%s@%llx IRQ%d\n", vgic_node->name, vctrl_res.start, irq);
> + on_each_cpu(vgic_init_maintenance_interrupt, &irq, 1);
What if all your CPUs aren't online at this point? Do you need a hotplug
notifier to re-enable the PPI? (you should try hotplug on the host actually,
it could be hilarious fun :)
> +
> + if (of_address_to_resource(vgic_node, 3, &vcpu_res)) {
> + kvm_err("Cannot obtain VCPU resource\n");
> + ret = -ENXIO;
> + goto out_unmap;
> + }
> + vgic_vcpu_base = vcpu_res.start;
> +
> + return 0;
> +
> +out_unmap:
> + iounmap(vgic_vctrl_base);
> +out_free_irq:
> + free_percpu_irq(irq, kvm_get_running_vcpus());
> +
> + return ret;
> +}
> +
> +int kvm_vgic_init(struct kvm *kvm)
> +{
> + int ret = 0, i;
> +
> + mutex_lock(&kvm->lock);
> +
> + if (vgic_initialized(kvm))
> + goto out;
> +
> + if (IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_dist_base) ||
> + IS_VGIC_ADDR_UNDEF(kvm->arch.vgic.vgic_cpu_base)) {
> + kvm_err("Need to set vgic cpu and dist addresses first\n");
> + ret = -ENXIO;
> + goto out;
> + }
> +
> + ret = kvm_phys_addr_ioremap(kvm, kvm->arch.vgic.vgic_cpu_base,
> + vgic_vcpu_base, VGIC_CPU_SIZE);
> + if (ret) {
> + kvm_err("Unable to remap VGIC CPU to VCPU\n");
> + goto out;
> + }
> +
> + for (i = 32; i < VGIC_NR_IRQS; i += 4)
> + vgic_set_target_reg(kvm, 0, i);
> +
> + kvm->arch.vgic.ready = true;
> +out:
> + mutex_unlock(&kvm->lock);
> + return ret;
> +}
> +
> +int kvm_vgic_create(struct kvm *kvm)
> +{
> + int ret;
> +
> + mutex_lock(&kvm->lock);
> +
> + if (atomic_read(&kvm->online_vcpus) || kvm->arch.vgic.vctrl_base) {
> + ret = -EEXIST;
> + goto out;
> + }
> +
> + spin_lock_init(&kvm->arch.vgic.lock);
> + kvm->arch.vgic.vctrl_base = vgic_vctrl_base;
> + kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
> + kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
> +
> + ret = 0;
Might as well assign this when it's declared.
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-11-10 15:45 ` [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick Christoffer Dall
@ 2012-12-05 10:43 ` Will Deacon
2012-12-05 10:58 ` Russell King - ARM Linux
2012-12-05 11:16 ` Russell King - ARM Linux
1 sibling, 1 reply; 58+ messages in thread
From: Will Deacon @ 2012-12-05 10:43 UTC (permalink / raw)
To: linux-arm-kernel
On Sat, Nov 10, 2012 at 03:45:39PM +0000, Christoffer Dall wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
>
> If we have level interrupts already programmed to fire on a vcpu,
> there is no reason to kick it after injecting a new interrupt,
> as we're guaranteed that we'll exit when the level interrupt will
> be EOId (VGIC_LR_EOI is set).
>
> The exit will force a reload of the VGIC, injecting the new interrupts.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> ---
> arch/arm/include/asm/kvm_vgic.h | 10 ++++++++++
> arch/arm/kvm/arm.c | 10 +++++++++-
> arch/arm/kvm/vgic.c | 10 ++++++++--
> 3 files changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index a8e7a93..7d2662c 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -215,6 +215,9 @@ struct vgic_cpu {
> u32 vgic_elrsr[2]; /* Saved only */
> u32 vgic_apr;
> u32 vgic_lr[64]; /* A15 has only 4... */
> +
> + /* Number of level-triggered interrupt in progress */
> + atomic_t irq_active_count;
> #endif
> };
>
> @@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>
> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
> #define vgic_initialized(k) ((k)->arch.vgic.ready)
> +#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
When is the atomic_t initialised to zero? I can only see increments.
> +
> #else
> static inline int kvm_vgic_hyp_init(void)
> {
> @@ -305,6 +310,11 @@ static inline bool vgic_initialized(struct kvm *kvm)
> {
> return true;
> }
> +
> +static inline int vgic_active_irq(struct kvm_vcpu *vcpu)
> +{
> + return 0;
> +}
> #endif
>
> #endif
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index a633d9d..1716f12 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -94,7 +94,15 @@ int kvm_arch_hardware_enable(void *garbage)
>
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> - return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> + if (kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE) {
> + if (vgic_active_irq(vcpu) &&
> + cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
> + return 0;
> +
> + return 1;
> + }
> +
> + return 0;
That's pretty nasty... why don't you check if there's an active interrupt before
trying to change the vcpu mode? That way, you can avoid the double cmpxchg.
> }
>
> void kvm_arch_hardware_disable(void *garbage)
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index 415ddb8..146de1d 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -705,8 +705,10 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
> kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
> BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
> vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
> - if (is_level)
> + if (is_level) {
> vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> + atomic_inc(&vgic_cpu->irq_active_count);
> + }
> return true;
> }
>
> @@ -718,8 +720,10 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>
> kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
> vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
> - if (is_level)
> + if (is_level) {
> vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> + atomic_inc(&vgic_cpu->irq_active_count);
> + }
>
> vgic_cpu->vgic_irq_lr_map[irq] = lr;
> clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
> @@ -1011,6 +1015,8 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>
> vgic_bitmap_set_irq_val(&dist->irq_active,
> vcpu->vcpu_id, irq, 0);
> + atomic_dec(&vgic_cpu->irq_active_count);
> + smp_mb();
If you actually need this, try smp_mb__after_atomic_dec although of course
I'd like to know why it's required :)
Will
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-12-05 10:43 ` Will Deacon
@ 2012-12-05 10:58 ` Russell King - ARM Linux
2012-12-05 12:17 ` Marc Zyngier
0 siblings, 1 reply; 58+ messages in thread
From: Russell King - ARM Linux @ 2012-12-05 10:58 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Dec 05, 2012 at 10:43:58AM +0000, Will Deacon wrote:
> On Sat, Nov 10, 2012 at 03:45:39PM +0000, Christoffer Dall wrote:
> > From: Marc Zyngier <marc.zyngier@arm.com>
> >
> > If we have level interrupts already programmed to fire on a vcpu,
> > there is no reason to kick it after injecting a new interrupt,
> > as we're guaranteed that we'll exit when the level interrupt will
> > be EOId (VGIC_LR_EOI is set).
> >
> > The exit will force a reload of the VGIC, injecting the new interrupts.
> >
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> > ---
> > arch/arm/include/asm/kvm_vgic.h | 10 ++++++++++
> > arch/arm/kvm/arm.c | 10 +++++++++-
> > arch/arm/kvm/vgic.c | 10 ++++++++--
> > 3 files changed, 27 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> > index a8e7a93..7d2662c 100644
> > --- a/arch/arm/include/asm/kvm_vgic.h
> > +++ b/arch/arm/include/asm/kvm_vgic.h
> > @@ -215,6 +215,9 @@ struct vgic_cpu {
> > u32 vgic_elrsr[2]; /* Saved only */
> > u32 vgic_apr;
> > u32 vgic_lr[64]; /* A15 has only 4... */
> > +
> > + /* Number of level-triggered interrupt in progress */
> > + atomic_t irq_active_count;
> > #endif
> > };
> >
> > @@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >
> > #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
> > #define vgic_initialized(k) ((k)->arch.vgic.ready)
> > +#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
>
> When is the atomic_t initialised to zero? I can only see increments.
I'd question whether an atomic type is correct for this; the only
protection that it's offering is to ensure that the atomic increment
and decrement occur atomically - there's nothing else that they're doing
in this code.
If those atomic increments and decrements are occuring beneath a common
lock, then using atomic types is just mere code obfuscation.
For example, I'd like to question the correctness of this:
+ if (vgic_active_irq(vcpu) &&
+ cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
What if vgic_active_irq() reads the atomic type, immediately after it gets
decremented to zero before the cmpxchg() is executed? Would that be a
problem?
If yes, yet again this illustrates why the use of atomic types leads people
down the path of believing that their code somehow becomes magically safe
through the use of this smoke-screen. IMHO, every use of atomic_t must be
questioned and carefully analysed before it gets into the kernel - many
are buggy through assumptions that atomic_t buys you something magic.
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-11-10 15:45 ` [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick Christoffer Dall
2012-12-05 10:43 ` Will Deacon
@ 2012-12-05 11:16 ` Russell King - ARM Linux
1 sibling, 0 replies; 58+ messages in thread
From: Russell King - ARM Linux @ 2012-12-05 11:16 UTC (permalink / raw)
To: linux-arm-kernel
For the sake of public education, let me rewrite this patch a bit to
illustrate why atomic_t's are bad, and then people can review this
instead.
Every change I've made here is functionally equivalent to the behaviour
of the atomic type; I have not added any new bugs here that aren't
present in the original code.
It is my hope that through education like this, people will see that
atomic types have no magic properties, and their use does not make
code automatically race free and correct; in fact, the inappropriate
use of atomic types is pure obfuscation and causes confusion.
On Sat, Nov 10, 2012 at 04:45:39PM +0100, Christoffer Dall wrote:
> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> index a8e7a93..7d2662c 100644
> --- a/arch/arm/include/asm/kvm_vgic.h
> +++ b/arch/arm/include/asm/kvm_vgic.h
> @@ -215,6 +215,9 @@ struct vgic_cpu {
> u32 vgic_elrsr[2]; /* Saved only */
> u32 vgic_apr;
> u32 vgic_lr[64]; /* A15 has only 4... */
> +
> + /* Number of level-triggered interrupt in progress */
> + atomic_t irq_active_count;
+ int irq_active_count;
> #endif
> };
>
> @@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>
> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
> #define vgic_initialized(k) ((k)->arch.vgic.ready)
> +#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
> +
+#define vgic_active_irq(v) ((v)->arch.vgic_cpu.irq_active_count)
> #else
> static inline int kvm_vgic_hyp_init(void)
> {
> @@ -305,6 +310,11 @@ static inline bool vgic_initialized(struct kvm *kvm)
> {
> return true;
> }
> +
> +static inline int vgic_active_irq(struct kvm_vcpu *vcpu)
> +{
> + return 0;
> +}
> #endif
>
> #endif
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index a633d9d..1716f12 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -94,7 +94,15 @@ int kvm_arch_hardware_enable(void *garbage)
>
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> - return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> + if (kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE) {
> + if (vgic_active_irq(vcpu) &&
> + cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
> + return 0;
So with the above change to the macro, this becomes:
+ if (vcpu->arch.vgic_cpu.irq_active_count &&
+ cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
> +
> + return 1;
> + }
> +
> + return 0;
> }
>
> void kvm_arch_hardware_disable(void *garbage)
> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
> index 415ddb8..146de1d 100644
> --- a/arch/arm/kvm/vgic.c
> +++ b/arch/arm/kvm/vgic.c
> @@ -705,8 +705,10 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
> kvm_debug("LR%d piggyback for IRQ%d %x\n", lr, irq, vgic_cpu->vgic_lr[lr]);
> BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
> vgic_cpu->vgic_lr[lr] |= VGIC_LR_PENDING_BIT;
> - if (is_level)
> + if (is_level) {
> vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> + atomic_inc(&vgic_cpu->irq_active_count);
+ spin_lock_irqsave(&atomic_lock, flags);
+ vgic_cpu->irq_active_count++;
+ spin_unlock_irqrestore(&atomic_lock, flags);
> + }
> return true;
> }
>
> @@ -718,8 +720,10 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>
> kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
> vgic_cpu->vgic_lr[lr] = MK_LR_PEND(sgi_source_id, irq);
> - if (is_level)
> + if (is_level) {
> vgic_cpu->vgic_lr[lr] |= VGIC_LR_EOI;
> + atomic_inc(&vgic_cpu->irq_active_count);
+ spin_lock_irqsave(&atomic_lock, flags);
+ vgic_cpu->irq_active_count++;
+ spin_unlock_irqrestore(&atomic_lock, flags);
> + }
>
> vgic_cpu->vgic_irq_lr_map[irq] = lr;
> clear_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr);
> @@ -1011,6 +1015,8 @@ static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>
> vgic_bitmap_set_irq_val(&dist->irq_active,
> vcpu->vcpu_id, irq, 0);
> + atomic_dec(&vgic_cpu->irq_active_count);
+ spin_lock_irqsave(&atomic_lock, flags);
+ vgic_cpu->irq_active_count--;
+ spin_unlock_irqrestore(&atomic_lock, flags);
> + smp_mb();
> vgic_cpu->vgic_lr[lr] &= ~VGIC_LR_EOI;
> writel_relaxed(vgic_cpu->vgic_lr[lr],
> dist->vctrl_base + GICH_LR0 + (lr << 2));
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-12-05 10:58 ` Russell King - ARM Linux
@ 2012-12-05 12:17 ` Marc Zyngier
2012-12-05 12:29 ` Russell King - ARM Linux
0 siblings, 1 reply; 58+ messages in thread
From: Marc Zyngier @ 2012-12-05 12:17 UTC (permalink / raw)
To: linux-arm-kernel
On 05/12/12 10:58, Russell King - ARM Linux wrote:
> On Wed, Dec 05, 2012 at 10:43:58AM +0000, Will Deacon wrote:
>> On Sat, Nov 10, 2012 at 03:45:39PM +0000, Christoffer Dall wrote:
>>> From: Marc Zyngier <marc.zyngier@arm.com>
>>>
>>> If we have level interrupts already programmed to fire on a vcpu,
>>> there is no reason to kick it after injecting a new interrupt,
>>> as we're guaranteed that we'll exit when the level interrupt will
>>> be EOId (VGIC_LR_EOI is set).
>>>
>>> The exit will force a reload of the VGIC, injecting the new interrupts.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>>> ---
>>> arch/arm/include/asm/kvm_vgic.h | 10 ++++++++++
>>> arch/arm/kvm/arm.c | 10 +++++++++-
>>> arch/arm/kvm/vgic.c | 10 ++++++++--
>>> 3 files changed, 27 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>>> index a8e7a93..7d2662c 100644
>>> --- a/arch/arm/include/asm/kvm_vgic.h
>>> +++ b/arch/arm/include/asm/kvm_vgic.h
>>> @@ -215,6 +215,9 @@ struct vgic_cpu {
>>> u32 vgic_elrsr[2]; /* Saved only */
>>> u32 vgic_apr;
>>> u32 vgic_lr[64]; /* A15 has only 4... */
>>> +
>>> + /* Number of level-triggered interrupt in progress */
>>> + atomic_t irq_active_count;
>>> #endif
>>> };
>>>
>>> @@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>>
>>> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
>>> #define vgic_initialized(k) ((k)->arch.vgic.ready)
>>> +#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
>>
>> When is the atomic_t initialised to zero? I can only see increments.
>
> I'd question whether an atomic type is correct for this; the only
> protection that it's offering is to ensure that the atomic increment
> and decrement occur atomically - there's nothing else that they're doing
> in this code.
>
> If those atomic increments and decrements are occuring beneath a common
> lock, then using atomic types is just mere code obfuscation.
No, they occur on code paths that do not have a common lock (one of them
being an interrupt handler). This may change though, after one comment
Will made earlier (the thing about delayed interrupts).
If these two code sections become mutually exclusive, then indeed there
will be no point in having an atomic type anymore.
> For example, I'd like to question the correctness of this:
>
> + if (vgic_active_irq(vcpu) &&
> + cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
>
> What if vgic_active_irq() reads the atomic type, immediately after it gets
> decremented to zero before the cmpxchg() is executed? Would that be a
> problem?
I do not think so. If the value gets decremented, it means we took a
maintenance interrupt, which means we exited the guest at some point.
Two possibilities:
- We're not in guest mode anymore (vcpu->mode = OUTSIDE_GUEST_MODE), and
cmpxchg will fail, hence signaling the guest to reload its state. This
is not needed (the guest will reload its state anyway), but doesn't
cause any harm.
- We're back into the guest (vcpu->mode = IN_GUEST_MODE), and cmpxchg
will fail as well, triggering a reload which is needed this time.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-12-05 12:17 ` Marc Zyngier
@ 2012-12-05 12:29 ` Russell King - ARM Linux
2012-12-05 13:40 ` Marc Zyngier
0 siblings, 1 reply; 58+ messages in thread
From: Russell King - ARM Linux @ 2012-12-05 12:29 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Dec 05, 2012 at 12:17:57PM +0000, Marc Zyngier wrote:
> On 05/12/12 10:58, Russell King - ARM Linux wrote:
> > On Wed, Dec 05, 2012 at 10:43:58AM +0000, Will Deacon wrote:
> >> On Sat, Nov 10, 2012 at 03:45:39PM +0000, Christoffer Dall wrote:
> >>> From: Marc Zyngier <marc.zyngier@arm.com>
> >>>
> >>> If we have level interrupts already programmed to fire on a vcpu,
> >>> there is no reason to kick it after injecting a new interrupt,
> >>> as we're guaranteed that we'll exit when the level interrupt will
> >>> be EOId (VGIC_LR_EOI is set).
> >>>
> >>> The exit will force a reload of the VGIC, injecting the new interrupts.
> >>>
> >>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
> >>> ---
> >>> arch/arm/include/asm/kvm_vgic.h | 10 ++++++++++
> >>> arch/arm/kvm/arm.c | 10 +++++++++-
> >>> arch/arm/kvm/vgic.c | 10 ++++++++--
> >>> 3 files changed, 27 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
> >>> index a8e7a93..7d2662c 100644
> >>> --- a/arch/arm/include/asm/kvm_vgic.h
> >>> +++ b/arch/arm/include/asm/kvm_vgic.h
> >>> @@ -215,6 +215,9 @@ struct vgic_cpu {
> >>> u32 vgic_elrsr[2]; /* Saved only */
> >>> u32 vgic_apr;
> >>> u32 vgic_lr[64]; /* A15 has only 4... */
> >>> +
> >>> + /* Number of level-triggered interrupt in progress */
> >>> + atomic_t irq_active_count;
> >>> #endif
> >>> };
> >>>
> >>> @@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >>>
> >>> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
> >>> #define vgic_initialized(k) ((k)->arch.vgic.ready)
> >>> +#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
> >>
> >> When is the atomic_t initialised to zero? I can only see increments.
> >
> > I'd question whether an atomic type is correct for this; the only
> > protection that it's offering is to ensure that the atomic increment
> > and decrement occur atomically - there's nothing else that they're doing
> > in this code.
> >
> > If those atomic increments and decrements are occuring beneath a common
> > lock, then using atomic types is just mere code obfuscation.
>
> No, they occur on code paths that do not have a common lock (one of them
> being an interrupt handler). This may change though, after one comment
> Will made earlier (the thing about delayed interrupts).
>
> If these two code sections become mutually exclusive, then indeed there
> will be no point in having an atomic type anymore.
>
> > For example, I'd like to question the correctness of this:
> >
> > + if (vgic_active_irq(vcpu) &&
> > + cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
> >
> > What if vgic_active_irq() reads the atomic type, immediately after it gets
> > decremented to zero before the cmpxchg() is executed? Would that be a
> > problem?
>
> I do not think so. If the value gets decremented, it means we took a
> maintenance interrupt, which means we exited the guest at some point.
> Two possibilities:
>
> - We're not in guest mode anymore (vcpu->mode = OUTSIDE_GUEST_MODE), and
> cmpxchg will fail, hence signaling the guest to reload its state. This
> is not needed (the guest will reload its state anyway), but doesn't
> cause any harm.
What is the relative ordering of the atomic decrement and setting
vcpu->mode to be OUTSIDE_GUEST_MODE ? Is there a window where we have
decremented this atomic type but vcpu->mode is still set to IN_GUEST_MODE.
> - We're back into the guest (vcpu->mode = IN_GUEST_MODE), and cmpxchg
> will fail as well, triggering a reload which is needed this time.
Well, the whole code looks really weird to me, especially that:
+ if (kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE) {
+ if (vgic_active_irq(vcpu) &&
+ cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
+ return 0;
+
+ return 1;
+ }
I've no idea what kvm_vcpu_exiting_guest_mode() is (it doesn't exist in
any tree I have access to)...
In any case, look at the version I converted to spinlocks and see whether
you think the code looks reasonable in that form. If it doesn't then it
isn't reasonable in atomic types either.
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-12-05 12:29 ` Russell King - ARM Linux
@ 2012-12-05 13:40 ` Marc Zyngier
2012-12-05 15:55 ` Russell King - ARM Linux
0 siblings, 1 reply; 58+ messages in thread
From: Marc Zyngier @ 2012-12-05 13:40 UTC (permalink / raw)
To: linux-arm-kernel
On 05/12/12 12:29, Russell King - ARM Linux wrote:
> On Wed, Dec 05, 2012 at 12:17:57PM +0000, Marc Zyngier wrote:
>> On 05/12/12 10:58, Russell King - ARM Linux wrote:
>>> On Wed, Dec 05, 2012 at 10:43:58AM +0000, Will Deacon wrote:
>>>> On Sat, Nov 10, 2012 at 03:45:39PM +0000, Christoffer Dall wrote:
>>>>> From: Marc Zyngier <marc.zyngier@arm.com>
>>>>>
>>>>> If we have level interrupts already programmed to fire on a vcpu,
>>>>> there is no reason to kick it after injecting a new interrupt,
>>>>> as we're guaranteed that we'll exit when the level interrupt will
>>>>> be EOId (VGIC_LR_EOI is set).
>>>>>
>>>>> The exit will force a reload of the VGIC, injecting the new interrupts.
>>>>>
>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
>>>>> ---
>>>>> arch/arm/include/asm/kvm_vgic.h | 10 ++++++++++
>>>>> arch/arm/kvm/arm.c | 10 +++++++++-
>>>>> arch/arm/kvm/vgic.c | 10 ++++++++--
>>>>> 3 files changed, 27 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm/include/asm/kvm_vgic.h b/arch/arm/include/asm/kvm_vgic.h
>>>>> index a8e7a93..7d2662c 100644
>>>>> --- a/arch/arm/include/asm/kvm_vgic.h
>>>>> +++ b/arch/arm/include/asm/kvm_vgic.h
>>>>> @@ -215,6 +215,9 @@ struct vgic_cpu {
>>>>> u32 vgic_elrsr[2]; /* Saved only */
>>>>> u32 vgic_apr;
>>>>> u32 vgic_lr[64]; /* A15 has only 4... */
>>>>> +
>>>>> + /* Number of level-triggered interrupt in progress */
>>>>> + atomic_t irq_active_count;
>>>>> #endif
>>>>> };
>>>>>
>>>>> @@ -254,6 +257,8 @@ bool vgic_handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>>>>
>>>>> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.vctrl_base))
>>>>> #define vgic_initialized(k) ((k)->arch.vgic.ready)
>>>>> +#define vgic_active_irq(v) (atomic_read(&(v)->arch.vgic_cpu.irq_active_count) == 0)
>>>>
>>>> When is the atomic_t initialised to zero? I can only see increments.
>>>
>>> I'd question whether an atomic type is correct for this; the only
>>> protection that it's offering is to ensure that the atomic increment
>>> and decrement occur atomically - there's nothing else that they're doing
>>> in this code.
>>>
>>> If those atomic increments and decrements are occuring beneath a common
>>> lock, then using atomic types is just mere code obfuscation.
>>
>> No, they occur on code paths that do not have a common lock (one of them
>> being an interrupt handler). This may change though, after one comment
>> Will made earlier (the thing about delayed interrupts).
>>
>> If these two code sections become mutually exclusive, then indeed there
>> will be no point in having an atomic type anymore.
>>
>>> For example, I'd like to question the correctness of this:
>>>
>>> + if (vgic_active_irq(vcpu) &&
>>> + cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
>>>
>>> What if vgic_active_irq() reads the atomic type, immediately after it gets
>>> decremented to zero before the cmpxchg() is executed? Would that be a
>>> problem?
>>
>> I do not think so. If the value gets decremented, it means we took a
>> maintenance interrupt, which means we exited the guest at some point.
>> Two possibilities:
>>
>> - We're not in guest mode anymore (vcpu->mode = OUTSIDE_GUEST_MODE), and
>> cmpxchg will fail, hence signaling the guest to reload its state. This
>> is not needed (the guest will reload its state anyway), but doesn't
>> cause any harm.
>
> What is the relative ordering of the atomic decrement and setting
> vcpu->mode to be OUTSIDE_GUEST_MODE ? Is there a window where we have
> decremented this atomic type but vcpu->mode is still set to IN_GUEST_MODE.
OUTSIDE_GUEST_MODE always occurs first, while interrupts are still
masked in SVC. We then unmask the interrupts, causing the maintenance
interrupt to be handled. Only this handler causes the active count to be
decremented.
>> - We're back into the guest (vcpu->mode = IN_GUEST_MODE), and cmpxchg
>> will fail as well, triggering a reload which is needed this time.
>
> Well, the whole code looks really weird to me, especially that:
>
> + if (kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE) {
> + if (vgic_active_irq(vcpu) &&
> + cmpxchg(&vcpu->mode, EXITING_GUEST_MODE, IN_GUEST_MODE) == EXITING_GUEST_MODE)
> + return 0;
> +
> + return 1;
> + }
>
> I've no idea what kvm_vcpu_exiting_guest_mode() is (it doesn't exist in
> any tree I have access to)...
You should find it in include/linux/kvm_host.h. It reads:
static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
{
return cmpxchg(&vcpu->mode, IN_GUEST_MODE, EXITING_GUEST_MODE);
}
Admittedly, the whole sequence should be rewritten to be clearer. What
it does is "If we're running a guest and there is no active interrupt,
then kick the guest".
It probably means the above code should read:
if (!vgic_active_irq(vcpu))
return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
return 0;
> In any case, look at the version I converted to spinlocks and see whether
> you think the code looks reasonable in that form. If it doesn't then it
> isn't reasonable in atomic types either.
I had a look, and I don't find it problematic.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 58+ messages in thread
* [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick
2012-12-05 13:40 ` Marc Zyngier
@ 2012-12-05 15:55 ` Russell King - ARM Linux
0 siblings, 0 replies; 58+ messages in thread
From: Russell King - ARM Linux @ 2012-12-05 15:55 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Dec 05, 2012 at 01:40:24PM +0000, Marc Zyngier wrote:
> Admittedly, the whole sequence should be rewritten to be clearer. What
> it does is "If we're running a guest and there is no active interrupt,
> then kick the guest".
On the whole this entire thing should be written clearer; from the
explanations you've given it seems that the only reason this code works
is because you're relying on several behaviours all coming together to
achieve the right result - which makes for fragile code.
You're partly relying on atomic types to ensure that the increment and
decrement happen exclusively. You're then relying on a combination of
IRQ protection and cmpxchg() to ensure that the non-atomic read of the
atomic type won't be a problem.
This doesn't inspire confidence, and I have big concerns over whether
this code will still be understandable in a number of years time.
And I still wonder how safe this is even with your explanations. IRQ
disabling only works for the local CPU core so I still have questions
over this wrt a SMP host OS.
^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2012-12-05 15:55 UTC | newest]
Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-10 15:44 [PATCH v4 00/13] KVM/ARM vGIC support Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 01/13] KVM: ARM: Introduce KVM_SET_DEVICE_ADDRESS ioctl Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 02/13] ARM: KVM: Keep track of currently running vcpus Christoffer Dall
2012-11-28 12:47 ` Will Deacon
2012-11-28 13:15 ` Marc Zyngier
2012-11-30 22:39 ` Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 03/13] ARM: KVM: Initial VGIC infrastructure support Christoffer Dall
2012-11-28 12:49 ` Will Deacon
2012-11-28 13:09 ` Marc Zyngier
2012-11-28 14:13 ` Will Deacon
2012-12-01 2:19 ` Christoffer Dall
2012-11-10 15:44 ` [PATCH v4 04/13] ARM: KVM: Initial VGIC MMIO support code Christoffer Dall
2012-11-12 8:54 ` Dong Aisheng
2012-11-13 13:32 ` Christoffer Dall
2012-11-28 13:09 ` Will Deacon
2012-11-28 13:44 ` Marc Zyngier
2012-11-10 15:44 ` [PATCH v4 05/13] ARM: KVM: VGIC accept vcpu and dist base addresses from user space Christoffer Dall
2012-11-12 8:56 ` Dong Aisheng
2012-11-13 13:35 ` Christoffer Dall
2012-11-28 13:11 ` Will Deacon
2012-11-28 13:22 ` [kvmarm] " Marc Zyngier
2012-12-01 2:52 ` Christoffer Dall
2012-12-01 15:57 ` Christoffer Dall
2012-12-03 10:40 ` Will Deacon
2012-11-10 15:44 ` [PATCH v4 06/13] ARM: KVM: VGIC distributor handling Christoffer Dall
2012-11-12 9:29 ` Dong Aisheng
2012-11-13 13:38 ` Christoffer Dall
2012-11-28 13:21 ` Will Deacon
2012-11-28 14:35 ` Marc Zyngier
2012-11-10 15:45 ` [PATCH v4 07/13] ARM: KVM: VGIC virtual CPU interface management Christoffer Dall
2012-12-03 13:23 ` Will Deacon
2012-12-03 14:11 ` Marc Zyngier
2012-12-03 14:34 ` Will Deacon
2012-12-03 15:24 ` Marc Zyngier
2012-12-03 14:54 ` Christoffer Dall
2012-11-10 15:45 ` [PATCH v4 08/13] ARM: KVM: vgic: retire queued, disabled interrupts Christoffer Dall
2012-12-03 13:24 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 09/13] ARM: KVM: VGIC interrupt injection Christoffer Dall
2012-12-03 13:25 ` Will Deacon
2012-12-03 14:21 ` Marc Zyngier
2012-12-03 14:58 ` Christoffer Dall
2012-12-03 19:13 ` Christoffer Dall
2012-12-03 19:22 ` Marc Zyngier
2012-11-10 15:45 ` [PATCH v4 10/13] ARM: KVM: VGIC control interface world switch Christoffer Dall
2012-12-03 13:31 ` Will Deacon
2012-12-03 14:26 ` Marc Zyngier
2012-11-10 15:45 ` [PATCH v4 11/13] ARM: KVM: VGIC initialisation code Christoffer Dall
2012-12-05 10:43 ` Will Deacon
2012-11-10 15:45 ` [PATCH v4 12/13] ARM: KVM: vgic: reduce the number of vcpu kick Christoffer Dall
2012-12-05 10:43 ` Will Deacon
2012-12-05 10:58 ` Russell King - ARM Linux
2012-12-05 12:17 ` Marc Zyngier
2012-12-05 12:29 ` Russell King - ARM Linux
2012-12-05 13:40 ` Marc Zyngier
2012-12-05 15:55 ` Russell King - ARM Linux
2012-12-05 11:16 ` Russell King - ARM Linux
2012-11-10 15:45 ` [PATCH v4 13/13] ARM: KVM: Add VGIC configuration option Christoffer Dall
2012-11-10 19:52 ` Sergei Shtylyov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).