* [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64
@ 2013-10-16 17:02 Anup Patel
2013-10-16 17:02 ` [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation Anup Patel
` (5 more replies)
0 siblings, 6 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:02 UTC (permalink / raw)
To: linux-arm-kernel
The Power State and Coordination Interface (PSCI) specification defines
SYSTEM_OFF and SYSTEM_RESET functions for system poweroff and reboot.
This patchset adds:
1. Emulation of PSCI SYSTEM_OFF and SYSTEM_RESET functions in
KVM ARM/ARM64 by forwarding them to user space (QEMU or KVMTOOL)
2. System poweroff and reboot using PSCI for ARM/ARM64 kernel
Anup Patel (5):
ARM/ARM64: KVM: Update user space API header for PSCI emulation
ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user
space
KVM: Add documentation for KVM_EXIT_PSCI exit reason
ARM: psci: Add support for system reboot and poweroff
ARM64: psci: Add support for system reboot and poweroff
Documentation/virtual/kvm/api.txt | 13 +++++++++
arch/arm/include/asm/kvm_psci.h | 25 +++++++++++++++-
arch/arm/include/uapi/asm/kvm.h | 2 ++
arch/arm/kernel/psci.c | 36 +++++++++++++++++++++++
arch/arm/kvm/arm.c | 12 ++++++--
arch/arm/kvm/handle_exit.c | 12 ++++++--
arch/arm/kvm/psci.c | 57 ++++++++++++++++++++++++++++++++++---
arch/arm64/include/asm/kvm_psci.h | 25 +++++++++++++++-
arch/arm64/include/uapi/asm/kvm.h | 2 ++
arch/arm64/kernel/psci.c | 36 +++++++++++++++++++++++
arch/arm64/kvm/handle_exit.c | 24 ++++++++++++----
include/uapi/linux/kvm.h | 7 +++++
12 files changed, 233 insertions(+), 18 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
@ 2013-10-16 17:02 ` Anup Patel
2013-10-16 20:30 ` Christoffer Dall
2013-10-16 22:11 ` Christoffer Dall
2013-10-16 17:02 ` [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space Anup Patel
` (4 subsequent siblings)
5 siblings, 2 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:02 UTC (permalink / raw)
To: linux-arm-kernel
Update user space API interface headers for providing information to
user space needed to emulate PSCI function calls in user space (i.e.
QEMU or KVMTOOL).
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
---
include/uapi/linux/kvm.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e32e776..dae2664 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -171,6 +171,7 @@ struct kvm_pit_config {
#define KVM_EXIT_WATCHDOG 21
#define KVM_EXIT_S390_TSCH 22
#define KVM_EXIT_EPR 23
+#define KVM_EXIT_PSCI 24
/* For KVM_EXIT_INTERNAL_ERROR */
/* Emulate instruction failed. */
@@ -301,6 +302,12 @@ struct kvm_run {
struct {
__u32 epr;
} epr;
+ /* KVM_EXIT_PSCI */
+ struct {
+ __u32 fn;
+ __u64 args[7];
+ __u64 ret[4];
+ } psci;
/* Fix the size of the union. */
char padding[256];
};
--
1.7.9.5
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
2013-10-16 17:02 ` [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation Anup Patel
@ 2013-10-16 17:02 ` Anup Patel
2013-10-16 22:22 ` Christoffer Dall
2013-10-17 8:37 ` Marc Zyngier
2013-10-16 17:02 ` [RFC PATCH 3/5] KVM: Add documentation for KVM_EXIT_PSCI exit reason Anup Patel
` (3 subsequent siblings)
5 siblings, 2 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:02 UTC (permalink / raw)
To: linux-arm-kernel
The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
functions hence cannot be emulated by the in-kernel PSCI emulation code.
To tackle this, we forward PSCI SYSTEM_OFF and SYSTEM_RESET function
calls from Guest to user space (i.e. QEMU or KVMTOOL) via KVM run
structure with KVM_EXIT_PSCI exit reason.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
---
arch/arm/include/asm/kvm_psci.h | 25 +++++++++++++++-
arch/arm/include/uapi/asm/kvm.h | 2 ++
arch/arm/kvm/arm.c | 12 ++++++--
arch/arm/kvm/handle_exit.c | 12 ++++++--
arch/arm/kvm/psci.c | 57 ++++++++++++++++++++++++++++++++++---
arch/arm64/include/asm/kvm_psci.h | 25 +++++++++++++++-
arch/arm64/include/uapi/asm/kvm.h | 2 ++
arch/arm64/kvm/handle_exit.c | 24 ++++++++++++----
8 files changed, 141 insertions(+), 18 deletions(-)
diff --git a/arch/arm/include/asm/kvm_psci.h b/arch/arm/include/asm/kvm_psci.h
index 9a83d98..783566f 100644
--- a/arch/arm/include/asm/kvm_psci.h
+++ b/arch/arm/include/asm/kvm_psci.h
@@ -18,6 +18,29 @@
#ifndef __ARM_KVM_PSCI_H__
#define __ARM_KVM_PSCI_H__
-bool kvm_psci_call(struct kvm_vcpu *vcpu);
+#include <linux/kvm_host.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_arm.h>
+
+/*
+ * The in-kernel PSCI emulation code wants to use a copy of run->psci,
+ * which is an anonymous type. Use our own type instead.
+ */
+struct kvm_exit_psci {
+ u32 fn;
+ u64 args[7];
+};
+
+static inline void kvm_prepare_psci(struct kvm_run *run,
+ struct kvm_exit_psci *psci)
+{
+ run->psci.fn = psci->fn;
+ memcpy(&run->psci.args, &psci->args, sizeof(run->psci.args));
+ memset(&run->psci.ret, 0, sizeof(run->psci.ret));
+ run->exit_reason = KVM_EXIT_PSCI;
+}
+
+int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run);
#endif /* __ARM_KVM_PSCI_H__ */
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index c1ee007..205cf0e 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -171,6 +171,8 @@ struct kvm_arch_memory_slot {
#define KVM_PSCI_FN_CPU_OFF KVM_PSCI_FN(1)
#define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
#define KVM_PSCI_FN_MIGRATE KVM_PSCI_FN(3)
+#define KVM_PSCI_FN_SYSTEM_OFF KVM_PSCI_FN(4)
+#define KVM_PSCI_FN_SYSTEM_RESET KVM_PSCI_FN(5)
#define KVM_PSCI_RET_SUCCESS 0
#define KVM_PSCI_RET_NI ((unsigned long)-1)
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index cc5adb9..5ffd9a3 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -459,7 +459,7 @@ static void update_vttbr(struct kvm *kvm)
spin_unlock(&kvm_vmid_lock);
}
-static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
+static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
if (likely(vcpu->arch.has_run_once))
return 0;
@@ -483,7 +483,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
*/
if (test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features)) {
*vcpu_reg(vcpu, 0) = KVM_PSCI_FN_CPU_OFF;
- kvm_psci_call(vcpu);
+ kvm_psci_call(vcpu, run);
}
return 0;
@@ -520,7 +520,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
if (unlikely(!kvm_vcpu_initialized(vcpu)))
return -ENOEXEC;
- ret = kvm_vcpu_first_run_init(vcpu);
+ ret = kvm_vcpu_first_run_init(vcpu, vcpu->run);
if (ret)
return ret;
@@ -530,6 +530,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
return ret;
}
+ if (run->exit_reason == KVM_EXIT_PSCI) {
+ ret = kvm_handle_psci_return(vcpu, vcpu->run);
+ if (ret)
+ return ret;
+ }
+
if (vcpu->sigset_active)
sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index df4c82d..1a12e6c 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -40,14 +40,20 @@ static int handle_svc_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
+ int ret;
+
trace_kvm_hvc(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
kvm_vcpu_hvc_get_imm(vcpu));
- if (kvm_psci_call(vcpu))
+ ret = kvm_psci_call(vcpu, run);
+ if (!ret)
+ return 1;
+ else if (ret == -EINVAL) {
+ kvm_inject_undefined(vcpu);
return 1;
+ }
- kvm_inject_undefined(vcpu);
- return 1;
+ return 0;
}
static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 86a693a..72c23a7 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -71,6 +71,45 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
return KVM_PSCI_RET_SUCCESS;
}
+static void kvm_psci_system_off(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+ struct kvm_exit_psci psci;
+
+ psci.fn = KVM_PSCI_FN_SYSTEM_OFF;
+ memset(&psci.args, 0, sizeof(psci.args));
+ kvm_prepare_psci(run, &psci);
+}
+
+static void kvm_psci_system_reset(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+ struct kvm_exit_psci psci;
+
+ psci.fn = KVM_PSCI_FN_SYSTEM_RESET;
+ memset(&psci.args, 0, sizeof(psci.args));
+ kvm_prepare_psci(run, &psci);
+}
+
+/**
+ * kvm_handle_psci_return -- Handle PSCI after user space emulation
+ * @vcpu: The VCPU pointer
+ * @run: The VCPU run struct containing the psci data
+ *
+ * This should only be called after returning from userspace for
+ * PSCI emulation.
+ */
+int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+ /*
+ * Currently, the PSCI functions passed to user space for emulation
+ * are SYSTEM_OFF and SYSTEM_RESET. These PSCI functions are not
+ * expected to return back after emulating in user space hence by
+ * default we return -EINVAL to avoid user space from doing RUN ioctl
+ * after handling KVM_EXIT_PSCI.
+ */
+
+ return -EINVAL;
+}
+
/**
* kvm_psci_call - handle PSCI call if r0 value is in range
* @vcpu: Pointer to the VCPU struct
@@ -81,8 +120,9 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
* function number specified in r0 is withing the PSCI range, and false
* otherwise.
*/
-bool kvm_psci_call(struct kvm_vcpu *vcpu)
+int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
+ int ret = 0;
unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
@@ -98,11 +138,20 @@ bool kvm_psci_call(struct kvm_vcpu *vcpu)
case KVM_PSCI_FN_MIGRATE:
val = KVM_PSCI_RET_NI;
break;
-
+ case KVM_PSCI_FN_SYSTEM_OFF:
+ kvm_psci_system_off(vcpu, run);
+ val = KVM_PSCI_RET_SUCCESS;
+ ret = -EINTR;
+ break;
+ case KVM_PSCI_FN_SYSTEM_RESET:
+ kvm_psci_system_reset(vcpu, run);
+ val = KVM_PSCI_RET_SUCCESS;
+ ret = -EINTR;
+ break;
default:
- return false;
+ return -EINVAL;
}
*vcpu_reg(vcpu, 0) = val;
- return true;
+ return ret;
}
diff --git a/arch/arm64/include/asm/kvm_psci.h b/arch/arm64/include/asm/kvm_psci.h
index e301a48..db90649 100644
--- a/arch/arm64/include/asm/kvm_psci.h
+++ b/arch/arm64/include/asm/kvm_psci.h
@@ -18,6 +18,29 @@
#ifndef __ARM64_KVM_PSCI_H__
#define __ARM64_KVM_PSCI_H__
-bool kvm_psci_call(struct kvm_vcpu *vcpu);
+#include <linux/kvm_host.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_arm.h>
+
+/*
+ * The in-kernel PSCI emulation code wants to use a copy of run->psci,
+ * which is an anonymous type. Use our own type instead.
+ */
+struct kvm_exit_psci {
+ u32 fn;
+ u64 args[7];
+};
+
+static inline void kvm_prepare_psci(struct kvm_run *run,
+ struct kvm_exit_psci *psci)
+{
+ run->psci.fn = psci->fn;
+ memcpy(&run->psci.args, &psci->args, sizeof(run->psci.args));
+ memset(&run->psci.ret, 0, sizeof(run->psci.ret));
+ run->exit_reason = KVM_EXIT_PSCI;
+}
+
+int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
+int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run);
#endif /* __ARM64_KVM_PSCI_H__ */
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index d9f026b..f678902 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -158,6 +158,8 @@ struct kvm_arch_memory_slot {
#define KVM_PSCI_FN_CPU_OFF KVM_PSCI_FN(1)
#define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
#define KVM_PSCI_FN_MIGRATE KVM_PSCI_FN(3)
+#define KVM_PSCI_FN_SYSTEM_OFF KVM_PSCI_FN(4)
+#define KVM_PSCI_FN_SYSTEM_RESET KVM_PSCI_FN(5)
#define KVM_PSCI_RET_SUCCESS 0
#define KVM_PSCI_RET_NI ((unsigned long)-1)
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 9beaca0..28e20bb 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -30,20 +30,32 @@ typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
- if (kvm_psci_call(vcpu))
+ int ret;
+
+ ret = kvm_psci_call(vcpu, run);
+ if (!ret)
+ return 1;
+ else if (ret == -EINVAL) {
+ kvm_inject_undefined(vcpu);
return 1;
+ }
- kvm_inject_undefined(vcpu);
- return 1;
+ return 0;
}
static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
- if (kvm_psci_call(vcpu))
+ int ret;
+
+ ret = kvm_psci_call(vcpu, run);
+ if (!ret)
+ return 1;
+ else if (ret == -EINVAL) {
+ kvm_inject_undefined(vcpu);
return 1;
+ }
- kvm_inject_undefined(vcpu);
- return 1;
+ return 0;
}
/**
--
1.7.9.5
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [RFC PATCH 3/5] KVM: Add documentation for KVM_EXIT_PSCI exit reason
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
2013-10-16 17:02 ` [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation Anup Patel
2013-10-16 17:02 ` [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space Anup Patel
@ 2013-10-16 17:02 ` Anup Patel
2013-10-16 17:02 ` [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff Anup Patel
` (2 subsequent siblings)
5 siblings, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:02 UTC (permalink / raw)
To: linux-arm-kernel
This patch adds documentation for KVM_EXIT_PSCI exit reason and
PSCI call info in KVM run structure.
We use exit reason KVM_EXIT_PSCI to forward PSCI calls such as
SYSTEM_OFF and SYSTEM_RESET to user space (i.e. QEMU or KVMTOOL)
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
---
Documentation/virtual/kvm/api.txt | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 076b849..215774d 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2655,6 +2655,19 @@ It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
external interrupt has just been delivered into the guest. User space
should put the acknowledged interrupt vector into the 'epr' field.
+ /* KVM_EXIT_PSCI */
+ struct {
+ __u32 fn;
+ __u64 args[7];
+ __u64 ret[4];
+ } psci;
+
+ARM/ARM64 specific. The KVM ARM/ARM64 emulates Power State and Coordination
+Interface (PSCI) for the Guest. This exit occurs when Guest issues a PSCI
+function call to KVM ARM/ARM64 which is not emulated by in-kernel PSCI
+emulation and needs to be emulated in user space (i.e. QEMU or KVMTOOL).
+Examples of such PSCI functions are SYSTEM_OFF and SYSTEM_RESET.
+
/* Fix the size of the union. */
char padding[256];
};
--
1.7.9.5
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
` (2 preceding siblings ...)
2013-10-16 17:02 ` [RFC PATCH 3/5] KVM: Add documentation for KVM_EXIT_PSCI exit reason Anup Patel
@ 2013-10-16 17:02 ` Anup Patel
2013-10-16 22:17 ` Rob Herring
2013-10-16 17:02 ` [RFC PATCH 5/5] ARM64: " Anup Patel
2013-10-16 17:08 ` [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
5 siblings, 1 reply; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:02 UTC (permalink / raw)
To: linux-arm-kernel
We have PSCI SYSTEM_OFF and SYSTEM_RESET function call emulation
available when running as Guest using KVM ARM.
This patch implements system reboot and poweroff using PSCI
SYSTEM_OFF and SYSTEM_RESET.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
---
arch/arm/kernel/psci.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/arch/arm/kernel/psci.c b/arch/arm/kernel/psci.c
index 4693188..30d9d65 100644
--- a/arch/arm/kernel/psci.c
+++ b/arch/arm/kernel/psci.c
@@ -17,11 +17,13 @@
#include <linux/init.h>
#include <linux/of.h>
+#include <linux/pm.h>
#include <asm/compiler.h>
#include <asm/errno.h>
#include <asm/opcodes-sec.h>
#include <asm/opcodes-virt.h>
+#include <asm/system_misc.h>
#include <asm/psci.h>
struct psci_operations psci_ops;
@@ -33,6 +35,8 @@ enum psci_function {
PSCI_FN_CPU_ON,
PSCI_FN_CPU_OFF,
PSCI_FN_MIGRATE,
+ PSCI_FN_SYSTEM_OFF,
+ PSCI_FN_SYSTEM_RESET,
PSCI_FN_MAX,
};
@@ -153,6 +157,28 @@ static int psci_migrate(unsigned long cpuid)
return psci_to_linux_errno(err);
}
+static void psci_power_off(void)
+{
+ int err;
+ u32 fn;
+
+ fn = psci_function_id[PSCI_FN_SYSTEM_OFF];
+ err = invoke_psci_fn(fn, 0, 0, 0);
+ if (err)
+ pr_warning("%s: failed\n", __func__);
+}
+
+static void psci_restart(enum reboot_mode reboot_mode, const char *cmd)
+{
+ int err;
+ u32 fn;
+
+ fn = psci_function_id[PSCI_FN_SYSTEM_RESET];
+ err = invoke_psci_fn(fn, 0, 0, 0);
+ if (err)
+ pr_warning("%s: failed\n", __func__);
+}
+
static const struct of_device_id psci_of_match[] __initconst = {
{ .compatible = "arm,psci", },
{},
@@ -204,6 +230,16 @@ void __init psci_init(void)
psci_ops.migrate = psci_migrate;
}
+ if (!of_property_read_u32(np, "system_off", &id)) {
+ psci_function_id[PSCI_FN_SYSTEM_OFF] = id;
+ pm_power_off = psci_power_off;
+ }
+
+ if (!of_property_read_u32(np, "system_reset", &id)) {
+ psci_function_id[PSCI_FN_SYSTEM_RESET] = id;
+ arm_pm_restart = psci_restart;
+ }
+
out_put_node:
of_node_put(np);
return;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [RFC PATCH 5/5] ARM64: psci: Add support for system reboot and poweroff
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
` (3 preceding siblings ...)
2013-10-16 17:02 ` [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff Anup Patel
@ 2013-10-16 17:02 ` Anup Patel
2013-10-16 17:08 ` [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
5 siblings, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:02 UTC (permalink / raw)
To: linux-arm-kernel
We have PSCI SYSTEM_OFF and SYSTEM_RESET function call emulation
available when running as Guest using KVM ARM64.
This patch implements system reboot and poweroff using PSCI
SYSTEM_OFF and SYSTEM_RESET.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
---
arch/arm64/kernel/psci.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 14f73c4..a157e4d 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -17,9 +17,11 @@
#include <linux/init.h>
#include <linux/of.h>
+#include <linux/pm.h>
#include <asm/compiler.h>
#include <asm/errno.h>
+#include <asm/system_misc.h>
#include <asm/psci.h>
struct psci_operations psci_ops;
@@ -31,6 +33,8 @@ enum psci_function {
PSCI_FN_CPU_ON,
PSCI_FN_CPU_OFF,
PSCI_FN_MIGRATE,
+ PSCI_FN_SYSTEM_OFF,
+ PSCI_FN_SYSTEM_RESET,
PSCI_FN_MAX,
};
@@ -151,6 +155,28 @@ static int psci_migrate(unsigned long cpuid)
return psci_to_linux_errno(err);
}
+static void psci_power_off(void)
+{
+ int err;
+ u32 fn;
+
+ fn = psci_function_id[PSCI_FN_SYSTEM_OFF];
+ err = invoke_psci_fn(fn, 0, 0, 0);
+ if (err)
+ pr_warning("%s: failed\n", __func__);
+}
+
+static void psci_restart(enum reboot_mode reboot_mode, const char *cmd)
+{
+ int err;
+ u32 fn;
+
+ fn = psci_function_id[PSCI_FN_SYSTEM_RESET];
+ err = invoke_psci_fn(fn, 0, 0, 0);
+ if (err)
+ pr_warning("%s: failed\n", __func__);
+}
+
static const struct of_device_id psci_of_match[] __initconst = {
{ .compatible = "arm,psci", },
{},
@@ -205,6 +231,16 @@ int __init psci_init(void)
psci_ops.migrate = psci_migrate;
}
+ if (!of_property_read_u32(np, "system_off", &id)) {
+ psci_function_id[PSCI_FN_SYSTEM_OFF] = id;
+ pm_power_off = psci_power_off;
+ }
+
+ if (!of_property_read_u32(np, "system_reset", &id)) {
+ psci_function_id[PSCI_FN_SYSTEM_RESET] = id;
+ arm_pm_restart = psci_restart;
+ }
+
out_put_node:
of_node_put(np);
return err;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
` (4 preceding siblings ...)
2013-10-16 17:02 ` [RFC PATCH 5/5] ARM64: " Anup Patel
@ 2013-10-16 17:08 ` Anup Patel
5 siblings, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-16 17:08 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Oct 16, 2013 at 10:32 PM, Anup Patel <anup.patel@linaro.org> wrote:
> The Power State and Coordination Interface (PSCI) specification defines
> SYSTEM_OFF and SYSTEM_RESET functions for system poweroff and reboot.
>
> This patchset adds:
> 1. Emulation of PSCI SYSTEM_OFF and SYSTEM_RESET functions in
> KVM ARM/ARM64 by forwarding them to user space (QEMU or KVMTOOL)
> 2. System poweroff and reboot using PSCI for ARM/ARM64 kernel
>
> Anup Patel (5):
> ARM/ARM64: KVM: Update user space API header for PSCI emulation
> ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user
> space
> KVM: Add documentation for KVM_EXIT_PSCI exit reason
> ARM: psci: Add support for system reboot and poweroff
> ARM64: psci: Add support for system reboot and poweroff
>
> Documentation/virtual/kvm/api.txt | 13 +++++++++
> arch/arm/include/asm/kvm_psci.h | 25 +++++++++++++++-
> arch/arm/include/uapi/asm/kvm.h | 2 ++
> arch/arm/kernel/psci.c | 36 +++++++++++++++++++++++
> arch/arm/kvm/arm.c | 12 ++++++--
> arch/arm/kvm/handle_exit.c | 12 ++++++--
> arch/arm/kvm/psci.c | 57 ++++++++++++++++++++++++++++++++++---
> arch/arm64/include/asm/kvm_psci.h | 25 +++++++++++++++-
> arch/arm64/include/uapi/asm/kvm.h | 2 ++
> arch/arm64/kernel/psci.c | 36 +++++++++++++++++++++++
> arch/arm64/kvm/handle_exit.c | 24 ++++++++++++----
> include/uapi/linux/kvm.h | 7 +++++
> 12 files changed, 233 insertions(+), 18 deletions(-)
>
> --
> 1.7.9.5
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
Hi All,
If any of you want to try this patchset then you can temporarily
use the KVMTOOL patch attached here.
We will soon have QEMU side changes for emulating PSCI
SYSTEM_OFF and SYSTEM_RESET.
Regards,
Anup
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kvmtool_psci_emulation.patch
Type: application/octet-stream
Size: 1593 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20131016/0284206b/attachment.obj>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-16 17:02 ` [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation Anup Patel
@ 2013-10-16 20:30 ` Christoffer Dall
2013-10-17 6:25 ` Anup Patel
2013-10-16 22:11 ` Christoffer Dall
1 sibling, 1 reply; 40+ messages in thread
From: Christoffer Dall @ 2013-10-16 20:30 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
> Update user space API interface headers for providing information to
> user space needed to emulate PSCI function calls in user space (i.e.
> QEMU or KVMTOOL).
>
> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> ---
> include/uapi/linux/kvm.h | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index e32e776..dae2664 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> #define KVM_EXIT_WATCHDOG 21
> #define KVM_EXIT_S390_TSCH 22
> #define KVM_EXIT_EPR 23
> +#define KVM_EXIT_PSCI 24
>
> /* For KVM_EXIT_INTERNAL_ERROR */
> /* Emulate instruction failed. */
> @@ -301,6 +302,12 @@ struct kvm_run {
> struct {
> __u32 epr;
> } epr;
> + /* KVM_EXIT_PSCI */
> + struct {
> + __u32 fn;
> + __u64 args[7];
> + __u64 ret[4];
> + } psci;
> /* Fix the size of the union. */
> char padding[256];
> };
> --
> 1.7.9.5
>
I think you'd need a KVM_CAP_PSCI or something here so that QEMU can
know to mmap this much, no?
Also, it would be easier if you just added the documentation for this
change together with this patch IMHO.
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-16 17:02 ` [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation Anup Patel
2013-10-16 20:30 ` Christoffer Dall
@ 2013-10-16 22:11 ` Christoffer Dall
2013-10-17 6:45 ` Anup Patel
1 sibling, 1 reply; 40+ messages in thread
From: Christoffer Dall @ 2013-10-16 22:11 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
> Update user space API interface headers for providing information to
> user space needed to emulate PSCI function calls in user space (i.e.
> QEMU or KVMTOOL).
>
> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> ---
> include/uapi/linux/kvm.h | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index e32e776..dae2664 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> #define KVM_EXIT_WATCHDOG 21
> #define KVM_EXIT_S390_TSCH 22
> #define KVM_EXIT_EPR 23
> +#define KVM_EXIT_PSCI 24
>
> /* For KVM_EXIT_INTERNAL_ERROR */
> /* Emulate instruction failed. */
> @@ -301,6 +302,12 @@ struct kvm_run {
> struct {
> __u32 epr;
> } epr;
> + /* KVM_EXIT_PSCI */
> + struct {
> + __u32 fn;
> + __u64 args[7];
> + __u64 ret[4];
> + } psci;
> /* Fix the size of the union. */
> char padding[256];
> };
> --
> 1.7.9.5
>
I am also wondering if this is not solving a very specific need without
thinking a little more carefully about this problem.
We have previously discussed the need for some secure side emulation
in QEMU, and I think perhaps we need something more generic which allows
user space to handle SMC calls and/or allows user space to "inject" some
secure world runtime that the kernel can run in a partially or fully
isolated container to handle SMC calls.
Peter raised this issue previously and pointed to a proposal he had as
well.
Is there a technical reason why we need something specifically directed
to PSCI?
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff
2013-10-16 17:02 ` [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff Anup Patel
@ 2013-10-16 22:17 ` Rob Herring
2013-10-17 5:08 ` Anup Patel
0 siblings, 1 reply; 40+ messages in thread
From: Rob Herring @ 2013-10-16 22:17 UTC (permalink / raw)
To: linux-arm-kernel
On 10/16/2013 12:02 PM, Anup Patel wrote:
> We have PSCI SYSTEM_OFF and SYSTEM_RESET function call emulation
> available when running as Guest using KVM ARM.
>
> This patch implements system reboot and poweroff using PSCI
> SYSTEM_OFF and SYSTEM_RESET.
I've done a similar patch [1] which also needs binding documentation to
go with it. The last version of which there is no agreement on is here [2].
Rob
[1] http://www.spinics.net/lists/arm-kernel/msg262217.html
[2] http://www.spinics.net/lists/devicetree/msg05348.html
>
>
> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> ---
> arch/arm/kernel/psci.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/arch/arm/kernel/psci.c b/arch/arm/kernel/psci.c
> index 4693188..30d9d65 100644
> --- a/arch/arm/kernel/psci.c
> +++ b/arch/arm/kernel/psci.c
> @@ -17,11 +17,13 @@
>
> #include <linux/init.h>
> #include <linux/of.h>
> +#include <linux/pm.h>
>
> #include <asm/compiler.h>
> #include <asm/errno.h>
> #include <asm/opcodes-sec.h>
> #include <asm/opcodes-virt.h>
> +#include <asm/system_misc.h>
> #include <asm/psci.h>
>
> struct psci_operations psci_ops;
> @@ -33,6 +35,8 @@ enum psci_function {
> PSCI_FN_CPU_ON,
> PSCI_FN_CPU_OFF,
> PSCI_FN_MIGRATE,
> + PSCI_FN_SYSTEM_OFF,
> + PSCI_FN_SYSTEM_RESET,
> PSCI_FN_MAX,
> };
>
> @@ -153,6 +157,28 @@ static int psci_migrate(unsigned long cpuid)
> return psci_to_linux_errno(err);
> }
>
> +static void psci_power_off(void)
> +{
> + int err;
> + u32 fn;
> +
> + fn = psci_function_id[PSCI_FN_SYSTEM_OFF];
> + err = invoke_psci_fn(fn, 0, 0, 0);
> + if (err)
> + pr_warning("%s: failed\n", __func__);
> +}
> +
> +static void psci_restart(enum reboot_mode reboot_mode, const char *cmd)
> +{
> + int err;
> + u32 fn;
> +
> + fn = psci_function_id[PSCI_FN_SYSTEM_RESET];
> + err = invoke_psci_fn(fn, 0, 0, 0);
> + if (err)
> + pr_warning("%s: failed\n", __func__);
> +}
> +
> static const struct of_device_id psci_of_match[] __initconst = {
> { .compatible = "arm,psci", },
> {},
> @@ -204,6 +230,16 @@ void __init psci_init(void)
> psci_ops.migrate = psci_migrate;
> }
>
> + if (!of_property_read_u32(np, "system_off", &id)) {
> + psci_function_id[PSCI_FN_SYSTEM_OFF] = id;
> + pm_power_off = psci_power_off;
> + }
> +
> + if (!of_property_read_u32(np, "system_reset", &id)) {
> + psci_function_id[PSCI_FN_SYSTEM_RESET] = id;
> + arm_pm_restart = psci_restart;
> + }
> +
> out_put_node:
> of_node_put(np);
> return;
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-16 17:02 ` [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space Anup Patel
@ 2013-10-16 22:22 ` Christoffer Dall
2013-10-17 5:52 ` Anup Patel
2013-10-17 8:37 ` Marc Zyngier
1 sibling, 1 reply; 40+ messages in thread
From: Christoffer Dall @ 2013-10-16 22:22 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Oct 16, 2013 at 10:32:31PM +0530, Anup Patel wrote:
> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>
> To tackle this, we forward PSCI SYSTEM_OFF and SYSTEM_RESET function
> calls from Guest to user space (i.e. QEMU or KVMTOOL) via KVM run
> structure with KVM_EXIT_PSCI exit reason.
>
> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> ---
> arch/arm/include/asm/kvm_psci.h | 25 +++++++++++++++-
> arch/arm/include/uapi/asm/kvm.h | 2 ++
> arch/arm/kvm/arm.c | 12 ++++++--
> arch/arm/kvm/handle_exit.c | 12 ++++++--
> arch/arm/kvm/psci.c | 57 ++++++++++++++++++++++++++++++++++---
> arch/arm64/include/asm/kvm_psci.h | 25 +++++++++++++++-
> arch/arm64/include/uapi/asm/kvm.h | 2 ++
> arch/arm64/kvm/handle_exit.c | 24 ++++++++++++----
> 8 files changed, 141 insertions(+), 18 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_psci.h b/arch/arm/include/asm/kvm_psci.h
> index 9a83d98..783566f 100644
> --- a/arch/arm/include/asm/kvm_psci.h
> +++ b/arch/arm/include/asm/kvm_psci.h
> @@ -18,6 +18,29 @@
> #ifndef __ARM_KVM_PSCI_H__
> #define __ARM_KVM_PSCI_H__
>
> -bool kvm_psci_call(struct kvm_vcpu *vcpu);
> +#include <linux/kvm_host.h>
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_arm.h>
> +
> +/*
> + * The in-kernel PSCI emulation code wants to use a copy of run->psci,
> + * which is an anonymous type. Use our own type instead.
> + */
> +struct kvm_exit_psci {
> + u32 fn;
> + u64 args[7];
> +};
> +
> +static inline void kvm_prepare_psci(struct kvm_run *run,
> + struct kvm_exit_psci *psci)
> +{
> + run->psci.fn = psci->fn;
> + memcpy(&run->psci.args, &psci->args, sizeof(run->psci.args));
> + memset(&run->psci.ret, 0, sizeof(run->psci.ret));
> + run->exit_reason = KVM_EXIT_PSCI;
> +}
> +
> +int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
> +int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run);
>
> #endif /* __ARM_KVM_PSCI_H__ */
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index c1ee007..205cf0e 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -171,6 +171,8 @@ struct kvm_arch_memory_slot {
> #define KVM_PSCI_FN_CPU_OFF KVM_PSCI_FN(1)
> #define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
> #define KVM_PSCI_FN_MIGRATE KVM_PSCI_FN(3)
> +#define KVM_PSCI_FN_SYSTEM_OFF KVM_PSCI_FN(4)
> +#define KVM_PSCI_FN_SYSTEM_RESET KVM_PSCI_FN(5)
>
> #define KVM_PSCI_RET_SUCCESS 0
> #define KVM_PSCI_RET_NI ((unsigned long)-1)
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index cc5adb9..5ffd9a3 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -459,7 +459,7 @@ static void update_vttbr(struct kvm *kvm)
> spin_unlock(&kvm_vmid_lock);
> }
>
> -static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> +static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu, struct kvm_run *run)
why this change? run can always be derived from vcpu->run.
> {
> if (likely(vcpu->arch.has_run_once))
> return 0;
> @@ -483,7 +483,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> */
> if (test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features)) {
> *vcpu_reg(vcpu, 0) = KVM_PSCI_FN_CPU_OFF;
> - kvm_psci_call(vcpu);
> + kvm_psci_call(vcpu, run);
> }
>
> return 0;
> @@ -520,7 +520,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> if (unlikely(!kvm_vcpu_initialized(vcpu)))
> return -ENOEXEC;
>
> - ret = kvm_vcpu_first_run_init(vcpu);
> + ret = kvm_vcpu_first_run_init(vcpu, vcpu->run);
> if (ret)
> return ret;
>
> @@ -530,6 +530,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> return ret;
> }
>
> + if (run->exit_reason == KVM_EXIT_PSCI) {
> + ret = kvm_handle_psci_return(vcpu, vcpu->run);
> + if (ret)
> + return ret;
> + }
> +
> if (vcpu->sigset_active)
> sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
>
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index df4c82d..1a12e6c 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -40,14 +40,20 @@ static int handle_svc_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
>
> static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
> + int ret;
> +
> trace_kvm_hvc(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
> kvm_vcpu_hvc_get_imm(vcpu));
>
> - if (kvm_psci_call(vcpu))
> + ret = kvm_psci_call(vcpu, run);
> + if (!ret)
> + return 1;
> + else if (ret == -EINVAL) {
> + kvm_inject_undefined(vcpu);
> return 1;
> + }
>
> - kvm_inject_undefined(vcpu);
> - return 1;
> + return 0;
> }
>
> static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
> index 86a693a..72c23a7 100644
> --- a/arch/arm/kvm/psci.c
> +++ b/arch/arm/kvm/psci.c
> @@ -71,6 +71,45 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> return KVM_PSCI_RET_SUCCESS;
> }
>
> +static void kvm_psci_system_off(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + struct kvm_exit_psci psci;
> +
> + psci.fn = KVM_PSCI_FN_SYSTEM_OFF;
> + memset(&psci.args, 0, sizeof(psci.args));
> + kvm_prepare_psci(run, &psci);
> +}
> +
> +static void kvm_psci_system_reset(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + struct kvm_exit_psci psci;
> +
> + psci.fn = KVM_PSCI_FN_SYSTEM_RESET;
> + memset(&psci.args, 0, sizeof(psci.args));
> + kvm_prepare_psci(run, &psci);
> +}
> +
> +/**
> + * kvm_handle_psci_return -- Handle PSCI after user space emulation
> + * @vcpu: The VCPU pointer
> + * @run: The VCPU run struct containing the psci data
> + *
> + * This should only be called after returning from userspace for
> + * PSCI emulation.
> + */
> +int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + /*
> + * Currently, the PSCI functions passed to user space for emulation
> + * are SYSTEM_OFF and SYSTEM_RESET. These PSCI functions are not
> + * expected to return back after emulating in user space hence by
> + * default we return -EINVAL to avoid user space from doing RUN ioctl
> + * after handling KVM_EXIT_PSCI.
> + */
> +
> + return -EINVAL;
> +}
> +
why would reset not return back after emulation?
also, do we need to impose this check or can we get rid of this all
together, if user space messes up, it's up to user space...
Are there any of the other PSCI functions that need specific handling in
the kernel on the return path?
> /**
> * kvm_psci_call - handle PSCI call if r0 value is in range
> * @vcpu: Pointer to the VCPU struct
> @@ -81,8 +120,9 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> * function number specified in r0 is withing the PSCI range, and false
> * otherwise.
> */
> -bool kvm_psci_call(struct kvm_vcpu *vcpu)
> +int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run)
why add run here as well?
> {
> + int ret = 0;
> unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
> unsigned long val;
>
> @@ -98,11 +138,20 @@ bool kvm_psci_call(struct kvm_vcpu *vcpu)
> case KVM_PSCI_FN_MIGRATE:
> val = KVM_PSCI_RET_NI;
> break;
> -
> + case KVM_PSCI_FN_SYSTEM_OFF:
> + kvm_psci_system_off(vcpu, run);
> + val = KVM_PSCI_RET_SUCCESS;
> + ret = -EINTR;
> + break;
> + case KVM_PSCI_FN_SYSTEM_RESET:
> + kvm_psci_system_reset(vcpu, run);
> + val = KVM_PSCI_RET_SUCCESS;
> + ret = -EINTR;
> + break;
> default:
> - return false;
> + return -EINVAL;
> }
>
> *vcpu_reg(vcpu, 0) = val;
> - return true;
> + return ret;
> }
> diff --git a/arch/arm64/include/asm/kvm_psci.h b/arch/arm64/include/asm/kvm_psci.h
> index e301a48..db90649 100644
> --- a/arch/arm64/include/asm/kvm_psci.h
> +++ b/arch/arm64/include/asm/kvm_psci.h
> @@ -18,6 +18,29 @@
> #ifndef __ARM64_KVM_PSCI_H__
> #define __ARM64_KVM_PSCI_H__
>
> -bool kvm_psci_call(struct kvm_vcpu *vcpu);
> +#include <linux/kvm_host.h>
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_arm.h>
> +
> +/*
> + * The in-kernel PSCI emulation code wants to use a copy of run->psci,
> + * which is an anonymous type. Use our own type instead.
> + */
> +struct kvm_exit_psci {
> + u32 fn;
> + u64 args[7];
> +};
> +
> +static inline void kvm_prepare_psci(struct kvm_run *run,
> + struct kvm_exit_psci *psci)
> +{
> + run->psci.fn = psci->fn;
> + memcpy(&run->psci.args, &psci->args, sizeof(run->psci.args));
> + memset(&run->psci.ret, 0, sizeof(run->psci.ret));
> + run->exit_reason = KVM_EXIT_PSCI;
> +}
> +
> +int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
> +int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run);
>
> #endif /* __ARM64_KVM_PSCI_H__ */
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index d9f026b..f678902 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -158,6 +158,8 @@ struct kvm_arch_memory_slot {
> #define KVM_PSCI_FN_CPU_OFF KVM_PSCI_FN(1)
> #define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
> #define KVM_PSCI_FN_MIGRATE KVM_PSCI_FN(3)
> +#define KVM_PSCI_FN_SYSTEM_OFF KVM_PSCI_FN(4)
> +#define KVM_PSCI_FN_SYSTEM_RESET KVM_PSCI_FN(5)
>
> #define KVM_PSCI_RET_SUCCESS 0
> #define KVM_PSCI_RET_NI ((unsigned long)-1)
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 9beaca0..28e20bb 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -30,20 +30,32 @@ typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
>
> static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
> - if (kvm_psci_call(vcpu))
> + int ret;
> +
> + ret = kvm_psci_call(vcpu, run);
> + if (!ret)
> + return 1;
> + else if (ret == -EINVAL) {
> + kvm_inject_undefined(vcpu);
> return 1;
> + }
>
> - kvm_inject_undefined(vcpu);
> - return 1;
> + return 0;
> }
>
> static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
> - if (kvm_psci_call(vcpu))
> + int ret;
> +
> + ret = kvm_psci_call(vcpu, run);
> + if (!ret)
> + return 1;
> + else if (ret == -EINVAL) {
> + kvm_inject_undefined(vcpu);
> return 1;
> + }
>
> - kvm_inject_undefined(vcpu);
> - return 1;
> + return 0;
> }
>
> /**
> --
> 1.7.9.5
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff
2013-10-16 22:17 ` Rob Herring
@ 2013-10-17 5:08 ` Anup Patel
2013-10-17 9:50 ` Marc Zyngier
0 siblings, 1 reply; 40+ messages in thread
From: Anup Patel @ 2013-10-17 5:08 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 3:47 AM, Rob Herring <robherring2@gmail.com> wrote:
> On 10/16/2013 12:02 PM, Anup Patel wrote:
>> We have PSCI SYSTEM_OFF and SYSTEM_RESET function call emulation
>> available when running as Guest using KVM ARM.
>>
>> This patch implements system reboot and poweroff using PSCI
>> SYSTEM_OFF and SYSTEM_RESET.
>
> I've done a similar patch [1] which also needs binding documentation to
> go with it. The last version of which there is no agreement on is here [2].
>
> Rob
>
> [1] http://www.spinics.net/lists/arm-kernel/msg262217.html
> [2] http://www.spinics.net/lists/devicetree/msg05348.html
Thanks, for the early warning.
I would be happy to go with your patch since you have already the device tree
bindings covered.
IMHO, we should try to go ahead with existing scheme of PSCI device tree
bindings for SYSTEM_OFF and SYSTEM_RESET. Later anyone can propose
revised PSCI device tree bindings. Its not nice to hold off good changes just
because people don't agree on the device tree bindings.
Also, are you planning to revise the device tree bindings for PSCI ?
--
Anup
>
>>
>>
>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>> ---
>> arch/arm/kernel/psci.c | 36 ++++++++++++++++++++++++++++++++++++
>> 1 file changed, 36 insertions(+)
>>
>> diff --git a/arch/arm/kernel/psci.c b/arch/arm/kernel/psci.c
>> index 4693188..30d9d65 100644
>> --- a/arch/arm/kernel/psci.c
>> +++ b/arch/arm/kernel/psci.c
>> @@ -17,11 +17,13 @@
>>
>> #include <linux/init.h>
>> #include <linux/of.h>
>> +#include <linux/pm.h>
>>
>> #include <asm/compiler.h>
>> #include <asm/errno.h>
>> #include <asm/opcodes-sec.h>
>> #include <asm/opcodes-virt.h>
>> +#include <asm/system_misc.h>
>> #include <asm/psci.h>
>>
>> struct psci_operations psci_ops;
>> @@ -33,6 +35,8 @@ enum psci_function {
>> PSCI_FN_CPU_ON,
>> PSCI_FN_CPU_OFF,
>> PSCI_FN_MIGRATE,
>> + PSCI_FN_SYSTEM_OFF,
>> + PSCI_FN_SYSTEM_RESET,
>> PSCI_FN_MAX,
>> };
>>
>> @@ -153,6 +157,28 @@ static int psci_migrate(unsigned long cpuid)
>> return psci_to_linux_errno(err);
>> }
>>
>> +static void psci_power_off(void)
>> +{
>> + int err;
>> + u32 fn;
>> +
>> + fn = psci_function_id[PSCI_FN_SYSTEM_OFF];
>> + err = invoke_psci_fn(fn, 0, 0, 0);
>> + if (err)
>> + pr_warning("%s: failed\n", __func__);
>> +}
>> +
>> +static void psci_restart(enum reboot_mode reboot_mode, const char *cmd)
>> +{
>> + int err;
>> + u32 fn;
>> +
>> + fn = psci_function_id[PSCI_FN_SYSTEM_RESET];
>> + err = invoke_psci_fn(fn, 0, 0, 0);
>> + if (err)
>> + pr_warning("%s: failed\n", __func__);
>> +}
>> +
>> static const struct of_device_id psci_of_match[] __initconst = {
>> { .compatible = "arm,psci", },
>> {},
>> @@ -204,6 +230,16 @@ void __init psci_init(void)
>> psci_ops.migrate = psci_migrate;
>> }
>>
>> + if (!of_property_read_u32(np, "system_off", &id)) {
>> + psci_function_id[PSCI_FN_SYSTEM_OFF] = id;
>> + pm_power_off = psci_power_off;
>> + }
>> +
>> + if (!of_property_read_u32(np, "system_reset", &id)) {
>> + psci_function_id[PSCI_FN_SYSTEM_RESET] = id;
>> + arm_pm_restart = psci_restart;
>> + }
>> +
>> out_put_node:
>> of_node_put(np);
>> return;
>>
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-16 22:22 ` Christoffer Dall
@ 2013-10-17 5:52 ` Anup Patel
0 siblings, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-17 5:52 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 3:52 AM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Wed, Oct 16, 2013 at 10:32:31PM +0530, Anup Patel wrote:
>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>>
>> To tackle this, we forward PSCI SYSTEM_OFF and SYSTEM_RESET function
>> calls from Guest to user space (i.e. QEMU or KVMTOOL) via KVM run
>> structure with KVM_EXIT_PSCI exit reason.
>>
>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>> ---
>> arch/arm/include/asm/kvm_psci.h | 25 +++++++++++++++-
>> arch/arm/include/uapi/asm/kvm.h | 2 ++
>> arch/arm/kvm/arm.c | 12 ++++++--
>> arch/arm/kvm/handle_exit.c | 12 ++++++--
>> arch/arm/kvm/psci.c | 57 ++++++++++++++++++++++++++++++++++---
>> arch/arm64/include/asm/kvm_psci.h | 25 +++++++++++++++-
>> arch/arm64/include/uapi/asm/kvm.h | 2 ++
>> arch/arm64/kvm/handle_exit.c | 24 ++++++++++++----
>> 8 files changed, 141 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_psci.h b/arch/arm/include/asm/kvm_psci.h
>> index 9a83d98..783566f 100644
>> --- a/arch/arm/include/asm/kvm_psci.h
>> +++ b/arch/arm/include/asm/kvm_psci.h
>> @@ -18,6 +18,29 @@
>> #ifndef __ARM_KVM_PSCI_H__
>> #define __ARM_KVM_PSCI_H__
>>
>> -bool kvm_psci_call(struct kvm_vcpu *vcpu);
>> +#include <linux/kvm_host.h>
>> +#include <asm/kvm_asm.h>
>> +#include <asm/kvm_arm.h>
>> +
>> +/*
>> + * The in-kernel PSCI emulation code wants to use a copy of run->psci,
>> + * which is an anonymous type. Use our own type instead.
>> + */
>> +struct kvm_exit_psci {
>> + u32 fn;
>> + u64 args[7];
>> +};
>> +
>> +static inline void kvm_prepare_psci(struct kvm_run *run,
>> + struct kvm_exit_psci *psci)
>> +{
>> + run->psci.fn = psci->fn;
>> + memcpy(&run->psci.args, &psci->args, sizeof(run->psci.args));
>> + memset(&run->psci.ret, 0, sizeof(run->psci.ret));
>> + run->exit_reason = KVM_EXIT_PSCI;
>> +}
>> +
>> +int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
>> +int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>
>> #endif /* __ARM_KVM_PSCI_H__ */
>> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
>> index c1ee007..205cf0e 100644
>> --- a/arch/arm/include/uapi/asm/kvm.h
>> +++ b/arch/arm/include/uapi/asm/kvm.h
>> @@ -171,6 +171,8 @@ struct kvm_arch_memory_slot {
>> #define KVM_PSCI_FN_CPU_OFF KVM_PSCI_FN(1)
>> #define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
>> #define KVM_PSCI_FN_MIGRATE KVM_PSCI_FN(3)
>> +#define KVM_PSCI_FN_SYSTEM_OFF KVM_PSCI_FN(4)
>> +#define KVM_PSCI_FN_SYSTEM_RESET KVM_PSCI_FN(5)
>>
>> #define KVM_PSCI_RET_SUCCESS 0
>> #define KVM_PSCI_RET_NI ((unsigned long)-1)
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index cc5adb9..5ffd9a3 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -459,7 +459,7 @@ static void update_vttbr(struct kvm *kvm)
>> spin_unlock(&kvm_vmid_lock);
>> }
>>
>> -static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>> +static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu, struct kvm_run *run)
>
> why this change? run can always be derived from vcpu->run.
Because thats how kvm_handle_mmio_return() and handle_exit() are
implemented.
If you think this is inappropriate then function prototypes for
kvm_handle_mmio_return() and handle_exit() should also change.
IMHO, we should be consistent in usage of vcpu->run.
>
>> {
>> if (likely(vcpu->arch.has_run_once))
>> return 0;
>> @@ -483,7 +483,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>> */
>> if (test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features)) {
>> *vcpu_reg(vcpu, 0) = KVM_PSCI_FN_CPU_OFF;
>> - kvm_psci_call(vcpu);
>> + kvm_psci_call(vcpu, run);
>> }
>>
>> return 0;
>> @@ -520,7 +520,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> if (unlikely(!kvm_vcpu_initialized(vcpu)))
>> return -ENOEXEC;
>>
>> - ret = kvm_vcpu_first_run_init(vcpu);
>> + ret = kvm_vcpu_first_run_init(vcpu, vcpu->run);
>> if (ret)
>> return ret;
>>
>> @@ -530,6 +530,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> return ret;
>> }
>>
>> + if (run->exit_reason == KVM_EXIT_PSCI) {
>> + ret = kvm_handle_psci_return(vcpu, vcpu->run);
>> + if (ret)
>> + return ret;
>> + }
>> +
>> if (vcpu->sigset_active)
>> sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
>>
>> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
>> index df4c82d..1a12e6c 100644
>> --- a/arch/arm/kvm/handle_exit.c
>> +++ b/arch/arm/kvm/handle_exit.c
>> @@ -40,14 +40,20 @@ static int handle_svc_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>
>> static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> {
>> + int ret;
>> +
>> trace_kvm_hvc(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
>> kvm_vcpu_hvc_get_imm(vcpu));
>>
>> - if (kvm_psci_call(vcpu))
>> + ret = kvm_psci_call(vcpu, run);
>> + if (!ret)
>> + return 1;
>> + else if (ret == -EINVAL) {
>> + kvm_inject_undefined(vcpu);
>> return 1;
>> + }
>>
>> - kvm_inject_undefined(vcpu);
>> - return 1;
>> + return 0;
>> }
>>
>> static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
>> index 86a693a..72c23a7 100644
>> --- a/arch/arm/kvm/psci.c
>> +++ b/arch/arm/kvm/psci.c
>> @@ -71,6 +71,45 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>> return KVM_PSCI_RET_SUCCESS;
>> }
>>
>> +static void kvm_psci_system_off(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> + struct kvm_exit_psci psci;
>> +
>> + psci.fn = KVM_PSCI_FN_SYSTEM_OFF;
>> + memset(&psci.args, 0, sizeof(psci.args));
>> + kvm_prepare_psci(run, &psci);
>> +}
>> +
>> +static void kvm_psci_system_reset(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> + struct kvm_exit_psci psci;
>> +
>> + psci.fn = KVM_PSCI_FN_SYSTEM_RESET;
>> + memset(&psci.args, 0, sizeof(psci.args));
>> + kvm_prepare_psci(run, &psci);
>> +}
>> +
>> +/**
>> + * kvm_handle_psci_return -- Handle PSCI after user space emulation
>> + * @vcpu: The VCPU pointer
>> + * @run: The VCPU run struct containing the psci data
>> + *
>> + * This should only be called after returning from userspace for
>> + * PSCI emulation.
>> + */
>> +int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> + /*
>> + * Currently, the PSCI functions passed to user space for emulation
>> + * are SYSTEM_OFF and SYSTEM_RESET. These PSCI functions are not
>> + * expected to return back after emulating in user space hence by
>> + * default we return -EINVAL to avoid user space from doing RUN ioctl
>> + * after handling KVM_EXIT_PSCI.
>> + */
>> +
>> + return -EINVAL;
>> +}
>> +
>
> why would reset not return back after emulation?
>
> also, do we need to impose this check or can we get rid of this all
> together, if user space messes up, it's up to user space...
Yes, I think we should allow SYSTEM_RESET to return. I'll update this
with a check.
Actually, when we reach here we also need to update X0-X3 (for ARM R0-R3)
with return values in run->psci. I'll add this in subsequent revision.
>
> Are there any of the other PSCI functions that need specific handling in
> the kernel on the return path?
So far only SYSTEM_RESET and SYSTEM_OFF as per current PSCI spec.
I think people might be also be interested in emulating vendor specific PSCI
calls in user space (QEMU or KVMTOOL or Some proprietary software).
>
>> /**
>> * kvm_psci_call - handle PSCI call if r0 value is in range
>> * @vcpu: Pointer to the VCPU struct
>> @@ -81,8 +120,9 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
>> * function number specified in r0 is withing the PSCI range, and false
>> * otherwise.
>> */
>> -bool kvm_psci_call(struct kvm_vcpu *vcpu)
>> +int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run)
>
> why add run here as well?
Please see my previous comments.
--
Anup
>
>> {
>> + int ret = 0;
>> unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
>> unsigned long val;
>>
>> @@ -98,11 +138,20 @@ bool kvm_psci_call(struct kvm_vcpu *vcpu)
>> case KVM_PSCI_FN_MIGRATE:
>> val = KVM_PSCI_RET_NI;
>> break;
>> -
>> + case KVM_PSCI_FN_SYSTEM_OFF:
>> + kvm_psci_system_off(vcpu, run);
>> + val = KVM_PSCI_RET_SUCCESS;
>> + ret = -EINTR;
>> + break;
>> + case KVM_PSCI_FN_SYSTEM_RESET:
>> + kvm_psci_system_reset(vcpu, run);
>> + val = KVM_PSCI_RET_SUCCESS;
>> + ret = -EINTR;
>> + break;
>> default:
>> - return false;
>> + return -EINVAL;
>> }
>>
>> *vcpu_reg(vcpu, 0) = val;
>> - return true;
>> + return ret;
>> }
>> diff --git a/arch/arm64/include/asm/kvm_psci.h b/arch/arm64/include/asm/kvm_psci.h
>> index e301a48..db90649 100644
>> --- a/arch/arm64/include/asm/kvm_psci.h
>> +++ b/arch/arm64/include/asm/kvm_psci.h
>> @@ -18,6 +18,29 @@
>> #ifndef __ARM64_KVM_PSCI_H__
>> #define __ARM64_KVM_PSCI_H__
>>
>> -bool kvm_psci_call(struct kvm_vcpu *vcpu);
>> +#include <linux/kvm_host.h>
>> +#include <asm/kvm_asm.h>
>> +#include <asm/kvm_arm.h>
>> +
>> +/*
>> + * The in-kernel PSCI emulation code wants to use a copy of run->psci,
>> + * which is an anonymous type. Use our own type instead.
>> + */
>> +struct kvm_exit_psci {
>> + u32 fn;
>> + u64 args[7];
>> +};
>> +
>> +static inline void kvm_prepare_psci(struct kvm_run *run,
>> + struct kvm_exit_psci *psci)
>> +{
>> + run->psci.fn = psci->fn;
>> + memcpy(&run->psci.args, &psci->args, sizeof(run->psci.args));
>> + memset(&run->psci.ret, 0, sizeof(run->psci.ret));
>> + run->exit_reason = KVM_EXIT_PSCI;
>> +}
>> +
>> +int kvm_handle_psci_return(struct kvm_vcpu *vcpu, struct kvm_run *run);
>> +int kvm_psci_call(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>
>> #endif /* __ARM64_KVM_PSCI_H__ */
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index d9f026b..f678902 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -158,6 +158,8 @@ struct kvm_arch_memory_slot {
>> #define KVM_PSCI_FN_CPU_OFF KVM_PSCI_FN(1)
>> #define KVM_PSCI_FN_CPU_ON KVM_PSCI_FN(2)
>> #define KVM_PSCI_FN_MIGRATE KVM_PSCI_FN(3)
>> +#define KVM_PSCI_FN_SYSTEM_OFF KVM_PSCI_FN(4)
>> +#define KVM_PSCI_FN_SYSTEM_RESET KVM_PSCI_FN(5)
>>
>> #define KVM_PSCI_RET_SUCCESS 0
>> #define KVM_PSCI_RET_NI ((unsigned long)-1)
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index 9beaca0..28e20bb 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -30,20 +30,32 @@ typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
>>
>> static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> {
>> - if (kvm_psci_call(vcpu))
>> + int ret;
>> +
>> + ret = kvm_psci_call(vcpu, run);
>> + if (!ret)
>> + return 1;
>> + else if (ret == -EINVAL) {
>> + kvm_inject_undefined(vcpu);
>> return 1;
>> + }
>>
>> - kvm_inject_undefined(vcpu);
>> - return 1;
>> + return 0;
>> }
>>
>> static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> {
>> - if (kvm_psci_call(vcpu))
>> + int ret;
>> +
>> + ret = kvm_psci_call(vcpu, run);
>> + if (!ret)
>> + return 1;
>> + else if (ret == -EINVAL) {
>> + kvm_inject_undefined(vcpu);
>> return 1;
>> + }
>>
>> - kvm_inject_undefined(vcpu);
>> - return 1;
>> + return 0;
>> }
>>
>> /**
>> --
>> 1.7.9.5
>>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-16 20:30 ` Christoffer Dall
@ 2013-10-17 6:25 ` Anup Patel
0 siblings, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-17 6:25 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 2:00 AM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>> Update user space API interface headers for providing information to
>> user space needed to emulate PSCI function calls in user space (i.e.
>> QEMU or KVMTOOL).
>>
>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>> ---
>> include/uapi/linux/kvm.h | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index e32e776..dae2664 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>> #define KVM_EXIT_WATCHDOG 21
>> #define KVM_EXIT_S390_TSCH 22
>> #define KVM_EXIT_EPR 23
>> +#define KVM_EXIT_PSCI 24
>>
>> /* For KVM_EXIT_INTERNAL_ERROR */
>> /* Emulate instruction failed. */
>> @@ -301,6 +302,12 @@ struct kvm_run {
>> struct {
>> __u32 epr;
>> } epr;
>> + /* KVM_EXIT_PSCI */
>> + struct {
>> + __u32 fn;
>> + __u64 args[7];
>> + __u64 ret[4];
>> + } psci;
>> /* Fix the size of the union. */
>> char padding[256];
>> };
>> --
>> 1.7.9.5
>>
>
> I think you'd need a KVM_CAP_PSCI or something here so that QEMU can
> know to mmap this much, no?
>
> Also, it would be easier if you just added the documentation for this
> change together with this patch IMHO.
Yes, I think its good atleast advertise KVM_CAP_PSCI to QEMU. I'll add it
in revised patch.
>
> -Christoffer
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
--
Anup
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-16 22:11 ` Christoffer Dall
@ 2013-10-17 6:45 ` Anup Patel
2013-10-17 8:47 ` Marc Zyngier
0 siblings, 1 reply; 40+ messages in thread
From: Anup Patel @ 2013-10-17 6:45 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>> Update user space API interface headers for providing information to
>> user space needed to emulate PSCI function calls in user space (i.e.
>> QEMU or KVMTOOL).
>>
>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>> ---
>> include/uapi/linux/kvm.h | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index e32e776..dae2664 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>> #define KVM_EXIT_WATCHDOG 21
>> #define KVM_EXIT_S390_TSCH 22
>> #define KVM_EXIT_EPR 23
>> +#define KVM_EXIT_PSCI 24
>>
>> /* For KVM_EXIT_INTERNAL_ERROR */
>> /* Emulate instruction failed. */
>> @@ -301,6 +302,12 @@ struct kvm_run {
>> struct {
>> __u32 epr;
>> } epr;
>> + /* KVM_EXIT_PSCI */
>> + struct {
>> + __u32 fn;
>> + __u64 args[7];
>> + __u64 ret[4];
>> + } psci;
>> /* Fix the size of the union. */
>> char padding[256];
>> };
>> --
>> 1.7.9.5
>>
> I am also wondering if this is not solving a very specific need without
> thinking a little more carefully about this problem.
No, its not solving a specific problem.
In fact, its more general because we pass complete info required to
emulate a PSCI call in user space.
(Please refer PSCI calling convention)
>
> We have previously discussed the need for some secure side emulation
> in QEMU, and I think perhaps we need something more generic which allows
> user space to handle SMC calls and/or allows user space to "inject" some
> secure world runtime that the kernel can run in a partially or fully
> isolated container to handle SMC calls.
>
> Peter raised this issue previously and pointed to a proposal he had as
> well.
If required we can have an additional field in kvm_run->psci which tells
whether the PSCI call is an SMC call or HVC call.
>
> Is there a technical reason why we need something specifically directed
> to PSCI?
Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
of adding a separate VirtIO device for System reboot and System poweroff.
Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
emulation in user space we would also have an infrastructure for adding
emulation of new PSCI calls in user space.
--
Anup
>
> -Christoffer
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-16 17:02 ` [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space Anup Patel
2013-10-16 22:22 ` Christoffer Dall
@ 2013-10-17 8:37 ` Marc Zyngier
2013-10-17 9:10 ` Peter Maydell
2013-10-17 11:13 ` Anup Patel
1 sibling, 2 replies; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 8:37 UTC (permalink / raw)
To: linux-arm-kernel
On 16/10/13 18:02, Anup Patel wrote:
> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
> functions hence cannot be emulated by the in-kernel PSCI emulation code.
Why can't we implement system-wide functionality in the kernel? I fail
to see the issue here.
> To tackle this, we forward PSCI SYSTEM_OFF and SYSTEM_RESET function
> calls from Guest to user space (i.e. QEMU or KVMTOOL) via KVM run
> structure with KVM_EXIT_PSCI exit reason.
I'm really not keen on this approach. Having part of the PSCI
implementation offloaded to userspace means we don't have a complete
implementation in KVM anymore, and we end-up duplicating functionality
all over the place.
Also, OFF and RESET are not PSCI specific concepts, and could be
implemented in various ways. I'm more inclined to return a
*standardized* exit code that the various platforms can interpret.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 6:45 ` Anup Patel
@ 2013-10-17 8:47 ` Marc Zyngier
2013-10-17 11:10 ` Anup Patel
0 siblings, 1 reply; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 8:47 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 07:45, Anup Patel wrote:
> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>> Update user space API interface headers for providing information to
>>> user space needed to emulate PSCI function calls in user space (i.e.
>>> QEMU or KVMTOOL).
>>>
>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>> ---
>>> include/uapi/linux/kvm.h | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index e32e776..dae2664 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>> #define KVM_EXIT_WATCHDOG 21
>>> #define KVM_EXIT_S390_TSCH 22
>>> #define KVM_EXIT_EPR 23
>>> +#define KVM_EXIT_PSCI 24
>>>
>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>> /* Emulate instruction failed. */
>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>> struct {
>>> __u32 epr;
>>> } epr;
>>> + /* KVM_EXIT_PSCI */
>>> + struct {
>>> + __u32 fn;
>>> + __u64 args[7];
>>> + __u64 ret[4];
>>> + } psci;
>>> /* Fix the size of the union. */
>>> char padding[256];
>>> };
>>> --
>>> 1.7.9.5
>>>
>> I am also wondering if this is not solving a very specific need without
>> thinking a little more carefully about this problem.
>
> No, its not solving a specific problem.
>
> In fact, its more general because we pass complete info required to
> emulate a PSCI call in user space.
> (Please refer PSCI calling convention)
>
>>
>> We have previously discussed the need for some secure side emulation
>> in QEMU, and I think perhaps we need something more generic which allows
>> user space to handle SMC calls and/or allows user space to "inject" some
>> secure world runtime that the kernel can run in a partially or fully
>> isolated container to handle SMC calls.
>>
>> Peter raised this issue previously and pointed to a proposal he had as
>> well.
>
> If required we can have an additional field in kvm_run->psci which tells
> whether the PSCI call is an SMC call or HVC call.
>
>>
>> Is there a technical reason why we need something specifically directed
>> to PSCI?
>
> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
> of adding a separate VirtIO device for System reboot and System poweroff.
>
> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
> emulation in user space we would also have an infrastructure for adding
> emulation of new PSCI calls in user space.
And I strongly oppose to that. It creates consistency issues (what if
userspace implements one version of PSCI, and the kernel another?), and
also some really horrible situations: Imagine you implement the SUSPEND
operation in userspace, and want to wake the vcpu up with an interrupt.
You'd end-up having to keep track of the state in the kernel, having to
forward the interrupt event to userspace...
So really, no.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 8:37 ` Marc Zyngier
@ 2013-10-17 9:10 ` Peter Maydell
2013-10-17 9:21 ` Marc Zyngier
2013-10-17 11:07 ` Anup Patel
2013-10-17 11:13 ` Anup Patel
1 sibling, 2 replies; 40+ messages in thread
From: Peter Maydell @ 2013-10-17 9:10 UTC (permalink / raw)
To: linux-arm-kernel
On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 16/10/13 18:02, Anup Patel wrote:
>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>
> Why can't we implement system-wide functionality in the kernel? I fail
> to see the issue here.
Because the kernel isn't emulating the whole board, and you need
to power off or reset the whole board, not just the CPUs.
> I'm really not keen on this approach. Having part of the PSCI
> implementation offloaded to userspace means we don't have a complete
> implementation in KVM anymore, and we end-up duplicating functionality
> all over the place.
>
> Also, OFF and RESET are not PSCI specific concepts, and could be
> implemented in various ways. I'm more inclined to return a
> *standardized* exit code that the various platforms can interpret.
Maybe we should have a more generic "kernel can't handle this,
toss it to userspace" API? That might also fit in with supporting
guests that want to make SMC calls to an emulated monitor...
-- PMM
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 9:10 ` Peter Maydell
@ 2013-10-17 9:21 ` Marc Zyngier
2013-10-17 9:31 ` Peter Maydell
2013-10-17 18:34 ` Christoffer Dall
2013-10-17 11:07 ` Anup Patel
1 sibling, 2 replies; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 9:21 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 10:10, Peter Maydell wrote:
> On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 16/10/13 18:02, Anup Patel wrote:
>>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>>
>> Why can't we implement system-wide functionality in the kernel? I fail
>> to see the issue here.
>
> Because the kernel isn't emulating the whole board, and you need
> to power off or reset the whole board, not just the CPUs.
In which case we can forward a generic event, once KVM has dealt with
the CPUs.
>> I'm really not keen on this approach. Having part of the PSCI
>> implementation offloaded to userspace means we don't have a complete
>> implementation in KVM anymore, and we end-up duplicating functionality
>> all over the place.
>>
>> Also, OFF and RESET are not PSCI specific concepts, and could be
>> implemented in various ways. I'm more inclined to return a
>> *standardized* exit code that the various platforms can interpret.
>
> Maybe we should have a more generic "kernel can't handle this,
> toss it to userspace" API? That might also fit in with supporting
> guests that want to make SMC calls to an emulated monitor...
Indeed.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 9:21 ` Marc Zyngier
@ 2013-10-17 9:31 ` Peter Maydell
2013-10-17 18:34 ` Christoffer Dall
1 sibling, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2013-10-17 9:31 UTC (permalink / raw)
To: linux-arm-kernel
On 17 October 2013 10:21, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/10/13 10:10, Peter Maydell wrote:
>> On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 16/10/13 18:02, Anup Patel wrote:
>>>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>>>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>>>
>>> Why can't we implement system-wide functionality in the kernel? I fail
>>> to see the issue here.
>>
>> Because the kernel isn't emulating the whole board, and you need
>> to power off or reset the whole board, not just the CPUs.
>
> In which case we can forward a generic event, once KVM has dealt with
> the CPUs.
Ideally per-CPU reset should be driven by userspace, incidentally.
We ought to have an ioctl for "hey, reset this CPU': at the moment
we have to fake it up by having QEMU feed the CPU back the register
values it read on powerup, which is kind of ugly.
-- PMM
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff
2013-10-17 5:08 ` Anup Patel
@ 2013-10-17 9:50 ` Marc Zyngier
0 siblings, 0 replies; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 9:50 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 06:08, Anup Patel wrote:
> On Thu, Oct 17, 2013 at 3:47 AM, Rob Herring <robherring2@gmail.com> wrote:
>> On 10/16/2013 12:02 PM, Anup Patel wrote:
>>> We have PSCI SYSTEM_OFF and SYSTEM_RESET function call emulation
>>> available when running as Guest using KVM ARM.
>>>
>>> This patch implements system reboot and poweroff using PSCI
>>> SYSTEM_OFF and SYSTEM_RESET.
>>
>> I've done a similar patch [1] which also needs binding documentation to
>> go with it. The last version of which there is no agreement on is here [2].
>>
>> Rob
>>
>> [1] http://www.spinics.net/lists/arm-kernel/msg262217.html
>> [2] http://www.spinics.net/lists/devicetree/msg05348.html
>
> Thanks, for the early warning.
>
> I would be happy to go with your patch since you have already the device tree
> bindings covered.
>
> IMHO, we should try to go ahead with existing scheme of PSCI device tree
> bindings for SYSTEM_OFF and SYSTEM_RESET. Later anyone can propose
> revised PSCI device tree bindings. Its not nice to hold off good changes just
> because people don't agree on the device tree bindings.
*Shrug*
DT is an ABI. Even more for virtualization. It can't be revised without
keeping the old version alive. Once it is in, it is for good, and we
have support it forever. So we'd better get it right from the beginning,
no matter how long it takes.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 9:10 ` Peter Maydell
2013-10-17 9:21 ` Marc Zyngier
@ 2013-10-17 11:07 ` Anup Patel
2013-10-17 11:13 ` Marc Zyngier
1 sibling, 1 reply; 40+ messages in thread
From: Anup Patel @ 2013-10-17 11:07 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 2:40 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 16/10/13 18:02, Anup Patel wrote:
>>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>>
>> Why can't we implement system-wide functionality in the kernel? I fail
>> to see the issue here.
>
> Because the kernel isn't emulating the whole board, and you need
> to power off or reset the whole board, not just the CPUs.
>
>> I'm really not keen on this approach. Having part of the PSCI
>> implementation offloaded to userspace means we don't have a complete
>> implementation in KVM anymore, and we end-up duplicating functionality
>> all over the place.
>>
>> Also, OFF and RESET are not PSCI specific concepts, and could be
>> implemented in various ways. I'm more inclined to return a
>> *standardized* exit code that the various platforms can interpret.
>
> Maybe we should have a more generic "kernel can't handle this,
> toss it to userspace" API? That might also fit in with supporting
> guests that want to make SMC calls to an emulated monitor...
Excatly, that is what this patch does by forwarding PSCI SYSTEM_OFF
and SYSTEM_RESET to user space (QEMU or KVMTOOL).
--
Anup
>
> -- PMM
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 8:47 ` Marc Zyngier
@ 2013-10-17 11:10 ` Anup Patel
2013-10-17 11:21 ` Marc Zyngier
0 siblings, 1 reply; 40+ messages in thread
From: Anup Patel @ 2013-10-17 11:10 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/10/13 07:45, Anup Patel wrote:
>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>> <christoffer.dall@linaro.org> wrote:
>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>> Update user space API interface headers for providing information to
>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>> QEMU or KVMTOOL).
>>>>
>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>> ---
>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>> index e32e776..dae2664 100644
>>>> --- a/include/uapi/linux/kvm.h
>>>> +++ b/include/uapi/linux/kvm.h
>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>> #define KVM_EXIT_WATCHDOG 21
>>>> #define KVM_EXIT_S390_TSCH 22
>>>> #define KVM_EXIT_EPR 23
>>>> +#define KVM_EXIT_PSCI 24
>>>>
>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>> /* Emulate instruction failed. */
>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>> struct {
>>>> __u32 epr;
>>>> } epr;
>>>> + /* KVM_EXIT_PSCI */
>>>> + struct {
>>>> + __u32 fn;
>>>> + __u64 args[7];
>>>> + __u64 ret[4];
>>>> + } psci;
>>>> /* Fix the size of the union. */
>>>> char padding[256];
>>>> };
>>>> --
>>>> 1.7.9.5
>>>>
>>> I am also wondering if this is not solving a very specific need without
>>> thinking a little more carefully about this problem.
>>
>> No, its not solving a specific problem.
>>
>> In fact, its more general because we pass complete info required to
>> emulate a PSCI call in user space.
>> (Please refer PSCI calling convention)
>>
>>>
>>> We have previously discussed the need for some secure side emulation
>>> in QEMU, and I think perhaps we need something more generic which allows
>>> user space to handle SMC calls and/or allows user space to "inject" some
>>> secure world runtime that the kernel can run in a partially or fully
>>> isolated container to handle SMC calls.
>>>
>>> Peter raised this issue previously and pointed to a proposal he had as
>>> well.
>>
>> If required we can have an additional field in kvm_run->psci which tells
>> whether the PSCI call is an SMC call or HVC call.
>>
>>>
>>> Is there a technical reason why we need something specifically directed
>>> to PSCI?
>>
>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>> of adding a separate VirtIO device for System reboot and System poweroff.
>>
>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>> emulation in user space we would also have an infrastructure for adding
>> emulation of new PSCI calls in user space.
>
> And I strongly oppose to that. It creates consistency issues (what if
> userspace implements one version of PSCI, and the kernel another?), and
> also some really horrible situations: Imagine you implement the SUSPEND
> operation in userspace, and want to wake the vcpu up with an interrupt.
> You'd end-up having to keep track of the state in the kernel, having to
> forward the interrupt event to userspace...
It is not about emulating all PSCI functions in user space. Its about forwarding
system-level PSCI functions or PSCI functions which cannot be emulated in
kernel to user space.
--
Anup
>
> So really, no.
>
> M.
> --
> Jazz is not dead. It just smells funny...
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 11:07 ` Anup Patel
@ 2013-10-17 11:13 ` Marc Zyngier
0 siblings, 0 replies; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 11:13 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 12:07, Anup Patel wrote:
> On Thu, Oct 17, 2013 at 2:40 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 16/10/13 18:02, Anup Patel wrote:
>>>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>>>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>>>
>>> Why can't we implement system-wide functionality in the kernel? I fail
>>> to see the issue here.
>>
>> Because the kernel isn't emulating the whole board, and you need
>> to power off or reset the whole board, not just the CPUs.
>>
>>> I'm really not keen on this approach. Having part of the PSCI
>>> implementation offloaded to userspace means we don't have a complete
>>> implementation in KVM anymore, and we end-up duplicating functionality
>>> all over the place.
>>>
>>> Also, OFF and RESET are not PSCI specific concepts, and could be
>>> implemented in various ways. I'm more inclined to return a
>>> *standardized* exit code that the various platforms can interpret.
>>
>> Maybe we should have a more generic "kernel can't handle this,
>> toss it to userspace" API? That might also fit in with supporting
>> guests that want to make SMC calls to an emulated monitor...
>
> Excatly, that is what this patch does by forwarding PSCI SYSTEM_OFF
> and SYSTEM_RESET to user space (QEMU or KVMTOOL).
I think you missed the words standardized and generic above.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 8:37 ` Marc Zyngier
2013-10-17 9:10 ` Peter Maydell
@ 2013-10-17 11:13 ` Anup Patel
2013-10-17 18:29 ` Christoffer Dall
1 sibling, 1 reply; 40+ messages in thread
From: Anup Patel @ 2013-10-17 11:13 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 2:07 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 16/10/13 18:02, Anup Patel wrote:
>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>
> Why can't we implement system-wide functionality in the kernel? I fail
> to see the issue here.
>
>> To tackle this, we forward PSCI SYSTEM_OFF and SYSTEM_RESET function
>> calls from Guest to user space (i.e. QEMU or KVMTOOL) via KVM run
>> structure with KVM_EXIT_PSCI exit reason.
>
> I'm really not keen on this approach. Having part of the PSCI
> implementation offloaded to userspace means we don't have a complete
> implementation in KVM anymore, and we end-up duplicating functionality
> all over the place.
>
> Also, OFF and RESET are not PSCI specific concepts, and could be
> implemented in various ways. I'm more inclined to return a
> *standardized* exit code that the various platforms can interpret.
Please refer to the documentation of SYSTEM_OFF and SYSTEM_RESET
functions in the PSCI specifications.
(More preciesly section 5.10 and 5.11 of
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0022b/index.html)
--
Anup
>
> M.
> --
> Jazz is not dead. It just smells funny...
>
>
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:10 ` Anup Patel
@ 2013-10-17 11:21 ` Marc Zyngier
2013-10-17 11:30 ` Anup Patel
0 siblings, 1 reply; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 11:21 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 12:10, Anup Patel wrote:
> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 17/10/13 07:45, Anup Patel wrote:
>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>> <christoffer.dall@linaro.org> wrote:
>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>> Update user space API interface headers for providing information to
>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>> QEMU or KVMTOOL).
>>>>>
>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>> ---
>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>> 1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>> index e32e776..dae2664 100644
>>>>> --- a/include/uapi/linux/kvm.h
>>>>> +++ b/include/uapi/linux/kvm.h
>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>> #define KVM_EXIT_EPR 23
>>>>> +#define KVM_EXIT_PSCI 24
>>>>>
>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>> /* Emulate instruction failed. */
>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>> struct {
>>>>> __u32 epr;
>>>>> } epr;
>>>>> + /* KVM_EXIT_PSCI */
>>>>> + struct {
>>>>> + __u32 fn;
>>>>> + __u64 args[7];
>>>>> + __u64 ret[4];
>>>>> + } psci;
>>>>> /* Fix the size of the union. */
>>>>> char padding[256];
>>>>> };
>>>>> --
>>>>> 1.7.9.5
>>>>>
>>>> I am also wondering if this is not solving a very specific need without
>>>> thinking a little more carefully about this problem.
>>>
>>> No, its not solving a specific problem.
>>>
>>> In fact, its more general because we pass complete info required to
>>> emulate a PSCI call in user space.
>>> (Please refer PSCI calling convention)
>>>
>>>>
>>>> We have previously discussed the need for some secure side emulation
>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>> secure world runtime that the kernel can run in a partially or fully
>>>> isolated container to handle SMC calls.
>>>>
>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>> well.
>>>
>>> If required we can have an additional field in kvm_run->psci which tells
>>> whether the PSCI call is an SMC call or HVC call.
>>>
>>>>
>>>> Is there a technical reason why we need something specifically directed
>>>> to PSCI?
>>>
>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>
>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>> emulation in user space we would also have an infrastructure for adding
>>> emulation of new PSCI calls in user space.
>>
>> And I strongly oppose to that. It creates consistency issues (what if
>> userspace implements one version of PSCI, and the kernel another?), and
>> also some really horrible situations: Imagine you implement the SUSPEND
>> operation in userspace, and want to wake the vcpu up with an interrupt.
>> You'd end-up having to keep track of the state in the kernel, having to
>> forward the interrupt event to userspace...
>
> It is not about emulating all PSCI functions in user space. Its about forwarding
> system-level PSCI functions or PSCI functions which cannot be emulated in
> kernel to user space.
The CPU parts of PSCI can perfectly be implemented in the kernel.
Then you can return something to userspace indicating what just
happened. And it doesn't have to be PSCI specific.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:21 ` Marc Zyngier
@ 2013-10-17 11:30 ` Anup Patel
2013-10-17 11:49 ` Alexander Graf
2013-10-17 11:52 ` Marc Zyngier
0 siblings, 2 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-17 11:30 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/10/13 12:10, Anup Patel wrote:
>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 17/10/13 07:45, Anup Patel wrote:
>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>> <christoffer.dall@linaro.org> wrote:
>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>> Update user space API interface headers for providing information to
>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>> QEMU or KVMTOOL).
>>>>>>
>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>> ---
>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>> 1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>> index e32e776..dae2664 100644
>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>> #define KVM_EXIT_EPR 23
>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>
>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>> /* Emulate instruction failed. */
>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>> struct {
>>>>>> __u32 epr;
>>>>>> } epr;
>>>>>> + /* KVM_EXIT_PSCI */
>>>>>> + struct {
>>>>>> + __u32 fn;
>>>>>> + __u64 args[7];
>>>>>> + __u64 ret[4];
>>>>>> + } psci;
>>>>>> /* Fix the size of the union. */
>>>>>> char padding[256];
>>>>>> };
>>>>>> --
>>>>>> 1.7.9.5
>>>>>>
>>>>> I am also wondering if this is not solving a very specific need without
>>>>> thinking a little more carefully about this problem.
>>>>
>>>> No, its not solving a specific problem.
>>>>
>>>> In fact, its more general because we pass complete info required to
>>>> emulate a PSCI call in user space.
>>>> (Please refer PSCI calling convention)
>>>>
>>>>>
>>>>> We have previously discussed the need for some secure side emulation
>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>> isolated container to handle SMC calls.
>>>>>
>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>> well.
>>>>
>>>> If required we can have an additional field in kvm_run->psci which tells
>>>> whether the PSCI call is an SMC call or HVC call.
>>>>
>>>>>
>>>>> Is there a technical reason why we need something specifically directed
>>>>> to PSCI?
>>>>
>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>
>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>> emulation in user space we would also have an infrastructure for adding
>>>> emulation of new PSCI calls in user space.
>>>
>>> And I strongly oppose to that. It creates consistency issues (what if
>>> userspace implements one version of PSCI, and the kernel another?), and
>>> also some really horrible situations: Imagine you implement the SUSPEND
>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>> You'd end-up having to keep track of the state in the kernel, having to
>>> forward the interrupt event to userspace...
>>
>> It is not about emulating all PSCI functions in user space. Its about forwarding
>> system-level PSCI functions or PSCI functions which cannot be emulated in
>> kernel to user space.
>
> The CPU parts of PSCI can perfectly be implemented in the kernel.
Agreed. This patches does the same.
>
> Then you can return something to userspace indicating what just
> happened. And it doesn't have to be PSCI specific.
Are you suggesting that everytime we want to emulate some new
PSCI call with help from user space (e.g. SYSTEM_OFF and
SYSTEM_RESET), we add new exit reasons and just keep on
increasing KVM exit reasons ?
Why can't the exit reason and exit info in struct kvm_run be
PSCI specific ?
On the contrary, it will be good to have exit reason and exit info
PSCI specific because we have PSCI specification which tells
how it is to be emulated ?
--
Anup
>
> M.
> --
> Jazz is not dead. It just smells funny...
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:30 ` Anup Patel
@ 2013-10-17 11:49 ` Alexander Graf
2013-10-17 11:55 ` Marc Zyngier
2013-10-17 11:52 ` Marc Zyngier
1 sibling, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-10-17 11:49 UTC (permalink / raw)
To: linux-arm-kernel
On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 17/10/13 12:10, Anup Patel wrote:
>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>> Update user space API interface headers for providing information to
>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>> QEMU or KVMTOOL).
>>>>>>>
>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>> ---
>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>> index e32e776..dae2664 100644
>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>
>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>> /* Emulate instruction failed. */
>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>> struct {
>>>>>>> __u32 epr;
>>>>>>> } epr;
>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>> + struct {
>>>>>>> + __u32 fn;
>>>>>>> + __u64 args[7];
>>>>>>> + __u64 ret[4];
>>>>>>> + } psci;
>>>>>>> /* Fix the size of the union. */
>>>>>>> char padding[256];
>>>>>>> };
>>>>>>> --
>>>>>>> 1.7.9.5
>>>>>>>
>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>> thinking a little more carefully about this problem.
>>>>>
>>>>> No, its not solving a specific problem.
>>>>>
>>>>> In fact, its more general because we pass complete info required to
>>>>> emulate a PSCI call in user space.
>>>>> (Please refer PSCI calling convention)
>>>>>
>>>>>>
>>>>>> We have previously discussed the need for some secure side emulation
>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>> isolated container to handle SMC calls.
>>>>>>
>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>> well.
>>>>>
>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>
>>>>>>
>>>>>> Is there a technical reason why we need something specifically directed
>>>>>> to PSCI?
>>>>>
>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>
>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>> emulation in user space we would also have an infrastructure for adding
>>>>> emulation of new PSCI calls in user space.
>>>>
>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>> forward the interrupt event to userspace...
>>>
>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>> kernel to user space.
>>
>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>
> Agreed. This patches does the same.
>
>>
>> Then you can return something to userspace indicating what just
>> happened. And it doesn't have to be PSCI specific.
>
> Are you suggesting that everytime we want to emulate some new
> PSCI call with help from user space (e.g. SYSTEM_OFF and
> SYSTEM_RESET), we add new exit reasons and just keep on
> increasing KVM exit reasons ?
>
> Why can't the exit reason and exit info in struct kvm_run be
> PSCI specific ?
>
> On the contrary, it will be good to have exit reason and exit info
> PSCI specific because we have PSCI specification which tells
> how it is to be emulated ?
I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
Alex
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:30 ` Anup Patel
2013-10-17 11:49 ` Alexander Graf
@ 2013-10-17 11:52 ` Marc Zyngier
1 sibling, 0 replies; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 11:52 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 12:30, Anup Patel wrote:
> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 17/10/13 12:10, Anup Patel wrote:
>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>> Update user space API interface headers for providing information to
>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>> QEMU or KVMTOOL).
>>>>>>>
>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>> ---
>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>> index e32e776..dae2664 100644
>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>
>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>> /* Emulate instruction failed. */
>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>> struct {
>>>>>>> __u32 epr;
>>>>>>> } epr;
>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>> + struct {
>>>>>>> + __u32 fn;
>>>>>>> + __u64 args[7];
>>>>>>> + __u64 ret[4];
>>>>>>> + } psci;
>>>>>>> /* Fix the size of the union. */
>>>>>>> char padding[256];
>>>>>>> };
>>>>>>> --
>>>>>>> 1.7.9.5
>>>>>>>
>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>> thinking a little more carefully about this problem.
>>>>>
>>>>> No, its not solving a specific problem.
>>>>>
>>>>> In fact, its more general because we pass complete info required to
>>>>> emulate a PSCI call in user space.
>>>>> (Please refer PSCI calling convention)
>>>>>
>>>>>>
>>>>>> We have previously discussed the need for some secure side emulation
>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>> isolated container to handle SMC calls.
>>>>>>
>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>> well.
>>>>>
>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>
>>>>>>
>>>>>> Is there a technical reason why we need something specifically directed
>>>>>> to PSCI?
>>>>>
>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>
>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>> emulation in user space we would also have an infrastructure for adding
>>>>> emulation of new PSCI calls in user space.
>>>>
>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>> forward the interrupt event to userspace...
>>>
>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>> kernel to user space.
>>
>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>
> Agreed. This patches does the same.
No it doesn't. There is more to it than just signalling userspace.
Things along the line of putting CPUs in reset, for example.
Now, for the platform part, I want this to be part of a much larger
discussion around how we offload platform-related things to userspace.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:49 ` Alexander Graf
@ 2013-10-17 11:55 ` Marc Zyngier
2013-10-17 12:01 ` Alexander Graf
2013-10-17 15:32 ` Anup Patel
0 siblings, 2 replies; 40+ messages in thread
From: Marc Zyngier @ 2013-10-17 11:55 UTC (permalink / raw)
To: linux-arm-kernel
On 17/10/13 12:49, Alexander Graf wrote:
>
> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
>
>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 17/10/13 12:10, Anup Patel wrote:
>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>>> Update user space API interface headers for providing information to
>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>>> QEMU or KVMTOOL).
>>>>>>>>
>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>>> ---
>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>>> index e32e776..dae2664 100644
>>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>>
>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>>> /* Emulate instruction failed. */
>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>>> struct {
>>>>>>>> __u32 epr;
>>>>>>>> } epr;
>>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>>> + struct {
>>>>>>>> + __u32 fn;
>>>>>>>> + __u64 args[7];
>>>>>>>> + __u64 ret[4];
>>>>>>>> + } psci;
>>>>>>>> /* Fix the size of the union. */
>>>>>>>> char padding[256];
>>>>>>>> };
>>>>>>>> --
>>>>>>>> 1.7.9.5
>>>>>>>>
>>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>>> thinking a little more carefully about this problem.
>>>>>>
>>>>>> No, its not solving a specific problem.
>>>>>>
>>>>>> In fact, its more general because we pass complete info required to
>>>>>> emulate a PSCI call in user space.
>>>>>> (Please refer PSCI calling convention)
>>>>>>
>>>>>>>
>>>>>>> We have previously discussed the need for some secure side emulation
>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>>> isolated container to handle SMC calls.
>>>>>>>
>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>>> well.
>>>>>>
>>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>>
>>>>>>>
>>>>>>> Is there a technical reason why we need something specifically directed
>>>>>>> to PSCI?
>>>>>>
>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>>
>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>>> emulation in user space we would also have an infrastructure for adding
>>>>>> emulation of new PSCI calls in user space.
>>>>>
>>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>>> forward the interrupt event to userspace...
>>>>
>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>>> kernel to user space.
>>>
>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>>
>> Agreed. This patches does the same.
>>
>>>
>>> Then you can return something to userspace indicating what just
>>> happened. And it doesn't have to be PSCI specific.
>>
>> Are you suggesting that everytime we want to emulate some new
>> PSCI call with help from user space (e.g. SYSTEM_OFF and
>> SYSTEM_RESET), we add new exit reasons and just keep on
>> increasing KVM exit reasons ?
>>
>> Why can't the exit reason and exit info in struct kvm_run be
>> PSCI specific ?
>>
>> On the contrary, it will be good to have exit reason and exit info
>> PSCI specific because we have PSCI specification which tells
>> how it is to be emulated ?
>
> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
>
> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
>
> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
>
> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
>
> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
The only nag here is that you can't do that for every function: SUSPEND
is one, for example. Once your vcpu is suspended, you need to to wake it
up with an interrupt, which are not routed to userspace (TFFT!).
So it becomes yet another can of worms, and I rather keep it simple.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:55 ` Marc Zyngier
@ 2013-10-17 12:01 ` Alexander Graf
2013-10-17 19:04 ` Christoffer Dall
2013-10-17 15:32 ` Anup Patel
1 sibling, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-10-17 12:01 UTC (permalink / raw)
To: linux-arm-kernel
On 17.10.2013, at 13:55, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/10/13 12:49, Alexander Graf wrote:
>>
>> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
>>
>>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> On 17/10/13 12:10, Anup Patel wrote:
>>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>>>> Update user space API interface headers for providing information to
>>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>>>> QEMU or KVMTOOL).
>>>>>>>>>
>>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>>>> ---
>>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>>>> index e32e776..dae2664 100644
>>>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>>>
>>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>>>> /* Emulate instruction failed. */
>>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>>>> struct {
>>>>>>>>> __u32 epr;
>>>>>>>>> } epr;
>>>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>>>> + struct {
>>>>>>>>> + __u32 fn;
>>>>>>>>> + __u64 args[7];
>>>>>>>>> + __u64 ret[4];
>>>>>>>>> + } psci;
>>>>>>>>> /* Fix the size of the union. */
>>>>>>>>> char padding[256];
>>>>>>>>> };
>>>>>>>>> --
>>>>>>>>> 1.7.9.5
>>>>>>>>>
>>>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>>>> thinking a little more carefully about this problem.
>>>>>>>
>>>>>>> No, its not solving a specific problem.
>>>>>>>
>>>>>>> In fact, its more general because we pass complete info required to
>>>>>>> emulate a PSCI call in user space.
>>>>>>> (Please refer PSCI calling convention)
>>>>>>>
>>>>>>>>
>>>>>>>> We have previously discussed the need for some secure side emulation
>>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>>>> isolated container to handle SMC calls.
>>>>>>>>
>>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>>>> well.
>>>>>>>
>>>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>>>
>>>>>>>>
>>>>>>>> Is there a technical reason why we need something specifically directed
>>>>>>>> to PSCI?
>>>>>>>
>>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>>>
>>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>>>> emulation in user space we would also have an infrastructure for adding
>>>>>>> emulation of new PSCI calls in user space.
>>>>>>
>>>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>>>> forward the interrupt event to userspace...
>>>>>
>>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>>>> kernel to user space.
>>>>
>>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>>>
>>> Agreed. This patches does the same.
>>>
>>>>
>>>> Then you can return something to userspace indicating what just
>>>> happened. And it doesn't have to be PSCI specific.
>>>
>>> Are you suggesting that everytime we want to emulate some new
>>> PSCI call with help from user space (e.g. SYSTEM_OFF and
>>> SYSTEM_RESET), we add new exit reasons and just keep on
>>> increasing KVM exit reasons ?
>>>
>>> Why can't the exit reason and exit info in struct kvm_run be
>>> PSCI specific ?
>>>
>>> On the contrary, it will be good to have exit reason and exit info
>>> PSCI specific because we have PSCI specification which tells
>>> how it is to be emulated ?
>>
>> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
>>
>> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
>>
>> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
>>
>> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
>>
>> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
>
> The only nag here is that you can't do that for every function: SUSPEND
> is one, for example. Once your vcpu is suspended, you need to to wake it
> up with an interrupt, which are not routed to userspace (TFFT!).
Not sure I understand. Can't you just vcpu_kick() it with a posix signal to get it out of vcpu_run() and unset the "suspended" state? If you guarantee that you don't get spurious exits out of SUSPEND you need to be able to set/unset that bit anyways for migration.
Alex
>
> So it becomes yet another can of worms, and I rather keep it simple.
>
> M.
> --
> Jazz is not dead. It just smells funny...
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 11:55 ` Marc Zyngier
2013-10-17 12:01 ` Alexander Graf
@ 2013-10-17 15:32 ` Anup Patel
1 sibling, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-17 15:32 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 5:25 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 17/10/13 12:49, Alexander Graf wrote:
>>
>> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
>>
>>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>> On 17/10/13 12:10, Anup Patel wrote:
>>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>>>> Update user space API interface headers for providing information to
>>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>>>> QEMU or KVMTOOL).
>>>>>>>>>
>>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>>>> ---
>>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>>>> index e32e776..dae2664 100644
>>>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>>>
>>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>>>> /* Emulate instruction failed. */
>>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>>>> struct {
>>>>>>>>> __u32 epr;
>>>>>>>>> } epr;
>>>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>>>> + struct {
>>>>>>>>> + __u32 fn;
>>>>>>>>> + __u64 args[7];
>>>>>>>>> + __u64 ret[4];
>>>>>>>>> + } psci;
>>>>>>>>> /* Fix the size of the union. */
>>>>>>>>> char padding[256];
>>>>>>>>> };
>>>>>>>>> --
>>>>>>>>> 1.7.9.5
>>>>>>>>>
>>>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>>>> thinking a little more carefully about this problem.
>>>>>>>
>>>>>>> No, its not solving a specific problem.
>>>>>>>
>>>>>>> In fact, its more general because we pass complete info required to
>>>>>>> emulate a PSCI call in user space.
>>>>>>> (Please refer PSCI calling convention)
>>>>>>>
>>>>>>>>
>>>>>>>> We have previously discussed the need for some secure side emulation
>>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>>>> isolated container to handle SMC calls.
>>>>>>>>
>>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>>>> well.
>>>>>>>
>>>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>>>
>>>>>>>>
>>>>>>>> Is there a technical reason why we need something specifically directed
>>>>>>>> to PSCI?
>>>>>>>
>>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>>>
>>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>>>> emulation in user space we would also have an infrastructure for adding
>>>>>>> emulation of new PSCI calls in user space.
>>>>>>
>>>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>>>> forward the interrupt event to userspace...
>>>>>
>>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>>>> kernel to user space.
>>>>
>>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>>>
>>> Agreed. This patches does the same.
>>>
>>>>
>>>> Then you can return something to userspace indicating what just
>>>> happened. And it doesn't have to be PSCI specific.
>>>
>>> Are you suggesting that everytime we want to emulate some new
>>> PSCI call with help from user space (e.g. SYSTEM_OFF and
>>> SYSTEM_RESET), we add new exit reasons and just keep on
>>> increasing KVM exit reasons ?
>>>
>>> Why can't the exit reason and exit info in struct kvm_run be
>>> PSCI specific ?
>>>
>>> On the contrary, it will be good to have exit reason and exit info
>>> PSCI specific because we have PSCI specification which tells
>>> how it is to be emulated ?
>>
>> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
>>
>> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
>>
>> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
>>
>> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
>>
>> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
>
> The only nag here is that you can't do that for every function: SUSPEND
> is one, for example. Once your vcpu is suspended, you need to to wake it
> up with an interrupt, which are not routed to userspace (TFFT!).
SUSPEND, OFF, and ON are VCPU level PSCI functions hence have to be
done in kernel KVM code.
SYSTEM_OFF and SYSTEM_RESET on the other had are board-level
PSCI functions hence have to be done in user space.
--
Anup
>
> So it becomes yet another can of worms, and I rather keep it simple.
>
> M.
> --
> Jazz is not dead. It just smells funny...
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 11:13 ` Anup Patel
@ 2013-10-17 18:29 ` Christoffer Dall
0 siblings, 0 replies; 40+ messages in thread
From: Christoffer Dall @ 2013-10-17 18:29 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 04:43:59PM +0530, Anup Patel wrote:
> On Thu, Oct 17, 2013 at 2:07 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> > On 16/10/13 18:02, Anup Patel wrote:
> >> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
> >> functions hence cannot be emulated by the in-kernel PSCI emulation code.
> >
> > Why can't we implement system-wide functionality in the kernel? I fail
> > to see the issue here.
> >
> >> To tackle this, we forward PSCI SYSTEM_OFF and SYSTEM_RESET function
> >> calls from Guest to user space (i.e. QEMU or KVMTOOL) via KVM run
> >> structure with KVM_EXIT_PSCI exit reason.
> >
> > I'm really not keen on this approach. Having part of the PSCI
> > implementation offloaded to userspace means we don't have a complete
> > implementation in KVM anymore, and we end-up duplicating functionality
> > all over the place.
> >
> > Also, OFF and RESET are not PSCI specific concepts, and could be
> > implemented in various ways. I'm more inclined to return a
> > *standardized* exit code that the various platforms can interpret.
>
> Please refer to the documentation of SYSTEM_OFF and SYSTEM_RESET
> functions in the PSCI specifications.
> (More preciesly section 5.10 and 5.11 of
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0022b/index.html)
>
Anup,
I think everyone knows where to find the specs, and I'm even more sure
that Marc knows the details of PSCI.
If there is a certain aspect of the spec that you find helpful to this
discussion that supports an implementation or design decision, I suggest
you try more carefully to explain why you think it is the right
solution, possibly using references to the specs in order to help people
out.
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 9:21 ` Marc Zyngier
2013-10-17 9:31 ` Peter Maydell
@ 2013-10-17 18:34 ` Christoffer Dall
2013-10-18 4:18 ` Anup Patel
1 sibling, 1 reply; 40+ messages in thread
From: Christoffer Dall @ 2013-10-17 18:34 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 10:21:11AM +0100, Marc Zyngier wrote:
> On 17/10/13 10:10, Peter Maydell wrote:
> > On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >> On 16/10/13 18:02, Anup Patel wrote:
> >>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
> >>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
> >>
> >> Why can't we implement system-wide functionality in the kernel? I fail
> >> to see the issue here.
> >
> > Because the kernel isn't emulating the whole board, and you need
> > to power off or reset the whole board, not just the CPUs.
>
> In which case we can forward a generic event, once KVM has dealt with
> the CPUs.
>
> >> I'm really not keen on this approach. Having part of the PSCI
> >> implementation offloaded to userspace means we don't have a complete
> >> implementation in KVM anymore, and we end-up duplicating functionality
> >> all over the place.
> >>
> >> Also, OFF and RESET are not PSCI specific concepts, and could be
> >> implemented in various ways. I'm more inclined to return a
> >> *standardized* exit code that the various platforms can interpret.
> >
> > Maybe we should have a more generic "kernel can't handle this,
> > toss it to userspace" API? That might also fit in with supporting
> > guests that want to make SMC calls to an emulated monitor...
>
> Indeed.
>
Agreed as well. That's what I was trying to say with a more generic
solution, and exiting on a subset of SMC/HVC calls for PSCI is a weird
compromise.
After all, PSCI is nothing more than acting on SMC or HVC calls, and I'm
quite sure there will be a need for user space to emulate handling of
SMC calls sooner or later, and that is the interface we need.
We may end up with both kernel and user space support for PSCI in that
case, but naturally it would always be *either* the kernel handling the
whole thing, *or* user space handling the whole thing.
In both cases we may need extra functionality to communicate between
user space and KVM, such as suspending a vcpu from user space and waking
it up on in-kernel generated timer interrupts, or, conversely, telling
user space that there was a PSCI call to turn off the system.
>From my point of view the most difficult part to figure out is how to
support general handling of secure monitor services in user space. I
don't see any particular problems with expanding the exit reasons with
things like "turn off VM" or "reset VM"?
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 12:01 ` Alexander Graf
@ 2013-10-17 19:04 ` Christoffer Dall
2013-10-17 22:06 ` Alexander Graf
0 siblings, 1 reply; 40+ messages in thread
From: Christoffer Dall @ 2013-10-17 19:04 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 17, 2013 at 02:01:18PM +0200, Alexander Graf wrote:
>
> On 17.10.2013, at 13:55, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> > On 17/10/13 12:49, Alexander Graf wrote:
> >>
> >> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
> >>
> >>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>> On 17/10/13 12:10, Anup Patel wrote:
> >>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>>>> On 17/10/13 07:45, Anup Patel wrote:
> >>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
> >>>>>>> <christoffer.dall@linaro.org> wrote:
> >>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
> >>>>>>>>> Update user space API interface headers for providing information to
> >>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
> >>>>>>>>> QEMU or KVMTOOL).
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> >>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> >>>>>>>>> ---
> >>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
> >>>>>>>>> 1 file changed, 7 insertions(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >>>>>>>>> index e32e776..dae2664 100644
> >>>>>>>>> --- a/include/uapi/linux/kvm.h
> >>>>>>>>> +++ b/include/uapi/linux/kvm.h
> >>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> >>>>>>>>> #define KVM_EXIT_WATCHDOG 21
> >>>>>>>>> #define KVM_EXIT_S390_TSCH 22
> >>>>>>>>> #define KVM_EXIT_EPR 23
> >>>>>>>>> +#define KVM_EXIT_PSCI 24
> >>>>>>>>>
> >>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
> >>>>>>>>> /* Emulate instruction failed. */
> >>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
> >>>>>>>>> struct {
> >>>>>>>>> __u32 epr;
> >>>>>>>>> } epr;
> >>>>>>>>> + /* KVM_EXIT_PSCI */
> >>>>>>>>> + struct {
> >>>>>>>>> + __u32 fn;
> >>>>>>>>> + __u64 args[7];
> >>>>>>>>> + __u64 ret[4];
> >>>>>>>>> + } psci;
> >>>>>>>>> /* Fix the size of the union. */
> >>>>>>>>> char padding[256];
> >>>>>>>>> };
> >>>>>>>>> --
> >>>>>>>>> 1.7.9.5
> >>>>>>>>>
> >>>>>>>> I am also wondering if this is not solving a very specific need without
> >>>>>>>> thinking a little more carefully about this problem.
> >>>>>>>
> >>>>>>> No, its not solving a specific problem.
> >>>>>>>
> >>>>>>> In fact, its more general because we pass complete info required to
> >>>>>>> emulate a PSCI call in user space.
> >>>>>>> (Please refer PSCI calling convention)
> >>>>>>>
> >>>>>>>>
> >>>>>>>> We have previously discussed the need for some secure side emulation
> >>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
> >>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
> >>>>>>>> secure world runtime that the kernel can run in a partially or fully
> >>>>>>>> isolated container to handle SMC calls.
> >>>>>>>>
> >>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
> >>>>>>>> well.
> >>>>>>>
> >>>>>>> If required we can have an additional field in kvm_run->psci which tells
> >>>>>>> whether the PSCI call is an SMC call or HVC call.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Is there a technical reason why we need something specifically directed
> >>>>>>>> to PSCI?
> >>>>>>>
> >>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
> >>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
> >>>>>>>
> >>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
> >>>>>>> emulation in user space we would also have an infrastructure for adding
> >>>>>>> emulation of new PSCI calls in user space.
> >>>>>>
> >>>>>> And I strongly oppose to that. It creates consistency issues (what if
> >>>>>> userspace implements one version of PSCI, and the kernel another?), and
> >>>>>> also some really horrible situations: Imagine you implement the SUSPEND
> >>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
> >>>>>> You'd end-up having to keep track of the state in the kernel, having to
> >>>>>> forward the interrupt event to userspace...
> >>>>>
> >>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
> >>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
> >>>>> kernel to user space.
> >>>>
> >>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
> >>>
> >>> Agreed. This patches does the same.
> >>>
> >>>>
> >>>> Then you can return something to userspace indicating what just
> >>>> happened. And it doesn't have to be PSCI specific.
> >>>
> >>> Are you suggesting that everytime we want to emulate some new
> >>> PSCI call with help from user space (e.g. SYSTEM_OFF and
> >>> SYSTEM_RESET), we add new exit reasons and just keep on
> >>> increasing KVM exit reasons ?
> >>>
> >>> Why can't the exit reason and exit info in struct kvm_run be
> >>> PSCI specific ?
> >>>
> >>> On the contrary, it will be good to have exit reason and exit info
> >>> PSCI specific because we have PSCI specification which tells
> >>> how it is to be emulated ?
> >>
> >> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
> >>
> >> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
> >>
> >> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
> >>
> >> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
> >>
> >> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
> >
> > The only nag here is that you can't do that for every function: SUSPEND
> > is one, for example. Once your vcpu is suspended, you need to to wake it
> > up with an interrupt, which are not routed to userspace (TFFT!).
>
> Not sure I understand. Can't you just vcpu_kick() it with a posix signal to get it out of vcpu_run() and unset the "suspended" state? If you guarantee that you don't get spurious exits out of SUSPEND you need to be able to set/unset that bit anyways for migration.
>
Right now the suspended state is private on the vcpu (the
vcpu->arch.pause if I'm not mistaken), so QEMU does not have a way to
set/unset this.
I think we got around this for migration, because a suspended CPU could
always be woken up spuriously according to the ARM specs (Peter, am I
mixing things up here?) so we simply wake everything up after migration
and it's up to the VM to go to sleep again... Of course, this may not be
ideal anyhow, and we could implement something to coordinate this state
between user space and the kernel.
If user space puts the thread to sleep waiting for a signal, does
vcpu_kick() currently send such a signal and wake you up or is it
something we'd have to add?
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 19:04 ` Christoffer Dall
@ 2013-10-17 22:06 ` Alexander Graf
2013-10-17 22:24 ` Christoffer Dall
0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-10-17 22:06 UTC (permalink / raw)
To: linux-arm-kernel
On 17.10.2013, at 21:04, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Thu, Oct 17, 2013 at 02:01:18PM +0200, Alexander Graf wrote:
>>
>> On 17.10.2013, at 13:55, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>>> On 17/10/13 12:49, Alexander Graf wrote:
>>>>
>>>> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
>>>>
>>>>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>> On 17/10/13 12:10, Anup Patel wrote:
>>>>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>>>>>> Update user space API interface headers for providing information to
>>>>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>>>>>> QEMU or KVMTOOL).
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>>>>>> ---
>>>>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>>>>>> index e32e776..dae2664 100644
>>>>>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>>>>>
>>>>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>>>>>> /* Emulate instruction failed. */
>>>>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>>>>>> struct {
>>>>>>>>>>> __u32 epr;
>>>>>>>>>>> } epr;
>>>>>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>>>>>> + struct {
>>>>>>>>>>> + __u32 fn;
>>>>>>>>>>> + __u64 args[7];
>>>>>>>>>>> + __u64 ret[4];
>>>>>>>>>>> + } psci;
>>>>>>>>>>> /* Fix the size of the union. */
>>>>>>>>>>> char padding[256];
>>>>>>>>>>> };
>>>>>>>>>>> --
>>>>>>>>>>> 1.7.9.5
>>>>>>>>>>>
>>>>>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>>>>>> thinking a little more carefully about this problem.
>>>>>>>>>
>>>>>>>>> No, its not solving a specific problem.
>>>>>>>>>
>>>>>>>>> In fact, its more general because we pass complete info required to
>>>>>>>>> emulate a PSCI call in user space.
>>>>>>>>> (Please refer PSCI calling convention)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We have previously discussed the need for some secure side emulation
>>>>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>>>>>> isolated container to handle SMC calls.
>>>>>>>>>>
>>>>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>>>>>> well.
>>>>>>>>>
>>>>>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Is there a technical reason why we need something specifically directed
>>>>>>>>>> to PSCI?
>>>>>>>>>
>>>>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>>>>>
>>>>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>>>>>> emulation in user space we would also have an infrastructure for adding
>>>>>>>>> emulation of new PSCI calls in user space.
>>>>>>>>
>>>>>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>>>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>>>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>>>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>>>>>> forward the interrupt event to userspace...
>>>>>>>
>>>>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>>>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>>>>>> kernel to user space.
>>>>>>
>>>>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>>>>>
>>>>> Agreed. This patches does the same.
>>>>>
>>>>>>
>>>>>> Then you can return something to userspace indicating what just
>>>>>> happened. And it doesn't have to be PSCI specific.
>>>>>
>>>>> Are you suggesting that everytime we want to emulate some new
>>>>> PSCI call with help from user space (e.g. SYSTEM_OFF and
>>>>> SYSTEM_RESET), we add new exit reasons and just keep on
>>>>> increasing KVM exit reasons ?
>>>>>
>>>>> Why can't the exit reason and exit info in struct kvm_run be
>>>>> PSCI specific ?
>>>>>
>>>>> On the contrary, it will be good to have exit reason and exit info
>>>>> PSCI specific because we have PSCI specification which tells
>>>>> how it is to be emulated ?
>>>>
>>>> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
>>>>
>>>> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
>>>>
>>>> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
>>>>
>>>> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
>>>>
>>>> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
>>>
>>> The only nag here is that you can't do that for every function: SUSPEND
>>> is one, for example. Once your vcpu is suspended, you need to to wake it
>>> up with an interrupt, which are not routed to userspace (TFFT!).
>>
>> Not sure I understand. Can't you just vcpu_kick() it with a posix signal to get it out of vcpu_run() and unset the "suspended" state? If you guarantee that you don't get spurious exits out of SUSPEND you need to be able to set/unset that bit anyways for migration.
>>
> Right now the suspended state is private on the vcpu (the
> vcpu->arch.pause if I'm not mistaken), so QEMU does not have a way to
> set/unset this.
>
> I think we got around this for migration, because a suspended CPU could
> always be woken up spuriously according to the ARM specs (Peter, am I
> mixing things up here?) so we simply wake everything up after migration
> and it's up to the VM to go to sleep again... Of course, this may not be
> ideal anyhow, and we could implement something to coordinate this state
> between user space and the kernel.
This is basically how we deal with this on ppc as well. You are never guaranteed that sleeping doesn't wake you up randomly. But this also means that you want to unset vcpu->arch.pause every time you stop waiting for events and go back to user space (-EINTR).
> If user space puts the thread to sleep waiting for a signal, does
> vcpu_kick() currently send such a signal and wake you up or is it
> something we'd have to add?
Why would user space put a thread to sleep? The kernel pauses the vcpu and user space would just send a signal, sending it back into user space which realizes it doesn't have to do anything, goes back into kvm_run() and thus breaks the pausing.
Or did I misunderstand the problem? :)
Alex
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 22:06 ` Alexander Graf
@ 2013-10-17 22:24 ` Christoffer Dall
2013-10-17 22:26 ` Alexander Graf
0 siblings, 1 reply; 40+ messages in thread
From: Christoffer Dall @ 2013-10-17 22:24 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 18, 2013 at 12:06:29AM +0200, Alexander Graf wrote:
>
> On 17.10.2013, at 21:04, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>
> > On Thu, Oct 17, 2013 at 02:01:18PM +0200, Alexander Graf wrote:
> >>
> >> On 17.10.2013, at 13:55, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>
> >>> On 17/10/13 12:49, Alexander Graf wrote:
> >>>>
> >>>> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
> >>>>
> >>>>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>>>> On 17/10/13 12:10, Anup Patel wrote:
> >>>>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>>>>>> On 17/10/13 07:45, Anup Patel wrote:
> >>>>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
> >>>>>>>>> <christoffer.dall@linaro.org> wrote:
> >>>>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
> >>>>>>>>>>> Update user space API interface headers for providing information to
> >>>>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
> >>>>>>>>>>> QEMU or KVMTOOL).
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> >>>>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> >>>>>>>>>>> ---
> >>>>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
> >>>>>>>>>>> 1 file changed, 7 insertions(+)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >>>>>>>>>>> index e32e776..dae2664 100644
> >>>>>>>>>>> --- a/include/uapi/linux/kvm.h
> >>>>>>>>>>> +++ b/include/uapi/linux/kvm.h
> >>>>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> >>>>>>>>>>> #define KVM_EXIT_WATCHDOG 21
> >>>>>>>>>>> #define KVM_EXIT_S390_TSCH 22
> >>>>>>>>>>> #define KVM_EXIT_EPR 23
> >>>>>>>>>>> +#define KVM_EXIT_PSCI 24
> >>>>>>>>>>>
> >>>>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
> >>>>>>>>>>> /* Emulate instruction failed. */
> >>>>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
> >>>>>>>>>>> struct {
> >>>>>>>>>>> __u32 epr;
> >>>>>>>>>>> } epr;
> >>>>>>>>>>> + /* KVM_EXIT_PSCI */
> >>>>>>>>>>> + struct {
> >>>>>>>>>>> + __u32 fn;
> >>>>>>>>>>> + __u64 args[7];
> >>>>>>>>>>> + __u64 ret[4];
> >>>>>>>>>>> + } psci;
> >>>>>>>>>>> /* Fix the size of the union. */
> >>>>>>>>>>> char padding[256];
> >>>>>>>>>>> };
> >>>>>>>>>>> --
> >>>>>>>>>>> 1.7.9.5
> >>>>>>>>>>>
> >>>>>>>>>> I am also wondering if this is not solving a very specific need without
> >>>>>>>>>> thinking a little more carefully about this problem.
> >>>>>>>>>
> >>>>>>>>> No, its not solving a specific problem.
> >>>>>>>>>
> >>>>>>>>> In fact, its more general because we pass complete info required to
> >>>>>>>>> emulate a PSCI call in user space.
> >>>>>>>>> (Please refer PSCI calling convention)
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> We have previously discussed the need for some secure side emulation
> >>>>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
> >>>>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
> >>>>>>>>>> secure world runtime that the kernel can run in a partially or fully
> >>>>>>>>>> isolated container to handle SMC calls.
> >>>>>>>>>>
> >>>>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
> >>>>>>>>>> well.
> >>>>>>>>>
> >>>>>>>>> If required we can have an additional field in kvm_run->psci which tells
> >>>>>>>>> whether the PSCI call is an SMC call or HVC call.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Is there a technical reason why we need something specifically directed
> >>>>>>>>>> to PSCI?
> >>>>>>>>>
> >>>>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
> >>>>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
> >>>>>>>>>
> >>>>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
> >>>>>>>>> emulation in user space we would also have an infrastructure for adding
> >>>>>>>>> emulation of new PSCI calls in user space.
> >>>>>>>>
> >>>>>>>> And I strongly oppose to that. It creates consistency issues (what if
> >>>>>>>> userspace implements one version of PSCI, and the kernel another?), and
> >>>>>>>> also some really horrible situations: Imagine you implement the SUSPEND
> >>>>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
> >>>>>>>> You'd end-up having to keep track of the state in the kernel, having to
> >>>>>>>> forward the interrupt event to userspace...
> >>>>>>>
> >>>>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
> >>>>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
> >>>>>>> kernel to user space.
> >>>>>>
> >>>>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
> >>>>>
> >>>>> Agreed. This patches does the same.
> >>>>>
> >>>>>>
> >>>>>> Then you can return something to userspace indicating what just
> >>>>>> happened. And it doesn't have to be PSCI specific.
> >>>>>
> >>>>> Are you suggesting that everytime we want to emulate some new
> >>>>> PSCI call with help from user space (e.g. SYSTEM_OFF and
> >>>>> SYSTEM_RESET), we add new exit reasons and just keep on
> >>>>> increasing KVM exit reasons ?
> >>>>>
> >>>>> Why can't the exit reason and exit info in struct kvm_run be
> >>>>> PSCI specific ?
> >>>>>
> >>>>> On the contrary, it will be good to have exit reason and exit info
> >>>>> PSCI specific because we have PSCI specification which tells
> >>>>> how it is to be emulated ?
> >>>>
> >>>> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
> >>>>
> >>>> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
> >>>>
> >>>> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
> >>>>
> >>>> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
> >>>>
> >>>> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
> >>>
> >>> The only nag here is that you can't do that for every function: SUSPEND
> >>> is one, for example. Once your vcpu is suspended, you need to to wake it
> >>> up with an interrupt, which are not routed to userspace (TFFT!).
> >>
> >> Not sure I understand. Can't you just vcpu_kick() it with a posix signal to get it out of vcpu_run() and unset the "suspended" state? If you guarantee that you don't get spurious exits out of SUSPEND you need to be able to set/unset that bit anyways for migration.
> >>
> > Right now the suspended state is private on the vcpu (the
> > vcpu->arch.pause if I'm not mistaken), so QEMU does not have a way to
> > set/unset this.
> >
> > I think we got around this for migration, because a suspended CPU could
> > always be woken up spuriously according to the ARM specs (Peter, am I
> > mixing things up here?) so we simply wake everything up after migration
> > and it's up to the VM to go to sleep again... Of course, this may not be
> > ideal anyhow, and we could implement something to coordinate this state
> > between user space and the kernel.
>
> This is basically how we deal with this on ppc as well. You are never guaranteed that sleeping doesn't wake you up randomly. But this also means that you want to unset vcpu->arch.pause every time you stop waiting for events and go back to user space (-EINTR).
>
Not sure what you mean, in which case do you need to unset the
vcpu->arch.pause? We unset this is there's an interrupt to the vcpu
detected inside the kernel, coming from either user space or from within
a device inside the kernel (like the arch timer).
> > If user space puts the thread to sleep waiting for a signal, does
> > vcpu_kick() currently send such a signal and wake you up or is it
> > something we'd have to add?
>
> Why would user space put a thread to sleep? The kernel pauses the vcpu and user space would just send a signal, sending it back into user space which realizes it doesn't have to do anything, goes back into kvm_run() and thus breaks the pausing.
>
> Or did I misunderstand the problem? :)
>
Oh, I mean if you wanted to implement vcpu suspend in user space, you
could either somehow set vcpu->pause from user space or you could idle,
not call KVM_VCPU_RUN until you receive a signal on that thread. Is
this crazy rambling?
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 22:24 ` Christoffer Dall
@ 2013-10-17 22:26 ` Alexander Graf
2013-10-18 3:34 ` Christoffer Dall
0 siblings, 1 reply; 40+ messages in thread
From: Alexander Graf @ 2013-10-17 22:26 UTC (permalink / raw)
To: linux-arm-kernel
On 18.10.2013, at 00:24, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Fri, Oct 18, 2013 at 12:06:29AM +0200, Alexander Graf wrote:
>>
>> On 17.10.2013, at 21:04, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>>
>>> On Thu, Oct 17, 2013 at 02:01:18PM +0200, Alexander Graf wrote:
>>>>
>>>> On 17.10.2013, at 13:55, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>
>>>>> On 17/10/13 12:49, Alexander Graf wrote:
>>>>>>
>>>>>> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
>>>>>>
>>>>>>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>>>> On 17/10/13 12:10, Anup Patel wrote:
>>>>>>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>>>>>>>>> On 17/10/13 07:45, Anup Patel wrote:
>>>>>>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
>>>>>>>>>>> <christoffer.dall@linaro.org> wrote:
>>>>>>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
>>>>>>>>>>>>> Update user space API interface headers for providing information to
>>>>>>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
>>>>>>>>>>>>> QEMU or KVMTOOL).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
>>>>>>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
>>>>>>>>>>>>> 1 file changed, 7 insertions(+)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>>>>>>>>>> index e32e776..dae2664 100644
>>>>>>>>>>>>> --- a/include/uapi/linux/kvm.h
>>>>>>>>>>>>> +++ b/include/uapi/linux/kvm.h
>>>>>>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
>>>>>>>>>>>>> #define KVM_EXIT_WATCHDOG 21
>>>>>>>>>>>>> #define KVM_EXIT_S390_TSCH 22
>>>>>>>>>>>>> #define KVM_EXIT_EPR 23
>>>>>>>>>>>>> +#define KVM_EXIT_PSCI 24
>>>>>>>>>>>>>
>>>>>>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
>>>>>>>>>>>>> /* Emulate instruction failed. */
>>>>>>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
>>>>>>>>>>>>> struct {
>>>>>>>>>>>>> __u32 epr;
>>>>>>>>>>>>> } epr;
>>>>>>>>>>>>> + /* KVM_EXIT_PSCI */
>>>>>>>>>>>>> + struct {
>>>>>>>>>>>>> + __u32 fn;
>>>>>>>>>>>>> + __u64 args[7];
>>>>>>>>>>>>> + __u64 ret[4];
>>>>>>>>>>>>> + } psci;
>>>>>>>>>>>>> /* Fix the size of the union. */
>>>>>>>>>>>>> char padding[256];
>>>>>>>>>>>>> };
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 1.7.9.5
>>>>>>>>>>>>>
>>>>>>>>>>>> I am also wondering if this is not solving a very specific need without
>>>>>>>>>>>> thinking a little more carefully about this problem.
>>>>>>>>>>>
>>>>>>>>>>> No, its not solving a specific problem.
>>>>>>>>>>>
>>>>>>>>>>> In fact, its more general because we pass complete info required to
>>>>>>>>>>> emulate a PSCI call in user space.
>>>>>>>>>>> (Please refer PSCI calling convention)
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> We have previously discussed the need for some secure side emulation
>>>>>>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
>>>>>>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
>>>>>>>>>>>> secure world runtime that the kernel can run in a partially or fully
>>>>>>>>>>>> isolated container to handle SMC calls.
>>>>>>>>>>>>
>>>>>>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
>>>>>>>>>>>> well.
>>>>>>>>>>>
>>>>>>>>>>> If required we can have an additional field in kvm_run->psci which tells
>>>>>>>>>>> whether the PSCI call is an SMC call or HVC call.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Is there a technical reason why we need something specifically directed
>>>>>>>>>>>> to PSCI?
>>>>>>>>>>>
>>>>>>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
>>>>>>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
>>>>>>>>>>>
>>>>>>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
>>>>>>>>>>> emulation in user space we would also have an infrastructure for adding
>>>>>>>>>>> emulation of new PSCI calls in user space.
>>>>>>>>>>
>>>>>>>>>> And I strongly oppose to that. It creates consistency issues (what if
>>>>>>>>>> userspace implements one version of PSCI, and the kernel another?), and
>>>>>>>>>> also some really horrible situations: Imagine you implement the SUSPEND
>>>>>>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
>>>>>>>>>> You'd end-up having to keep track of the state in the kernel, having to
>>>>>>>>>> forward the interrupt event to userspace...
>>>>>>>>>
>>>>>>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
>>>>>>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
>>>>>>>>> kernel to user space.
>>>>>>>>
>>>>>>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
>>>>>>>
>>>>>>> Agreed. This patches does the same.
>>>>>>>
>>>>>>>>
>>>>>>>> Then you can return something to userspace indicating what just
>>>>>>>> happened. And it doesn't have to be PSCI specific.
>>>>>>>
>>>>>>> Are you suggesting that everytime we want to emulate some new
>>>>>>> PSCI call with help from user space (e.g. SYSTEM_OFF and
>>>>>>> SYSTEM_RESET), we add new exit reasons and just keep on
>>>>>>> increasing KVM exit reasons ?
>>>>>>>
>>>>>>> Why can't the exit reason and exit info in struct kvm_run be
>>>>>>> PSCI specific ?
>>>>>>>
>>>>>>> On the contrary, it will be good to have exit reason and exit info
>>>>>>> PSCI specific because we have PSCI specification which tells
>>>>>>> how it is to be emulated ?
>>>>>>
>>>>>> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
>>>>>>
>>>>>> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
>>>>>>
>>>>>> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
>>>>>>
>>>>>> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
>>>>>>
>>>>>> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
>>>>>
>>>>> The only nag here is that you can't do that for every function: SUSPEND
>>>>> is one, for example. Once your vcpu is suspended, you need to to wake it
>>>>> up with an interrupt, which are not routed to userspace (TFFT!).
>>>>
>>>> Not sure I understand. Can't you just vcpu_kick() it with a posix signal to get it out of vcpu_run() and unset the "suspended" state? If you guarantee that you don't get spurious exits out of SUSPEND you need to be able to set/unset that bit anyways for migration.
>>>>
>>> Right now the suspended state is private on the vcpu (the
>>> vcpu->arch.pause if I'm not mistaken), so QEMU does not have a way to
>>> set/unset this.
>>>
>>> I think we got around this for migration, because a suspended CPU could
>>> always be woken up spuriously according to the ARM specs (Peter, am I
>>> mixing things up here?) so we simply wake everything up after migration
>>> and it's up to the VM to go to sleep again... Of course, this may not be
>>> ideal anyhow, and we could implement something to coordinate this state
>>> between user space and the kernel.
>>
>> This is basically how we deal with this on ppc as well. You are never guaranteed that sleeping doesn't wake you up randomly. But this also means that you want to unset vcpu->arch.pause every time you stop waiting for events and go back to user space (-EINTR).
>>
>
> Not sure what you mean, in which case do you need to unset the
> vcpu->arch.pause? We unset this is there's an interrupt to the vcpu
> detected inside the kernel, coming from either user space or from within
> a device inside the kernel (like the arch timer).
>
>>> If user space puts the thread to sleep waiting for a signal, does
>>> vcpu_kick() currently send such a signal and wake you up or is it
>>> something we'd have to add?
>>
>> Why would user space put a thread to sleep? The kernel pauses the vcpu and user space would just send a signal, sending it back into user space which realizes it doesn't have to do anything, goes back into kvm_run() and thus breaks the pausing.
>>
>> Or did I misunderstand the problem? :)
>>
> Oh, I mean if you wanted to implement vcpu suspend in user space, you
> could either somehow set vcpu->pause from user space or you could idle,
> not call KVM_VCPU_RUN until you receive a signal on that thread. Is
> this crazy rambling?
Ah, I thing I get where you're aiming at. I think it makes sense to leave ownership of the vcpu in the kernel's hands. So you would indicate that the vcpu should be suspended and then call VCPU_RUN.
But take a look at x86's mpstate. IIRC that's one of the problems it's trying to solve.
Alex
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation
2013-10-17 22:26 ` Alexander Graf
@ 2013-10-18 3:34 ` Christoffer Dall
0 siblings, 0 replies; 40+ messages in thread
From: Christoffer Dall @ 2013-10-18 3:34 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 18, 2013 at 12:26:38AM +0200, Alexander Graf wrote:
>
> On 18.10.2013, at 00:24, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>
> > On Fri, Oct 18, 2013 at 12:06:29AM +0200, Alexander Graf wrote:
> >>
> >> On 17.10.2013, at 21:04, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >>
> >>> On Thu, Oct 17, 2013 at 02:01:18PM +0200, Alexander Graf wrote:
> >>>>
> >>>> On 17.10.2013, at 13:55, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>>
> >>>>> On 17/10/13 12:49, Alexander Graf wrote:
> >>>>>>
> >>>>>> On 17.10.2013, at 13:30, Anup Patel <anup@brainfault.org> wrote:
> >>>>>>
> >>>>>>> On Thu, Oct 17, 2013 at 4:51 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>>>>>> On 17/10/13 12:10, Anup Patel wrote:
> >>>>>>>>> On Thu, Oct 17, 2013 at 2:17 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >>>>>>>>>> On 17/10/13 07:45, Anup Patel wrote:
> >>>>>>>>>>> On Thu, Oct 17, 2013 at 3:41 AM, Christoffer Dall
> >>>>>>>>>>> <christoffer.dall@linaro.org> wrote:
> >>>>>>>>>>>> On Wed, Oct 16, 2013 at 10:32:30PM +0530, Anup Patel wrote:
> >>>>>>>>>>>>> Update user space API interface headers for providing information to
> >>>>>>>>>>>>> user space needed to emulate PSCI function calls in user space (i.e.
> >>>>>>>>>>>>> QEMU or KVMTOOL).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Signed-off-by: Anup Patel <anup.patel@linaro.org>
> >>>>>>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
> >>>>>>>>>>>>> ---
> >>>>>>>>>>>>> include/uapi/linux/kvm.h | 7 +++++++
> >>>>>>>>>>>>> 1 file changed, 7 insertions(+)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >>>>>>>>>>>>> index e32e776..dae2664 100644
> >>>>>>>>>>>>> --- a/include/uapi/linux/kvm.h
> >>>>>>>>>>>>> +++ b/include/uapi/linux/kvm.h
> >>>>>>>>>>>>> @@ -171,6 +171,7 @@ struct kvm_pit_config {
> >>>>>>>>>>>>> #define KVM_EXIT_WATCHDOG 21
> >>>>>>>>>>>>> #define KVM_EXIT_S390_TSCH 22
> >>>>>>>>>>>>> #define KVM_EXIT_EPR 23
> >>>>>>>>>>>>> +#define KVM_EXIT_PSCI 24
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> /* For KVM_EXIT_INTERNAL_ERROR */
> >>>>>>>>>>>>> /* Emulate instruction failed. */
> >>>>>>>>>>>>> @@ -301,6 +302,12 @@ struct kvm_run {
> >>>>>>>>>>>>> struct {
> >>>>>>>>>>>>> __u32 epr;
> >>>>>>>>>>>>> } epr;
> >>>>>>>>>>>>> + /* KVM_EXIT_PSCI */
> >>>>>>>>>>>>> + struct {
> >>>>>>>>>>>>> + __u32 fn;
> >>>>>>>>>>>>> + __u64 args[7];
> >>>>>>>>>>>>> + __u64 ret[4];
> >>>>>>>>>>>>> + } psci;
> >>>>>>>>>>>>> /* Fix the size of the union. */
> >>>>>>>>>>>>> char padding[256];
> >>>>>>>>>>>>> };
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> 1.7.9.5
> >>>>>>>>>>>>>
> >>>>>>>>>>>> I am also wondering if this is not solving a very specific need without
> >>>>>>>>>>>> thinking a little more carefully about this problem.
> >>>>>>>>>>>
> >>>>>>>>>>> No, its not solving a specific problem.
> >>>>>>>>>>>
> >>>>>>>>>>> In fact, its more general because we pass complete info required to
> >>>>>>>>>>> emulate a PSCI call in user space.
> >>>>>>>>>>> (Please refer PSCI calling convention)
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have previously discussed the need for some secure side emulation
> >>>>>>>>>>>> in QEMU, and I think perhaps we need something more generic which allows
> >>>>>>>>>>>> user space to handle SMC calls and/or allows user space to "inject" some
> >>>>>>>>>>>> secure world runtime that the kernel can run in a partially or fully
> >>>>>>>>>>>> isolated container to handle SMC calls.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Peter raised this issue previously and pointed to a proposal he had as
> >>>>>>>>>>>> well.
> >>>>>>>>>>>
> >>>>>>>>>>> If required we can have an additional field in kvm_run->psci which tells
> >>>>>>>>>>> whether the PSCI call is an SMC call or HVC call.
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is there a technical reason why we need something specifically directed
> >>>>>>>>>>>> to PSCI?
> >>>>>>>>>>>
> >>>>>>>>>>> Its quite natural to add this to PSCI emulation in KVM ARM/ARM64 instead
> >>>>>>>>>>> of adding a separate VirtIO device for System reboot and System poweroff.
> >>>>>>>>>>>
> >>>>>>>>>>> Also in the process of implementing SYSTEM_OFF and SYSTEM_RESET
> >>>>>>>>>>> emulation in user space we would also have an infrastructure for adding
> >>>>>>>>>>> emulation of new PSCI calls in user space.
> >>>>>>>>>>
> >>>>>>>>>> And I strongly oppose to that. It creates consistency issues (what if
> >>>>>>>>>> userspace implements one version of PSCI, and the kernel another?), and
> >>>>>>>>>> also some really horrible situations: Imagine you implement the SUSPEND
> >>>>>>>>>> operation in userspace, and want to wake the vcpu up with an interrupt.
> >>>>>>>>>> You'd end-up having to keep track of the state in the kernel, having to
> >>>>>>>>>> forward the interrupt event to userspace...
> >>>>>>>>>
> >>>>>>>>> It is not about emulating all PSCI functions in user space. Its about forwarding
> >>>>>>>>> system-level PSCI functions or PSCI functions which cannot be emulated in
> >>>>>>>>> kernel to user space.
> >>>>>>>>
> >>>>>>>> The CPU parts of PSCI can perfectly be implemented in the kernel.
> >>>>>>>
> >>>>>>> Agreed. This patches does the same.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Then you can return something to userspace indicating what just
> >>>>>>>> happened. And it doesn't have to be PSCI specific.
> >>>>>>>
> >>>>>>> Are you suggesting that everytime we want to emulate some new
> >>>>>>> PSCI call with help from user space (e.g. SYSTEM_OFF and
> >>>>>>> SYSTEM_RESET), we add new exit reasons and just keep on
> >>>>>>> increasing KVM exit reasons ?
> >>>>>>>
> >>>>>>> Why can't the exit reason and exit info in struct kvm_run be
> >>>>>>> PSCI specific ?
> >>>>>>>
> >>>>>>> On the contrary, it will be good to have exit reason and exit info
> >>>>>>> PSCI specific because we have PSCI specification which tells
> >>>>>>> how it is to be emulated ?
> >>>>>>
> >>>>>> I completely agree with Marc that split-brain ownership of any address space (and PSCI is basically one) is a very bad idea.
> >>>>>>
> >>>>>> However, so far the only solution I've seen mentioned is that the kernel owns PSCI (read: decodes it) and then drives user space with explicit commands.
> >>>>>>
> >>>>>> Couldn't we reverse this logic? User space owns PSCI. By default all PSCI calls go to user space. If a PSCI call makes more sense to be executed by kvm, it can explicitly route it to be handled by kvm instead.
> >>>>>>
> >>>>>> That way the owner is still at a single spot and we can fast path the few cases that may be performance critical or a lot easier to handle in kvm.
> >>>>>>
> >>>>>> The good part about this is that we get consistency in QEMU with the TCG PSCI handlers along the way.
> >>>>>
> >>>>> The only nag here is that you can't do that for every function: SUSPEND
> >>>>> is one, for example. Once your vcpu is suspended, you need to to wake it
> >>>>> up with an interrupt, which are not routed to userspace (TFFT!).
> >>>>
> >>>> Not sure I understand. Can't you just vcpu_kick() it with a posix signal to get it out of vcpu_run() and unset the "suspended" state? If you guarantee that you don't get spurious exits out of SUSPEND you need to be able to set/unset that bit anyways for migration.
> >>>>
> >>> Right now the suspended state is private on the vcpu (the
> >>> vcpu->arch.pause if I'm not mistaken), so QEMU does not have a way to
> >>> set/unset this.
> >>>
> >>> I think we got around this for migration, because a suspended CPU could
> >>> always be woken up spuriously according to the ARM specs (Peter, am I
> >>> mixing things up here?) so we simply wake everything up after migration
> >>> and it's up to the VM to go to sleep again... Of course, this may not be
> >>> ideal anyhow, and we could implement something to coordinate this state
> >>> between user space and the kernel.
> >>
> >> This is basically how we deal with this on ppc as well. You are never guaranteed that sleeping doesn't wake you up randomly. But this also means that you want to unset vcpu->arch.pause every time you stop waiting for events and go back to user space (-EINTR).
> >>
> >
> > Not sure what you mean, in which case do you need to unset the
> > vcpu->arch.pause? We unset this is there's an interrupt to the vcpu
> > detected inside the kernel, coming from either user space or from within
> > a device inside the kernel (like the arch timer).
> >
> >>> If user space puts the thread to sleep waiting for a signal, does
> >>> vcpu_kick() currently send such a signal and wake you up or is it
> >>> something we'd have to add?
> >>
> >> Why would user space put a thread to sleep? The kernel pauses the vcpu and user space would just send a signal, sending it back into user space which realizes it doesn't have to do anything, goes back into kvm_run() and thus breaks the pausing.
> >>
> >> Or did I misunderstand the problem? :)
> >>
> > Oh, I mean if you wanted to implement vcpu suspend in user space, you
> > could either somehow set vcpu->pause from user space or you could idle,
> > not call KVM_VCPU_RUN until you receive a signal on that thread. Is
> > this crazy rambling?
>
> Ah, I thing I get where you're aiming at. I think it makes sense to leave ownership of the vcpu in the kernel's hands. So you would indicate that the vcpu should be suspended and then call VCPU_RUN.
>
Probably, and thinking about it, we can stil have the kernel handle PSCI
completely and let user space handle other secure calls without
limitations, which is probably the best thing anyhow.
> But take a look at x86's mpstate. IIRC that's one of the problems it's trying to solve.
>
Will do.
-Christoffer
^ permalink raw reply [flat|nested] 40+ messages in thread
* [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space
2013-10-17 18:34 ` Christoffer Dall
@ 2013-10-18 4:18 ` Anup Patel
0 siblings, 0 replies; 40+ messages in thread
From: Anup Patel @ 2013-10-18 4:18 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Oct 18, 2013 at 12:04 AM, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Thu, Oct 17, 2013 at 10:21:11AM +0100, Marc Zyngier wrote:
>> On 17/10/13 10:10, Peter Maydell wrote:
>> > On 17 October 2013 09:37, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> >> On 16/10/13 18:02, Anup Patel wrote:
>> >>> The PSCI SYSTEM_OFF and SYSTEM_RESET functions are VM or Guest level
>> >>> functions hence cannot be emulated by the in-kernel PSCI emulation code.
>> >>
>> >> Why can't we implement system-wide functionality in the kernel? I fail
>> >> to see the issue here.
>> >
>> > Because the kernel isn't emulating the whole board, and you need
>> > to power off or reset the whole board, not just the CPUs.
>>
>> In which case we can forward a generic event, once KVM has dealt with
>> the CPUs.
>>
>> >> I'm really not keen on this approach. Having part of the PSCI
>> >> implementation offloaded to userspace means we don't have a complete
>> >> implementation in KVM anymore, and we end-up duplicating functionality
>> >> all over the place.
>> >>
>> >> Also, OFF and RESET are not PSCI specific concepts, and could be
>> >> implemented in various ways. I'm more inclined to return a
>> >> *standardized* exit code that the various platforms can interpret.
>> >
>> > Maybe we should have a more generic "kernel can't handle this,
>> > toss it to userspace" API? That might also fit in with supporting
>> > guests that want to make SMC calls to an emulated monitor...
>>
>> Indeed.
>>
> Agreed as well. That's what I was trying to say with a more generic
> solution, and exiting on a subset of SMC/HVC calls for PSCI is a weird
> compromise.
OK.
>
> After all, PSCI is nothing more than acting on SMC or HVC calls, and I'm
> quite sure there will be a need for user space to emulate handling of
> SMC calls sooner or later, and that is the interface we need.
>
> We may end up with both kernel and user space support for PSCI in that
> case, but naturally it would always be *either* the kernel handling the
> whole thing, *or* user space handling the whole thing.
>
> In both cases we may need extra functionality to communicate between
> user space and KVM, such as suspending a vcpu from user space and waking
> it up on in-kernel generated timer interrupts, or, conversely, telling
> user space that there was a PSCI call to turn off the system.
Agreed. Lets keep "user space emulation of PSCI for SMC/HVC" out of
discussions because we can implement PSCI SYSTEM_OFF and
SYSTEM_RESET without that.
>
> >From my point of view the most difficult part to figure out is how to
> support general handling of secure monitor services in user space. I
> don't see any particular problems with expanding the exit reasons with
> things like "turn off VM" or "reset VM"?
Fine with me.
I'll keep separate exit reasons for HVC-based SYSTEM_OFF and
SYSTEM_RESET.
If everyone agrees with this then I'll change it accordingly in revised patch.
(Marc? Christoffer? PMM? ...)
--
Anup
>
> -Christoffer
> _______________________________________________
> kvmarm mailing list
> kvmarm at lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2013-10-18 4:18 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-16 17:02 [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
2013-10-16 17:02 ` [RFC PATCH 1/5] ARM/ARM64: KVM: Update user space API header for PSCI emulation Anup Patel
2013-10-16 20:30 ` Christoffer Dall
2013-10-17 6:25 ` Anup Patel
2013-10-16 22:11 ` Christoffer Dall
2013-10-17 6:45 ` Anup Patel
2013-10-17 8:47 ` Marc Zyngier
2013-10-17 11:10 ` Anup Patel
2013-10-17 11:21 ` Marc Zyngier
2013-10-17 11:30 ` Anup Patel
2013-10-17 11:49 ` Alexander Graf
2013-10-17 11:55 ` Marc Zyngier
2013-10-17 12:01 ` Alexander Graf
2013-10-17 19:04 ` Christoffer Dall
2013-10-17 22:06 ` Alexander Graf
2013-10-17 22:24 ` Christoffer Dall
2013-10-17 22:26 ` Alexander Graf
2013-10-18 3:34 ` Christoffer Dall
2013-10-17 15:32 ` Anup Patel
2013-10-17 11:52 ` Marc Zyngier
2013-10-16 17:02 ` [RFC PATCH 2/5] ARM/ARM64: KVM: Forward PSCI SYSTEM_OFF and SYSTEM_RESET to user space Anup Patel
2013-10-16 22:22 ` Christoffer Dall
2013-10-17 5:52 ` Anup Patel
2013-10-17 8:37 ` Marc Zyngier
2013-10-17 9:10 ` Peter Maydell
2013-10-17 9:21 ` Marc Zyngier
2013-10-17 9:31 ` Peter Maydell
2013-10-17 18:34 ` Christoffer Dall
2013-10-18 4:18 ` Anup Patel
2013-10-17 11:07 ` Anup Patel
2013-10-17 11:13 ` Marc Zyngier
2013-10-17 11:13 ` Anup Patel
2013-10-17 18:29 ` Christoffer Dall
2013-10-16 17:02 ` [RFC PATCH 3/5] KVM: Add documentation for KVM_EXIT_PSCI exit reason Anup Patel
2013-10-16 17:02 ` [RFC PATCH 4/5] ARM: psci: Add support for system reboot and poweroff Anup Patel
2013-10-16 22:17 ` Rob Herring
2013-10-17 5:08 ` Anup Patel
2013-10-17 9:50 ` Marc Zyngier
2013-10-16 17:02 ` [RFC PATCH 5/5] ARM64: " Anup Patel
2013-10-16 17:08 ` [RFC PATCH 0/5] PSCI system off and reset for KVM ARM/ARM64 Anup Patel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).