* [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation
@ 2025-06-17 16:33 Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 01/11] target/arm: allow gdb to read ARM_CP_NORAW regs (!upstream) Alex Bennée
` (10 more replies)
0 siblings, 11 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
The following is an RFC to explore how KVM would look if we forwarded
almost all traps back to QEMU to deal with.
Why - won't it be horribly slow?
--------------------------------
Maybe, that's why its an RFC.
Traditionally KVM tries to avoid full vmexit's to QEMU because the
additional context switches add to the latency of servicing requests.
For things like the GIC where latency really matters the normal KVM
approach is to implement it in the kernel and then just leave QEMU to
handling state saving and migration matters.
Where we have to exit, for example for device emulation, platforms
like VirtIO try really hard minimise the number of times we exit for
any data transfer.
However hypervisors can't virtualise everything and for some QEMU
use-cases you might want to run the full software stack (firmware,
hypervisor et all). This is the idea for the proposed SplitAccel where
EL1/EL0 are run under a hypervisor and EL2+ get run under TCG's
emulation. For this to work QEMU needs to be aware of the whole system
state and have full control over anything that is virtualised by the
hypervisor. We have an initial PoC for SplitAccel that works with
HVF's much simpler programming model.
This series is a precursor to implementing a SplitAccel for KVM and
investigates how hacky it might look.
Kernel
------
For this to work you need a modified kernel. You can find my tree
here:
https://git.linaro.org/plugins/gitiles/people/alex.bennee/linux/+/refs/heads/kvm/trap-me-harder
I will be posting the kernel patches to LKML in due course but the
changes are pretty simple. We add a new creation flag
(KVM_VM_TYPE_ARM_TRAP_ALL) that when activated implement an
alternative table in KVM's handle_exit() code.
The ESR_ELx_EC_IABT_LOW/ESR_ELx_EC_DABT_LOW exceptions are still
handled by KVM as the kernel general has to deal with paging in the
required memory. I've also left the debug exceptions to be processed
in KVM as the handling of pstate gets tricky and takes care when
re-entering the guest.
Everything else exits with a new exit reason called
KVM_EXIT_ARM_TRAP_HARDER when exposed the ESR_EL1 and a few other
registers so QEMU can deal with things.
QEMU Patches
------------
Patches 1-2 - minor tweaks that make debugging easier
Patch 3 - bring in the uapi headers from Kernel
Patches 4-5 - plumbing in -accel kvm,trap-harder=on
Patches 6-7 - allow creation of an out-of-kernel GIC (kernel-irqchip=off)
Patches 8-11- trap handlers for the kvm_arm_handle_hard_trap path
Testing
-------
Currently I'm testing everything inside an emulated QEMU, so the guest
host is booted with a standard Debian Trixie although I use virtiofsd to
mount my real host home inside the guest hosts home:
./qemu-system-aarch64 \
-machine type=virt,virtualization=on,pflash0=rom,pflash1=efivars,gic-version=max \
-blockdev node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true \
-blockdev node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \
-cpu cortex-a76 \
-m 8192 \
-object memory-backend-memfd,id=mem,size=8G,share=on \
-numa node,memdev=mem \
-smp 4 \
-accel tcg \
-serial mon:stdio \
-device virtio-net-pci,netdev=unet \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-scsi-pci \
-device scsi-hd,drive=hd \
-blockdev driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap \
-kernel /home/alex/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image \
-append "root=/dev/sda2" \
-chardev socket,id=vfs,path=/tmp/virtiofsd.sock \
-device vhost-user-fs-pci,chardev=vfs,tag=home \
-display none -s -S
Inside the guest host I have built QEMU with:
../../configure --disable-docs \
--enable-debug-info --extra-ldflags=-gsplit-dwarf \
--disable-tcg --disable-xen --disable-tools \
--target-list=aarch64-softmmu
make qemu-system-aarch64 -j(nproc)
Even with a cut down configuration this can take awhile to build under
softmmu emulation!
And finally I can boot my guest image with:
./qemu-system-aarch64 \
-machine type=virt,gic-version=3 \
-cpu host \
-smp 1 \
-accel kvm,kernel-irqchip=off,trap-harder=on \
-serial mon:stdio \
-m 4096 \
-kernel ~/lsrc/linux.git/builds/arm64.initramfs/arch/arm64/boot/Image \
-append "console=ttyAMA0" \
-display none -d unimp,trace:kvm_hypercall,trace:kvm_wfx_trap
And you can witness the system slowly booting up. Currently the system
hangs before displaying the login prompt because its not being woken
up from the WFI:
[ 0.315642] Serial: AMBA PL011 UART driver
[ 0.345625] 9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 13, base_baud = 0) is a PL011 rev1
[ 0.348138] printk: console [ttyAMA0] enabled
Saving 256 bits of creditable seed for next boot
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Populating /dev using udev: done
Starting system message bus: done
Starting network: udhcpc: started, v1.37.0
kvm_wfx_trap 0: WFI @ 0xffffffc080cf9be4
Next steps
----------
I need to figure out whats going on with the WFI failing. I also
intend to boot up my Aarch64 system and try it out on real hardware.
Then I can start looking into the actual performance and what
bottlenecks this might introduce.
Once Philippe has posted the SplitAccel RFC I can look at what it
would take to integrate this approach so we can boot a full-stack with
EL3/EL2 starting.
Alex Bennée (11):
target/arm: allow gdb to read ARM_CP_NORAW regs (!upstream)
target/arm: re-arrange debug_cp_reginfo
linux-headers: Update to Linux 6.15.1 with trap-mem-harder (WIP)
kvm: expose a trap-harder option to the command line
target/arm: enable KVM_VM_TYPE_ARM_TRAP_ALL when asked
kvm/arm: allow out-of kernel GICv3 to work with KVM
target/arm: clamp value on icc_bpr_write to account for RES0 fields
kvm/arm: plumb in a basic trap harder handler
kvm/arm: implement sysreg trap handler
kvm/arm: implement a basic hypercall handler
kvm/arm: implement WFx traps for KVM
include/standard-headers/linux/virtio_pci.h | 1 +
include/system/kvm_int.h | 4 +
linux-headers/linux/kvm.h | 8 +
linux-headers/linux/vhost.h | 4 +-
target/arm/kvm_arm.h | 17 ++
target/arm/syndrome.h | 4 +
hw/arm/virt.c | 18 +-
hw/intc/arm_gicv3_common.c | 4 -
hw/intc/arm_gicv3_cpuif.c | 5 +-
target/arm/cpu.c | 2 +-
target/arm/debug_helper.c | 12 +-
target/arm/gdbstub.c | 6 +-
target/arm/helper.c | 15 +-
target/arm/kvm-stub.c | 5 +
target/arm/kvm.c | 243 ++++++++++++++++++++
hw/intc/Kconfig | 2 +-
target/arm/trace-events | 4 +
17 files changed, 334 insertions(+), 20 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 15+ messages in thread
* [RFC PATCH 01/11] target/arm: allow gdb to read ARM_CP_NORAW regs (!upstream)
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 02/11] target/arm: re-arrange debug_cp_reginfo Alex Bennée
` (9 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Before this we suppress all ARM_CP_NORAW registers being listed under
GDB. This includes useful registers like CurrentEL which gets tagged
as ARM_CP_NO_RAW because it is one of the ARM_CP_SPECIAL_MASK
registers. These are registers TCG can directly compute because we
have the information at compile time but until now with no readfn.
Add a .readfn to return the CurrentEL and then loosen the restrictions
in arm_register_sysreg_for_feature to allow ARM_CP_NORAW registers to
be read if there is a readfn available.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20250507165840.401623-1-alex.bennee@linaro.org>
---
vRFC
- this is a useful debugging aid but a bit haphazard for
up-streaming. See thread comments for details.
---
target/arm/gdbstub.c | 6 +++++-
target/arm/helper.c | 15 ++++++++++++++-
2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index ce4497ad7c..029678ac9a 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -282,7 +282,11 @@ static void arm_register_sysreg_for_feature(gpointer key, gpointer value,
CPUARMState *env = &cpu->env;
DynamicGDBFeatureInfo *dyn_feature = &cpu->dyn_sysreg_feature;
- if (!(ri->type & (ARM_CP_NO_RAW | ARM_CP_NO_GDB))) {
+ if (!(ri->type & ARM_CP_NO_GDB)) {
+ /* skip ARM_CP_NO_RAW if there are no helper functions */
+ if ((ri->type & ARM_CP_NO_RAW) && !ri->readfn) {
+ return;
+ }
if (arm_feature(env, ARM_FEATURE_AARCH64)) {
if (ri->state == ARM_CP_STATE_AA64) {
arm_gen_one_feature_sysreg(¶m->builder, dyn_feature,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7631210287..8501c06b93 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4996,6 +4996,17 @@ static void ic_ivau_write(CPUARMState *env, const ARMCPRegInfo *ri,
}
#endif
+/*
+ * Normally the current_el is known at translation time and we can
+ * emit the result directly in TCG code. However this helper exists
+ * only so we can also expose CURRENTEL to gdb.
+ */
+static uint64_t aa64_currentel_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+ int el = arm_current_el(env);
+ return el;
+}
+
static const ARMCPRegInfo v8_cp_reginfo[] = {
/*
* Minimal set of EL0-visible registers. This will need to be expanded
@@ -5034,7 +5045,9 @@ static const ARMCPRegInfo v8_cp_reginfo[] = {
},
{ .name = "CURRENTEL", .state = ARM_CP_STATE_AA64,
.opc0 = 3, .opc1 = 0, .opc2 = 2, .crn = 4, .crm = 2,
- .access = PL1_R, .type = ARM_CP_CURRENTEL },
+ .access = PL1_R, .type = ARM_CP_CURRENTEL,
+ .readfn = aa64_currentel_read
+ },
/*
* Instruction cache ops. All of these except `IC IVAU` NOP because we
* don't emulate caches.
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 02/11] target/arm: re-arrange debug_cp_reginfo
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 01/11] target/arm: allow gdb to read ARM_CP_NORAW regs (!upstream) Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 03/11] linux-headers: Update to Linux 6.15.1 with trap-mem-harder (WIP) Alex Bennée
` (8 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Although we are using structure initialisation the order of the
op[012]/cr[nm] fields don't match the rest of the code base.
Re-organise to be consistent and help the poor engineer who is
grepping for system registers.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/debug_helper.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index 69fb1d0d9f..8130ff78de 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -948,19 +948,21 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
* DBGDSAR is deprecated and must RAZ from v8 anyway, so it has no AArch64
* accessor.
*/
- { .name = "DBGDRAR", .cp = 14, .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 0,
+ { .name = "DBGDRAR", .cp = 14,
+ .opc0 = 0, .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 0,
.access = PL0_R, .accessfn = access_tdra,
.type = ARM_CP_CONST | ARM_CP_NO_GDB, .resetvalue = 0 },
{ .name = "MDRAR_EL1", .state = ARM_CP_STATE_AA64,
- .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 0,
+ .opc0 = 2, .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 0,
.access = PL1_R, .accessfn = access_tdra,
.type = ARM_CP_CONST, .resetvalue = 0 },
- { .name = "DBGDSAR", .cp = 14, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 0,
+ { .name = "DBGDSAR", .cp = 14,
+ .opc0 = 0, .opc1 = 0, .crn = 2, .crm = 0,.opc2 = 0,
.access = PL0_R, .accessfn = access_tdra,
.type = ARM_CP_CONST | ARM_CP_NO_GDB, .resetvalue = 0 },
/* Monitor debug system control register; the 32-bit alias is DBGDSCRext. */
- { .name = "MDSCR_EL1", .state = ARM_CP_STATE_BOTH,
- .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 2,
+ { .name = "MDSCR_EL1", .state = ARM_CP_STATE_BOTH, .cp = 14,
+ .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 2,
.access = PL1_RW, .accessfn = access_tda,
.fgt = FGT_MDSCR_EL1,
.nv2_redirect_offset = 0x158,
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 03/11] linux-headers: Update to Linux 6.15.1 with trap-mem-harder (WIP)
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 01/11] target/arm: allow gdb to read ARM_CP_NORAW regs (!upstream) Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 02/11] target/arm: re-arrange debug_cp_reginfo Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 04/11] kvm: expose a trap-harder option to the command line Alex Bennée
` (7 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Import headers for trap-me-harder, based on 6.15.1.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
include/standard-headers/linux/virtio_pci.h | 1 +
| 8 ++++++++
| 4 ++--
3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/standard-headers/linux/virtio_pci.h b/include/standard-headers/linux/virtio_pci.h
index 91fec6f502..09e964e6ee 100644
--- a/include/standard-headers/linux/virtio_pci.h
+++ b/include/standard-headers/linux/virtio_pci.h
@@ -246,6 +246,7 @@ struct virtio_pci_cfg_cap {
#define VIRTIO_ADMIN_CMD_LIST_USE 0x1
/* Admin command group type. */
+#define VIRTIO_ADMIN_GROUP_TYPE_SELF 0x0
#define VIRTIO_ADMIN_GROUP_TYPE_SRIOV 0x1
/* Transitional device admin command. */
--git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 99cc82a275..bb51fb179b 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -178,6 +178,7 @@ struct kvm_xen_exit {
#define KVM_EXIT_NOTIFY 37
#define KVM_EXIT_LOONGARCH_IOCSR 38
#define KVM_EXIT_MEMORY_FAULT 39
+#define KVM_EXIT_ARM_TRAP_HARDER 40
/* For KVM_EXIT_INTERNAL_ERROR */
/* Emulate instruction failed. */
@@ -439,6 +440,12 @@ struct kvm_run {
__u64 gpa;
__u64 size;
} memory_fault;
+ /* KVM_EXIT_ARM_TRAP_HARDER */
+ struct {
+ __u64 esr;
+ __u64 elr;
+ __u64 far;
+ } arm_trap_harder;
/* Fix the size of the union. */
char padding[256];
};
@@ -645,6 +652,7 @@ struct kvm_enable_cap {
#define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL
#define KVM_VM_TYPE_ARM_IPA_SIZE(x) \
((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
+#define KVM_VM_TYPE_ARM_TRAP_ALL 0x10000ULL
/*
* ioctls for /dev/kvm fds:
*/
--git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index b95dd84eef..d4b3e2ae13 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -28,10 +28,10 @@
/* Set current process as the (exclusive) owner of this file descriptor. This
* must be called before any other vhost command. Further calls to
- * VHOST_OWNER_SET fail until VHOST_OWNER_RESET is called. */
+ * VHOST_SET_OWNER fail until VHOST_RESET_OWNER is called. */
#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
/* Give up ownership, and reset the device to default values.
- * Allows subsequent call to VHOST_OWNER_SET to succeed. */
+ * Allows subsequent call to VHOST_SET_OWNER to succeed. */
#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
/* Set up/modify memory layout */
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 04/11] kvm: expose a trap-harder option to the command line
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (2 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 03/11] linux-headers: Update to Linux 6.15.1 with trap-mem-harder (WIP) Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 05/11] target/arm: enable KVM_VM_TYPE_ARM_TRAP_ALL when asked Alex Bennée
` (6 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
It would be nice to only have the variable for this is a KVM_ARM_STATE
but currently everything is just held together in the common KVMState.
Only KVM ARM can set the flag though.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
include/system/kvm_int.h | 4 ++++
target/arm/kvm.c | 19 +++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/include/system/kvm_int.h b/include/system/kvm_int.h
index 756a3c0a25..a1e306b7b7 100644
--- a/include/system/kvm_int.h
+++ b/include/system/kvm_int.h
@@ -122,6 +122,10 @@ struct KVMState
OnOffAuto kernel_irqchip_split;
bool sync_mmu;
bool guest_state_protected;
+
+ /* currently Arm only, but we have no KVMArmState */
+ bool trap_harder;
+
uint64_t manual_dirty_log_protect;
/*
* Older POSIX says that ioctl numbers are signed int, but in
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 74fda8b809..8b1719bfc1 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1615,6 +1615,18 @@ static void kvm_arch_set_eager_split_size(Object *obj, Visitor *v,
s->kvm_eager_split_size = value;
}
+static bool kvm_arch_get_trap_harder(Object *obj, Error **errp)
+{
+ KVMState *s = KVM_STATE(obj);
+ return s->trap_harder;
+}
+
+static void kvm_arch_set_trap_harder(Object *obj, bool value, Error **errp)
+{
+ KVMState *s = KVM_STATE(obj);
+ s->trap_harder = value;
+}
+
void kvm_arch_accel_class_init(ObjectClass *oc)
{
object_class_property_add(oc, "eager-split-size", "size",
@@ -1623,6 +1635,13 @@ void kvm_arch_accel_class_init(ObjectClass *oc)
object_class_property_set_description(oc, "eager-split-size",
"Eager Page Split chunk size for hugepages. (default: 0, disabled)");
+
+ object_class_property_add_bool(oc, "trap-harder",
+ kvm_arch_get_trap_harder,
+ kvm_arch_set_trap_harder);
+
+ object_class_property_set_description(oc, "trap-harder",
+ "Trap harder mode traps almost everything to QEMU (default: off)");
}
int kvm_arch_insert_hw_breakpoint(vaddr addr, vaddr len, int type)
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 05/11] target/arm: enable KVM_VM_TYPE_ARM_TRAP_ALL when asked
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (3 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 04/11] kvm: expose a trap-harder option to the command line Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 06/11] kvm/arm: allow out-of kernel GICv3 to work with KVM Alex Bennée
` (5 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/kvm_arm.h | 9 +++++++++
hw/arm/virt.c | 7 +++++--
target/arm/kvm.c | 7 +++++++
3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 7dc83caed5..a4f68e14cb 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -191,6 +191,15 @@ bool kvm_arm_sve_supported(void);
*/
bool kvm_arm_mte_supported(void);
+/**
+ * kvm_arm_get_type: return the base KVM type flags
+ * @ms: Machine state handle
+ *
+ * Returns the base type flags, usually zero. These will be combined
+ * with the IPA flags from bellow.
+ */
+int kvm_arm_get_type(MachineState *ms);
+
/**
* kvm_arm_get_max_vm_ipa_size:
* @ms: Machine state handle
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9a6cd085a3..55433f8fce 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3037,11 +3037,14 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
/*
* for arm64 kvm_type [7-0] encodes the requested number of bits
- * in the IPA address space
+ * in the IPA address space.
+ *
+ * For trap-me-harder we apply KVM_VM_TYPE_ARM_TRAP_ALL
*/
static int virt_kvm_type(MachineState *ms, const char *type_str)
{
VirtMachineState *vms = VIRT_MACHINE(ms);
+ int kvm_type = kvm_arm_get_type(ms);
int max_vm_pa_size, requested_pa_size;
bool fixed_ipa;
@@ -3071,7 +3074,7 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
* the implicit legacy 40b IPA setting, in which case the kvm_type
* must be 0.
*/
- return fixed_ipa ? 0 : requested_pa_size;
+ return fixed_ipa ? kvm_type : deposit32(kvm_type, 0, 8, requested_pa_size);
}
static int virt_hvf_get_physical_address_range(MachineState *ms)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 8b1719bfc1..ed0f6024d6 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -515,6 +515,13 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms, bool *fixed_ipa)
return ret > 0 ? ret : 40;
}
+int kvm_arm_get_type(MachineState *ms)
+{
+ KVMState *s = KVM_STATE(ms->accelerator);
+
+ return s->trap_harder ? KVM_VM_TYPE_ARM_TRAP_ALL : 0;
+}
+
int kvm_arch_get_default_type(MachineState *ms)
{
bool fixed_ipa;
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 06/11] kvm/arm: allow out-of kernel GICv3 to work with KVM
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (4 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 05/11] target/arm: enable KVM_VM_TYPE_ARM_TRAP_ALL when asked Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 07/11] target/arm: clamp value on icc_bpr_write to account for RES0 fields Alex Bennée
` (4 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Previously we suppressed this option as KVM would get confused if it
started trapping GIC system registers without a GIC configured.
However if we know we are trapping harder we can allow it much like we
do for HVF.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/kvm_arm.h | 8 ++++++++
hw/arm/virt.c | 11 +++++++++--
hw/intc/arm_gicv3_common.c | 4 ----
target/arm/cpu.c | 2 +-
target/arm/kvm.c | 6 ++++++
hw/intc/Kconfig | 2 +-
6 files changed, 25 insertions(+), 8 deletions(-)
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index a4f68e14cb..008a72ccd4 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -200,6 +200,14 @@ bool kvm_arm_mte_supported(void);
*/
int kvm_arm_get_type(MachineState *ms);
+/**
+ * kvm_arm_is_trapping_harder: return true if trapping harder
+ * @ms: Machine state handle
+ *
+ * return true if trapping harder
+ */
+bool kvm_arm_is_trapping_harder(MachineState *ms);
+
/**
* kvm_arm_get_max_vm_ipa_size:
* @ms: Machine state handle
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 55433f8fce..e117433cc7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1998,9 +1998,16 @@ static void finalize_gic_version(VirtMachineState *vms)
gics_supported |= VIRT_GIC_VERSION_3_MASK;
}
} else if (kvm_enabled() && !kvm_irqchip_in_kernel()) {
- /* KVM w/o kernel irqchip can only deal with GICv2 */
+ MachineState *ms = MACHINE(vms);
gics_supported |= VIRT_GIC_VERSION_2_MASK;
- accel_name = "KVM with kernel-irqchip=off";
+ if (kvm_arm_is_trapping_harder(ms) &&
+ module_object_class_by_name("arm-gicv3")) {
+ gics_supported |= VIRT_GIC_VERSION_3_MASK;
+ accel_name = "TMH KVM with kernel-irqchip=off";
+ } else {
+ /* KVM w/o kernel irqchip can only deal with GICv2 */
+ accel_name = "KVM with kernel-irqchip=off";
+ }
} else if (tcg_enabled() || hvf_enabled() || qtest_enabled()) {
gics_supported |= VIRT_GIC_VERSION_2_MASK;
if (module_object_class_by_name("arm-gicv3")) {
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 1cee68193c..9a46afaa0d 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -662,10 +662,6 @@ const char *gicv3_class_name(void)
if (kvm_irqchip_in_kernel()) {
return "kvm-arm-gicv3";
} else {
- if (kvm_enabled()) {
- error_report("Userspace GICv3 is not supported with KVM");
- exit(1);
- }
return "arm-gicv3";
}
}
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index e025e241ed..f7618a3038 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1463,7 +1463,7 @@ static void arm_cpu_initfn(Object *obj)
# endif
#else
/* Our inbound IRQ and FIQ lines */
- if (kvm_enabled()) {
+ if (kvm_enabled() && kvm_irqchip_in_kernel()) {
/*
* VIRQ, VFIQ, NMI, VINMI are unused with KVM but we add
* them to maintain the same interface as non-KVM CPUs.
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index ed0f6024d6..c5374d12cf 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -522,6 +522,12 @@ int kvm_arm_get_type(MachineState *ms)
return s->trap_harder ? KVM_VM_TYPE_ARM_TRAP_ALL : 0;
}
+bool kvm_arm_is_trapping_harder(MachineState *ms)
+{
+ KVMState *s = KVM_STATE(ms->accelerator);
+ return s->trap_harder;
+}
+
int kvm_arch_get_default_type(MachineState *ms)
{
bool fixed_ipa;
diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
index 7547528f2c..0eb37364a7 100644
--- a/hw/intc/Kconfig
+++ b/hw/intc/Kconfig
@@ -23,7 +23,7 @@ config APIC
config ARM_GIC
bool
- select ARM_GICV3 if TCG
+ select ARM_GICV3 # can be used by TCG, HVF or KVM
select ARM_GIC_KVM if KVM
select MSI_NONBROKEN
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 07/11] target/arm: clamp value on icc_bpr_write to account for RES0 fields
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (5 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 06/11] kvm/arm: allow out-of kernel GICv3 to work with KVM Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 08/11] kvm/arm: plumb in a basic trap harder handler Alex Bennée
` (3 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
If the user writes a large value to the register but with the bottom
bits unset we could end up with something illegal. By clamping ahead
of the check we at least assure we won't assert(bpr > 0) later in the
GIC interface code.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
hw/intc/arm_gicv3_cpuif.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 4b4cf09157..165f7e9c2f 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -1797,6 +1797,9 @@ static void icc_bpr_write(CPUARMState *env, const ARMCPRegInfo *ri,
trace_gicv3_icc_bpr_write(ri->crm == 8 ? 0 : 1,
gicv3_redist_affid(cs), value);
+ /* clamp the value to 2:0, the rest os RES0 */
+ value = deposit64(0, 0, 3, value);
+
if (grp == GICV3_G1 && gicv3_use_ns_bank(env)) {
grp = GICV3_G1NS;
}
@@ -1820,7 +1823,7 @@ static void icc_bpr_write(CPUARMState *env, const ARMCPRegInfo *ri,
value = minval;
}
- cs->icc_bpr[grp] = value & 7;
+ cs->icc_bpr[grp] = value;
gicv3_cpuif_update(cs);
}
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 08/11] kvm/arm: plumb in a basic trap harder handler
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (6 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 07/11] target/arm: clamp value on icc_bpr_write to account for RES0 fields Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 09/11] kvm/arm: implement sysreg trap handler Alex Bennée
` (2 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Currently we do nothing but report we don't handle anything and let
KVM come to a halt.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/syndrome.h | 4 ++++
target/arm/kvm-stub.c | 5 +++++
target/arm/kvm.c | 44 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 53 insertions(+)
diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
index 3244e0740d..29b95bdd36 100644
--- a/target/arm/syndrome.h
+++ b/target/arm/syndrome.h
@@ -88,6 +88,10 @@ typedef enum {
#define ARM_EL_ISV_SHIFT 24
#define ARM_EL_IL (1 << ARM_EL_IL_SHIFT)
#define ARM_EL_ISV (1 << ARM_EL_ISV_SHIFT)
+#define ARM_EL_ISS_SHIFT 0
+#define ARM_EL_ISS_LENGTH 25
+#define ARM_EL_ISS2_SHIFT 32
+#define ARM_EL_ISS2_LENGTH 24
/* In the Data Abort syndrome */
#define ARM_EL_VNCR (1 << 13)
diff --git a/target/arm/kvm-stub.c b/target/arm/kvm-stub.c
index 34e57fab01..765efb1848 100644
--- a/target/arm/kvm-stub.c
+++ b/target/arm/kvm-stub.c
@@ -60,6 +60,11 @@ void kvm_arm_add_vcpu_properties(ARMCPU *cpu)
g_assert_not_reached();
}
+int kvm_arm_get_type(MachineState *ms)
+{
+ g_assert_not_reached();
+}
+
int kvm_arm_get_max_vm_ipa_size(MachineState *ms, bool *fixed_ipa)
{
g_assert_not_reached();
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index c5374d12cf..f2255cfdc8 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1414,6 +1414,43 @@ static bool kvm_arm_handle_debug(ARMCPU *cpu,
return false;
}
+/**
+ * kvm_arm_handle_hard_trap:
+ * @cpu: ARMCPU
+ * @esr: full exception state register
+ * @elr: exception link return address
+ * @far: fault address (if used)
+ *
+ * Returns: 0 if the exception has been handled, < 0 otherwise
+ */
+static int kvm_arm_handle_hard_trap(ARMCPU *cpu,
+ uint64_t esr,
+ uint64_t elr,
+ uint64_t far)
+{
+ CPUState *cs = CPU(cpu);
+ int esr_ec = extract64(esr, ARM_EL_EC_SHIFT, ARM_EL_EC_LENGTH);
+ int esr_iss = extract64(esr, ARM_EL_ISS_SHIFT, ARM_EL_ISS_LENGTH);
+ int esr_iss2 = extract64(esr, ARM_EL_ISS2_SHIFT, ARM_EL_ISS2_LENGTH);
+ int esr_il = extract64(esr, ARM_EL_IL_SHIFT, 1);
+
+ /*
+ * Ensure register state is synchronised
+ *
+ * This sets vcpu->vcpu_dirty which should ensure the registers
+ * are synced back to KVM before we restart.
+ */
+ kvm_cpu_synchronize_state(cs);
+
+ switch (esr_ec) {
+ default:
+ qemu_log_mask(LOG_UNIMP, "%s: unhandled EC: %x/%x/%x/%d\n",
+ __func__, esr_ec, esr_iss, esr_iss2, esr_il);
+ return -1;
+ }
+}
+
+
int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
{
ARMCPU *cpu = ARM_CPU(cs);
@@ -1430,9 +1467,16 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
ret = kvm_arm_handle_dabt_nisv(cpu, run->arm_nisv.esr_iss,
run->arm_nisv.fault_ipa);
break;
+ case KVM_EXIT_ARM_TRAP_HARDER:
+ ret = kvm_arm_handle_hard_trap(cpu,
+ run->arm_trap_harder.esr,
+ run->arm_trap_harder.elr,
+ run->arm_trap_harder.far);
+ break;
default:
qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
__func__, run->exit_reason);
+ ret = -1;
break;
}
return ret;
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 09/11] kvm/arm: implement sysreg trap handler
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (7 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 08/11] kvm/arm: plumb in a basic trap harder handler Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 11/11] kvm/arm: implement WFx traps for KVM Alex Bennée
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
Fortunately all the information about which sysreg is being accessed
should be in the ISS field of the ESR. Once we process that we can
figure out what we need to do.
[AJB: the read/write stuff should probably go into a shared helper].
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/kvm.c | 95 +++++++++++++++++++++++++++++++++++++++++
target/arm/trace-events | 2 +
2 files changed, 97 insertions(+)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index f2255cfdc8..0a852af126 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -24,6 +24,7 @@
#include "system/runstate.h"
#include "system/kvm.h"
#include "system/kvm_int.h"
+#include "cpregs.h"
#include "kvm_arm.h"
#include "cpu.h"
#include "trace.h"
@@ -1414,6 +1415,98 @@ static bool kvm_arm_handle_debug(ARMCPU *cpu,
return false;
}
+/*
+ * To handle system register traps we should be able to extract the
+ * encoding from the ISS encoding and go from there.
+ */
+static int kvm_arm_handle_sysreg_trap(ARMCPU *cpu,
+ uint64_t esr_iss,
+ uint64_t elr)
+{
+ int op0 = extract32(esr_iss, 20, 2);
+ int op2 = extract32(esr_iss, 17, 3);
+ int op1 = extract32(esr_iss, 14, 3);
+ int crn = extract32(esr_iss, 10, 4);
+ int rt = extract32(esr_iss, 5, 5);
+ int crm = extract32(esr_iss, 1, 4);
+ bool is_read = extract32(esr_iss, 0, 1);
+
+ uint32_t key = ENCODE_AA64_CP_REG(CP_REG_ARM64_SYSREG_CP, crn, crm, op0, op1, op2);
+ const ARMCPRegInfo *ri = get_arm_cp_reginfo(cpu->cp_regs, key);
+
+ if (ri) {
+ CPUARMState *env = &cpu->env;
+ uint64_t val = 0;
+ bool take_bql = ri->type & ARM_CP_IO;
+
+ if (ri->accessfn) {
+ if (ri->accessfn(env, ri, true) != CP_ACCESS_OK) {
+ g_assert_not_reached();
+ }
+ }
+
+ if (take_bql) {
+ bql_lock();
+ }
+
+ if (is_read) {
+ if (ri->type & ARM_CP_CONST) {
+ val = ri->resetvalue;
+ } else if (ri->readfn) {
+ val = ri->readfn(env, ri);
+ } else {
+ val = CPREG_FIELD64(env, ri);
+ }
+ trace_kvm_sysreg_read(ri->name, val);
+
+ if (rt < 31) {
+ env->xregs[rt] = val;
+ } else {
+ /* this would be deeply weird */
+ g_assert_not_reached();
+ }
+ } else {
+ /* x31 == zero reg */
+ if (rt < 31) {
+ val = env->xregs[rt];
+ }
+
+ if (ri->writefn) {
+ ri->writefn(env, ri, val);
+ } else {
+ CPREG_FIELD64(env, ri) = val;
+ }
+ trace_kvm_sysreg_write(ri->name, val);
+ }
+
+ if (take_bql) {
+ bql_unlock();
+ }
+
+ /*
+ * Set PC to return.
+ *
+ * Note we elr_el2 doesn't seem to be what we need so lets
+ * rely on env->pc being correct.
+ *
+ * TODO We currently skip to the next instruction
+ * unconditionally but that is at odds with the kernels code
+ * which only does that conditionally (see kvm_handle_sys_reg
+ * -> perform_access):
+ *
+ * if (likely(r->access(vcpu, params, r)))
+ * kvm_incr_pc(vcpu);
+ *
+ */
+ env->pc = env->pc + 4;
+ return 0;
+ }
+
+ fprintf(stderr, "%s: @ %" PRIx64 " failed to find sysreg crn:%d crm:%d op0:%d op1:%d op2:%d\n",
+ __func__, elr, crn, crm, op0, op1, op2);
+ return -1;
+}
+
/**
* kvm_arm_handle_hard_trap:
* @cpu: ARMCPU
@@ -1443,6 +1536,8 @@ static int kvm_arm_handle_hard_trap(ARMCPU *cpu,
kvm_cpu_synchronize_state(cs);
switch (esr_ec) {
+ case EC_SYSTEMREGISTERTRAP:
+ return kvm_arm_handle_sysreg_trap(cpu, esr_iss, elr);
default:
qemu_log_mask(LOG_UNIMP, "%s: unhandled EC: %x/%x/%x/%d\n",
__func__, esr_ec, esr_iss, esr_iss2, esr_il);
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 4438dce7be..69bb4d370d 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -13,3 +13,5 @@ arm_gt_update_irq(int timer, int irqstate) "gt_update_irq: timer %d irqstate %d"
# kvm.c
kvm_arm_fixup_msi_route(uint64_t iova, uint64_t gpa) "MSI iova = 0x%"PRIx64" is translated into 0x%"PRIx64
+kvm_sysreg_read(const char *name, uint64_t val) "%s => 0x%" PRIx64
+kvm_sysreg_write(const char *name, uint64_t val) "%s <= 0x%" PRIx64
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (8 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 09/11] kvm/arm: implement sysreg trap handler Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
2025-08-22 7:12 ` Philippe Mathieu-Daudé
2025-08-22 7:15 ` Philippe Mathieu-Daudé
2025-06-17 16:33 ` [RFC PATCH 11/11] kvm/arm: implement WFx traps for KVM Alex Bennée
10 siblings, 2 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
For now just deal with the basic version probe we see during startup.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/kvm.c | 44 +++++++++++++++++++++++++++++++++++++++++
target/arm/trace-events | 1 +
2 files changed, 45 insertions(+)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 0a852af126..1280e2c1e8 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1507,6 +1507,43 @@ static int kvm_arm_handle_sysreg_trap(ARMCPU *cpu,
return -1;
}
+/*
+ * The guest is making a hypercall or firmware call. We can handle a
+ * limited number of them (e.g. PSCI) but we can't emulate a true
+ * firmware. This is an abbreviated version of
+ * kvm_smccc_call_handler() in the kernel and the TCG only arm_handle_psci_call().
+ *
+ * In the SplitAccel case we would be transitioning to execute EL2+
+ * under TCG.
+ */
+static int kvm_arm_handle_hypercall(ARMCPU *cpu,
+ int esr_ec)
+{
+ CPUARMState *env = &cpu->env;
+ int32_t ret = 0;
+
+ trace_kvm_hypercall(esr_ec, env->xregs[0]);
+
+ switch (env->xregs[0]) {
+ case QEMU_PSCI_0_2_FN_PSCI_VERSION:
+ ret = QEMU_PSCI_VERSION_1_1;
+ break;
+ case QEMU_PSCI_0_2_FN_MIGRATE_INFO_TYPE:
+ ret = QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED; /* No trusted OS */
+ break;
+ case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
+ ret = QEMU_PSCI_RET_NOT_SUPPORTED;
+ break;
+ default:
+ qemu_log_mask(LOG_UNIMP, "%s: unhandled hypercall %"PRIx64"\n",
+ __func__, env->xregs[0]);
+ return -1;
+ }
+
+ env->xregs[0] = ret;
+ return 0;
+}
+
/**
* kvm_arm_handle_hard_trap:
* @cpu: ARMCPU
@@ -1538,6 +1575,13 @@ static int kvm_arm_handle_hard_trap(ARMCPU *cpu,
switch (esr_ec) {
case EC_SYSTEMREGISTERTRAP:
return kvm_arm_handle_sysreg_trap(cpu, esr_iss, elr);
+ case EC_AA32_SVC:
+ case EC_AA32_HVC:
+ case EC_AA32_SMC:
+ case EC_AA64_SVC:
+ case EC_AA64_HVC:
+ case EC_AA64_SMC:
+ return kvm_arm_handle_hypercall(cpu, esr_ec);
default:
qemu_log_mask(LOG_UNIMP, "%s: unhandled EC: %x/%x/%x/%d\n",
__func__, esr_ec, esr_iss, esr_iss2, esr_il);
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 69bb4d370d..10cdba92a3 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -15,3 +15,4 @@ arm_gt_update_irq(int timer, int irqstate) "gt_update_irq: timer %d irqstate %d"
kvm_arm_fixup_msi_route(uint64_t iova, uint64_t gpa) "MSI iova = 0x%"PRIx64" is translated into 0x%"PRIx64
kvm_sysreg_read(const char *name, uint64_t val) "%s => 0x%" PRIx64
kvm_sysreg_write(const char *name, uint64_t val) "%s <= 0x%" PRIx64
+kvm_hypercall(int ec, uint64_t arg0) "%d: %"PRIx64
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC PATCH 11/11] kvm/arm: implement WFx traps for KVM
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
` (9 preceding siblings ...)
2025-06-17 16:33 ` [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler Alex Bennée
@ 2025-06-17 16:33 ` Alex Bennée
10 siblings, 0 replies; 15+ messages in thread
From: Alex Bennée @ 2025-06-17 16:33 UTC (permalink / raw)
To: qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Philippe Mathieu-Daudé, Alex Bennée
This allows the vCPU guest core to go to sleep on a WFx instruction.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
target/arm/kvm.c | 28 ++++++++++++++++++++++++++++
target/arm/trace-events | 1 +
2 files changed, 29 insertions(+)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 1280e2c1e8..63ba8573a2 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1544,6 +1544,32 @@ static int kvm_arm_handle_hypercall(ARMCPU *cpu,
return 0;
}
+/*
+ * It would be perfectly fine to immediately return from any WFE/WFI
+ * trap however that would mean we spend a lot of time bouncing
+ * between the hypervisor and QEMU when things are idle.
+ */
+
+static const char * wfx_insn[] = {
+ "WFI",
+ "WFE",
+ "WFIT",
+ "WFET"
+};
+
+static int kvm_arm_handle_wfx(CPUState *cs, int esr_iss)
+{
+ int ti = extract32(esr_iss, 0, 2);
+ ARMCPU *cpu = ARM_CPU(cs);
+ CPUARMState *env = &cpu->env;
+
+ trace_kvm_wfx_trap(cs->cpu_index, wfx_insn[ti], env->pc);
+
+ /* stop the CPU, return to the top of the loop */
+ cs->stop = true;
+ return EXCP_YIELD;
+}
+
/**
* kvm_arm_handle_hard_trap:
* @cpu: ARMCPU
@@ -1582,6 +1608,8 @@ static int kvm_arm_handle_hard_trap(ARMCPU *cpu,
case EC_AA64_HVC:
case EC_AA64_SMC:
return kvm_arm_handle_hypercall(cpu, esr_ec);
+ case EC_WFX_TRAP:
+ return kvm_arm_handle_wfx(cs, esr_iss);
default:
qemu_log_mask(LOG_UNIMP, "%s: unhandled EC: %x/%x/%x/%d\n",
__func__, esr_ec, esr_iss, esr_iss2, esr_il);
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 10cdba92a3..bb02da12ab 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -16,3 +16,4 @@ kvm_arm_fixup_msi_route(uint64_t iova, uint64_t gpa) "MSI iova = 0x%"PRIx64" is
kvm_sysreg_read(const char *name, uint64_t val) "%s => 0x%" PRIx64
kvm_sysreg_write(const char *name, uint64_t val) "%s <= 0x%" PRIx64
kvm_hypercall(int ec, uint64_t arg0) "%d: %"PRIx64
+kvm_wfx_trap(int vcpu, const char *insn, uint64_t vaddr) "%d: %s @ 0x%" PRIx64
--
2.47.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler
2025-06-17 16:33 ` [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler Alex Bennée
@ 2025-08-22 7:12 ` Philippe Mathieu-Daudé
2025-08-22 7:55 ` Manos Pitsidianakis
2025-08-22 7:15 ` Philippe Mathieu-Daudé
1 sibling, 1 reply; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-22 7:12 UTC (permalink / raw)
To: Alex Bennée, qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini,
Pierrick Bouvier
On 17/6/25 18:33, Alex Bennée wrote:
> For now just deal with the basic version probe we see during startup.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> target/arm/kvm.c | 44 +++++++++++++++++++++++++++++++++++++++++
> target/arm/trace-events | 1 +
> 2 files changed, 45 insertions(+)
>
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 0a852af126..1280e2c1e8 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -1507,6 +1507,43 @@ static int kvm_arm_handle_sysreg_trap(ARMCPU *cpu,
> return -1;
> }
>
> +/*
> + * The guest is making a hypercall or firmware call. We can handle a
> + * limited number of them (e.g. PSCI) but we can't emulate a true
> + * firmware. This is an abbreviated version of
> + * kvm_smccc_call_handler() in the kernel and the TCG only arm_handle_psci_call().
> + *
> + * In the SplitAccel case we would be transitioning to execute EL2+
> + * under TCG.
> + */
> +static int kvm_arm_handle_hypercall(ARMCPU *cpu,
> + int esr_ec)
> +{
> + CPUARMState *env = &cpu->env;
> + int32_t ret = 0;
> +
> + trace_kvm_hypercall(esr_ec, env->xregs[0]);
> +
> + switch (env->xregs[0]) {
> + case QEMU_PSCI_0_2_FN_PSCI_VERSION:
> + ret = QEMU_PSCI_VERSION_1_1;
> + break;
> + case QEMU_PSCI_0_2_FN_MIGRATE_INFO_TYPE:
> + ret = QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED; /* No trusted OS */
> + break;
> + case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
> + ret = QEMU_PSCI_RET_NOT_SUPPORTED;
> + break;
> + default:
> + qemu_log_mask(LOG_UNIMP, "%s: unhandled hypercall %"PRIx64"\n",
> + __func__, env->xregs[0]);
> + return -1;
> + }
> +
> + env->xregs[0] = ret;
> + return 0;
> +}
> +
> /**
> * kvm_arm_handle_hard_trap:
> * @cpu: ARMCPU
> @@ -1538,6 +1575,13 @@ static int kvm_arm_handle_hard_trap(ARMCPU *cpu,
> switch (esr_ec) {
> case EC_SYSTEMREGISTERTRAP:
> return kvm_arm_handle_sysreg_trap(cpu, esr_iss, elr);
> + case EC_AA32_SVC:
> + case EC_AA32_HVC:
> + case EC_AA32_SMC:
> + case EC_AA64_SVC:
> + case EC_AA64_HVC:
> + case EC_AA64_SMC:
Should we increment $pc for SVC/SMC?
The instruction operation pseudocode [*] is:
preferred_exception_return = ThisInstrAddr(64);
[*]
https://developer.arm.com/documentation/ddi0602/2022-06/Shared-Pseudocode/AArch64-Exceptions?lang=en
> + return kvm_arm_handle_hypercall(cpu, esr_ec);
> default:
> qemu_log_mask(LOG_UNIMP, "%s: unhandled EC: %x/%x/%x/%d\n",
> __func__, esr_ec, esr_iss, esr_iss2, esr_il);
> diff --git a/target/arm/trace-events b/target/arm/trace-events
> index 69bb4d370d..10cdba92a3 100644
> --- a/target/arm/trace-events
> +++ b/target/arm/trace-events
> @@ -15,3 +15,4 @@ arm_gt_update_irq(int timer, int irqstate) "gt_update_irq: timer %d irqstate %d"
> kvm_arm_fixup_msi_route(uint64_t iova, uint64_t gpa) "MSI iova = 0x%"PRIx64" is translated into 0x%"PRIx64
> kvm_sysreg_read(const char *name, uint64_t val) "%s => 0x%" PRIx64
> kvm_sysreg_write(const char *name, uint64_t val) "%s <= 0x%" PRIx64
> +kvm_hypercall(int ec, uint64_t arg0) "%d: %"PRIx64
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler
2025-06-17 16:33 ` [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler Alex Bennée
2025-08-22 7:12 ` Philippe Mathieu-Daudé
@ 2025-08-22 7:15 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-08-22 7:15 UTC (permalink / raw)
To: Alex Bennée, qemu-devel
Cc: Cornelia Huck, qemu-arm, Mark Burton, Michael S. Tsirkin,
Alexander Graf, kvm, Peter Maydell, Paolo Bonzini
On 17/6/25 18:33, Alex Bennée wrote:
> For now just deal with the basic version probe we see during startup.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> target/arm/kvm.c | 44 +++++++++++++++++++++++++++++++++++++++++
> target/arm/trace-events | 1 +
> 2 files changed, 45 insertions(+)
> +/*
> + * The guest is making a hypercall or firmware call. We can handle a
> + * limited number of them (e.g. PSCI) but we can't emulate a true
> + * firmware. This is an abbreviated version of
> + * kvm_smccc_call_handler() in the kernel and the TCG only arm_handle_psci_call().
> + *
> + * In the SplitAccel case we would be transitioning to execute EL2+
> + * under TCG.
> + */
> +static int kvm_arm_handle_hypercall(ARMCPU *cpu,
> + int esr_ec)
> +{
> + CPUARMState *env = &cpu->env;
> + int32_t ret = 0;
> +
> + trace_kvm_hypercall(esr_ec, env->xregs[0]);
> +
Should we make arm_is_psci_call() generic to be able to use it here?
> + switch (env->xregs[0]) {
> + case QEMU_PSCI_0_2_FN_PSCI_VERSION:
> + ret = QEMU_PSCI_VERSION_1_1;
> + break;
> + case QEMU_PSCI_0_2_FN_MIGRATE_INFO_TYPE:
> + ret = QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED; /* No trusted OS */
> + break;
> + case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
> + ret = QEMU_PSCI_RET_NOT_SUPPORTED;
> + break;
> + default:
> + qemu_log_mask(LOG_UNIMP, "%s: unhandled hypercall %"PRIx64"\n",
> + __func__, env->xregs[0]);
> + return -1;
> + }
> +
> + env->xregs[0] = ret;
> + return 0;
> +}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler
2025-08-22 7:12 ` Philippe Mathieu-Daudé
@ 2025-08-22 7:55 ` Manos Pitsidianakis
0 siblings, 0 replies; 15+ messages in thread
From: Manos Pitsidianakis @ 2025-08-22 7:55 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: Alex Bennée, qemu-devel, Cornelia Huck, qemu-arm,
Mark Burton, Michael S. Tsirkin, Alexander Graf, kvm,
Peter Maydell, Paolo Bonzini, Pierrick Bouvier
On Fri, Aug 22, 2025 at 10:13 AM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 17/6/25 18:33, Alex Bennée wrote:
> > For now just deal with the basic version probe we see during startup.
> >
> > Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> > ---
> > target/arm/kvm.c | 44 +++++++++++++++++++++++++++++++++++++++++
> > target/arm/trace-events | 1 +
> > 2 files changed, 45 insertions(+)
> >
> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index 0a852af126..1280e2c1e8 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -1507,6 +1507,43 @@ static int kvm_arm_handle_sysreg_trap(ARMCPU *cpu,
> > return -1;
> > }
> >
> > +/*
> > + * The guest is making a hypercall or firmware call. We can handle a
> > + * limited number of them (e.g. PSCI) but we can't emulate a true
> > + * firmware. This is an abbreviated version of
> > + * kvm_smccc_call_handler() in the kernel and the TCG only arm_handle_psci_call().
> > + *
> > + * In the SplitAccel case we would be transitioning to execute EL2+
> > + * under TCG.
> > + */
> > +static int kvm_arm_handle_hypercall(ARMCPU *cpu,
> > + int esr_ec)
> > +{
> > + CPUARMState *env = &cpu->env;
> > + int32_t ret = 0;
> > +
> > + trace_kvm_hypercall(esr_ec, env->xregs[0]);
> > +
> > + switch (env->xregs[0]) {
> > + case QEMU_PSCI_0_2_FN_PSCI_VERSION:
> > + ret = QEMU_PSCI_VERSION_1_1;
> > + break;
> > + case QEMU_PSCI_0_2_FN_MIGRATE_INFO_TYPE:
> > + ret = QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED; /* No trusted OS */
> > + break;
> > + case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
> > + ret = QEMU_PSCI_RET_NOT_SUPPORTED;
> > + break;
> > + default:
> > + qemu_log_mask(LOG_UNIMP, "%s: unhandled hypercall %"PRIx64"\n",
> > + __func__, env->xregs[0]);
> > + return -1;
> > + }
> > +
> > + env->xregs[0] = ret;
> > + return 0;
> > +}
> > +
> > /**
> > * kvm_arm_handle_hard_trap:
> > * @cpu: ARMCPU
> > @@ -1538,6 +1575,13 @@ static int kvm_arm_handle_hard_trap(ARMCPU *cpu,
> > switch (esr_ec) {
> > case EC_SYSTEMREGISTERTRAP:
> > return kvm_arm_handle_sysreg_trap(cpu, esr_iss, elr);
> > + case EC_AA32_SVC:
> > + case EC_AA32_HVC:
> > + case EC_AA32_SMC:
> > + case EC_AA64_SVC:
> > + case EC_AA64_HVC:
> > + case EC_AA64_SMC:
>
> Should we increment $pc for SVC/SMC?
> The instruction operation pseudocode [*] is:
>
> preferred_exception_return = ThisInstrAddr(64);
>
Here's what the trusted firmware handler does.
The exception return address is modified by the :
https://github.com/ARM-software/arm-trusted-firmware/blob/da6b3a181c03a492ee52182b0466d0b7cc4091dd/bl31/aarch64/runtime_exceptions.S#L456-L480
> * returns:
> * -1: unhandled trap, UNDEF injection into lower EL
> * 0: handled trap, return to the trapping instruction (repeating it)
> * 1: handled trap, return to the next instruction
An SMC-aware trap handler should do the same
> [*]
> https://developer.arm.com/documentation/ddi0602/2022-06/Shared-Pseudocode/AArch64-Exceptions?lang=en
>
> > + return kvm_arm_handle_hypercall(cpu, esr_ec);
> > default:
> > qemu_log_mask(LOG_UNIMP, "%s: unhandled EC: %x/%x/%x/%d\n",
> > __func__, esr_ec, esr_iss, esr_iss2, esr_il);
> > diff --git a/target/arm/trace-events b/target/arm/trace-events
> > index 69bb4d370d..10cdba92a3 100644
> > --- a/target/arm/trace-events
> > +++ b/target/arm/trace-events
> > @@ -15,3 +15,4 @@ arm_gt_update_irq(int timer, int irqstate) "gt_update_irq: timer %d irqstate %d"
> > kvm_arm_fixup_msi_route(uint64_t iova, uint64_t gpa) "MSI iova = 0x%"PRIx64" is translated into 0x%"PRIx64
> > kvm_sysreg_read(const char *name, uint64_t val) "%s => 0x%" PRIx64
> > kvm_sysreg_write(const char *name, uint64_t val) "%s <= 0x%" PRIx64
> > +kvm_hypercall(int ec, uint64_t arg0) "%d: %"PRIx64
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-08-22 7:56 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 16:33 [RFC PATCH 00/11] kvm/arm: trap-me-harder implementation Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 01/11] target/arm: allow gdb to read ARM_CP_NORAW regs (!upstream) Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 02/11] target/arm: re-arrange debug_cp_reginfo Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 03/11] linux-headers: Update to Linux 6.15.1 with trap-mem-harder (WIP) Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 04/11] kvm: expose a trap-harder option to the command line Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 05/11] target/arm: enable KVM_VM_TYPE_ARM_TRAP_ALL when asked Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 06/11] kvm/arm: allow out-of kernel GICv3 to work with KVM Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 07/11] target/arm: clamp value on icc_bpr_write to account for RES0 fields Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 08/11] kvm/arm: plumb in a basic trap harder handler Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 09/11] kvm/arm: implement sysreg trap handler Alex Bennée
2025-06-17 16:33 ` [RFC PATCH 10/11] kvm/arm: implement a basic hypercall handler Alex Bennée
2025-08-22 7:12 ` Philippe Mathieu-Daudé
2025-08-22 7:55 ` Manos Pitsidianakis
2025-08-22 7:15 ` Philippe Mathieu-Daudé
2025-06-17 16:33 ` [RFC PATCH 11/11] kvm/arm: implement WFx traps for KVM Alex Bennée
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).