* Saving and restoring state of a KVM VM using GICv2 fails
@ 2025-10-10 15:33 Peter Maydell
2025-10-12 17:14 ` Marc Zyngier
0 siblings, 1 reply; 2+ messages in thread
From: Peter Maydell @ 2025-10-10 15:33 UTC (permalink / raw)
To: open list:ARM
Cc: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K. Poulose,
Zenghui Yu
I was testing doing saving and restoring state of a KVM VM that
happened to be using GICv2, and I discovered that it doesn't
seem to work. Running the VM works fine, as does the state save,
but when you try to reload the state it fails:
$ /work/test-images/virtv8/runme ./build/arm/qemu-system-aarch64
-enable-kvm -machine gic-version=2 -loadvm gic2
qemu-system-aarch64: Could not set register op0:3 op1:0 crn:0 crm:1
op2:1 to 11011 (is 10011011)
qemu-system-aarch64: error while loading state for instance 0x0 of
device 'cpu': post load hook failed for: cpu, version_id: 22,
minimum_version: 22, ret: -1
This is QEMU saying that it tried to do the KVM_SET_ONE_REG for
ID_PFR1_EL1 to 0x11011, and failed, and that KVM thinks that register's
value is 0x10011011. The difference is that KVM has the GIC field set
(bits [31:28]).
Looking at the kernel code, I think this happens because the kernel
only clears out the GIC field of the idreg in kvm_finalize_sys_regs(),
which gets called when the vcpu is first run. So because state save
happens after the vcpu has run for a bit, it sees the value of the
register with the GIC field set to 0, and that's what it writes out
into the saved state data. But the loadvm operation happens
with a fresh new VM which has never been run. So the kernel still
thinks the GIC field in the idreg should be 1, and it fails the
SET_ONE_REG operation which tries to write it to 0.
This kernel reports itself as
6.14.0-1012-aws #12~24.04.1-Ubuntu SMP Fri Aug 15 00:07:14 UTC 2025
The failure happens for both aarch32 and aarch64 guests.
I haven't tested whether it happens on a host that has only
64-bit EL1 (i.e. where ID_PFR1_EL1 doesn't exist). It may be
that there's some flexibility about writes to ID_AA64PFR0_EL1.GIC
which needs to also be permitted for ID_PFR1_EL1.GIC.
thanks
-- PMM
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Saving and restoring state of a KVM VM using GICv2 fails
2025-10-10 15:33 Saving and restoring state of a KVM VM using GICv2 fails Peter Maydell
@ 2025-10-12 17:14 ` Marc Zyngier
0 siblings, 0 replies; 2+ messages in thread
From: Marc Zyngier @ 2025-10-12 17:14 UTC (permalink / raw)
To: Peter Maydell
Cc: open list:ARM, Oliver Upton, Joey Gouly, Suzuki K. Poulose,
Zenghui Yu
On Fri, 10 Oct 2025 16:33:45 +0100,
Peter Maydell <peter.maydell@linaro.org> wrote:
>
> I was testing doing saving and restoring state of a KVM VM that
> happened to be using GICv2, and I discovered that it doesn't
> seem to work. Running the VM works fine, as does the state save,
> but when you try to reload the state it fails:
>
> $ /work/test-images/virtv8/runme ./build/arm/qemu-system-aarch64
> -enable-kvm -machine gic-version=2 -loadvm gic2
> qemu-system-aarch64: Could not set register op0:3 op1:0 crn:0 crm:1
> op2:1 to 11011 (is 10011011)
> qemu-system-aarch64: error while loading state for instance 0x0 of
> device 'cpu': post load hook failed for: cpu, version_id: 22,
> minimum_version: 22, ret: -1
>
> This is QEMU saying that it tried to do the KVM_SET_ONE_REG for
> ID_PFR1_EL1 to 0x11011, and failed, and that KVM thinks that register's
> value is 0x10011011. The difference is that KVM has the GIC field set
> (bits [31:28]).
>
> Looking at the kernel code, I think this happens because the kernel
> only clears out the GIC field of the idreg in kvm_finalize_sys_regs(),
> which gets called when the vcpu is first run. So because state save
> happens after the vcpu has run for a bit, it sees the value of the
> register with the GIC field set to 0, and that's what it writes out
> into the saved state data. But the loadvm operation happens
> with a fresh new VM which has never been run. So the kernel still
> thinks the GIC field in the idreg should be 1, and it fails the
> SET_ONE_REG operation which tries to write it to 0.
Right, this fires on upstream as well. We allow writes to
ID_AA64PFR0_EL1.GIC, but we don't let ID_PFR1_EL1.GIC being written,
while we otherwise insist on keeping them in sync.
I think there's a few changes that need making:
- let ID_PDR1_EL1.GIC being written to
- manage ID_{AA64PFR0,PFR1}_EL1.GIC from the point where we create the
in-kernel GIC
- reserve the 'finalize' treatment for the case where we don't have an
in-kernel GIC
> This kernel reports itself as
> 6.14.0-1012-aws #12~24.04.1-Ubuntu SMP Fri Aug 15 00:07:14 UTC 2025
Yup, that's consistent with the above being introduced in 6.12.
> The failure happens for both aarch32 and aarch64 guests.
> I haven't tested whether it happens on a host that has only
> 64-bit EL1 (i.e. where ID_PFR1_EL1 doesn't exist). It may be
> that there's some flexibility about writes to ID_AA64PFR0_EL1.GIC
> which needs to also be permitted for ID_PFR1_EL1.GIC.
We treat all the non-AA64 idregs as RAZ/WI when the host is not
AArch32 capable, so at least that particular aspect should be OK (and
GICv2, AArch64-only machines should be relatively rare...). But the
32/64bit feature matching has been off for some time, and we probably
have more of those lurking.
Anyway, I'll post the fixes shortly once I've written commit messages.
Thanks,
M.
--
Jazz isn't dead. It just smells funny.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-10-12 17:14 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-10 15:33 Saving and restoring state of a KVM VM using GICv2 fails Peter Maydell
2025-10-12 17:14 ` Marc Zyngier
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.