* [PATCH] KVM: arm64: Discard PC update state on vcpu reset
@ 2026-03-12 14:08 Marc Zyngier
2026-03-13 14:30 ` Joey Gouly
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-03-12 14:08 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, stable
Our vcpu reset suffers from a particularly interesting flaw, as it
does not correctly deal with state that will have an effect on the
execution flow out of reset.
Take the following completely random example, never seen in the wild
and that never resulted in a couple of sleepless nights: /s
- vcpu-A issues a PSCI_CPU_OFF using the SMC conduit
- SMC being a trapped instruction (as opposed to HVC which is always
normally executed), we annotate the vcpu as needing to skip the
next instruction, which is the SMC itself
- vcpu-A is now safely off
- vcpu-B issues a PSCI_CPU_ON for vcpu-A, providing a starting PC
- vcpu-A gets reset, get the new PC, and is sent on its merry way
- right at the point of entering the guest, we notice that a PC
increment is pending (remember the earlier SMC?)
- vcpu-A skips its first instruction...
What could possibly go wrong?
Well, I'm glad you asked. For pKVM as a NV guest, that first instruction
is extremely significant, as it indicates whether the CPU is booting
or resuming. Having skipped that instruction, nothing makes any sense
anymore, and CPU hotplugging fails.
This is all caused by the decoupling of PC update from the handling
of an exception that triggers such update, making it non-obvious
what affects what when.
Fix this train wreck by discarding all the PC-affecting state on
vcpu reset.
Fixes: f5e30680616ab ("KVM: arm64: Move __adjust_pc out of line")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
---
arch/arm64/kvm/reset.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 959532422d3a3..b963fd975aaca 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -247,6 +247,20 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
kvm_vcpu_set_be(vcpu);
*vcpu_pc(vcpu) = target_pc;
+
+ /*
+ * We may come from a state where either a PC update was
+ * pending (SMC call resulting in PC being increpented to
+ * skip the SMC) or a pending exception. Make sure we get
+ * rid of all that, as this cannot be valid out of reset.
+ *
+ * Note that clearing the exception mask also clears PC
+ * updates, but that's an implementation detail, and we
+ * really want to make it explicit.
+ */
+ vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
+ vcpu_clear_flag(vcpu, EXCEPT_MASK);
+ vcpu_clear_flag(vcpu, INCREMENT_PC);
vcpu_set_reg(vcpu, 0, reset_state.r0);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Discard PC update state on vcpu reset
2026-03-12 14:08 [PATCH] KVM: arm64: Discard PC update state on vcpu reset Marc Zyngier
@ 2026-03-13 14:30 ` Joey Gouly
2026-03-15 15:08 ` Marc Zyngier
2026-03-13 16:16 ` Suzuki K Poulose
2026-03-15 15:11 ` Marc Zyngier
2 siblings, 1 reply; 5+ messages in thread
From: Joey Gouly @ 2026-03-13 14:30 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, stable
On Thu, Mar 12, 2026 at 02:08:50PM +0000, Marc Zyngier wrote:
> Our vcpu reset suffers from a particularly interesting flaw, as it
> does not correctly deal with state that will have an effect on the
> execution flow out of reset.
>
> Take the following completely random example, never seen in the wild
> and that never resulted in a couple of sleepless nights: /s
>
> - vcpu-A issues a PSCI_CPU_OFF using the SMC conduit
>
> - SMC being a trapped instruction (as opposed to HVC which is always
> normally executed), we annotate the vcpu as needing to skip the
> next instruction, which is the SMC itself
>
> - vcpu-A is now safely off
>
> - vcpu-B issues a PSCI_CPU_ON for vcpu-A, providing a starting PC
>
> - vcpu-A gets reset, get the new PC, and is sent on its merry way
>
> - right at the point of entering the guest, we notice that a PC
> increment is pending (remember the earlier SMC?)
>
> - vcpu-A skips its first instruction...
>
> What could possibly go wrong?
>
> Well, I'm glad you asked. For pKVM as a NV guest, that first instruction
> is extremely significant, as it indicates whether the CPU is booting
> or resuming. Having skipped that instruction, nothing makes any sense
> anymore, and CPU hotplugging fails.
Would the normal method of offlining/onlining via sysfs also be affected?
>
> This is all caused by the decoupling of PC update from the handling
> of an exception that triggers such update, making it non-obvious
> what affects what when.
>
> Fix this train wreck by discarding all the PC-affecting state on
> vcpu reset.
Good job on tracking it down.. makes you wonder why the DSB "fixed" things!
>
> Fixes: f5e30680616ab ("KVM: arm64: Move __adjust_pc out of line")
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: stable@vger.kernel.org
> ---
> arch/arm64/kvm/reset.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 959532422d3a3..b963fd975aaca 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -247,6 +247,20 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> kvm_vcpu_set_be(vcpu);
>
> *vcpu_pc(vcpu) = target_pc;
> +
> + /*
> + * We may come from a state where either a PC update was
> + * pending (SMC call resulting in PC being increpented to
incremented
> + * skip the SMC) or a pending exception. Make sure we get
> + * rid of all that, as this cannot be valid out of reset.
> + *
> + * Note that clearing the exception mask also clears PC
> + * updates, but that's an implementation detail, and we
> + * really want to make it explicit.
> + */
> + vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
> + vcpu_clear_flag(vcpu, EXCEPT_MASK);
> + vcpu_clear_flag(vcpu, INCREMENT_PC);
> vcpu_set_reg(vcpu, 0, reset_state.r0);
> }
>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Thanks,
Joey
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Discard PC update state on vcpu reset
2026-03-12 14:08 [PATCH] KVM: arm64: Discard PC update state on vcpu reset Marc Zyngier
2026-03-13 14:30 ` Joey Gouly
@ 2026-03-13 16:16 ` Suzuki K Poulose
2026-03-15 15:11 ` Marc Zyngier
2 siblings, 0 replies; 5+ messages in thread
From: Suzuki K Poulose @ 2026-03-13 16:16 UTC (permalink / raw)
To: Marc Zyngier, kvmarm, linux-arm-kernel
Cc: Joey Gouly, Oliver Upton, Zenghui Yu, stable
On 12/03/2026 14:08, Marc Zyngier wrote:
> Our vcpu reset suffers from a particularly interesting flaw, as it
> does not correctly deal with state that will have an effect on the
> execution flow out of reset.
>
> Take the following completely random example, never seen in the wild
> and that never resulted in a couple of sleepless nights: /s
>
> - vcpu-A issues a PSCI_CPU_OFF using the SMC conduit
>
> - SMC being a trapped instruction (as opposed to HVC which is always
> normally executed), we annotate the vcpu as needing to skip the
> next instruction, which is the SMC itself
>
> - vcpu-A is now safely off
>
> - vcpu-B issues a PSCI_CPU_ON for vcpu-A, providing a starting PC
>
> - vcpu-A gets reset, get the new PC, and is sent on its merry way
>
> - right at the point of entering the guest, we notice that a PC
> increment is pending (remember the earlier SMC?)
>
> - vcpu-A skips its first instruction...
>
> What could possibly go wrong?
>
> Well, I'm glad you asked. For pKVM as a NV guest, that first instruction
> is extremely significant, as it indicates whether the CPU is booting
> or resuming. Having skipped that instruction, nothing makes any sense
> anymore, and CPU hotplugging fails.
>
> This is all caused by the decoupling of PC update from the handling
> of an exception that triggers such update, making it non-obvious
> what affects what when.
>
> Fix this train wreck by discarding all the PC-affecting state on
> vcpu reset.
>
> Fixes: f5e30680616ab ("KVM: arm64: Move __adjust_pc out of line")
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: stable@vger.kernel.org
> ---
> arch/arm64/kvm/reset.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 959532422d3a3..b963fd975aaca 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -247,6 +247,20 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> kvm_vcpu_set_be(vcpu);
>
> *vcpu_pc(vcpu) = target_pc;
> +
> + /*
> + * We may come from a state where either a PC update was
> + * pending (SMC call resulting in PC being increpented to
> + * skip the SMC) or a pending exception. Make sure we get
> + * rid of all that, as this cannot be valid out of reset.
> + *
> + * Note that clearing the exception mask also clears PC
> + * updates, but that's an implementation detail, and we
> + * really want to make it explicit.
> + */
> + vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
> + vcpu_clear_flag(vcpu, EXCEPT_MASK);
> + vcpu_clear_flag(vcpu, INCREMENT_PC);
> vcpu_set_reg(vcpu, 0, reset_state.r0);
> }
Wow! Thats it finally !! Glad you found the root cause.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Discard PC update state on vcpu reset
2026-03-13 14:30 ` Joey Gouly
@ 2026-03-15 15:08 ` Marc Zyngier
0 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-03-15 15:08 UTC (permalink / raw)
To: Joey Gouly
Cc: kvmarm, linux-arm-kernel, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, stable
On Fri, 13 Mar 2026 14:30:19 +0000,
Joey Gouly <joey.gouly@arm.com> wrote:
>
> On Thu, Mar 12, 2026 at 02:08:50PM +0000, Marc Zyngier wrote:
> > Our vcpu reset suffers from a particularly interesting flaw, as it
> > does not correctly deal with state that will have an effect on the
> > execution flow out of reset.
> >
> > Take the following completely random example, never seen in the wild
> > and that never resulted in a couple of sleepless nights: /s
> >
> > - vcpu-A issues a PSCI_CPU_OFF using the SMC conduit
> >
> > - SMC being a trapped instruction (as opposed to HVC which is always
> > normally executed), we annotate the vcpu as needing to skip the
> > next instruction, which is the SMC itself
> >
> > - vcpu-A is now safely off
> >
> > - vcpu-B issues a PSCI_CPU_ON for vcpu-A, providing a starting PC
> >
> > - vcpu-A gets reset, get the new PC, and is sent on its merry way
> >
> > - right at the point of entering the guest, we notice that a PC
> > increment is pending (remember the earlier SMC?)
> >
> > - vcpu-A skips its first instruction...
> >
> > What could possibly go wrong?
> >
> > Well, I'm glad you asked. For pKVM as a NV guest, that first instruction
> > is extremely significant, as it indicates whether the CPU is booting
> > or resuming. Having skipped that instruction, nothing makes any sense
> > anymore, and CPU hotplugging fails.
>
> Would the normal method of offlining/onlining via sysfs also be affected?
I'm not sure there's any other. That's how I tested it, anyway, since
you can't do a late onlining with pKVM.
> >
> > This is all caused by the decoupling of PC update from the handling
> > of an exception that triggers such update, making it non-obvious
> > what affects what when.
> >
> > Fix this train wreck by discarding all the PC-affecting state on
> > vcpu reset.
>
> Good job on tracking it down.. makes you wonder why the DSB "fixed" things!
Because it was exactly at the correct spot to hide the crap. But NOP,
or even UDF would have had the same effect.
>
> >
> > Fixes: f5e30680616ab ("KVM: arm64: Move __adjust_pc out of line")
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > arch/arm64/kvm/reset.c | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index 959532422d3a3..b963fd975aaca 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -247,6 +247,20 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> > kvm_vcpu_set_be(vcpu);
> >
> > *vcpu_pc(vcpu) = target_pc;
> > +
> > + /*
> > + * We may come from a state where either a PC update was
> > + * pending (SMC call resulting in PC being increpented to
> incremented
> > + * skip the SMC) or a pending exception. Make sure we get
> > + * rid of all that, as this cannot be valid out of reset.
> > + *
> > + * Note that clearing the exception mask also clears PC
> > + * updates, but that's an implementation detail, and we
> > + * really want to make it explicit.
> > + */
> > + vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
> > + vcpu_clear_flag(vcpu, EXCEPT_MASK);
> > + vcpu_clear_flag(vcpu, INCREMENT_PC);
> > vcpu_set_reg(vcpu, 0, reset_state.r0);
> > }
> >
>
> Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Thanks!
M.
--
Jazz isn't dead. It just smells funny.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Discard PC update state on vcpu reset
2026-03-12 14:08 [PATCH] KVM: arm64: Discard PC update state on vcpu reset Marc Zyngier
2026-03-13 14:30 ` Joey Gouly
2026-03-13 16:16 ` Suzuki K Poulose
@ 2026-03-15 15:11 ` Marc Zyngier
2 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-03-15 15:11 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, Marc Zyngier
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, stable
On Thu, 12 Mar 2026 14:08:50 +0000, Marc Zyngier wrote:
> Our vcpu reset suffers from a particularly interesting flaw, as it
> does not correctly deal with state that will have an effect on the
> execution flow out of reset.
>
> Take the following completely random example, never seen in the wild
> and that never resulted in a couple of sleepless nights: /s
>
> [...]
Applied to fixes, thanks!
[1/1] KVM: arm64: Discard PC update state on vcpu reset
commit: 1744a6ef48b9a48f017e3e1a0d05de0a6978396e
Cheers,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-15 15:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-12 14:08 [PATCH] KVM: arm64: Discard PC update state on vcpu reset Marc Zyngier
2026-03-13 14:30 ` Joey Gouly
2026-03-15 15:08 ` Marc Zyngier
2026-03-13 16:16 ` Suzuki K Poulose
2026-03-15 15:11 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox