* [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-27 18:05 [PATCH 0/2] KVM: arm64: PSCI relay fixes Mark Rutland
@ 2025-02-27 18:05 ` Mark Rutland
2025-02-28 9:29 ` Leo Yan
2025-02-27 18:05 ` [PATCH 2/2] KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu() Mark Rutland
2025-03-02 8:41 ` [PATCH 0/2] KVM: arm64: PSCI relay fixes Marc Zyngier
2 siblings, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2025-02-27 18:05 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ahmed.genidi, ben.horgan, catalin.marinas, kvmarm, leo.yan,
mark.rutland, maz, oliver.upton, will
On CPUs without FEAT_E2H0, HCR_EL2.E2H is RES1, but may reset to an
UNKNOWN value out of reset and consequently may not read as 1 unless it
has been explicitly initialized.
We handled this for the head.S boot code in commits:
3944382fa6f22b54 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative")
b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented")
Unfortunately, we forgot to apply a similar fix to the KVM PSCI entry
points used when relaying CPU_ON, CPU_SUSPEND, and SYSTEM SUSPEND. When
KVM is entered via these entry points, the value of HCR_EL2.E2H may be
consumed before it has been initialized (e.g. by the 'init_el2_state'
macro).
Initialize HCR_EL2.E2H early in these paths such that it can be consumed
reliably. The existing code in head.S is factored out into a new
'init_el2_hcr' macro, and this is used in the __kvm_hyp_init_cpu()
function common to all the relevant PSCI entry points.
For clarity, I've tweaked the assembly used to check whether
ID_AA64MMFR4_EL1.E2H0 is negative. The bitfield is extracted as a signed
value, and this is checked with a signed-less-than (LT) comparison.
As the hyp code will reconfigure HCR_EL2 later in ___kvm_hyp_init(), all
bits other than E2H are initialized to zero in __kvm_hyp_init_cpu().
Fixes: 3944382fa6f22b54 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative")
Fixes: b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ahmed Genidi <ahmed.genidi@arm.com>
Cc: Ben Horgan <ben.horgan@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/el2_setup.h | 26 ++++++++++++++++++++++++++
arch/arm64/kernel/head.S | 19 +------------------
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 8 +++++++-
3 files changed, 34 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index 25e1626517500..bc8ebd55788ac 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -16,6 +16,32 @@
#include <asm/sysreg.h>
#include <linux/irqchip/arm-gic-v3.h>
+.macro init_el2_hcr val
+ mov_q x0, \val
+
+ /*
+ * Compliant CPUs advertise their VHE-onlyness with
+ * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
+ * can reset into an UNKNOWN state and might not read as 1 until it has
+ * been initialized explicitly.
+ *
+ * Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
+ * don't advertise it (they predate this relaxation).
+ *
+ * Initalize HCR_EL2.E2H so that later code can rely upon HCR_EL2.E2H
+ * indicating whether the CPU is running in E2H mode.
+ */
+ mrs_s x1, SYS_ID_AA64MMFR4_EL1
+ sbfx x1, x1, #ID_AA64MMFR4_EL1_E2H0_SHIFT, #ID_AA64MMFR4_EL1_E2H0_WIDTH
+ cmp x1, #0
+ b.lt .LnVHE_\@
+
+ orr x0, x0, #HCR_E2H
+.LnVHE_\@:
+ msr hcr_el2, x0
+ isb
+.endm
+
.macro __init_el2_sctlr
mov_q x0, INIT_SCTLR_EL2_MMU_OFF
msr sctlr_el2, x0
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 5ab1970ee5436..2d56459d6c94c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -298,25 +298,8 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr sctlr_el2, x0
isb
0:
- mov_q x0, HCR_HOST_NVHE_FLAGS
-
- /*
- * Compliant CPUs advertise their VHE-onlyness with
- * ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
- * RES1 in that case. Publish the E2H bit early so that
- * it can be picked up by the init_el2_state macro.
- *
- * Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
- * don't advertise it (they predate this relaxation).
- */
- mrs_s x1, SYS_ID_AA64MMFR4_EL1
- tbz x1, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f
-
- orr x0, x0, #HCR_E2H
-1:
- msr hcr_el2, x0
- isb
+ init_el2_hcr HCR_HOST_NVHE_FLAGS
init_el2_state
/* Hypervisor stub */
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index fc18662260676..3fb5504a7d7fc 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -73,8 +73,12 @@ __do_hyp_init:
eret
SYM_CODE_END(__kvm_hyp_init)
+/*
+ * Initialize EL2 CPU state to sane values.
+ *
+ * HCR_EL2.E2H must have been initialized already.
+ */
SYM_CODE_START_LOCAL(__kvm_init_el2_state)
- /* Initialize EL2 CPU state to sane values. */
init_el2_state // Clobbers x0..x2
finalise_el2_state
ret
@@ -206,6 +210,8 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
2: msr SPsel, #1 // We want to use SP_EL{1,2}
+ init_el2_hcr 0
+
bl __kvm_init_el2_state
__init_el2_nvhe_prepare_eret
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-27 18:05 ` [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early Mark Rutland
@ 2025-02-28 9:29 ` Leo Yan
2025-02-28 9:43 ` Marc Zyngier
0 siblings, 1 reply; 11+ messages in thread
From: Leo Yan @ 2025-02-28 9:29 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ahmed.genidi, ben.horgan, catalin.marinas,
kvmarm, maz, oliver.upton, will
Hi Mark,
On Thu, Feb 27, 2025 at 06:05:25PM +0000, Mark Rutland wrote:
[...]
> +.macro init_el2_hcr val
> + mov_q x0, \val
> +
> + /*
> + * Compliant CPUs advertise their VHE-onlyness with
> + * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
> + * can reset into an UNKNOWN state and might not read as 1 until it has
> + * been initialized explicitly.
For ID_AA64MMFR4_EL1.E2H0 < 0 case, the code actually clears the
HCR_EL2.E2H bit.
Hence, the comment should be corrected as: "... it can reset into an
UNKNOWN state and might not read as 0 until it has been initialized
explicitly".
> + *
> + * Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
> + * don't advertise it (they predate this relaxation).
> + *
> + * Initalize HCR_EL2.E2H so that later code can rely upon HCR_EL2.E2H
> + * indicating whether the CPU is running in E2H mode.
> + */
I think it is even better to clear the HCR_E2H bit first. This can
avoid any dependency on the passed parameter 'val'.
bic x0, x0, #HCR_E2H
> + mrs_s x1, SYS_ID_AA64MMFR4_EL1
> + sbfx x1, x1, #ID_AA64MMFR4_EL1_E2H0_SHIFT, #ID_AA64MMFR4_EL1_E2H0_WIDTH
> + cmp x1, #0
> + b.lt .LnVHE_\@
> +
> + orr x0, x0, #HCR_E2H
> +.LnVHE_\@:
> + msr hcr_el2, x0
> + isb
> +.endm
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-28 9:29 ` Leo Yan
@ 2025-02-28 9:43 ` Marc Zyngier
2025-02-28 9:52 ` Mark Rutland
2025-02-28 10:14 ` Leo Yan
0 siblings, 2 replies; 11+ messages in thread
From: Marc Zyngier @ 2025-02-28 9:43 UTC (permalink / raw)
To: Leo Yan
Cc: Mark Rutland, linux-arm-kernel, ahmed.genidi, ben.horgan,
catalin.marinas, kvmarm, oliver.upton, will
On Fri, 28 Feb 2025 09:29:55 +0000,
Leo Yan <leo.yan@arm.com> wrote:
>
> Hi Mark,
>
> On Thu, Feb 27, 2025 at 06:05:25PM +0000, Mark Rutland wrote:
>
> [...]
>
> > +.macro init_el2_hcr val
> > + mov_q x0, \val
> > +
> > + /*
> > + * Compliant CPUs advertise their VHE-onlyness with
> > + * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
> > + * can reset into an UNKNOWN state and might not read as 1 until it has
> > + * been initialized explicitly.
>
> For ID_AA64MMFR4_EL1.E2H0 < 0 case, the code actually clears the
> HCR_EL2.E2H bit.
>
> Hence, the comment should be corrected as: "... it can reset into an
> UNKNOWN state and might not read as 0 until it has been initialized
> explicitly".
The comment is just fine. It is the code that is wrong, as it avoids
setting E2H when E2H0 < 0 while we want the exact opposite behaviour.
As a result, 'b.lt' really should be a 'b.ge'. Or the original code
kept as is.
>
> > + *
> > + * Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
> > + * don't advertise it (they predate this relaxation).
> > + *
> > + * Initalize HCR_EL2.E2H so that later code can rely upon HCR_EL2.E2H
> > + * indicating whether the CPU is running in E2H mode.
> > + */
>
> I think it is even better to clear the HCR_E2H bit first. This can
> avoid any dependency on the passed parameter 'val'.
What are you trying to avoid? A random value passed as a parameter to
the macro?
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-28 9:43 ` Marc Zyngier
@ 2025-02-28 9:52 ` Mark Rutland
2025-02-28 10:20 ` Marc Zyngier
2025-02-28 10:14 ` Leo Yan
1 sibling, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2025-02-28 9:52 UTC (permalink / raw)
To: Marc Zyngier
Cc: Leo Yan, linux-arm-kernel, ahmed.genidi, ben.horgan,
catalin.marinas, kvmarm, oliver.upton, will
On Fri, Feb 28, 2025 at 09:43:20AM +0000, Marc Zyngier wrote:
> On Fri, 28 Feb 2025 09:29:55 +0000,
> Leo Yan <leo.yan@arm.com> wrote:
> >
> > Hi Mark,
> >
> > On Thu, Feb 27, 2025 at 06:05:25PM +0000, Mark Rutland wrote:
> >
> > [...]
> >
> > > +.macro init_el2_hcr val
> > > + mov_q x0, \val
> > > +
> > > + /*
> > > + * Compliant CPUs advertise their VHE-onlyness with
> > > + * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
> > > + * can reset into an UNKNOWN state and might not read as 1 until it has
> > > + * been initialized explicitly.
> >
> > For ID_AA64MMFR4_EL1.E2H0 < 0 case, the code actually clears the
> > HCR_EL2.E2H bit.
> >
> > Hence, the comment should be corrected as: "... it can reset into an
> > UNKNOWN state and might not read as 0 until it has been initialized
> > explicitly".
>
> The comment is just fine. It is the code that is wrong, as it avoids
> setting E2H when E2H0 < 0 while we want the exact opposite behaviour.
>
> As a result, 'b.lt' really should be a 'b.ge'. Or the original code
> kept as is.
Ugh, yes. I got confused and got the condition backwards.
Either works. Using 'b.ge' is closer to my intention -- I found the
'tbz' of the sign bit somewhat surprising and that needed a longer line
after the lable name changed.
Would you like me to respin, or would you be hapy to fix up when
applying?
Mark.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-28 9:52 ` Mark Rutland
@ 2025-02-28 10:20 ` Marc Zyngier
2025-02-28 11:13 ` Mark Rutland
0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2025-02-28 10:20 UTC (permalink / raw)
To: Mark Rutland
Cc: Leo Yan, linux-arm-kernel, ahmed.genidi, ben.horgan,
catalin.marinas, kvmarm, oliver.upton, will
On Fri, 28 Feb 2025 09:52:50 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Feb 28, 2025 at 09:43:20AM +0000, Marc Zyngier wrote:
> > On Fri, 28 Feb 2025 09:29:55 +0000,
> > Leo Yan <leo.yan@arm.com> wrote:
> > >
> > > Hi Mark,
> > >
> > > On Thu, Feb 27, 2025 at 06:05:25PM +0000, Mark Rutland wrote:
> > >
> > > [...]
> > >
> > > > +.macro init_el2_hcr val
> > > > + mov_q x0, \val
> > > > +
> > > > + /*
> > > > + * Compliant CPUs advertise their VHE-onlyness with
> > > > + * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
> > > > + * can reset into an UNKNOWN state and might not read as 1 until it has
> > > > + * been initialized explicitly.
> > >
> > > For ID_AA64MMFR4_EL1.E2H0 < 0 case, the code actually clears the
> > > HCR_EL2.E2H bit.
> > >
> > > Hence, the comment should be corrected as: "... it can reset into an
> > > UNKNOWN state and might not read as 0 until it has been initialized
> > > explicitly".
> >
> > The comment is just fine. It is the code that is wrong, as it avoids
> > setting E2H when E2H0 < 0 while we want the exact opposite behaviour.
> >
> > As a result, 'b.lt' really should be a 'b.ge'. Or the original code
> > kept as is.
>
> Ugh, yes. I got confused and got the condition backwards.
>
> Either works. Using 'b.ge' is closer to my intention -- I found the
> 'tbz' of the sign bit somewhat surprising and that needed a longer line
> after the lable name changed.
>
> Would you like me to respin, or would you be hapy to fix up when
> applying?
I can fix it on the fly, but it needs retesting, as I don't understand
how things could work in this state.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-28 10:20 ` Marc Zyngier
@ 2025-02-28 11:13 ` Mark Rutland
0 siblings, 0 replies; 11+ messages in thread
From: Mark Rutland @ 2025-02-28 11:13 UTC (permalink / raw)
To: Marc Zyngier
Cc: Leo Yan, linux-arm-kernel, ahmed.genidi, ben.horgan,
catalin.marinas, kvmarm, oliver.upton, will
On Fri, Feb 28, 2025 at 10:20:38AM +0000, Marc Zyngier wrote:
> On Fri, 28 Feb 2025 09:52:50 +0000,
> Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Fri, Feb 28, 2025 at 09:43:20AM +0000, Marc Zyngier wrote:
> > > On Fri, 28 Feb 2025 09:29:55 +0000,
> > > Leo Yan <leo.yan@arm.com> wrote:
> > > >
> > > > Hi Mark,
> > > >
> > > > On Thu, Feb 27, 2025 at 06:05:25PM +0000, Mark Rutland wrote:
> > > >
> > > > [...]
> > > >
> > > > > +.macro init_el2_hcr val
> > > > > + mov_q x0, \val
> > > > > +
> > > > > + /*
> > > > > + * Compliant CPUs advertise their VHE-onlyness with
> > > > > + * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
> > > > > + * can reset into an UNKNOWN state and might not read as 1 until it has
> > > > > + * been initialized explicitly.
> > > >
> > > > For ID_AA64MMFR4_EL1.E2H0 < 0 case, the code actually clears the
> > > > HCR_EL2.E2H bit.
> > > >
> > > > Hence, the comment should be corrected as: "... it can reset into an
> > > > UNKNOWN state and might not read as 0 until it has been initialized
> > > > explicitly".
> > >
> > > The comment is just fine. It is the code that is wrong, as it avoids
> > > setting E2H when E2H0 < 0 while we want the exact opposite behaviour.
> > >
> > > As a result, 'b.lt' really should be a 'b.ge'. Or the original code
> > > kept as is.
> >
> > Ugh, yes. I got confused and got the condition backwards.
> >
> > Either works. Using 'b.ge' is closer to my intention -- I found the
> > 'tbz' of the sign bit somewhat surprising and that needed a longer line
> > after the lable name changed.
> >
> > Would you like me to respin, or would you be hapy to fix up when
> > applying?
>
> I can fix it on the fly, but it needs retesting, as I don't understand
> how things could work in this state.
This happened to work by virtue of coincidence :/
Critically, I have not tested this on a CPU where HCR_EL2.E2H is
writeable but one polarity has no effect, as I don't have such a CPU to
hand. IIUC you tested that with hVHE under NV per commit:
b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented")
... but I don't currently have a good setup to test that configuration.
In other cases, this largely falls out in the wash, e.g.
* On a CPU without E2H, where the HCR_EL2.E2H bit is implemented as
RAZ/WI, the bit always reads as 0. Trying to set the bit has no
effect. Later reads see 0.
Hence this case happens to work.
* On a CPU with E2H and without FEAT_E2H0, where the HCR_EL2.E2H bit is
implemented as RAO/WI, the bit always reads as 1. Trying to clear the
bit has no effect. Later reads see 1.
Hence this case happens to work.
* On a CPU with E2H and with FEAT_E2H0, there the HCR_EL2.E2H bit is
implemented and has an effect, writing to the bit moves the CPU into
E2H mode.
The early boot code handles this the same as FEAT_E2H0 being absent,
and so that happens to work. I haven't dug into how HCR_EL2 gets
properly initialized later by KVM, but testing seems to indicate that
this works.
Mark.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early
2025-02-28 9:43 ` Marc Zyngier
2025-02-28 9:52 ` Mark Rutland
@ 2025-02-28 10:14 ` Leo Yan
1 sibling, 0 replies; 11+ messages in thread
From: Leo Yan @ 2025-02-28 10:14 UTC (permalink / raw)
To: Marc Zyngier
Cc: Mark Rutland, linux-arm-kernel, ahmed.genidi, ben.horgan,
catalin.marinas, kvmarm, oliver.upton, will
On Fri, Feb 28, 2025 at 09:43:20AM +0000, Marc Zyngier wrote:
[...]
> On Fri, 28 Feb 2025 09:29:55 +0000,
> Leo Yan <leo.yan@arm.com> wrote:
> >
> > Hi Mark,
> >
> > On Thu, Feb 27, 2025 at 06:05:25PM +0000, Mark Rutland wrote:
> >
> > [...]
> >
> > > +.macro init_el2_hcr val
> > > + mov_q x0, \val
> > > +
> > > + /*
> > > + * Compliant CPUs advertise their VHE-onlyness with
> > > + * ID_AA64MMFR4_EL1.E2H0 < 0. On such CPUs HCR_EL2.E2H is RES1, but it
> > > + * can reset into an UNKNOWN state and might not read as 1 until it has
> > > + * been initialized explicitly.
> >
> > For ID_AA64MMFR4_EL1.E2H0 < 0 case, the code actually clears the
> > HCR_EL2.E2H bit.
> >
> > Hence, the comment should be corrected as: "... it can reset into an
> > UNKNOWN state and might not read as 0 until it has been initialized
> > explicitly".
>
> The comment is just fine. It is the code that is wrong, as it avoids
> setting E2H when E2H0 < 0 while we want the exact opposite behaviour.
>
> As a result, 'b.lt' really should be a 'b.ge'. Or the original code
> kept as is.
Thanks for correcting.
> > > + *
> > > + * Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
> > > + * don't advertise it (they predate this relaxation).
> > > + *
> > > + * Initalize HCR_EL2.E2H so that later code can rely upon HCR_EL2.E2H
> > > + * indicating whether the CPU is running in E2H mode.
> > > + */
> >
> > I think it is even better to clear the HCR_E2H bit first. This can
> > avoid any dependency on the passed parameter 'val'.
>
> What are you trying to avoid? A random value passed as a parameter to
> the macro?
Yes, an unexpected value passed to the macro is a concern.
If the intentaion here is only to set the HCR_EL2.E2H bit when E2H0 < 0,
we don't need to cosndier other configurations and current code is just
fine?
Thanks,
Leo
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
2025-02-27 18:05 [PATCH 0/2] KVM: arm64: PSCI relay fixes Mark Rutland
2025-02-27 18:05 ` [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early Mark Rutland
@ 2025-02-27 18:05 ` Mark Rutland
2025-02-28 9:56 ` Leo Yan
2025-03-02 8:41 ` [PATCH 0/2] KVM: arm64: PSCI relay fixes Marc Zyngier
2 siblings, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2025-02-27 18:05 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ahmed.genidi, ben.horgan, catalin.marinas, kvmarm, leo.yan,
mark.rutland, maz, oliver.upton, will
From: Ahmed Genidi <ahmed.genidi@arm.com>
When KVM is in protected mode, host calls to PSCI are proxied via EL2,
and cold entries from CPU_ON, CPU_SUSPEND, and SYSTEM_SUSPEND bounce
through __kvm_hyp_init_cpu() at EL2 before entering the host kernel's
entry point at EL1. While __kvm_hyp_init_cpu() initializes SPSR_EL2 for
the exception return to EL1, it does not initialize SCTLR_EL1.
Due to this, it's possible to enter EL1 with SCTLR_EL1 in an UNKNOWN
state. In practice this has been seen to result in kernel crashes after
CPU_ON as a result of SCTLR_EL1.M being 1 in violation of the initial
core configuration specified by PSCI.
Fix this by initializing SCTLR_EL1 for cold entry to the host kernel.
As it's necessary to write to SCTLR_EL12 in VHE mode, this
initialization is moved into __kvm_host_psci_cpu_entry() where we can
use write_sysreg_el1().
The remnants of the '__init_el2_nvhe_prepare_eret' macro are folded into
its only caller, as this is clearer than having the macro.
Fixes: cdf367192766ad11 ("KVM: arm64: Intercept host's CPU_ON SMCs")
Reported-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ahmed Genidi <ahmed.genidi@arm.com>
[ Mark: clarify commit message, handle E2H, move to C, remove macro ]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ahmed Genidi <ahmed.genidi@arm.com>
Cc: Ben Horgan <ben.horgan@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/el2_setup.h | 5 -----
arch/arm64/kernel/head.S | 3 ++-
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 2 --
arch/arm64/kvm/hyp/nvhe/psci-relay.c | 3 +++
4 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index bc8ebd55788ac..7774aec91027e 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -270,11 +270,6 @@
.Lskip_gcs_\@:
.endm
-.macro __init_el2_nvhe_prepare_eret
- mov x0, #INIT_PSTATE_EL1
- msr spsr_el2, x0
-.endm
-
.macro __init_el2_mpam
/* Memory Partitioning And Monitoring: disable EL2 traps */
mrs x1, id_aa64pfr0_el1
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2d56459d6c94c..2ce73525de2c9 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -322,7 +322,8 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr sctlr_el1, x1
mov x2, xzr
3:
- __init_el2_nvhe_prepare_eret
+ mov x0, #INIT_PSTATE_EL1
+ msr spsr_el2, x0
mov w0, #BOOT_CPU_MODE_EL2
orr x0, x0, x2
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 3fb5504a7d7fc..f8af11189572f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -214,8 +214,6 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
bl __kvm_init_el2_state
- __init_el2_nvhe_prepare_eret
-
/* Enable MMU, set vectors and stack. */
mov x0, x28
bl ___kvm_hyp_init // Clobbers x0..x2
diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
index 9c2ce1e0e99a5..c3e196fb8b18f 100644
--- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
+++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
@@ -218,6 +218,9 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
if (is_cpu_on)
release_boot_args(boot_args);
+ write_sysreg_el1(INIT_SCTLR_EL1_MMU_OFF, SYS_SCTLR);
+ write_sysreg(INIT_PSTATE_EL1, SPSR_EL2);
+
__host_enter(host_ctxt);
}
--
2.30.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 2/2] KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
2025-02-27 18:05 ` [PATCH 2/2] KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu() Mark Rutland
@ 2025-02-28 9:56 ` Leo Yan
0 siblings, 0 replies; 11+ messages in thread
From: Leo Yan @ 2025-02-28 9:56 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ahmed.genidi, ben.horgan, catalin.marinas,
kvmarm, maz, oliver.upton, will
On Thu, Feb 27, 2025 at 06:05:26PM +0000, Mark Rutland wrote:
> From: Ahmed Genidi <ahmed.genidi@arm.com>
>
> When KVM is in protected mode, host calls to PSCI are proxied via EL2,
> and cold entries from CPU_ON, CPU_SUSPEND, and SYSTEM_SUSPEND bounce
> through __kvm_hyp_init_cpu() at EL2 before entering the host kernel's
> entry point at EL1. While __kvm_hyp_init_cpu() initializes SPSR_EL2 for
> the exception return to EL1, it does not initialize SCTLR_EL1.
>
> Due to this, it's possible to enter EL1 with SCTLR_EL1 in an UNKNOWN
> state. In practice this has been seen to result in kernel crashes after
> CPU_ON as a result of SCTLR_EL1.M being 1 in violation of the initial
> core configuration specified by PSCI.
>
> Fix this by initializing SCTLR_EL1 for cold entry to the host kernel.
> As it's necessary to write to SCTLR_EL12 in VHE mode, this
> initialization is moved into __kvm_host_psci_cpu_entry() where we can
> use write_sysreg_el1().
>
> The remnants of the '__init_el2_nvhe_prepare_eret' macro are folded into
> its only caller, as this is clearer than having the macro.
>
> Fixes: cdf367192766ad11 ("KVM: arm64: Intercept host's CPU_ON SMCs")
> Reported-by: Leo Yan <leo.yan@arm.com>
> Signed-off-by: Ahmed Genidi <ahmed.genidi@arm.com>
> [ Mark: clarify commit message, handle E2H, move to C, remove macro ]
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
> Cc: Ahmed Genidi <ahmed.genidi@arm.com>
> Cc: Ben Horgan <ben.horgan@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Leo Yan <leo.yan@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Will Deacon <will@kernel.org>
> ---
> arch/arm64/include/asm/el2_setup.h | 5 -----
> arch/arm64/kernel/head.S | 3 ++-
> arch/arm64/kvm/hyp/nvhe/hyp-init.S | 2 --
> arch/arm64/kvm/hyp/nvhe/psci-relay.c | 3 +++
> 4 files changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index bc8ebd55788ac..7774aec91027e 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -270,11 +270,6 @@
> .Lskip_gcs_\@:
> .endm
>
> -.macro __init_el2_nvhe_prepare_eret
> - mov x0, #INIT_PSTATE_EL1
> - msr spsr_el2, x0
> -.endm
> -
> .macro __init_el2_mpam
> /* Memory Partitioning And Monitoring: disable EL2 traps */
> mrs x1, id_aa64pfr0_el1
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 2d56459d6c94c..2ce73525de2c9 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -322,7 +322,8 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> msr sctlr_el1, x1
> mov x2, xzr
> 3:
> - __init_el2_nvhe_prepare_eret
> + mov x0, #INIT_PSTATE_EL1
> + msr spsr_el2, x0
>
> mov w0, #BOOT_CPU_MODE_EL2
> orr x0, x0, x2
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
> index 3fb5504a7d7fc..f8af11189572f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
> @@ -214,8 +214,6 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
>
> bl __kvm_init_el2_state
>
> - __init_el2_nvhe_prepare_eret
> -
> /* Enable MMU, set vectors and stack. */
> mov x0, x28
> bl ___kvm_hyp_init // Clobbers x0..x2
> diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> index 9c2ce1e0e99a5..c3e196fb8b18f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> @@ -218,6 +218,9 @@ asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
> if (is_cpu_on)
> release_boot_args(boot_args);
>
> + write_sysreg_el1(INIT_SCTLR_EL1_MMU_OFF, SYS_SCTLR);
> + write_sysreg(INIT_PSTATE_EL1, SPSR_EL2);
> +
> __host_enter(host_ctxt);
> }
>
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2] KVM: arm64: PSCI relay fixes
2025-02-27 18:05 [PATCH 0/2] KVM: arm64: PSCI relay fixes Mark Rutland
2025-02-27 18:05 ` [PATCH 1/2] KVM: arm64: Initialize HCR_EL2.E2H early Mark Rutland
2025-02-27 18:05 ` [PATCH 2/2] KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu() Mark Rutland
@ 2025-03-02 8:41 ` Marc Zyngier
2 siblings, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2025-03-02 8:41 UTC (permalink / raw)
To: linux-arm-kernel, Mark Rutland
Cc: ahmed.genidi, ben.horgan, catalin.marinas, kvmarm, leo.yan,
oliver.upton, will
On Thu, 27 Feb 2025 18:05:24 +0000, Mark Rutland wrote:
> These patches fix some missing initialization in the PSCI relay code
> which can result in host kernel crashes shortly after entering the cold
> entry points used by PSCI CPU_ON, CPU_SUSPEND, and SYSTEM_SUSPEND.
>
> The SCTLR_EL1 issue was originally reported by Leo Yan and debugged by
> Ahmed Genidi. While looking at Ahmed's patch I spotted a more general
> issue with E2H, so I've fixed that up with patch 1, and I've tweaked his
> patch accordingly. Any errors there are my fault.
>
> [...]
Applied to fixes, thanks!
[1/2] KVM: arm64: Initialize HCR_EL2.E2H early
commit: 7a68b55ff39b0a1638acb1694c185d49f6077a0d
[2/2] KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
commit: 3855a7b91d42ebf3513b7ccffc44807274978b3d
Cheers,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 11+ messages in thread