* [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast
@ 2025-04-15 15:46 D Scott Phillips
2025-04-15 15:46 ` [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0 D Scott Phillips
2025-04-15 18:22 ` [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast Marc Zyngier
0 siblings, 2 replies; 7+ messages in thread
From: D Scott Phillips @ 2025-04-15 15:46 UTC (permalink / raw)
To: Catalin Marinas, Joey Gouly, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, Will Deacon, Zenghui Yu, kvmarm,
linux-arm-kernel, linux-kernel
In the skip_mmu_switch case, config.hcr was used uninitialized. On my
machine that caused garbage to be written to HCR_EL2 and then the CPU
got stuck at the synchronous exception handler. Also, the restore of
HCR_EL2 was missing at the end of the function in the same case.
In skip_mmu_switch case, initialize config.hcr with HCR_HOST_VHE_FLAGS.
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
---
arch/arm64/kvm/at.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index f74a66ce3064b..ff4b06ce661af 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -1233,8 +1233,10 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
* the right one (as we trapped from vEL2). If not, save the
* full MMU context.
*/
- if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
+ if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)) {
+ config.hcr = read_sysreg(hcr_el2);
goto skip_mmu_switch;
+ }
/*
* Obtaining the S2 MMU for a L2 is horribly racy, and we may not
@@ -1299,7 +1301,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
if (!fail)
par = read_sysreg_par();
- if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
+ if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
+ write_sysreg(config.hcr, hcr_el2);
+ else
__mmu_config_restore(&config);
return par;
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0
2025-04-15 15:46 [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast D Scott Phillips
@ 2025-04-15 15:46 ` D Scott Phillips
2025-04-15 18:33 ` Marc Zyngier
2025-04-15 18:22 ` [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast Marc Zyngier
1 sibling, 1 reply; 7+ messages in thread
From: D Scott Phillips @ 2025-04-15 15:46 UTC (permalink / raw)
To: Catalin Marinas, Joey Gouly, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, Will Deacon, Zenghui Yu, kvmarm,
linux-arm-kernel, linux-kernel
pNMIs are intended to be deliverable during operations like guest
tlb flushing or nested AT, however the setting of HCR_EL2 controls
are accidentally blocking async exceptions.
You can see this by doing:
# perf record -e cycles:Hk -g ./dirty_log_perf_test -m 3 \
-i 4 -s anonymous -b 4G -v 32
Where no samples will be collected in __kvm_tlb_flush_vmid_ipa_nsh()
between enter_vmid_context() and exit_vmid_context() then many
samples are collected right after the write to HCR_EL2 in
exit_vmid_context(), where pNMIs actually get unmasked.
Set HCR_EL2.IMO so that pNMIs are not blocked during guest tlb
flushing or nested AT.
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
---
arch/arm64/include/asm/hardirq.h | 3 ++-
arch/arm64/kvm/at.c | 4 +++-
arch/arm64/kvm/hyp/vhe/tlb.c | 10 ++++++++++
3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index cbfa7b6f2e098..6eb3f93851023 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -41,7 +41,8 @@ do { \
\
___hcr = read_sysreg(hcr_el2); \
if (!(___hcr & HCR_TGE)) { \
- write_sysreg(___hcr | HCR_TGE, hcr_el2); \
+ write_sysreg((___hcr & ~(HCR_AMO | HCR_IMO | HCR_FMO)) |\
+ HCR_TGE, hcr_el2); \
isb(); \
} \
/* \
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index ff4b06ce661af..f31f0d78c5813 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -1269,7 +1269,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
skip_mmu_switch:
/* Clear TGE, enable S2 translation, we're rolling */
- write_sysreg((config.hcr & ~HCR_TGE) | HCR_VM, hcr_el2);
+ write_sysreg((config.hcr & ~HCR_TGE) |
+ HCR_AMO | HCR_IMO | HCR_FMO | HCR_VM,
+ hcr_el2);
isb();
switch (op) {
diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
index 3d50a1bd2bdbc..ecb700bab3b8f 100644
--- a/arch/arm64/kvm/hyp/vhe/tlb.c
+++ b/arch/arm64/kvm/hyp/vhe/tlb.c
@@ -55,6 +55,15 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
* bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
* let's flip TGE before executing the TLB operation.
*
+ * One other fun complication to consider is the target EL for
+ * asynchronous exceptions. We want to allow NMIs during tlb flushing,
+ * so we need to ensure that the target EL for IRQs remains as EL2.
+ * HCR_EL2.{E2H,TGE,IMO} = {1,0,0} would set the target EL for IRQs as
+ * EL1, and IRQs at EL2 would be "C" (Interrupts not taken regardless
+ * of the value of interrupt masks). So we need to set
+ * HCR_EL2.{E2H,TGE,IMO} = {1,0,1} so that NMIs will still be
+ * delivered.
+ *
* ARM erratum 1165522 requires some special handling (again),
* as we need to make sure both stages of translation are in
* place before clearing TGE. __load_stage2() already
@@ -63,6 +72,7 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
__load_stage2(mmu, mmu->arch);
val = read_sysreg(hcr_el2);
val &= ~HCR_TGE;
+ val |= HCR_AMO | HCR_IMO | HCR_FMO;
write_sysreg(val, hcr_el2);
isb();
}
--
2.48.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast
2025-04-15 15:46 [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast D Scott Phillips
2025-04-15 15:46 ` [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0 D Scott Phillips
@ 2025-04-15 18:22 ` Marc Zyngier
2025-04-16 23:00 ` D Scott Phillips
1 sibling, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2025-04-15 18:22 UTC (permalink / raw)
To: D Scott Phillips
Cc: Catalin Marinas, Joey Gouly, Oliver Upton, Suzuki K Poulose,
Will Deacon, Zenghui Yu, kvmarm, linux-arm-kernel, linux-kernel
On Tue, 15 Apr 2025 16:46:55 +0100,
D Scott Phillips <scott@os.amperecomputing.com> wrote:
>
> In the skip_mmu_switch case, config.hcr was used uninitialized. On my
> machine that caused garbage to be written to HCR_EL2 and then the CPU
> got stuck at the synchronous exception handler. Also, the restore of
> HCR_EL2 was missing at the end of the function in the same case.
Huh, how embarrassing. Thanks for spotting this one.
>
> In skip_mmu_switch case, initialize config.hcr with HCR_HOST_VHE_FLAGS.
>
> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
> ---
> arch/arm64/kvm/at.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
> index f74a66ce3064b..ff4b06ce661af 100644
> --- a/arch/arm64/kvm/at.c
> +++ b/arch/arm64/kvm/at.c
> @@ -1233,8 +1233,10 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> * the right one (as we trapped from vEL2). If not, save the
> * full MMU context.
> */
> - if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
> + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)) {
> + config.hcr = read_sysreg(hcr_el2);
> goto skip_mmu_switch;
> + }
>
> /*
> * Obtaining the S2 MMU for a L2 is horribly racy, and we may not
> @@ -1299,7 +1301,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> if (!fail)
> par = read_sysreg_par();
>
> - if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
> + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
> + write_sysreg(config.hcr, hcr_el2);
> + else
> __mmu_config_restore(&config);
>
> return par;
I think the diff below should do the trick (and incidently matches
your commit message).
Thanks,
M.
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index f74a66ce3064b..773e3b4d5c7e5 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -1214,7 +1214,7 @@ static u64 handle_at_slow(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
*/
static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
{
- struct mmu_config config;
+ struct mmu_config config = { .hcr = HCR_HOST_VHE_FLAGS, };
struct kvm_s2_mmu *mmu;
bool fail;
u64 par;
@@ -1301,6 +1301,8 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
__mmu_config_restore(&config);
+ else
+ write_sysreg(config.hcr, hcr_el2);
return par;
}
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0
2025-04-15 15:46 ` [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0 D Scott Phillips
@ 2025-04-15 18:33 ` Marc Zyngier
2025-04-16 22:59 ` D Scott Phillips
0 siblings, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2025-04-15 18:33 UTC (permalink / raw)
To: D Scott Phillips
Cc: Catalin Marinas, Joey Gouly, Oliver Upton, Suzuki K Poulose,
Will Deacon, Zenghui Yu, kvmarm, linux-arm-kernel, linux-kernel
On Tue, 15 Apr 2025 16:46:56 +0100,
D Scott Phillips <scott@os.amperecomputing.com> wrote:
>
> pNMIs are intended to be deliverable during operations like guest
> tlb flushing or nested AT, however the setting of HCR_EL2 controls
> are accidentally blocking async exceptions.
>
> You can see this by doing:
>
> # perf record -e cycles:Hk -g ./dirty_log_perf_test -m 3 \
> -i 4 -s anonymous -b 4G -v 32
>
> Where no samples will be collected in __kvm_tlb_flush_vmid_ipa_nsh()
> between enter_vmid_context() and exit_vmid_context() then many
> samples are collected right after the write to HCR_EL2 in
> exit_vmid_context(), where pNMIs actually get unmasked.
>
> Set HCR_EL2.IMO so that pNMIs are not blocked during guest tlb
> flushing or nested AT.
>
> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
> ---
> arch/arm64/include/asm/hardirq.h | 3 ++-
> arch/arm64/kvm/at.c | 4 +++-
> arch/arm64/kvm/hyp/vhe/tlb.c | 10 ++++++++++
> 3 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
> index cbfa7b6f2e098..6eb3f93851023 100644
> --- a/arch/arm64/include/asm/hardirq.h
> +++ b/arch/arm64/include/asm/hardirq.h
> @@ -41,7 +41,8 @@ do { \
> \
> ___hcr = read_sysreg(hcr_el2); \
> if (!(___hcr & HCR_TGE)) { \
> - write_sysreg(___hcr | HCR_TGE, hcr_el2); \
> + write_sysreg((___hcr & ~(HCR_AMO | HCR_IMO | HCR_FMO)) |\
> + HCR_TGE, hcr_el2); \
> isb(); \
> } \
> /* \
> diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
> index ff4b06ce661af..f31f0d78c5813 100644
> --- a/arch/arm64/kvm/at.c
> +++ b/arch/arm64/kvm/at.c
> @@ -1269,7 +1269,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>
> skip_mmu_switch:
> /* Clear TGE, enable S2 translation, we're rolling */
> - write_sysreg((config.hcr & ~HCR_TGE) | HCR_VM, hcr_el2);
> + write_sysreg((config.hcr & ~HCR_TGE) |
> + HCR_AMO | HCR_IMO | HCR_FMO | HCR_VM,
> + hcr_el2);
> isb();
>
> switch (op) {
> diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
> index 3d50a1bd2bdbc..ecb700bab3b8f 100644
> --- a/arch/arm64/kvm/hyp/vhe/tlb.c
> +++ b/arch/arm64/kvm/hyp/vhe/tlb.c
> @@ -55,6 +55,15 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
> * bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
> * let's flip TGE before executing the TLB operation.
> *
> + * One other fun complication to consider is the target EL for
> + * asynchronous exceptions. We want to allow NMIs during tlb flushing,
> + * so we need to ensure that the target EL for IRQs remains as EL2.
> + * HCR_EL2.{E2H,TGE,IMO} = {1,0,0} would set the target EL for IRQs as
> + * EL1, and IRQs at EL2 would be "C" (Interrupts not taken regardless
> + * of the value of interrupt masks). So we need to set
> + * HCR_EL2.{E2H,TGE,IMO} = {1,0,1} so that NMIs will still be
> + * delivered.
> + *
> * ARM erratum 1165522 requires some special handling (again),
> * as we need to make sure both stages of translation are in
> * place before clearing TGE. __load_stage2() already
> @@ -63,6 +72,7 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
> __load_stage2(mmu, mmu->arch);
> val = read_sysreg(hcr_el2);
> val &= ~HCR_TGE;
> + val |= HCR_AMO | HCR_IMO | HCR_FMO;
> write_sysreg(val, hcr_el2);
> isb();
> }
This seems terribly complicated for no good reasons. Why can't we run
with HCR_xMO set at all times? i.e. like this:
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 974d72b5905b8..bba4b0e930915 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -100,7 +100,7 @@
HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID1)
#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
#define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
-#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
+#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H | HCR_AMO | HCR_IMO | HCR_FMO)
#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
#define MPAMHCR_HOST_FLAGS 0
There should never be a case where we don't want physical interrupts
to be taken at EL2 when running VHE, and we should never use these
bits to mask interrupts. This has been relaxed in the architecture to
have an IMPDEF behaviour anyway, as it cannot be implemented on NV.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0
2025-04-15 18:33 ` Marc Zyngier
@ 2025-04-16 22:59 ` D Scott Phillips
0 siblings, 0 replies; 7+ messages in thread
From: D Scott Phillips @ 2025-04-16 22:59 UTC (permalink / raw)
To: Marc Zyngier
Cc: Catalin Marinas, Joey Gouly, Oliver Upton, Suzuki K Poulose,
Will Deacon, Zenghui Yu, kvmarm, linux-arm-kernel, linux-kernel
Marc Zyngier <maz@kernel.org> writes:
> On Tue, 15 Apr 2025 16:46:56 +0100,
> D Scott Phillips <scott@os.amperecomputing.com> wrote:
>>
>> pNMIs are intended to be deliverable during operations like guest
>> tlb flushing or nested AT, however the setting of HCR_EL2 controls
>> are accidentally blocking async exceptions.
>>
>> You can see this by doing:
>>
>> # perf record -e cycles:Hk -g ./dirty_log_perf_test -m 3 \
>> -i 4 -s anonymous -b 4G -v 32
>>
>> Where no samples will be collected in __kvm_tlb_flush_vmid_ipa_nsh()
>> between enter_vmid_context() and exit_vmid_context() then many
>> samples are collected right after the write to HCR_EL2 in
>> exit_vmid_context(), where pNMIs actually get unmasked.
>>
>> Set HCR_EL2.IMO so that pNMIs are not blocked during guest tlb
>> flushing or nested AT.
>>
>> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
>> ---
>> arch/arm64/include/asm/hardirq.h | 3 ++-
>> arch/arm64/kvm/at.c | 4 +++-
>> arch/arm64/kvm/hyp/vhe/tlb.c | 10 ++++++++++
>> 3 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
>> index cbfa7b6f2e098..6eb3f93851023 100644
>> --- a/arch/arm64/include/asm/hardirq.h
>> +++ b/arch/arm64/include/asm/hardirq.h
>> @@ -41,7 +41,8 @@ do { \
>> \
>> ___hcr = read_sysreg(hcr_el2); \
>> if (!(___hcr & HCR_TGE)) { \
>> - write_sysreg(___hcr | HCR_TGE, hcr_el2); \
>> + write_sysreg((___hcr & ~(HCR_AMO | HCR_IMO | HCR_FMO)) |\
>> + HCR_TGE, hcr_el2); \
>> isb(); \
>> } \
>> /* \
>> diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
>> index ff4b06ce661af..f31f0d78c5813 100644
>> --- a/arch/arm64/kvm/at.c
>> +++ b/arch/arm64/kvm/at.c
>> @@ -1269,7 +1269,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>>
>> skip_mmu_switch:
>> /* Clear TGE, enable S2 translation, we're rolling */
>> - write_sysreg((config.hcr & ~HCR_TGE) | HCR_VM, hcr_el2);
>> + write_sysreg((config.hcr & ~HCR_TGE) |
>> + HCR_AMO | HCR_IMO | HCR_FMO | HCR_VM,
>> + hcr_el2);
>> isb();
>>
>> switch (op) {
>> diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
>> index 3d50a1bd2bdbc..ecb700bab3b8f 100644
>> --- a/arch/arm64/kvm/hyp/vhe/tlb.c
>> +++ b/arch/arm64/kvm/hyp/vhe/tlb.c
>> @@ -55,6 +55,15 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
>> * bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
>> * let's flip TGE before executing the TLB operation.
>> *
>> + * One other fun complication to consider is the target EL for
>> + * asynchronous exceptions. We want to allow NMIs during tlb flushing,
>> + * so we need to ensure that the target EL for IRQs remains as EL2.
>> + * HCR_EL2.{E2H,TGE,IMO} = {1,0,0} would set the target EL for IRQs as
>> + * EL1, and IRQs at EL2 would be "C" (Interrupts not taken regardless
>> + * of the value of interrupt masks). So we need to set
>> + * HCR_EL2.{E2H,TGE,IMO} = {1,0,1} so that NMIs will still be
>> + * delivered.
>> + *
>> * ARM erratum 1165522 requires some special handling (again),
>> * as we need to make sure both stages of translation are in
>> * place before clearing TGE. __load_stage2() already
>> @@ -63,6 +72,7 @@ static void enter_vmid_context(struct kvm_s2_mmu *mmu,
>> __load_stage2(mmu, mmu->arch);
>> val = read_sysreg(hcr_el2);
>> val &= ~HCR_TGE;
>> + val |= HCR_AMO | HCR_IMO | HCR_FMO;
>> write_sysreg(val, hcr_el2);
>> isb();
>> }
>
> This seems terribly complicated for no good reasons. Why can't we run
> with HCR_xMO set at all times? i.e. like this:
Ah, good point Marc. You're right, this is much simpler.
Tested-by: D Scott Phillips <scott@os.amperecomputing.com>
Reviewed-by: D Scott Phillips <scott@os.amperecomputing.com>
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 974d72b5905b8..bba4b0e930915 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -100,7 +100,7 @@
> HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID1)
> #define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
> #define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
> -#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
> +#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H | HCR_AMO | HCR_IMO | HCR_FMO)
>
> #define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
> #define MPAMHCR_HOST_FLAGS 0
>
> There should never be a case where we don't want physical interrupts
> to be taken at EL2 when running VHE, and we should never use these
> bits to mask interrupts. This has been relaxed in the architecture to
> have an IMPDEF behaviour anyway, as it cannot be implemented on NV.
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast
2025-04-15 18:22 ` [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast Marc Zyngier
@ 2025-04-16 23:00 ` D Scott Phillips
2025-04-17 16:13 ` Marc Zyngier
0 siblings, 1 reply; 7+ messages in thread
From: D Scott Phillips @ 2025-04-16 23:00 UTC (permalink / raw)
To: Marc Zyngier
Cc: Catalin Marinas, Joey Gouly, Oliver Upton, Suzuki K Poulose,
Will Deacon, Zenghui Yu, kvmarm, linux-arm-kernel, linux-kernel
Marc Zyngier <maz@kernel.org> writes:
> On Tue, 15 Apr 2025 16:46:55 +0100,
> D Scott Phillips <scott@os.amperecomputing.com> wrote:
>>
>> In the skip_mmu_switch case, config.hcr was used uninitialized. On my
>> machine that caused garbage to be written to HCR_EL2 and then the CPU
>> got stuck at the synchronous exception handler. Also, the restore of
>> HCR_EL2 was missing at the end of the function in the same case.
>
> Huh, how embarrassing. Thanks for spotting this one.
>
>>
>> In skip_mmu_switch case, initialize config.hcr with HCR_HOST_VHE_FLAGS.
>>
>> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
>> ---
>> arch/arm64/kvm/at.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
>> index f74a66ce3064b..ff4b06ce661af 100644
>> --- a/arch/arm64/kvm/at.c
>> +++ b/arch/arm64/kvm/at.c
>> @@ -1233,8 +1233,10 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>> * the right one (as we trapped from vEL2). If not, save the
>> * full MMU context.
>> */
>> - if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
>> + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)) {
>> + config.hcr = read_sysreg(hcr_el2);
>> goto skip_mmu_switch;
>> + }
>>
>> /*
>> * Obtaining the S2 MMU for a L2 is horribly racy, and we may not
>> @@ -1299,7 +1301,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>> if (!fail)
>> par = read_sysreg_par();
>>
>> - if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
>> + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
>> + write_sysreg(config.hcr, hcr_el2);
>> + else
>> __mmu_config_restore(&config);
>>
>> return par;
>
> I think the diff below should do the trick (and incidently matches
> your commit message).
Looks good Marc, thanks
Reviewed-by: D Scott Phillips <scott@os.amperecomputing.com>
>
> Thanks,
>
> M.
>
> diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
> index f74a66ce3064b..773e3b4d5c7e5 100644
> --- a/arch/arm64/kvm/at.c
> +++ b/arch/arm64/kvm/at.c
> @@ -1214,7 +1214,7 @@ static u64 handle_at_slow(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> */
> static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> {
> - struct mmu_config config;
> + struct mmu_config config = { .hcr = HCR_HOST_VHE_FLAGS, };
> struct kvm_s2_mmu *mmu;
> bool fail;
> u64 par;
> @@ -1301,6 +1301,8 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
>
> if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
> __mmu_config_restore(&config);
> + else
> + write_sysreg(config.hcr, hcr_el2);
>
> return par;
> }
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast
2025-04-16 23:00 ` D Scott Phillips
@ 2025-04-17 16:13 ` Marc Zyngier
0 siblings, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2025-04-17 16:13 UTC (permalink / raw)
To: D Scott Phillips
Cc: Catalin Marinas, Joey Gouly, Oliver Upton, Suzuki K Poulose,
Will Deacon, Zenghui Yu, kvmarm, linux-arm-kernel, linux-kernel
On Thu, 17 Apr 2025 00:00:39 +0100,
D Scott Phillips <scott@os.amperecomputing.com> wrote:
>
> Marc Zyngier <maz@kernel.org> writes:
>
> > On Tue, 15 Apr 2025 16:46:55 +0100,
> > D Scott Phillips <scott@os.amperecomputing.com> wrote:
> >>
> >> In the skip_mmu_switch case, config.hcr was used uninitialized. On my
> >> machine that caused garbage to be written to HCR_EL2 and then the CPU
> >> got stuck at the synchronous exception handler. Also, the restore of
> >> HCR_EL2 was missing at the end of the function in the same case.
> >
> > Huh, how embarrassing. Thanks for spotting this one.
> >
> >>
> >> In skip_mmu_switch case, initialize config.hcr with HCR_HOST_VHE_FLAGS.
> >>
> >> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
> >> ---
> >> arch/arm64/kvm/at.c | 8 ++++++--
> >> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
> >> index f74a66ce3064b..ff4b06ce661af 100644
> >> --- a/arch/arm64/kvm/at.c
> >> +++ b/arch/arm64/kvm/at.c
> >> @@ -1233,8 +1233,10 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> >> * the right one (as we trapped from vEL2). If not, save the
> >> * full MMU context.
> >> */
> >> - if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
> >> + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)) {
> >> + config.hcr = read_sysreg(hcr_el2);
> >> goto skip_mmu_switch;
> >> + }
> >>
> >> /*
> >> * Obtaining the S2 MMU for a L2 is horribly racy, and we may not
> >> @@ -1299,7 +1301,9 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
> >> if (!fail)
> >> par = read_sysreg_par();
> >>
> >> - if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
> >> + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))
> >> + write_sysreg(config.hcr, hcr_el2);
> >> + else
> >> __mmu_config_restore(&config);
> >>
> >> return par;
> >
> > I think the diff below should do the trick (and incidently matches
> > your commit message).
>
> Looks good Marc, thanks
>
> Reviewed-by: D Scott Phillips <scott@os.amperecomputing.com>
Actually, we can do much better, because the more I look at this, the
more I find it silly. How about the patch below? It also fixes a
latent issue where HCR_PTW was never set... I quickly tested it on my
Q box, and nothing exploded. But at the same time, nothing exploded
with the buggy code...
Thanks,
M.
From 0191a6dfbeb52b41b2c8aa81edf7812b56a3612f Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Thu, 17 Apr 2025 16:24:29 +0100
Subject: [PATCH] KVM: arm64: Don't feed uninitialised data to HCR_EL2
When the guest executes an AT S1E{0,1} from EL2, and that its
HCR_EL2.{E2H,TGE}=={1,1}, then this is a pure S1 translation
that doesn't involve a guest-supplied S2, and the full S1
context is already in place. This allows us to take a shortcut
and avoid save/restoring a bunch of registers.
However, we set HCR_EL2 to a value suitable for the use of AT
in guest context. And we do so by using the value that we saved.
Or not. In the case described above, we restore whatever junk
was on the stack, and carry on with it until the next entry.
Needless to say, this is completely broken.
But this also triggers the realisation that saving HCR_EL2 is
a bit pointless. We are always in host context at the point where
reach this code, and what we program to enter the guest is a known
value (vcpu->arch.hcr_el2).
Drop the pointless save/restore, and wrap the AT operations with
writes that switch between guest and host values for HCR_EL2.
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/at.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index ea497ffbd1b19..da5359668b9c9 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -464,7 +464,6 @@ struct mmu_config {
u64 sctlr;
u64 vttbr;
u64 vtcr;
- u64 hcr;
};
static void __mmu_config_save(struct mmu_config *config)
@@ -487,13 +486,10 @@ static void __mmu_config_save(struct mmu_config *config)
config->sctlr = read_sysreg_el1(SYS_SCTLR);
config->vttbr = read_sysreg(vttbr_el2);
config->vtcr = read_sysreg(vtcr_el2);
- config->hcr = read_sysreg(hcr_el2);
}
static void __mmu_config_restore(struct mmu_config *config)
{
- write_sysreg(config->hcr, hcr_el2);
-
/*
* ARM errata 1165522 and 1530923 require TGE to be 1 before
* we update the guest state.
@@ -1248,8 +1244,8 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
__load_stage2(mmu, mmu->arch);
skip_mmu_switch:
- /* Clear TGE, enable S2 translation, we're rolling */
- write_sysreg((config.hcr & ~HCR_TGE) | HCR_VM, hcr_el2);
+ /* Temporarily switch back to guest context */
+ write_sysreg(vcpu->arch.hcr_el2, hcr_el2);
isb();
switch (op) {
@@ -1281,6 +1277,8 @@ static u64 __kvm_at_s1e01_fast(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
if (!fail)
par = read_sysreg_par();
+ write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+
if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)))
__mmu_config_restore(&config);
--
2.39.2
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-04-17 16:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-15 15:46 [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast D Scott Phillips
2025-04-15 15:46 ` [PATCH 2/2] KVM: arm64: Avoid blocking irqs when tlb flushing/ATing with HCR.TGE=0 D Scott Phillips
2025-04-15 18:33 ` Marc Zyngier
2025-04-16 22:59 ` D Scott Phillips
2025-04-15 18:22 ` [PATCH 1/2] KVM: arm64: fix config.hcr used uninitialized in __kvm_at_s1e01_fast Marc Zyngier
2025-04-16 23:00 ` D Scott Phillips
2025-04-17 16:13 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).